Find Investable Startups and Competitors
Search thousands of startups using natural language—just describe what you're looking for
Top 50 Ai Model Hosting
Discover the top 50 Ai Model Hosting startups. Browse funding data, key metrics, and company insights. Average funding: $96.2M.
Sort by
Featherless.ai offers serverless AI hosting with a GPU orchestration system, simplifying the deployment and management of AI models. Their platform allows developers to run AI applications without managing underlying infrastructure, optimizing GPU utilization and reducing operational overhead.
Funding: $5.0M
Rough estimate of the amount of funding raised
Airbus Ventures
Airbus Ventures
Funding: $5.0M
Rough estimate of the amount of funding raised
OpenGradient is a decentralized platform that enables secure hosting and inference execution of open-source AI models using EVM-compatible smart contracts and a heterogeneous AI compute architecture. It addresses the challenges of model deployment and verifiable inference in AI applications, allowing developers to build scalable and permissionless solutions.
Funding: $9.0M
Rough estimate of the amount of funding raised
Funding: $9.0M
Rough estimate of the amount of funding raised
Lilypad offers a platform for AI model deployment, distribution, and monetization, connecting model creators with compute providers. Their platform combines a model marketplace, MLOps tools, and a distributed compute network to simplify scaling AI inference across various applications. This allows AI model creators to generate revenue and compute providers to monetize their resources.
Funding: $1.9M
Rough estimate of the amount of funding raised
Funding: $1.9M
Rough estimate of the amount of funding raised
Provides a platform for deploying and optimizing generative AI models, including large language models (LLMs), with tools for fine-tuning, real-time monitoring, and autoscaling. Reduces GPU costs by over 50% and improves inference performance with techniques like iteration batching, native quantization, and dedicated GPU resource management, enabling businesses to scale AI applications efficiently and securely.
Funding: $6.8M
Rough estimate of the amount of funding raised
Funding: $6.8M
Rough estimate of the amount of funding raised
Provides a serverless machine learning inference platform that enables businesses to deploy and scale AI models via a simple API, eliminating the need for complex ML infrastructure. It reduces costs and improves efficiency by offering pay-per-use pricing, low-latency performance, and automatic scaling on dedicated A100 and H100 GPUs.
Funding: $20.6M
Rough estimate of the amount of funding raised
Funding: $20.6M
Rough estimate of the amount of funding raised
Tensorfuse provides a platform for deploying and managing large language model (LLM) pipelines on cloud infrastructure, allowing users to run serverless GPUs on AWS, Azure, or GCP. The solution enables businesses to scale generative AI models efficiently while keeping data secure within their private cloud, eliminating idle costs and reducing egress charges.
Funding: $500.0K
Rough estimate of the amount of funding raised
Y Combinator
Y Combinator
Funding: $500.0K
Rough estimate of the amount of funding raised
Provides a cloud-based platform for training, fine-tuning, and deploying generative AI models, optimized with proprietary hardware-software integration to eliminate virtualization overhead. Radium’s architecture enables up to 50% faster training and 135% faster inference compared to traditional hyperscalers, supporting scalable, secure, and cost-efficient AI development for enterprises.
Funding: $4.7M
Rough estimate of the amount of funding raised
Funding: $4.7M
Rough estimate of the amount of funding raised
Crusoe provides a managed AI cloud platform that delivers low‑latency, high‑throughput inference for large‑context models using NVIDIA and AMD GPUs with its MemoryAlloy engine. The service abstracts cluster provisioning via an API‑key workflow, auto‑scales on Kubernetes/Slurm, and includes a web console for one‑click model deployment, while its renewable‑powered data centers reduce compute costs by up to 80 %.
Funding: $1.4B
Rough estimate of the amount of funding raised
Mubadala CapitalValor Equity Partners
Mubadala CapitalValor Equity Partners
Funding: $1.4B
Rough estimate of the amount of funding raised
Baseten provides a platform for deploying and serving machine learning models with optimized inference speed and autoscaling capabilities, enabling seamless transition from development to production. The solution addresses the complexities of model infrastructure management, allowing teams to focus on building and iterating on their AI applications without incurring excessive costs.
Funding: $60.0M
Rough estimate of the amount of funding raised
Spark CapitalIVP
Spark CapitalIVP
Funding: $60.0M
Rough estimate of the amount of funding raised
Together AI provides a cloud-based platform for training, fine-tuning, and deploying open-source generative AI models using NVIDIA's latest GPUs, enabling users to achieve high performance at a lower cost. The platform addresses the need for scalable AI infrastructure by offering customizable solutions that support the entire generative AI lifecycle, from model development to production deployment.
Funding: $232.5M
Rough estimate of the amount of funding raised
Salesforce Ventures
Salesforce Ventures
Funding: $232.5M
Rough estimate of the amount of funding raised
Lepton AI Cloud provides a scalable platform for AI inference and training, utilizing high-performance GPU infrastructure and a fast LLM engine to achieve up to 600 tokens per second. The platform enables enterprises to efficiently deploy and manage AI models, processing over 20 billion tokens and generating 1 million images daily with 99.9% uptime.
Funding: $11.0M
Rough estimate of the amount of funding raised
Funding: $11.0M
Rough estimate of the amount of funding raised
Salt AI offers a development engine that accelerates AI adoption in life sciences by providing a platform for reproducible AI workflows. It enables faster time-to-output and reduced compute costs through optimized model hosting and visual workflow design, facilitating collaboration for drug discovery and biological research.
Funding: $3.0M
Rough estimate of the amount of funding raised
Morpheus Ventures
Morpheus Ventures
Funding: $3.0M
Rough estimate of the amount of funding raised
Provides a platform for decentralized AI model training by aggregating global compute resources and enabling multi-node GPU deployments across cloud providers. This reduces costs and increases accessibility for developing large-scale models, while allowing contributors to co-own and improve open-source AI innovations.
Funding: $15.0M
Rough estimate of the amount of funding raised
Founders Fund
Founders Fund
Funding: $15.0M
Rough estimate of the amount of funding raised
Provides a fully managed AI cloud platform powered by NVIDIA® H100 and H200 Tensor Core GPUs, offering scalable GPU clusters with InfiniBand networking for high-speed data processing. Enables efficient model training, fine-tuning, and inference with tools like MLflow, PostgreSQL, and Apache Spark, reducing the complexity and cost of deploying AI applications at scale.
Funding: $700.0M
Rough estimate of the amount of funding raised
Funding: $700.0M
Rough estimate of the amount of funding raised
Ori provides on-demand access to top-tier GPUs and serverless Kubernetes for training and deploying machine learning models at scale. The platform offers cost-optimized solutions that allow users to pay only for the resources they utilize, addressing the need for flexible and efficient AI infrastructure.
Funding: $148.8M
Rough estimate of the amount of funding raised
Funding: $148.8M
Rough estimate of the amount of funding raised
Rolling AI provides a robust AI infrastructure platform that enables users to deploy and manage machine learning models at scale. The platform addresses the challenges of high computational costs and complex deployment processes, allowing businesses to efficiently harness AI capabilities for diverse applications.
Oumi offers an open-source platform for the end-to-end development and deployment of AI models, supporting pre-training, fine-tuning, and evaluation of text and multimodal architectures. It enables flexible deployment across diverse environments and helps users avoid vendor lock-in while accelerating AI research and development.
Funding: $10.0M
Rough estimate of the amount of funding raised
Obvious VenturesVenrock
Obvious VenturesVenrock
Funding: $10.0M
Rough estimate of the amount of funding raised
Nscale provides a GPU cloud platform optimized for AI workloads, featuring on-demand compute and inference services, dedicated training clusters, and scalable GPU nodes. The platform addresses the high costs and inefficiencies associated with AI model training and deployment by offering a fully integrated infrastructure powered by renewable energy in Europe.
Funding: $185.0M
Rough estimate of the amount of funding raised
Sandton Capital Partners
Sandton Capital Partners
Funding: $185.0M
Rough estimate of the amount of funding raised
Lyceum simplifies AI model training by automating GPU infrastructure selection and deployment. The platform offers one-click GPU deployment, intelligent hardware matching, and predictive runtime analysis to optimize job scheduling for speed and cost efficiency. This allows AI developers and data scientists to focus on model development without managing complex infrastructure.
Funding: $12.0M
Rough estimate of the amount of funding raised
redalpine
redalpine
Funding: $12.0M
Rough estimate of the amount of funding raised
Mistral AI provides open-weight generative AI models that developers and businesses can customize and deploy in various environments, including on-premise and cloud platforms. Their technology enhances AI application development by offering high-performance models with validated reasoning capabilities, ensuring independence from specific cloud providers.
Funding: $1.0M
Rough estimate of the amount of funding raised
General CatalystDST Global
General CatalystDST Global
Funding: $1.0M
Rough estimate of the amount of funding raised
BentoML provides a Unified Inference Platform that enables developers to build and deploy scalable AI systems using any model on their preferred cloud infrastructure. The platform addresses the challenges of slow iteration and high costs in AI deployment by offering features like auto-scaling, low-latency serving, and seamless integration with existing cloud resources.
Funding: $10.0M
Rough estimate of the amount of funding raised
DCM Ventures
DCM Ventures
Funding: $10.0M
Rough estimate of the amount of funding raised
Hydra Host provides dedicated bare metal servers with full root access and optimized GPU configurations for AI and high-performance computing workloads, ensuring maximum processing capabilities without the overhead of shared resources. The platform addresses the need for enhanced privacy and security in multi-cloud environments by offering customizable solutions that eliminate vulnerabilities associated with multi-tenant setups.
Funding: $14.1M
Rough estimate of the amount of funding raised
Founders Fund
Founders Fund
Funding: $14.1M
Rough estimate of the amount of funding raised
Simplismart provides a high-performance inference engine that enables rapid deployment and fine-tuning of generative AI models on-premises or across various cloud platforms. This technology reduces model deployment time from months to days, significantly lowering operational costs while enhancing inference speed and scalability.
Funding: $8.3M
Rough estimate of the amount of funding raised
Google for Startups
Google for Startups
Funding: $8.3M
Rough estimate of the amount of funding raised
Oumi provides an open-source AI platform for building, evaluating, and deploying AI models at scale. It enables researchers, developers, and enterprises to develop and own their AI solutions without proprietary vendor lock-in.
Funding: $10.0M
Rough estimate of the amount of funding raised
Obvious VenturesVenrock
Obvious VenturesVenrock
Funding: $10.0M
Rough estimate of the amount of funding raised
Replicate provides an API for software developers to run and fine-tune open-source AI models, enabling the deployment of custom models at scale with minimal code. This platform addresses the challenge of accessing and utilizing advanced AI capabilities without requiring extensive machine learning expertise or infrastructure management.
Funding: $57.8M
Rough estimate of the amount of funding raised
Andreessen HorowitzSequoia CapitalY Combinator
Andreessen HorowitzSequoia CapitalY Combinator
Funding: $57.8M
Rough estimate of the amount of funding raised
RunPod is a cloud platform that provides globally distributed GPU resources for deploying and scaling machine learning applications, enabling developers to run AI workloads without managing infrastructure. The platform reduces cold-start times to under 250 milliseconds and offers flexible pricing, allowing users to efficiently handle fluctuating demand while minimizing operational costs.
Funding: $20.0M
Rough estimate of the amount of funding raised
Intel CapitalDell Technologies Capital
Intel CapitalDell Technologies Capital
Funding: $20.0M
Rough estimate of the amount of funding raised
TrueFoundry provides a platform that automates the deployment and management of machine learning models on users' own infrastructure, integrating seamlessly with GPUs and TPUs for efficient resource utilization. By simplifying the complexities of model training, inference, and monitoring, it enables data scientists and ML engineers to focus on delivering actionable insights while significantly reducing cloud costs.
Funding: $18.5M
Rough estimate of the amount of funding raised
Funding: $18.5M
Rough estimate of the amount of funding raised
Fireworks AI provides a serverless inference platform that enables the rapid deployment and fine-tuning of compound AI models, optimizing for speed and cost efficiency. The technology addresses the challenges of slow model inference and high operational costs, allowing businesses to scale AI applications effectively while maintaining low latency and high throughput.
Funding: $77.0M
Rough estimate of the amount of funding raised
Sequoia Capital
Sequoia Capital
Funding: $77.0M
Rough estimate of the amount of funding raised
TitanML provides an enterprise-grade LLM cluster for high-performance language model inference, enabling organizations to deploy AI applications securely within their own infrastructure. This solution addresses the need for data privacy and control while optimizing operational costs and performance through advanced inference techniques.
Funding: $14.9M
Rough estimate of the amount of funding raised
Funding: $14.9M
Rough estimate of the amount of funding raised
This company provides on-demand GPU cloud infrastructure optimized for AI and machine learning workloads. Their platform offers scalable GPU clusters, high-speed storage, and secure networking, enabling teams to accelerate model training and deployment.
Bach is a platform-as-a-service that automates the setup and management of scalable cloud hosting environments specifically for AI and GPU workloads, eliminating the need for DevOps expertise. By utilizing multi-tenant cluster sharing and auto-scaling, Bach reduces infrastructure costs and accelerates application development, enabling teams to focus on building rather than managing complex cloud systems.
NetMind offers a unified platform for accessing and deploying diverse AI models, including LLMs and multimodal capabilities, through standard APIs and the Model Context Protocol. The service simplifies AI infrastructure by providing on-demand GPU cluster rentals and managed inference endpoints, enabling developers to integrate AI without managing complex deployments.
FlyMy.AI provides a cloud platform that enables businesses to run and integrate thousands of AI models with optimized inference times as low as 55.7 milliseconds, utilizing a compiler-first architecture for peak performance. This solution eliminates the need for extensive engineering teams and reduces operational costs by offering autoscaling and per-second billing, making advanced AI capabilities accessible to companies of all sizes.
Denvr Cloud provides on-demand and dedicated GPU computing for AI inference and model training, utilizing NVIDIA GPUs and Intel AI accelerators to enhance performance and scalability. The platform simplifies AI operations by offering transparent pricing and real-time cost monitoring, addressing the need for efficient and cost-effective infrastructure in AI development.
Funding: $10.8M
Rough estimate of the amount of funding raised
Funding: $10.8M
Rough estimate of the amount of funding raised
HostedAI is a software platform that enables service providers to efficiently manage and monetize GPU resources through dynamic allocation and consumption-based billing. By normalizing diverse GPU hardware, it simplifies AI workload management, allowing enterprises to scale operations flexibly while reducing operational complexity and costs.
Daemo AI offers an enterprise AI platform that streamlines model creation and deployment via a low‑code interface and a library of pre‑trained, domain‑agnostic models that can be fine‑tuned on proprietary data. The platform provides scalable MLOps pipelines, automated CI/CD, data versioning, and secure REST/gRPC inference APIs with built‑in monitoring, drift detection, and governance dashboards. It is delivered under a subscription model with tiered compute usage and pay‑per‑inference pricing.
FedDevBDC Capital's Growth & Transition Capital
Doubleword AI provides an inference platform that lets enterprises run large language models securely across on‑premise, private‑cloud, and public‑cloud environments. Its Batch Inference service delivers high‑throughput, cost‑optimized token processing with 1‑hour and 24‑hour SLAs, while the Control Layer adds centralized authentication, role‑based access, usage metering, and audit‑ready logging. The platform auto‑generates OpenAI‑compatible endpoints, uses GPU‑aware autoscaling and infrastructure‑as‑code for reliable, self‑healing deployments, enabling AI/ML teams to serve models without building custom infrastructure.
The startup provides a unified API that grants access to over 200 AI models, including advanced options for chat, image generation, and music creation, ensuring high availability with 99% uptime. This solution enables developers to integrate diverse AI functionalities into their applications efficiently, reducing deployment complexity and operational costs.
The startup offers a subscription-based machine learning platform that enables users to design, train, and maintain their AI models by simply uploading data. This technology allows businesses to leverage their data for actionable insights, enhancing profitability and operational efficiency.
Funding: $490.0K
Rough estimate of the amount of funding raised
Clave
Clave
Funding: $490.0K
Rough estimate of the amount of funding raised
Autumn8 provides a Deployment and Inference Toolkit for Machine Learning that includes a Model Hosting Service and an SLA Optimizer, enabling users to efficiently select, deploy, and manage generative AI models. The toolkit reduces deployment time by 50% and optimizes inference costs through workload and system-level enhancements, ensuring compliance with performance requirements across various cloud platforms.
This startup offers a distributed inference protocol that enables hosting and running fine-tuned large language models (LLMs) on decentralized hardware. Their platform prioritizes data privacy, reduces hosting costs, and provides infinite scalability for clients using generative AI models.
Elotl provides a serverless infrastructure platform designed for deploying and managing microservices, specifically tailored for AI applications. The platform enables organizations to self-host large language models, retrieval-augmented generation, and vector databases, mitigating the high costs and data privacy risks associated with public GenAI inference APIs.
Funding: $7.5M
Rough estimate of the amount of funding raised
Vertex Ventures US
Vertex Ventures US
Funding: $7.5M
Rough estimate of the amount of funding raised
mkinf aggregates idle GPU capacity from a global network of data centers, providing a single entry point for accessing distributed compute power. This platform minimizes response latency and reduces compute costs by up to 10x, enabling efficient deployment of AI models and real-time inference.
Inceptron provides a unified inference platform that compiles AI model graphs into optimized GPU binaries with automatic operator fusion and hardware‑aware code generation. The managed runtime offers serverless, autoscaling GPU replicas across multiple clouds, integrated MLOps hooks, and built‑in observability and security controls, enabling low‑latency, cost‑effective production inference. Usage is billed per token for serverless deployments or hourly for dedicated GPUs.
TensorChord offers ModelZ, a serverless infrastructure platform that enables organizations to deploy machine learning models with auto-scaling capabilities and support for popular ML frameworks. This solution addresses the challenges of infrastructure management and scaling, allowing users to focus on developing and refining their AI applications without upfront costs or long-term commitments.
Hillhouse Ventures
Provides a unified AI model management platform that enables seamless deployment, scaling, and monitoring of over 250 machine learning models through a single API. It simplifies multi-model integration by offering features like fallback mechanisms, load balancing, and detailed usage tracking, ensuring high availability and cost efficiency for AI-powered applications.
Inferex provides a cloud infrastructure tailored for the deployment and scaling of artificial intelligence applications, enabling businesses to integrate AI models into their workflows efficiently. The platform addresses the high computational demands and complex data management associated with AI workloads, allowing for rapid model deployment and reliable execution.
Funding: $1.4M
Rough estimate of the amount of funding raised
Funding: $1.4M
Rough estimate of the amount of funding raised
NeuroWatt provides a full-stack AI infrastructure platform that enables users to rent GPU computational power and access AI solutions for model training and deployment. The company supports AI project development through incubation funding and community collaboration, addressing the need for scalable resources in the rapidly growing AI sector.
Bineric is a Norwegian AI company specializing in model development and prompt engineering, with its flagship product, NorskGPT, a GDPR-compliant large language model securely hosted in Norway. The company provides tailored AI solutions that enhance natural language processing capabilities for businesses operating in the Norwegian market.
Impulse AI provides a chat‑driven platform that lets non‑technical users specify AI tasks in natural language, automatically selects appropriate model architectures, sources relevant datasets, and runs fine‑tuning pipelines on scalable GPU infrastructure. The trained model is deployed to a managed REST/gRPC inference endpoint within seconds, and creators can publish and monetize models through an integrated marketplace. The service includes version control, performance dashboards, and enterprise‑grade security.