Find Investable Startups and Competitors
Search thousands of startups using natural language—just describe what you're looking for
Top 50 Data Warehouse Service - Series B
Discover the top 50 Data Warehouse Service startups at Series B. Browse funding data, key metrics, and company insights. Average funding: $57M.
Sort by
Yellowbrick Data
-Mountain View, United StatesYellowbrick Data provides a high-performance SQL data platform that supports enterprise data warehousing and streaming analytics with continuous data ingestion and low-latency query execution. This technology enables organizations to efficiently handle large-scale, concurrent workloads while minimizing unpredictable query runtimes, facilitating faster decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
MotherDuck
-Seattle, United StatesMotherDuck is a cloud-based data warehouse that enhances DuckDB's in-process analytics capabilities, enabling real-time, collaborative data analysis without the overhead of traditional systems. It provides users with fast performance and efficient pricing, allowing for the rapid onboarding of non-technical users and the creation of interactive data applications.
Ocient
-Chicago, United StatesThe startup operates a data analytics platform that enables rapid analysis of large datasets, handling tens of terabytes to exabytes with trillions of rows. By ingesting billions of rows per second and providing filtered aggregate results, the platform simplifies complex data ecosystems for organizations.
Funding: $100M+
Rough estimate of the amount of funding raised
Onehouse
-Menlo Park, United StatesOnehouse is a fully managed cloud-native lakehouse service that ingests data from various sources in near real-time, enabling organizations to maintain a single source of truth without the need for complex data replication. By leveraging Apache Hudi and supporting multiple query engines, it reduces operational costs by over 50% while providing scalable access to analytics-ready data.
Funding: $50M+
Rough estimate of the amount of funding raised
RisingWave Labs
-San Francisco, United StatesProvides a serverless stream processing platform and an open-source distributed streaming database that enable real-time ingestion, transformation, and analysis of high-velocity data streams alongside historical data. This unified solution supports use cases like continuous analytics, real-time ETL, and feature engineering, delivering sub-second query performance and seamless integration with tools like Kafka, PostgreSQL, and Snowflake.
Funding: $20M+
Rough estimate of the amount of funding raised
Tabular (now part of Databricks)
Develops a data automation platform built on Apache Iceberg, enabling seamless table format interoperability across data lakes and warehouses. It addresses the challenges of data fragmentation and inefficiency by providing a unified standard for managing and processing large-scale datasets.
Funding: $20M+
Rough estimate of the amount of funding raised
SCUBA Analytics
-Mountain View, United StatesSCUBA is a Time-Series Data Warehouse optimized for real-time decision intelligence, enabling businesses to leverage first-party and AI-generated data without complex integrations. The platform provides in-the-moment insights and activation across channels while ensuring data privacy and enhancing audience profiles through agnostic identity matching.
Funding: $20M+
Rough estimate of the amount of funding raised
Revefi
-Redmond, United StatesRevefi develops Raden, an AI-powered data engineering platform that automates data observability, quality, and performance optimization for cloud data warehouses. The platform enables businesses to achieve a 30-50% reduction in data infrastructure costs and enhances operational efficiency by providing actionable insights within minutes.
Funding: $20M+
Rough estimate of the amount of funding raised
PrimeNumber
-Tokyo, JapanPrimeNumbers offers a data integration service called TROCCO® that automates the data acquisition process, enabling engineers to efficiently manage and utilize data from various sources. This solution addresses the challenge of fragmented data environments by providing a centralized platform for data orchestration, enhancing data accessibility and operational efficiency for businesses.
Funding: $20M+
Rough estimate of the amount of funding raised
Vaultspeed
-Leuven, BelgiumVaultspeed provides a no-code, metadata-driven data modeling platform that automates data integration and business logic for cloud data warehouses, lakehouses, and meshes. This solution enables data teams to rapidly build and deploy data products in under two sprints, significantly reducing time-to-market and technical debt.
Funding: $20M+
Rough estimate of the amount of funding raised
Mesh-AI
-London, United KingdomMesh-AI provides data engineering and artificial intelligence consulting services to help organizations optimize their data workflows and implement machine learning solutions. By enhancing data accessibility and analytical capabilities, Mesh-AI enables businesses to make informed decisions and improve operational efficiency.
Funding: $20M+
Rough estimate of the amount of funding raised
Upsolver
-Palo Alto, United StatesUpsolver provides a platform for building continuous data pipelines for cloud data lakes using SQL and automated pipeline management. This technology enables organizations to efficiently process and analyze large volumes of data in real-time, enhancing data accessibility and decision-making.
Funding: $20M+
Rough estimate of the amount of funding raised
Bigeye
-San Francisco, United StatesBigeye provides lineage-enabled data observability solutions that monitor data integrity across both modern and legacy data stacks. The platform enables teams to quickly identify and resolve data incidents, ensuring reliable data for analytics and decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
Hevo Data
-San Francisco, United StatesHevo Data provides an automated data pipeline platform that integrates data from over 150 sources in near real-time, enabling users to prepare and load analytics-ready data with minimal maintenance. The platform ensures 100% data accuracy and 99.9% uptime, addressing the challenges of data management and operational inefficiencies for modern data teams.
Funding: $20M+
Rough estimate of the amount of funding raised
Reveal
Reveal is a cloud-based platform that enables organizations to capture and analyze critical business data efficiently. By streamlining data collection and reporting processes, it enhances decision-making and operational effectiveness for enterprises.
Funding: $200M+
Rough estimate of the amount of funding raised
Space and Time
-San Francisco, United StatesSpace and Time provides a decentralized data platform that combines blockchain indexing, data warehousing, and API services, all secured by sub-second zero-knowledge (ZK) proofs for SQL queries. This technology enables smart contracts to access verified on-chain and off-chain data in real-time, enhancing the reliability and efficiency of data-driven applications.
Funding: $50M+
Rough estimate of the amount of funding raised
Census
-San Francisco, United StatesCensus is a Data Activation and Reverse ETL platform that enables businesses to define and sync trusted data from their data warehouse to over 150 operational tools without the need for code or CSVs. This solution eliminates data silos, allowing marketing and data teams to collaborate effectively by providing real-time access to actionable insights and standardized datasets.
Funding: $50M+
Rough estimate of the amount of funding raised
Anomalo
-Palo Alto, United StatesAnomalo provides automated AI-driven data quality monitoring for enterprise data warehouses, utilizing unsupervised machine learning to detect anomalies and validate data integrity without requiring code. This solution addresses the issue of unreliable data by enabling rapid identification and resolution of data quality problems, ensuring accurate and trustworthy insights for business operations.
Funding: $100M+
Rough estimate of the amount of funding raised
Cube
Cube offers a universal semantic layer that standardizes data definitions and governance across multiple business intelligence tools, enabling consistent insights and efficient analytics workflows. By centralizing data modeling and access control, Cube reduces analytics downtime and accelerates the development of data applications, resulting in significant cost savings and improved performance.
Funding: $20M+
Rough estimate of the amount of funding raised
Harbr
-London, United KingdomProvides a cloud-based platform for creating, governing, and distributing data products across enterprises without moving data from its source. It enables seamless data sharing, integration with tools and AI models, and the establishment of self-service data marketplaces, reducing the complexity and cost of data access and distribution.
Funding: $20M+
Rough estimate of the amount of funding raised
Rasgo
-City of New York, United StatesThe startup offers a feature store workflow platform that streamlines data acquisition, integration, and feature engineering for data scientists. By automating repetitive data preparation tasks, it enables teams to focus on delivering actionable insights more efficiently.
Funding: $20M+
Rough estimate of the amount of funding raised
Tobiko
-San Mateo, PhilippinesThe startup develops an open-source DataOps platform that enables data teams to transform large datasets efficiently, facilitating collaborative data management and testing of data pipeline changes. This solution addresses the challenges of data integration and decision-making by providing a framework that enhances the scalability and reliability of data operations.
Funding: $20M+
Rough estimate of the amount of funding raised
Voltron Data
-Cranford, United StatesVoltron Data provides Theseus, a GPU-accelerated SQL engine designed for processing petabyte-scale data without the need for indexing or data movement. It enables enterprises to significantly reduce query times, server counts, and operational costs, making it ideal for large-scale ETL and machine learning preprocessing tasks.
Funding: $100M+
Rough estimate of the amount of funding raised
Rivery
-East New York, United StatesRivery is a no-code ELT platform that enables rapid data ingestion, transformation, and orchestration from various sources into cloud data warehouses. It streamlines data workflows and reduces processing time, allowing businesses to efficiently manage their data operations and enhance analytics capabilities.
Funding: $20M+
Rough estimate of the amount of funding raised
WEKA
-Campbell, United StatesWEKA provides a cloud-native, software-defined data platform that enables organizations to efficiently store, process, and manage large volumes of data across on-premises and cloud environments. By transforming stagnant data silos into streaming data pipelines, WEKA enhances performance for AI and high-performance computing workloads while reducing energy consumption and carbon emissions.
Funding: $100M+
Rough estimate of the amount of funding raised
Dagster Labs
-San Francisco, United StatesDagster Labs provides a cloud-native orchestration platform that enables data engineers to manage complex data pipelines using software-defined assets and a declarative programming model. This solution enhances data pipeline velocity and reliability through integrated lineage, observability, and first-class testing, addressing the challenges of data complexity and operational inefficiencies.
Funding: $20M+
Rough estimate of the amount of funding raised
Daasity
-San Diego, United StatesDaasity provides a unified data platform that integrates and analyzes omnichannel retail data, enabling brands to visualize key performance indicators and identify growth opportunities. By transforming raw data into actionable insights, Daasity helps businesses optimize inventory management, enhance customer segmentation, and improve demand forecasting.
Funding: $20M+
Rough estimate of the amount of funding raised
DataChat
-Madison, United StatesDataChat is a no-code conversational intelligence platform that enables business users to query their data in plain English, facilitating rapid access to insights without the need for technical expertise. By allowing iterative analytics and maintaining data security within the user's database, DataChat streamlines the decision-making process and enhances data-driven operations.
Funding: $20M+
Rough estimate of the amount of funding raised
StarTree
-Mountain View, United StatesStarTree provides a platform-as-a-service built on Apache Pinot, enabling real-time analytics with sub-second query response times on petabyte-scale data. This solution allows businesses to efficiently handle high concurrency demands while minimizing costs associated with data processing and analysis.
Funding: $50M+
Rough estimate of the amount of funding raised
Promethium
-Menlo Park, United StatesPromethium offers an AI-Powered Data Fabric that enables teams to access and analyze data products in real time without moving data, facilitating instant insights for both technical and non-technical users. This platform addresses the challenge of slow data access and complex integrations by providing a unified, self-service environment with enterprise-grade security and governance.
Funding: $20M+
Rough estimate of the amount of funding raised
Infoworks.io
-Palo Alto, United StatesInfoworks provides a unified platform that automates the migration of data, metadata, and workloads to the cloud, achieving three times faster migration at one-third the cost of traditional methods. The solution modernizes cloud data operations, enabling enterprises to efficiently leverage AI, machine learning, and analytics use cases with reduced resource requirements.
Funding: $20M+
Rough estimate of the amount of funding raised
Dasera
-Mountain View, United StatesDasera is a Data Security and Privacy Management (DSPM) platform that automates the discovery, classification, and governance of structured and unstructured data across on-premises, cloud, and hybrid environments. By providing precise visibility and control over data access and usage, Dasera minimizes the risks associated with data breaches and regulatory non-compliance.
Funding: $20M+
Rough estimate of the amount of funding raised
Materialize
-East New York, United StatesMaterialize is a cloud operational data store that uses Differential Dataflow to provide strongly consistent, real-time views of operational data with sub-second latency. This technology enables businesses to quickly respond to changes by integrating and querying data from multiple sources without the complexity of traditional data processing methods.
Tembo
-Cincinnati, United StatesProvides a fully-managed PostgreSQL platform that supports over 200 extensions, enabling developers to build, scale, and optimize applications with features like data warehousing, message queuing, machine learning, and geospatial processing. By consolidating multiple data services into a single, extensible database solution, it reduces infrastructure sprawl and simplifies operational complexity for real-time and analytical workloads.
Funding: $20M+
Rough estimate of the amount of funding raised
Silk
-Needham, United StatesProvides a software-defined cloud storage platform that optimizes the performance and efficiency of cloud databases by utilizing a shared-compute architecture. It reduces costs through resource pooling and data reduction, while enabling up to 10x performance improvements, automated data lifecycle management, and zero RPO recovery across multiple cloud environments.
Funding: $20M+
Rough estimate of the amount of funding raised
Y42
-Berlin, GermanyY42 offers a turnkey data orchestration platform that enables users to build, monitor, and maintain data pipelines using Google BigQuery and Snowflake. It addresses fragmented data flows and inefficient maintenance by providing a unified architecture with built-in monitoring and version control, allowing teams to streamline their data operations.
Funding: $20M+
Rough estimate of the amount of funding raised
Datorios (FMA Metrolink.ai
-Palo Alto, United StatesDatorios provides a specialized observability solution for Apache Flink, enabling real-time monitoring of data quality, operational continuity, and issue resolution across data, code, and infrastructure. This platform helps organizations mitigate risks associated with operational disruptions and poor data quality, ultimately protecting revenue and maintaining customer trust.
Funding: $20M+
Rough estimate of the amount of funding raised
Acceldata
-Campbell, United StatesAcceldata provides a unified data observability platform that enables businesses to monitor data pipelines, detect anomalies, and ensure data quality in real-time. This technology helps organizations prevent data failures and optimize costs, ultimately enhancing the reliability of their data infrastructure.
Funding: $100M+
Rough estimate of the amount of funding raised
Tinybird
-East New York, United StatesTinybird is a data platform that enables developers to ingest both batch and streaming data, query it using SQL, and publish APIs for user-facing analytics. By simplifying the creation of real-time data products, Tinybird addresses the challenge of rapidly delivering actionable insights from large datasets without the need for complex infrastructure management.
Funding: $20M+
Rough estimate of the amount of funding raised
Hydrolix
-Portland, United StatesHydrolix is a streaming data lake that utilizes decoupled storage, indexed search, and stream processing to manage terabyte-scale log data efficiently. The platform reduces log data retention costs by 75% while enabling real-time query performance and eliminating the need for data aggregation or sampling.
Funding: $50M+
Rough estimate of the amount of funding raised
Decodable
-San Francisco, United StatesDecodable is a fully-managed, serverless platform that utilizes Apache Flink and Debezium for real-time ETL/ELT and stream processing, enabling users to efficiently ingest, transform, and deliver high-volume event data. The platform addresses the complexity of managing multiple tools by providing a unified solution that ensures data quality and compliance while minimizing operational overhead.
Funding: $20M+
Rough estimate of the amount of funding raised
Prophecy
-San Francisco, United StatesProphecy offers a Data Transformation Copilot that enables users to build, deploy, and monitor data pipelines using an AI-powered visual interface that generates native Spark or SQL code. This platform addresses the challenge of inefficient data processing by allowing business users to self-serve, significantly reducing reliance on data engineers and accelerating analytics workflows.
Funding: $100M+
Rough estimate of the amount of funding raised
Qatalog
-London, United KingdomProvides a data integration platform that connects and harmonizes information from emails, files, and various applications, including databases like BigQuery and Snowflake. It enables organizations to perform intelligent queries across their entire data ecosystem without requiring technical expertise, ensuring secure, permission-based access to insights. This streamlines data retrieval and collaboration for teams, reducing time spent on manual data aggregation and improving decision-making efficiency.
DataPelago
-Mountain View, United StatesThe startup develops an analytics platform that provides real-time insights from diverse data sets, regardless of their scale or structure. This technology enables businesses to efficiently tackle complex data challenges while optimizing performance and cost.
Funding: $20M+
Rough estimate of the amount of funding raised
Striim
-Bengaluru, IndiaStriim provides a unified data integration and streaming platform that enables real-time data ingestion and processing from diverse sources, including transaction logs and IoT sensors, using a SQL-like streaming engine. This technology allows businesses to achieve immediate insights and automate decision-making processes, enhancing operational efficiency and responsiveness to customer needs.
Funding: $50M+
Rough estimate of the amount of funding raised
Ascend.io
-Menlo Park, United StatesThe startup offers an autonomous big data analytics platform that utilizes declarative configurations and automation to manage cloud infrastructure and optimize data pipelines. This technology reduces maintenance efforts throughout the data lifecycle, enabling business managers to initiate projects and make informed decisions with ease.
Funding: $50M+
Rough estimate of the amount of funding raised
Datafold
-San Francisco, United StatesDatafold provides a unified platform for proactive data quality management through automated CI testing, machine learning-powered monitoring, and cross-database data reconciliation. The solution enables data teams to prevent quality issues and accelerate data migrations, resulting in significant time savings and improved accuracy across workflows.
Funding: $20M+
Rough estimate of the amount of funding raised
Airbyte
-San Francisco, United StatesAirbyte is an open-source data integration engine that enables organizations to sync data from various applications to data warehouses, facilitating seamless data movement across multi-cloud environments. By providing a platform for building custom connectors with low-code or no-code options, Airbyte addresses the challenge of managing diverse data sources while ensuring data privacy and governance.
Funding: $100M+
Rough estimate of the amount of funding raised
lakeFS
-Santa Monica, United StatesProvides a scalable data version control system that enables data engineers and scientists to manage data with Git-like operations, supporting reproducible pipelines and parallel experimentation. By integrating with cloud object storage and compute engines, it reduces storage costs, improves data quality, and enables rollback capabilities to maintain production integrity.
Funding: $20M+
Rough estimate of the amount of funding raised
Symmetry Systems
-San Mateo, PhilippinesDataAI Security offers a Data Security Posture Management (DSPM) platform that discovers, classifies, and monitors both structured and unstructured data across hybrid cloud environments. The platform enables organizations to detect data access anomalies and enforce compliance, thereby mitigating risks associated with data breaches and unauthorized access.
Funding: $20M+
Rough estimate of the amount of funding raised