Find Investable Startups and Competitors
Search thousands of startups using natural language—just describe what you're looking for
Top 50 Data Annotation Platform - Late Stage
Discover the top 50 Data Annotation Platform startups at Late Stage. Browse funding data, key metrics, and company insights. Average funding: $117.7M.
Sort by
SuperAnnotate
-San Mateo, PhilippinesSuperAnnotate is an AI data platform that integrates dataset creation, curation, and model evaluation into a single workflow, enabling users to build and fine-tune high-quality models efficiently. The platform addresses the challenges of data annotation and model performance assessment by providing customizable tools and access to a global marketplace of trained annotation teams.
Encord
-San Francisco, United StatesEncord is an AI data development platform that enables computer vision and multimodal AI teams to manage, curate, and annotate diverse data types, including images, videos, and documents, all in one place. By transforming unstructured data into high-quality training datasets, Encord enhances AI model performance and accelerates labeling processes, resulting in significant improvements in accuracy and efficiency.
Datagen
Datagen Technologies develops simulated data technology that generates scalable, bias-free datasets with automatic annotation capabilities. This technology addresses the challenges of data scarcity and bias in machine learning, enabling more accurate and reliable model training.
Funding: $50M+
Rough estimate of the amount of funding raised
Labelbox
-San Francisco, United StatesLabelbox operates a data training platform that utilizes AI-assisted labeling and a global network of experts to provide high-quality data curation and evaluation for machine learning applications. This platform addresses the challenge of efficiently managing large-scale data labeling and evaluation, enabling businesses to accelerate model development and improve AI performance.
Snorkel AI
-Redwood City, United StatesSnorkel Flow is an AI data development platform that enables data scientists to programmatically label and annotate large datasets, significantly reducing the time required for data preparation. By leveraging domain knowledge and automated techniques, the platform enhances the accuracy and efficiency of training data for specialized AI applications in fields like bioinformatics and natural language processing.
Funding: $100M+
Rough estimate of the amount of funding raised
Roboflow
-Washington, United StatesRoboflow provides a platform for developers to manage image data and streamline the process of training and deploying computer vision models. By offering tools for dataset annotation, preprocessing, and one-click model training, it simplifies the complexities of computer vision projects, enabling faster development and deployment.
DatologyAI
-Redwood City, United StatesDatologyAI develops automated data curation tools that utilize modality-agnostic algorithms to identify and eliminate redundant and noisy data points without requiring labels. This technology enables organizations to optimize their deep learning model training, significantly improving performance while reducing computational costs.
SafeGraph
-Denver, United StatesThe startup offers a machine learning-based data platform that integrates and verifies data from thousands of sources, including business names, addresses, and operational hours. This platform provides companies with accurate records essential for analyzing human movement patterns and making informed decisions.
Funding: $50M+
Rough estimate of the amount of funding raised
LANDING AI
-East New York, United StatesProvides a platform for building, deploying, and scaling computer vision models tailored to specific industry tasks, such as object detection and optical character recognition. By integrating with tools like Snowflake, it enables organizations to perform visual AI tasks directly on their data without moving it, reducing deployment time by 80% and supporting over 1 billion annual image inferences with 99.99% uptime.
Funding: $50M+
Rough estimate of the amount of funding raised
Clarifai
-Wilmington, United StatesClarifai offers an end-to-end AI lifecycle platform that automates data labeling, model training, and deployment, enabling organizations to build and operationalize AI applications efficiently. By standardizing workflows and optimizing compute resources, the platform reduces development time and costs, allowing enterprises to scale AI solutions rapidly.
Funding: $50M+
Rough estimate of the amount of funding raised
Dataiku
-City of New York, United StatesDataiku is an enterprise AI and machine learning platform that enables organizations to prepare data, build models, and deploy AI applications at scale. It addresses the challenge of fragmented data workflows by providing a unified environment for collaboration, governance, and operational efficiency across various teams and industries.
Funding: $200M+
Rough estimate of the amount of funding raised
Defined.ai
-Lisbon, PortugalDefined.ai provides a marketplace for ethically sourced training data, specializing in diverse datasets for speech recognition, natural language processing, and medical image analysis. The company addresses the need for high-quality, bias-free data that complies with ethical and legal standards, enabling organizations to develop AI solutions responsibly and effectively.
Funding: $50M+
Rough estimate of the amount of funding raised
Edge Impulse
-San Jose, United StatesEdge Impulse provides a platform for developing embedded machine learning models that run on various edge devices, including microcontrollers and gateways. This technology enables manufacturers to optimize sensor data processing, reduce bill of materials costs, and accelerate time to market for their products.
Funding: $50M+
Rough estimate of the amount of funding raised
Nozomi
-SingaporeThe startup provides a straightforward tool for collecting and organizing data from API endpoints, enabling users to efficiently manage their data flow. This solution addresses the challenge of data fragmentation by simplifying the integration and accessibility of diverse API data sources.
Funding: $100M+
Rough estimate of the amount of funding raised
dotData
-San Mateo, PhilippinesDotData is an end-to-end data science automation platform that utilizes AI and machine learning to extract actionable insights from complex, multi-source data sets in minutes. It enables organizations to identify key performance drivers and enhance predictive model accuracy without requiring specialized coding skills.
Funding: $50M+
Rough estimate of the amount of funding raised
Anomalo
-Palo Alto, United StatesAnomalo provides automated AI-driven data quality monitoring for enterprise data warehouses, utilizing unsupervised machine learning to detect anomalies and validate data integrity without requiring code. This solution addresses the issue of unreliable data by enabling rapid identification and resolution of data quality problems, ensuring accurate and trustworthy insights for business operations.
Funding: $100M+
Rough estimate of the amount of funding raised
Flatfile
-Denver, United StatesFlatfile provides a data onboarding platform that utilizes a JavaScript snippet to import, map, and normalize customer data from spreadsheets into software applications. This technology reduces the time and cost associated with manual data cleanup, ensuring high-quality, validated data for seamless integration into business systems.
BigID
-East New York, United StatesBigID provides a cloud-native platform that utilizes machine learning for data discovery, classification, and security across hybrid environments. The solution enables organizations to manage sensitive data, ensure regulatory compliance, and mitigate risks associated with data privacy and security breaches.
Ataccama
-Toronto, CanadaAtaccama is an AI-powered enterprise platform that integrates data quality, master data management, and metadata management to enhance data governance. The platform enables organizations to maintain accurate and consistent data across systems, improving decision-making and operational efficiency.
Funding: $100M+
Rough estimate of the amount of funding raised
Tecton
-San Francisco, United StatesTecton provides an enterprise-ready feature store that automates the creation and management of data pipelines for machine learning applications, enabling data scientists to focus on feature engineering without the complexities of infrastructure. By delivering real-time, accurate data at scale, Tecton accelerates model deployment by up to 80% and enhances model performance through rapid feature experimentation.
Acceldata
-Campbell, United StatesAcceldata provides a unified data observability platform that enables businesses to monitor data pipelines, detect anomalies, and ensure data quality in real-time. This technology helps organizations prevent data failures and optimize costs, ultimately enhancing the reliability of their data infrastructure.
Funding: $100M+
Rough estimate of the amount of funding raised
Aumni
-Salt Lake City, United StatesAumni is an investment analytics platform that centralizes venture portfolio management by extracting and analyzing data from legal deal documents, enabling precise monitoring of capitalization, valuations, and performance metrics. The platform addresses inefficiencies in portfolio reporting and data collection, allowing venture capital firms to generate actionable insights and streamline workflows.
Funding: $50M+
Rough estimate of the amount of funding raised
Unsupervised
-Boulder, United StatesUnsupervised is an automated analytics platform that employs AI-powered data agents to analyze complex datasets and generate actionable insights, answers, and predictions. By continuously learning from connected data sources, it significantly reduces the time spent on manual data preparation, enabling organizations to uncover hidden value and improve decision-making efficiency.
Funding: $50M+
Rough estimate of the amount of funding raised
Dune
-Oslo, NorwayDune Analytics is an Ethereum-centric analytics platform that provides real-time access to on-chain crypto data across over 93 blockchains, enabling users to analyze and visualize blockchain activity. The platform addresses the challenge of data accessibility in the crypto space, allowing developers and analysts to derive actionable insights from complex on-chain information.
Funding: $50M+
Rough estimate of the amount of funding raised
People Data Labs
-City of New York, United StatesThe startup offers a technical recruiting and data analytics platform that standardizes and enhances the searchability of personnel profiles across various sectors, including sales, marketing, and talent acquisition. By providing data-driven insights, the platform enables organizations to efficiently identify and resolve identity discrepancies in recruiting and market research processes.
Funding: $50M+
Rough estimate of the amount of funding raised
Ascend.io
-Menlo Park, United StatesThe startup offers an autonomous big data analytics platform that utilizes declarative configurations and automation to manage cloud infrastructure and optimize data pipelines. This technology reduces maintenance efforts throughout the data lifecycle, enabling business managers to initiate projects and make informed decisions with ease.
percipient.ai
-Santa Clara, CubaPercipient.ai offers the Mirage® Intelligence Analysis Platform, which utilizes artificial intelligence to analyze unstructured multimedia and intelligence data in real time. This technology enhances decision-making capabilities for organizations by providing accurate insights into patterns and relationships, enabling them to protect critical resources and maintain operational effectiveness.
Funding: $100M+
Rough estimate of the amount of funding raised
Coda
-SingaporeCoda Wallet is a self-custody cryptocurrency wallet that enables users to securely manage their digital assets and private keys without relying on third-party services. This solution addresses the risks of centralized exchanges by providing users full control over their crypto holdings and personal data.
Funding: $500M+
Rough estimate of the amount of funding raised
Reveal
Reveal is a cloud-based platform that enables organizations to capture and analyze critical business data efficiently. By streamlining data collection and reporting processes, it enhances decision-making and operational effectiveness for enterprises.
Funding: $200M+
Rough estimate of the amount of funding raised
InfoSum
-Basingstoke, United KingdomThe startup offers a decentralized data platform that enables secure connections between multiple parties, utilizing patented non-movement data technology to facilitate data collaboration without exposing sensitive information. This approach enhances customer experiences while ensuring privacy, allowing companies to leverage their customer data effectively and safely.
Funding: $50M+
Rough estimate of the amount of funding raised
REWIND | 1° WINDTRE Business Partner
-Ottawa, CanadaThe startup offers an automatic data backup platform that quickly backs up and restores business data without requiring technical expertise. By facilitating daily updates and ensuring data integrity, the platform protects critical information stored in various applications.
Funding: $50M+
Rough estimate of the amount of funding raised
Prophecy
-San Francisco, United StatesProphecy offers a Data Transformation Copilot that enables users to build, deploy, and monitor data pipelines using an AI-powered visual interface that generates native Spark or SQL code. This platform addresses the challenge of inefficient data processing by allowing business users to self-serve, significantly reducing reliance on data engineers and accelerating analytics workflows.
Funding: $100M+
Rough estimate of the amount of funding raised
Cyera
-East New York, United StatesCyera is an AI-driven data security platform that provides enterprises with real-time visibility into their data landscape, identifying sensitive data, access points, and associated risks. This enables organizations to mitigate data security risks, ensure compliance, and enhance their incident response capabilities.
Coalesce Automation
-San Francisco, United StatesCoalesce provides a data transformation platform that enables users to visually develop and automate data pipelines, significantly reducing the time required for complex data operations. This solution addresses the inefficiencies of manual coding in data management, allowing teams to deliver high-quality analytics-ready data at scale.
Funding: $50M+
Rough estimate of the amount of funding raised
Versana
-City of New York, United StatesThe startup offers a loan data platform that captures agent banks' reference data directly from its source, providing real-time insights into loan-level details and portfolio positions. This technology enhances transparency and efficiency in the syndicated loan market by eliminating discrepancies through direct digital access.
Funding: $50M+
Rough estimate of the amount of funding raised
Great Expectations
-Salt Lake City, United StatesGreat Expectations offers GX Cloud, an end-to-end data quality platform that utilizes an Expectation-based approach to testing, enabling organizations to establish verifiable assertions about their data. This solution enhances data integrity and collaboration by providing a unified framework for monitoring data quality across various business functions, ensuring reliable input for critical decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
Hyperscience
-City of New York, United StatesHyperscience is an intelligent document-processing platform that utilizes machine learning to automate the extraction and validation of data from various document types, achieving over 96% accuracy and 99% automation. The platform addresses the inefficiencies in manual document processing, enabling enterprises to significantly reduce turnaround times and operational costs.
DevRev
-Palo Alto, United StatesThe startup offers an artificial intelligence-native platform that integrates data analytics and machine learning to enhance product development and customer service. Its developer-centric customer relationship management software enables teams to efficiently build and support products, significantly increasing productivity for clients.
Petuum
-Pittsburgh, United StatesThe startup offers a machine learning infrastructure platform that provides a flexible operating system and virtualization interface for building and deploying machine learning and deep learning applications at scale. This technology enables enterprises to manage applications and hardware from a single terminal, resulting in increased productivity, reduced operational costs, and faster delivery times.
Cognite
-Sake, NorwayCognite develops an industrial IoT data platform, Cognite Data Fusion®, that creates AI-populated digital twins and knowledge graphs to provide real-time insights for heavy-asset industries. This technology enables organizations to enhance productivity, optimize maintenance, and achieve a 400% ROI by simplifying access to complex industrial data.
Funding: $200M+
Rough estimate of the amount of funding raised
Gretel
-San Diego, United StatesGretel is a multimodal synthetic data platform that utilizes generative AI and privacy-enhancing technologies to create artificial datasets that mirror the statistical properties of real data. This enables developers to train and validate AI models while maintaining data privacy and accelerating access to high-quality data.
Funding: $50M+
Rough estimate of the amount of funding raised
Zoomin
-East New York, United StatesZoomin provides a data governance and LLM readiness platform that integrates and enriches unstructured enterprise data from various knowledge repositories, enabling organizations to enhance their AI applications. By streamlining data ingestion and applying advanced retrieval strategies, Zoomin improves the relevance and performance of AI-driven insights across multiple customer touchpoints.
Funding: $50M+
Rough estimate of the amount of funding raised
Cyberhaven
-Palo Alto, United StatesCyberhaven provides a data lineage technology that traces the flow of sensitive information across systems, enabling organizations to understand data movement and prevent unauthorized exfiltration. By combining data loss prevention, insider risk management, and cloud data security, Cyberhaven effectively mitigates insider threats and protects critical data in real-time.
Stockmark Inc.
-Tokyo, JapanThe startup offers an enterprise platform that leverages artificial intelligence to analyze market trends and internal operations, enabling businesses to make data-driven decisions. By providing actionable insights, the platform enhances operational efficiency and supports scalable growth in dynamic environments.
Funding: $50M+
Rough estimate of the amount of funding raised
Bigeye
-San Francisco, United StatesBigeye provides lineage-enabled data observability solutions that monitor data integrity across both modern and legacy data stacks. The platform enables teams to quickly identify and resolve data incidents, ensuring reliable data for analytics and decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
Benchling
-San Francisco, United StatesBenchling provides a cloud-based platform that digitizes laboratory workflows and automates data management for biotechnology research, enabling scientists to efficiently plan, record, and analyze experiments. By reducing time spent on manual data capture and enhancing collaboration, Benchling accelerates the development of biopharmaceuticals and other biotech products.
Funding: $100M+
Rough estimate of the amount of funding raised
DroneDeploy
-San Francisco, United StatesDroneDeploy is a cloud-based platform that integrates data from drones, robots, and 360 cameras to provide real-time mapping and analytics for asset management. It reduces the need for manual inspections, enhancing safety and operational efficiency across various industries, including construction and energy.
Funding: $50M+
Rough estimate of the amount of funding raised
Inveniam
-City of New York, United StatesThe startup offers a data operating platform that enhances the liquidity of private market assets, specifically in private equity and commercial real estate, by ensuring data integrity and provenance. This enables asset owners, valuation firms, and investors to efficiently buy and sell assets while maintaining privacy and facilitating accurate price discovery.
Funding: $100M+
Rough estimate of the amount of funding raised
Atlan
-SingaporeAtlan is an active metadata platform that consolidates and enriches metadata from various data sources, enabling data teams to visualize column-level lineage and implement role-based access controls. This platform addresses the challenge of data discovery and governance by providing a centralized control plane for trusted, AI-ready data, enhancing compliance and user adoption across organizations.
Funding: $100M+
Rough estimate of the amount of funding raised
Darrow AI
-City of New York, United StatesThe startup operates an AI-powered justice intelligence platform that analyzes publicly available data to identify significant legal violations. By connecting plaintiff lawyers with high-value cases, the platform reduces the time spent on business development, allowing law firms to concentrate on litigation.
Funding: $50M+
Rough estimate of the amount of funding raised