Find Investable Startups and Competitors
Search thousands of startups using natural language—just describe what you're looking for
Top 50 Data Labeling Service - Late Stage
Discover the top 50 Data Labeling Service startups at Late Stage. Browse funding data, key metrics, and company insights. Average funding: $129.2M.
Sort by
Labelbox
-San Francisco, United StatesLabelbox operates a data training platform that utilizes AI-assisted labeling and a global network of experts to provide high-quality data curation and evaluation for machine learning applications. This platform addresses the challenge of efficiently managing large-scale data labeling and evaluation, enabling businesses to accelerate model development and improve AI performance.
Snorkel AI
-Redwood City, United StatesSnorkel Flow is an AI data development platform that enables data scientists to programmatically label and annotate large datasets, significantly reducing the time required for data preparation. By leveraging domain knowledge and automated techniques, the platform enhances the accuracy and efficiency of training data for specialized AI applications in fields like bioinformatics and natural language processing.
Funding: $100M+
Rough estimate of the amount of funding raised
DatologyAI
-Redwood City, United StatesDatologyAI develops automated data curation tools that utilize modality-agnostic algorithms to identify and eliminate redundant and noisy data points without requiring labels. This technology enables organizations to optimize their deep learning model training, significantly improving performance while reducing computational costs.
Datagen
Datagen Technologies develops simulated data technology that generates scalable, bias-free datasets with automatic annotation capabilities. This technology addresses the challenges of data scarcity and bias in machine learning, enabling more accurate and reliable model training.
Funding: $50M+
Rough estimate of the amount of funding raised
SuperAnnotate
-San Mateo, PhilippinesSuperAnnotate is an AI data platform that integrates dataset creation, curation, and model evaluation into a single workflow, enabling users to build and fine-tune high-quality models efficiently. The platform addresses the challenges of data annotation and model performance assessment by providing customizable tools and access to a global marketplace of trained annotation teams.
Encord
-San Francisco, United StatesEncord is an AI data development platform that enables computer vision and multimodal AI teams to manage, curate, and annotate diverse data types, including images, videos, and documents, all in one place. By transforming unstructured data into high-quality training datasets, Encord enhances AI model performance and accelerates labeling processes, resulting in significant improvements in accuracy and efficiency.
Clarifai
-Wilmington, United StatesClarifai offers an end-to-end AI lifecycle platform that automates data labeling, model training, and deployment, enabling organizations to build and operationalize AI applications efficiently. By standardizing workflows and optimizing compute resources, the platform reduces development time and costs, allowing enterprises to scale AI solutions rapidly.
Funding: $50M+
Rough estimate of the amount of funding raised
Defined.ai
-Lisbon, PortugalDefined.ai provides a marketplace for ethically sourced training data, specializing in diverse datasets for speech recognition, natural language processing, and medical image analysis. The company addresses the need for high-quality, bias-free data that complies with ethical and legal standards, enabling organizations to develop AI solutions responsibly and effectively.
Funding: $50M+
Rough estimate of the amount of funding raised
LANDING AI
-East New York, United StatesProvides a platform for building, deploying, and scaling computer vision models tailored to specific industry tasks, such as object detection and optical character recognition. By integrating with tools like Snowflake, it enables organizations to perform visual AI tasks directly on their data without moving it, reducing deployment time by 80% and supporting over 1 billion annual image inferences with 99.99% uptime.
Funding: $50M+
Rough estimate of the amount of funding raised
SafeGraph
-Denver, United StatesThe startup offers a machine learning-based data platform that integrates and verifies data from thousands of sources, including business names, addresses, and operational hours. This platform provides companies with accurate records essential for analyzing human movement patterns and making informed decisions.
Funding: $50M+
Rough estimate of the amount of funding raised
Roboflow
-Washington, United StatesRoboflow provides a platform for developers to manage image data and streamline the process of training and deploying computer vision models. By offering tools for dataset annotation, preprocessing, and one-click model training, it simplifies the complexities of computer vision projects, enabling faster development and deployment.
BigID
-East New York, United StatesBigID provides a cloud-native platform that utilizes machine learning for data discovery, classification, and security across hybrid environments. The solution enables organizations to manage sensitive data, ensure regulatory compliance, and mitigate risks associated with data privacy and security breaches.
People Data Labs
-City of New York, United StatesThe startup offers a technical recruiting and data analytics platform that standardizes and enhances the searchability of personnel profiles across various sectors, including sales, marketing, and talent acquisition. By providing data-driven insights, the platform enables organizations to efficiently identify and resolve identity discrepancies in recruiting and market research processes.
Funding: $50M+
Rough estimate of the amount of funding raised
Petuum
-Pittsburgh, United StatesThe startup offers a machine learning infrastructure platform that provides a flexible operating system and virtualization interface for building and deploying machine learning and deep learning applications at scale. This technology enables enterprises to manage applications and hardware from a single terminal, resulting in increased productivity, reduced operational costs, and faster delivery times.
Dataiku
-City of New York, United StatesDataiku is an enterprise AI and machine learning platform that enables organizations to prepare data, build models, and deploy AI applications at scale. It addresses the challenge of fragmented data workflows by providing a unified environment for collaboration, governance, and operational efficiency across various teams and industries.
Funding: $200M+
Rough estimate of the amount of funding raised
dotData
-San Mateo, PhilippinesDotData is an end-to-end data science automation platform that utilizes AI and machine learning to extract actionable insights from complex, multi-source data sets in minutes. It enables organizations to identify key performance drivers and enhance predictive model accuracy without requiring specialized coding skills.
Funding: $50M+
Rough estimate of the amount of funding raised
Bigeye
-San Francisco, United StatesBigeye provides lineage-enabled data observability solutions that monitor data integrity across both modern and legacy data stacks. The platform enables teams to quickly identify and resolve data incidents, ensuring reliable data for analytics and decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
Comet
-East New York, United StatesComet provides an end-to-end model evaluation platform that enables AI developers to track datasets, code changes, and experimentation history while monitoring model performance in production. This platform addresses the challenges of reproducibility and performance degradation in machine learning workflows by offering tools for experiment management, model versioning, and real-time performance monitoring.
Acceldata
-Campbell, United StatesAcceldata provides a unified data observability platform that enables businesses to monitor data pipelines, detect anomalies, and ensure data quality in real-time. This technology helps organizations prevent data failures and optimize costs, ultimately enhancing the reliability of their data infrastructure.
Funding: $100M+
Rough estimate of the amount of funding raised
Hyperscience
-City of New York, United StatesHyperscience is an intelligent document-processing platform that utilizes machine learning to automate the extraction and validation of data from various document types, achieving over 96% accuracy and 99% automation. The platform addresses the inefficiencies in manual document processing, enabling enterprises to significantly reduce turnaround times and operational costs.
Coda
-SingaporeCoda Wallet is a self-custody cryptocurrency wallet that enables users to securely manage their digital assets and private keys without relying on third-party services. This solution addresses the risks of centralized exchanges by providing users full control over their crypto holdings and personal data.
Funding: $500M+
Rough estimate of the amount of funding raised
Sentra
-Tel Aviv, IsraelProvides an AI-powered Data Security Posture Management (DSPM) platform that automatically discovers, classifies, and monitors sensitive data across cloud environments, including SaaS, IaaS, and on-premises systems. It mitigates risks by enforcing least privilege access, detecting policy violations, and preventing data breaches through continuous monitoring and contextual threat analysis.
Funding: $50M+
Rough estimate of the amount of funding raised
Edge Impulse
-San Jose, United StatesEdge Impulse provides a platform for developing embedded machine learning models that run on various edge devices, including microcontrollers and gateways. This technology enables manufacturers to optimize sensor data processing, reduce bill of materials costs, and accelerate time to market for their products.
Funding: $50M+
Rough estimate of the amount of funding raised
Tecton
-San Francisco, United StatesTecton provides an enterprise-ready feature store that automates the creation and management of data pipelines for machine learning applications, enabling data scientists to focus on feature engineering without the complexities of infrastructure. By delivering real-time, accurate data at scale, Tecton accelerates model deployment by up to 80% and enhances model performance through rapid feature experimentation.
Weights & Biases
-San Francisco, United StatesWeights & Biases provides a developer-first MLOps platform that enables machine learning teams to track, visualize, and optimize their experiments and models through tools like hyperparameter sweeps and automated workflows. The platform addresses the challenges of managing ML pipelines and data, facilitating collaboration and improving model performance across AI applications.
Funding: $200M+
Rough estimate of the amount of funding raised
Volumez
-Boston, United StatesVolumez offers a Data Infrastructure as a Service (DIaaS) platform that dynamically orchestrates compute, network, and storage resources across cloud environments to create optimized data infrastructures for various workloads. This solution addresses the challenges of performance inconsistency and resource inefficiency in data-intensive applications by delivering guaranteed high throughput, low latency, and maximized GPU utilization.
Funding: $50M+
Rough estimate of the amount of funding raised
Nozomi
-SingaporeThe startup provides a straightforward tool for collecting and organizing data from API endpoints, enabling users to efficiently manage their data flow. This solution addresses the challenge of data fragmentation by simplifying the integration and accessibility of diverse API data sources.
Funding: $100M+
Rough estimate of the amount of funding raised
Upstage AI
-Yongxin, South KoreaUpstage develops AI tools that automate repetitive tasks and enhance productivity through advanced document processing and key information extraction. Their technology provides decision support across various industries by enabling efficient data retrieval and analysis, reducing manual workload and improving operational efficiency.
Funding: $100M+
Rough estimate of the amount of funding raised
Tier IV
-Tokyo, JapanTIER IV develops autonomous driving technology for intelligent vehicles, utilizing advanced sensor fusion and machine learning algorithms to enhance navigation and safety. This technology addresses the need for efficient transportation solutions, improving mobility and quality of life in urban environments.
Funding: $200M+
Rough estimate of the amount of funding raised
Anomalo
-Palo Alto, United StatesAnomalo provides automated AI-driven data quality monitoring for enterprise data warehouses, utilizing unsupervised machine learning to detect anomalies and validate data integrity without requiring code. This solution addresses the issue of unreliable data by enabling rapid identification and resolution of data quality problems, ensuring accurate and trustworthy insights for business operations.
Funding: $100M+
Rough estimate of the amount of funding raised
TetraScience
-Boston, United StatesTetraScience provides a cloud-based platform that replatforms and engineers scientific data, enabling biopharmaceutical companies to automate lab data management and analytics. This approach addresses the inefficiencies of siloed data, resulting in a 10x increase in scientist productivity and a 60% reduction in time to market for drug discovery.
Benchling
-San Francisco, United StatesBenchling provides a cloud-based platform that digitizes laboratory workflows and automates data management for biotechnology research, enabling scientists to efficiently plan, record, and analyze experiments. By reducing time spent on manual data capture and enhancing collaboration, Benchling accelerates the development of biopharmaceuticals and other biotech products.
Funding: $100M+
Rough estimate of the amount of funding raised
Seekr
Seekr provides an end-to-end AI platform that utilizes patented technologies to enhance the accuracy of large language models (LLMs) while minimizing bias and errors. The platform simplifies AI development, enabling businesses to quickly build, validate, and deploy trusted AI applications tailored to their specific industry needs.
Funding: $50M+
Rough estimate of the amount of funding raised
Baseten
-San Francisco, United StatesBaseten provides a platform for deploying and serving machine learning models with optimized inference speed and autoscaling capabilities, enabling seamless transition from development to production. The solution addresses the complexities of model infrastructure management, allowing teams to focus on building and iterating on their AI applications without incurring excessive costs.
Funding: $50M+
Rough estimate of the amount of funding raised
Dixa
Miuros by Dixa provides AI-powered analytics and quality assurance tools for customer service teams, enabling them to transform data into actionable insights. The platform enhances agent performance and customer interactions by identifying coaching opportunities and streamlining the analysis of customer communications.
Funding: $100M+
Rough estimate of the amount of funding raised
Culture Biosciences
-San Francisco, United StatesCulture Biosciences develops cloud-connected bioreactors that enable real-time experiment design, monitoring, and data analysis for bioprocess development. Their technology allows companies to optimize cell culture processes and accelerate product development without the constraints of traditional lab setups.
Laurel
-San Francisco, United StatesThe startup offers a timekeeping software platform that utilizes artificial intelligence to automatically generate timesheets by capturing and analyzing data related to specific matters. This technology enables legal and accounting professionals to gain insights into time allocation, resulting in more accurate and structured time reporting.
Funding: $50M+
Rough estimate of the amount of funding raised
Cyberhaven
-Palo Alto, United StatesCyberhaven provides a data lineage technology that traces the flow of sensitive information across systems, enabling organizations to understand data movement and prevent unauthorized exfiltration. By combining data loss prevention, insider risk management, and cloud data security, Cyberhaven effectively mitigates insider threats and protects critical data in real-time.
HaulHub
-Haverhill, United StatesHaulHub Technologies provides a platform that integrates transportation logistics and digital ticketing for heavy construction companies, utilizing data aggregation from equipment and material suppliers. This system enhances operational efficiency by streamlining communication between contractors and project owners, addressing the challenges of data management and coordination in construction projects.
Funding: $100M+
Rough estimate of the amount of funding raised
REWIND | 1° WINDTRE Business Partner
-Ottawa, CanadaThe startup offers an automatic data backup platform that quickly backs up and restores business data without requiring technical expertise. By facilitating daily updates and ensuring data integrity, the platform protects critical information stored in various applications.
Funding: $50M+
Rough estimate of the amount of funding raised
Unsupervised
-Boulder, United StatesUnsupervised is an automated analytics platform that employs AI-powered data agents to analyze complex datasets and generate actionable insights, answers, and predictions. By continuously learning from connected data sources, it significantly reduces the time spent on manual data preparation, enabling organizations to uncover hidden value and improve decision-making efficiency.
Funding: $50M+
Rough estimate of the amount of funding raised
Reveal
Reveal is a cloud-based platform that enables organizations to capture and analyze critical business data efficiently. By streamlining data collection and reporting processes, it enhances decision-making and operational effectiveness for enterprises.
Funding: $200M+
Rough estimate of the amount of funding raised
Ataccama
-Toronto, CanadaAtaccama is an AI-powered enterprise platform that integrates data quality, master data management, and metadata management to enhance data governance. The platform enables organizations to maintain accurate and consistent data across systems, improving decision-making and operational efficiency.
Funding: $100M+
Rough estimate of the amount of funding raised
Great Expectations
-Salt Lake City, United StatesGreat Expectations offers GX Cloud, an end-to-end data quality platform that utilizes an Expectation-based approach to testing, enabling organizations to establish verifiable assertions about their data. This solution enhances data integrity and collaboration by providing a unified framework for monitoring data quality across various business functions, ensuring reliable input for critical decision-making.
Funding: $50M+
Rough estimate of the amount of funding raised
Hugging Face
-Paris, FranceThe startup offers a machine-learning community platform that facilitates collaboration on models, datasets, and applications, enabling users to create and discover machine-learning projects. By providing paid computing resources and enterprise systems, the platform enhances the efficiency of open-source development, allowing users to contribute to and advance the field of machine learning.
Funding: $200M+
Rough estimate of the amount of funding raised
Flatfile
-Denver, United StatesFlatfile provides a data onboarding platform that utilizes a JavaScript snippet to import, map, and normalize customer data from spreadsheets into software applications. This technology reduces the time and cost associated with manual data cleanup, ensuring high-quality, validated data for seamless integration into business systems.
Gretel
-San Diego, United StatesGretel is a multimodal synthetic data platform that utilizes generative AI and privacy-enhancing technologies to create artificial datasets that mirror the statistical properties of real data. This enables developers to train and validate AI models while maintaining data privacy and accelerating access to high-quality data.
Funding: $50M+
Rough estimate of the amount of funding raised
Nightfall AI
-San Francisco, United StatesNightfall is a cloud data protection platform that utilizes an AI-native detection engine to accurately identify and remediate sensitive data across various applications and environments. It addresses the risk of data leaks and compliance violations by providing real-time visibility and automated remediation for personally identifiable information (PII), personal health information (PHI), and other critical data types.
Funding: $50M+
Rough estimate of the amount of funding raised
Cyera
-East New York, United StatesCyera is an AI-driven data security platform that provides enterprises with real-time visibility into their data landscape, identifying sensitive data, access points, and associated risks. This enables organizations to mitigate data security risks, ensure compliance, and enhance their incident response capabilities.
DevRev
-Palo Alto, United StatesThe startup offers an artificial intelligence-native platform that integrates data analytics and machine learning to enhance product development and customer service. Its developer-centric customer relationship management software enables teams to efficiently build and support products, significantly increasing productivity for clients.