Find Investable Startups and Competitors
Search thousands of startups using natural language—just describe what you're looking for
Top 50 Speech Recognition
Discover the top 50 Speech Recognition startups. Browse funding data, key metrics, and company insights. Average funding: $44.5M.
Sort by
Fano Labs
Fano Labs develops automatic speech recognition (ASR) technology that accurately transcribes multilingual and mixed-language conversations, achieving over 90% accuracy in enterprise environments. Their solutions transform interaction data from customer service channels into actionable insights, enhancing compliance, operational efficiency, and customer satisfaction.
David AI
David AI generates and labels proprietary audio datasets, including over 10,000 hours of speaker-separated, natural conversations at 24+ kHz, to enhance the training of advanced speech recognition models. This unique dataset addresses the need for high-quality, non-public audio data, enabling AI developers to improve model accuracy and performance.
Verbit.ai
The startup develops interactive transcription and captioning software that employs automated speech recognition technology to convert live and recorded audio and video into searchable text files. This solution enhances accessibility and usability of multimedia content for sectors such as education, legal, media, and enterprise, enabling organizations to extract actionable insights from their audio-visual materials.
Funding: $500M+
Rough estimate of the amount of funding raised
Toma
Provides a voice-based AI platform that enables businesses to integrate natural language processing and speech recognition into their applications. This solution streamlines customer interactions, reduces response times, and improves accessibility by enabling seamless voice-driven communication.
Funding: $300K+
Rough estimate of the amount of funding raised
SoundHound
SoundHound AI provides voice AI solutions that enable brands to integrate natural language processing and speech recognition into their products. This technology enhances user interaction by allowing seamless voice commands, improving customer engagement and operational efficiency.
Funding: $200M+
Rough estimate of the amount of funding raised
SpeakX
SpeakX is an AI-powered application that enhances English speaking skills through real-time speech recognition and personalized feedback. The app targets non-native speakers who struggle with pronunciation and fluency, providing measurable improvements in conversational confidence and clarity.
Gladia
Gladia.io provides a speech-to-text API that enables real-time and asynchronous transcription of audio data with less than 300 milliseconds latency, ensuring high accuracy across over 100 languages. This technology enhances productivity for contact centers, sales teams, and media platforms by delivering actionable insights and seamless integration into existing workflows.
Funding: $20M+
Rough estimate of the amount of funding raised
Amberscript
Amberscript provides automated transcription and subtitling services that convert audio and video content into text using advanced speech recognition technology. The platform ensures 100% accuracy through a combination of machine-generated transcripts and human quality checks, enabling businesses to make their audio content accessible and searchable.
Funding: $10M+
Rough estimate of the amount of funding raised
Seasalt.ai
The startup develops a cloud-based artificial intelligence platform that utilizes speech recognition, synthesis, and natural language understanding to enhance communication between brands and customers. Its technology enables businesses to capture and analyze voice and text interactions, providing actionable insights and improving operational efficiency.
Funding: $5M+
Rough estimate of the amount of funding raised
Speak
Speak is a language-learning mobile application that utilizes speech recognition technology to facilitate real-time conversational practice in English. By providing personalized feedback and a tailored curriculum, it helps users improve their pronunciation and fluency in everyday scenarios.
AssemblyAI
AssemblyAI provides a platform of APIs for high-accuracy speech-to-text transcription, enabling real-time audio processing with features like speaker diarization and language detection. The technology allows businesses to convert audio data into actionable insights, improving accessibility and enhancing data analysis capabilities.
Lucida AI
Lucida AI offers Lucy, an AI-powered speaking coach that utilizes advanced speech recognition and a closed Large Language Model to provide real-time, personalized feedback on grammar, vocabulary, fluency, and pronunciation. This platform addresses the challenge of non-fluent English communication in professional settings, enabling employees to enhance their speaking skills and improve overall business performance.
Funding: $500K+
Rough estimate of the amount of funding raised
voize
Provides a voice-enabled documentation app for healthcare professionals, using custom AI and speech recognition to capture care reports, vital signs, and movement logs directly during patient interactions. This eliminates manual paperwork, integrates seamlessly with existing documentation systems, and saves each caregiver 20-30 minutes per shift, improving efficiency and data accuracy.
Deepgram
Deepgram provides a voice AI platform that offers APIs for speech-to-text, text-to-speech, and natural language understanding, enabling developers to integrate advanced voice capabilities into their applications. The technology addresses the need for accurate and efficient transcription and voice interaction, delivering real-time processing and support for over 30 languages at a significantly lower cost and faster speed than traditional solutions.
Funding: $100M+
Rough estimate of the amount of funding raised
Intelsense.ai
Intelsense AI provides AI-powered voice and language processing solutions, including automatic speech recognition and natural language understanding, to help enterprises accurately transcribe and analyze customer interactions in multiple languages. This technology enhances customer service by enabling businesses to understand and respond to client needs more effectively, while ensuring data privacy and confidentiality.
Funding: $100K+
Rough estimate of the amount of funding raised
Maqsam
Maqsam provides a cloud-based communication suite powered by proprietary Arabic AI, enabling accurate automatic speech recognition, call summaries, and sentiment analysis for enhanced customer interactions. The platform addresses inefficiencies in customer support and sales processes for SMBs and enterprises by automating tasks and offering precise analytics for informed decision-making.
Useful Sensors
Useful Sensors Inc. manufactures low-cost AI hardware modules, including real-time translation devices and high-performance speech-to-text software optimized for edge devices. These products enhance communication by providing accurate, instantaneous language translation and efficient speech recognition, addressing the challenges of effective interaction in diverse environments.
Funding: $5M+
Rough estimate of the amount of funding raised
Vatis Tech
Vatis Tech offers a speech recognition API that utilizes AI-driven speech-to-text technology to transcribe and translate audio in over 40 languages with up to 95% accuracy. This platform enables businesses to efficiently convert spoken content into text, facilitating faster data processing and enhancing accessibility across various industries.
Funding: $500K+
Rough estimate of the amount of funding raised
Coqui
Provides open-source speech technology tools that enable developers to build, customize, and deploy voice applications across various platforms. This addresses the need for accessible and adaptable speech recognition and synthesis solutions, supporting diverse languages and use cases without proprietary restrictions.
Funding: $3M+
Rough estimate of the amount of funding raised
Speaksee Venue Accessibility
Speaksee develops a Microphone Kit that utilizes AI technology to convert speech to text in real-time, enabling deaf and hard-of-hearing individuals to participate in group conversations across various settings. This solution addresses the lack of accessible communication in environments such as meetings and classrooms, ensuring inclusivity and improved understanding.
Funding: $2M+
Rough estimate of the amount of funding raised
BabbleLabs
BabbleLabs enhances speech quality in human-machine communication through advanced audio processing algorithms that filter background noise and improve clarity. This technology addresses issues of poor audio fidelity in voice recognition systems, enabling more accurate and efficient interactions.
Funding: $10M+
Rough estimate of the amount of funding raised
The MAMA AI
MAMA AI develops solutions in conversational AI, speech recognition, and machine translation to facilitate natural interactions between humans and machines. Their technology addresses the challenge of effective communication in various languages and contexts, enhancing user experience across customer service and enterprise applications.
Yobe
Yobe develops AI-powered technology for voice data extraction and analytics, enabling precise interpretation of speech nuances such as intent, emotion, and identity in various acoustic environments. This technology addresses the limitations of traditional voice recognition systems, enhancing user experience and safety in applications like in-car voice controls.
Funding: $5M+
Rough estimate of the amount of funding raised
Fluent.ai
The startup develops a personalized voice user interface technology that enables offline, noise-robust speech recognition without converting speech to text, allowing it to understand audible speech in any language or accent. This technology provides businesses with secure and convenient voice interface solutions that perform reliably in challenging environments.
Funding: $5M+
Rough estimate of the amount of funding raised
Chimege systems
The startup develops an artificial intelligence platform for automatic speech recognition that converts speech and audio files into text. This technology enables efficient search and dictation capabilities for audio and video content, enhancing accessibility and production workflows in the media content market.
Funding: $3M+
Rough estimate of the amount of funding raised
Nuvo (Previously AI Communis)
The startup develops an automatic speech recognition platform that employs voice recognition and natural language processing powered by artificial intelligence. This technology enables enterprises to efficiently translate, subtitle, and edit audio content in a cloud environment, supporting sixteen Asian languages for enhanced global communication and productivity.
Funding: $2M+
Rough estimate of the amount of funding raised
Navana.ai
Navana.ai develops high-performance speech recognition APIs tailored for Indian languages, enabling voice-based interactions in e-commerce and financial services. Their technology addresses the challenge of language accessibility by providing accurate, customizable voice solutions that support over 11 regional languages and dialects.
Xelex AI
Xelex provides text and audio data-enrichment services that enhance the accuracy of automatic speech recognition (ASR) and natural language processing (NLP) models for machine learning applications. By delivering meticulously curated training data and rapid transcript correction, Xelex addresses the need for reliable and precise data in contact center solutions and healthcare AI development.
ActionPower
ActionPower provides AI-driven speech recognition and natural language processing services that enhance human-computer interaction. The technology enables users to efficiently transcribe and analyze spoken language, improving accessibility and communication in various applications.
Funding: $10M+
Rough estimate of the amount of funding raised
Nuvo (Previously AI Communis)
The startup specializes in Automatic Speech Recognition and Natural Language Processing technologies to convert spoken language into text and analyze its meaning. This enables businesses to enhance customer interactions and streamline data processing by accurately transcribing and understanding voice communications.
Funding: $1M+
Rough estimate of the amount of funding raised
Voicegain
Voicegain offers a Generative Voice AI platform that utilizes deep learning-based Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) APIs to transcribe and analyze voice interactions in contact centers. This technology enhances operational efficiency by providing accurate transcriptions, sentiment analysis, and actionable insights, enabling businesses to improve customer experience and streamline quality assurance processes.
MediNav
MediNav is a medical dictation software that utilizes advanced speech recognition and natural language processing algorithms to learn and extract relevant medical information, significantly reducing documentation time for healthcare professionals. By automating patient documentation, MediNav allows doctors to spend more time with patients and improves operational efficiency in hospitals and clinics.
Funding: $500K+
Rough estimate of the amount of funding raised
Kanari AI
This startup develops speech recognition, text-to-speech, and natural language processing technologies specifically for Arabic and other languages, enhancing accessibility and inclusivity in communication. Their solutions provide high accuracy and customization for various industries, addressing the need for effective language processing in diverse applications such as media, healthcare, and education.
Funding: $300K+
Rough estimate of the amount of funding raised
Cadence
The startup provides real-time translation services for business meetings and live events using advanced speech recognition and natural language processing technologies. This offering enables effective communication across language barriers, enhancing collaboration and participation in multilingual environments.
Noota
Noota is a platform that records meetings and utilizes speech recognition technology to transcribe conversations and generate structured meeting summaries. By automating note-taking and integrating with CRM and ATS systems, it enhances productivity and ensures that critical information is captured and easily accessible.
Funding: $100K+
Rough estimate of the amount of funding raised
i2x
i2x utilizes real-time speech recognition and AI-driven analytics to transcribe and analyze phone conversations, providing immediate feedback and coaching to agents during calls. This technology enhances agent performance and onboarding efficiency, leading to measurable increases in conversion rates and customer satisfaction.
Inscripta
The startup develops an AI-powered speech recognition API that enhances medical transcription by integrating personnel and workflow management systems. This technology improves accuracy, language adaptability, and security, enabling healthcare enterprises to efficiently capture and analyze audio data from patient interactions.
Funding: $2M+
Rough estimate of the amount of funding raised
TalkTastic
TalkTastic utilizes multimodal AI to provide contextual speech recognition and personalized text rewrites across all macOS applications, enhancing dictation accuracy by analyzing the content displayed on the user's screen. This technology addresses the challenge of ineffective dictation tools that fail to understand context, enabling users to communicate more clearly and efficiently without the need for manual editing.
Augustus Intelligence
The startup develops secure artificial intelligence software for face and speech recognition, providing clients with reliable and scalable data-driven results. This technology enables organizations to implement AI solutions while ensuring data integrity and trustworthiness in their applications.
Funding: $20M+
Rough estimate of the amount of funding raised
AudioTelligence
AudioTelligence provides real-time audio processing technology that enhances automatic speech recognition (ASR) systems by improving accuracy in noisy environments. This technology addresses the challenge of misinterpretation in speech recognition, enabling clearer and more reliable communication in various applications.
Funding: $1M+
Rough estimate of the amount of funding raised
Tepy.ai
Tepy.ai provides a silent-speech recognition platform that converts lip movements into audible speech or text. This enables hands-free, non-invasive communication for individuals with voice loss or in sound-sensitive environments, allowing them to control applications and interact with digital devices.
sylby GmbH
Sylby combines AI speech recognition with linguistics to provide real-time feedback on pronunciation, specifically targeting challenging sounds for non-native speakers. The app offers personalized lessons and exercises, enabling language learners to improve their communication skills at their own pace, ultimately enhancing their confidence in speaking.
Babel
Babel Technology develops speech-language processing solutions powered by artificial intelligence to enhance communication accessibility for individuals with speech impairments. Their technology enables real-time speech recognition and language translation, facilitating improved interaction in various settings.
Say It Labs
The startup develops speech recognition-based video games that utilize artificial intelligence and speech science to assist individuals with speech disorders. By integrating therapeutic approaches into engaging gameplay, users can practice their speech and work towards specific therapy goals in a supportive environment.
Funding: $500K+
Rough estimate of the amount of funding raised
TalkMe
TalkMe is an AI-driven language practice platform that utilizes speech recognition and natural language processing to enhance pronunciation and grammar comprehension for learners. By providing personalized feedback and interactive speaking exercises, it helps users build confidence and deepen their understanding of language and culture.
Gowajee.ai
Gowajee provides a voice AI platform that utilizes proprietary Automatic Speech Recognition (ASR) and Generative Voice technologies to deliver real-time transcription and realistic voice generation for contact centers. Their solutions enhance compliance and customer verification processes while achieving industry-leading accuracy in Thai language applications.
Kardome
Kardome develops voice interaction technology that enhances speech recognition accuracy in noisy environments by utilizing deep learning and spatial hearing algorithms. This technology enables real-time, secure voice control for smart devices, addressing the challenges users face with traditional voice interfaces in complex soundscapes.
SalesNote
SalesNote is an AI-driven platform that utilizes advanced speech recognition technology to transcribe and analyze conversations in real-time, providing actionable insights and detailed reports for sales teams. By automating data capture and integrating with existing CRM systems, it enhances communication efficiency and ensures that critical information is never overlooked.
JUST: Access
The startup offers an AI-driven transcription tool that utilizes advanced speech recognition technology to generate highly accurate transcripts from video conference audio. This platform provides essential legal transcription services, including live event transcription and data insights, enabling businesses to efficiently convert spoken content into written form.
Funding: $300K+
Rough estimate of the amount of funding raised
superwhisper
Provides an offline, AI-powered voice-to-text transcription solution for macOS and iOS, utilizing local models to ensure data privacy and security. It enables users to transcribe speech in over 100 languages, integrate with any application via clipboard, and customize vocabulary, streamlining tasks like note-taking, email responses, and coding without the need for typing.