Locations

Company Background

Our client is a technology startup building advanced voice automation solutions for the quick-service restaurant industry. The company develops a privacy-conscious, high-performance drive-thru voice assistant that automates real-time customer interactions, improves order accuracy, and helps restaurant chains increase revenue and operational efficiency. Its product is designed for fast deployment in noisy, high-volume drive-thru environments and is already gaining market traction through cooperation with a major restaurant chain.

Project Description

The project is a next-generation voice automation engine for drive-thru order-taking in quick-service restaurants. It enables fully automated, real-time conversations between customers and the restaurant ordering system using modern speech recognition, natural language processing, and text-to-speech technologies.

The engineer will work on improving the core voice and audio capabilities of the platform, including Speech-to-Text, Text-to-Speech, noise cancellation, speech enhancement, and real-time audio pipelines. The main focus will be on reducing latency, improving recognition quality and speech clarity, and making the system robust in extremely noisy real-world environments such as drive-thrus.

Technologies

Speech-to-Text, Text-to-Speech
Audio Engineering / DSP
Noise Suppression, Voice Activity Detection, Signal Processing
PyTorch / TensorFlow
Real-Time Inference, Streaming Pipelines
GPU Optimization, Edge Inference
Production ML Systems

What You'll Do

Optimize low-latency, real-time Speech-to-Text pipelines for production drive-thru environments;
Improve Text-to-Speech naturalness, responsiveness, and overall conversational quality;
Design, tune, and improve noise suppression, echo cancellation, and speech enhancement systems;
Improve speech recognition accuracy and robustness under challenging acoustic conditions, including engine noise, weather, overlapping speech, poor microphone quality, and outdoor environments;
Build and scale audio processing infrastructure for production deployments;
Evaluate, benchmark, and compare speech models using real-world audio data and production scenarios;
Experiment with modern Speech AI technologies, models, and architectures to improve system performance;
Collaborate with LLM and conversational AI teams to improve end-to-end voice interaction quality;

Job Requirements

Advanced Python development skills;
Deep hands-on expertise with Speech-to-Text and Text-to-Speech systems;
Proven experience improving speech recognition quality in noisy or otherwise challenging acoustic environments;
Strong expertise in noise suppression, echo cancellation, voice activity detection, and speech enhancement;
Strong understanding of real-time and streaming audio architectures, including conversational voice pipelines and real-time inference;
Experience building low-latency, production-grade AI systems;
Experience with modern speech AI frameworks, models, and APIs;
Experience deploying and scaling AI services in cloud environments;
Ability to troubleshoot complex audio quality, latency, and reliability issues;
Product-oriented mindset with a focus on real-world performance, customer experience, and high ownership;
Ability to collaborate effectively with engineering, LLM, and conversational AI teams;
English level: B2 or higher;

What Do We Offer

The global benefits package includes:

Technical and non-technical training for professional and personal growth;
Internal conferences and meetups to learn from industry experts;
Support and mentorship from an experienced employee to help you professional grow and development;
Health insurance;
English courses;
Sports activities to promote a healthy lifestyle;
Flexible work options, including remote and hybrid opportunities;
Referral program for bringing in new talent;
Work anniversary program and additional vacation days.

Didn't find anything suitable?

We're always starting new projects and we'd love to work with you. Please send your CV and we'll get in touch.

We will be glad to see you!

First Name First Name is required. Maximum 50 characters.

Last Name Last Name is required. Maximum 50 characters.

Email Email is required. Please enter a valid email address (e.g. recipient@domaine.org).

Referral Maximum 100 characters. Add the name of our employee (e.g. John Smith)

Comments Maximum 2000 characters.

Attach file Please attach file in the allowed format .pdf, .doc(x), .txt, .rtf Please attach file less than 3 Mb

Formats (3 MB): doc, docx, pdf, txt, rtf

I have acknowledged and agree that Coherent Solutions will process my submitted personal data pursuant to Privacy Policy for Job Applicants and understand that due to international presence of Coherent Solutions my personal data may be transferred to third countries. I also give consent to Coherent Solutions to process my personal data for 3 (three) years for the purpose to notify me about future job openings. I am informed that I can withdraw my consent anytime by submitting a request to privacy@coherentsolutions.com. In such case my personal data will be deleted from a candidates database.

Please Add Comment or Attach File.

An error occurred sending your message.
Try again or contact us via webinforequest@coherentsolutions.com.

Thanks for your application!
We will reply soon.

Other open vacancies: AI/ML

AI Engineer (Life Sciences & Healthcare)

Countries
7
BulgariaGeorgiaLithuaniaMexicoMoldovaPolandUkraine
BulgariaGeorgiaLithuaniaMexicoMoldovaPolandUkraine

AI/ML Developer

Countries
5
BulgariaGeorgiaLithuaniaMoldovaPoland
BulgariaGeorgiaLithuaniaMoldovaPoland