Get our newest whitepaper, The Future of Fitness 2026

Locations

Company Background

Our client is a technology startup building advanced voice automation solutions for the quick-service restaurant industry. The company develops a privacy-conscious, high-performance drive-thru voice assistant that automates real-time customer interactions, improves order accuracy, and helps restaurant chains increase revenue and operational efficiency. Its product is designed for fast deployment in noisy, high-volume drive-thru environments and is already gaining market traction through cooperation with a major restaurant chain.

Project Description

The project is a next-generation voice automation engine for drive-thru order-taking in quick-service restaurants. It enables fully automated, real-time conversations between customers and the restaurant ordering system using modern speech recognition, natural language processing, and text-to-speech technologies.

The engineer will work on improving the core voice and audio capabilities of the platform, including Speech-to-Text, Text-to-Speech, noise cancellation, speech enhancement, and real-time audio pipelines. The main focus will be on reducing latency, improving recognition quality and speech clarity, and making the system robust in extremely noisy real-world environments such as drive-thrus.

Technologies

  • Speech-to-Text, Text-to-Speech
  • Audio Engineering / DSP
  • Noise Suppression, Voice Activity Detection, Signal Processing
  • PyTorch / TensorFlow
  • Real-Time Inference, Streaming Pipelines
  • GPU Optimization, Edge Inference
  • Production ML Systems

What You'll Do

  • Optimize low-latency, real-time Speech-to-Text pipelines for production drive-thru environments;
  • Improve Text-to-Speech naturalness, responsiveness, and overall conversational quality;
  • Design, tune, and improve noise suppression, echo cancellation, and speech enhancement systems;
  • Improve speech recognition accuracy and robustness under challenging acoustic conditions, including engine noise, weather, overlapping speech, poor microphone quality, and outdoor environments;
  • Build and scale audio processing infrastructure for production deployments;
  • Evaluate, benchmark, and compare speech models using real-world audio data and production scenarios;
  • Experiment with modern Speech AI technologies, models, and architectures to improve system performance;
  • Collaborate with LLM and conversational AI teams to improve end-to-end voice interaction quality;

Job Requirements

  • Advanced Python development skills;
  • Deep hands-on expertise with Speech-to-Text and Text-to-Speech systems;
  • Proven experience improving speech recognition quality in noisy or otherwise challenging acoustic environments;
  • Strong expertise in noise suppression, echo cancellation, voice activity detection, and speech enhancement;
  • Strong understanding of real-time and streaming audio architectures, including conversational voice pipelines and real-time inference;
  • Experience building low-latency, production-grade AI systems;
  • Experience with modern speech AI frameworks, models, and APIs;
  • Experience deploying and scaling AI services in cloud environments;
  • Ability to troubleshoot complex audio quality, latency, and reliability issues;
  • Product-oriented mindset with a focus on real-world performance, customer experience, and high ownership;
  • Ability to collaborate effectively with engineering, LLM, and conversational AI teams;
  • English level: B2 or higher;

What Do We Offer

The global benefits package includes:

  • Technical and non-technical training for professional and personal growth;
  • Internal conferences and meetups to learn from industry experts;
  • Support and mentorship from an experienced employee to help you professional grow and development;
  • Health insurance;
  • English courses;
  • Sports activities to promote a healthy lifestyle;
  • Flexible work options, including remote and hybrid opportunities;
  • Referral program for bringing in new talent;
  • Work anniversary program and additional vacation days.

Didn't find anything suitable?

We're always starting new projects and we'd love to work with you. Please send your CV and we'll get in touch.

We will be glad to see you!

First Name is required. Maximum 50 characters.
Last Name is required. Maximum 50 characters.
Email is required. Please enter a valid email address (e.g. recipient@domaine.org).
Maximum 100 characters. Add the name of our employee (e.g. John Smith)
Maximum 2000 characters.
Please attach file in the allowed format .pdf, .doc(x), .txt, .rtf Please attach file less than 3 Mb
Formats (3 MB): doc, docx, pdf, txt, rtf
Please Add Comment or Attach File.

An error occurred sending your message.
Try again or contact us via webinforequest@coherentsolutions.com.

Thanks for your application!
We will reply soon.

Share vacancy