Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!
We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.


User unblocked successfully
https://bayt.page.link/9XkcTzBrXnWGckHM7
Back to the job results
Other Business Support Services
Create a job alert for similar positions
Job alert turned off. You won’t receive updates for this search anymore.

Job description

Lucida is teaching the world to speak.
Two billion people are trying to learn a language.
Almost all of them are stuck ; not because they lack motivation, but because the only thing that actually works (talking to a human tutor) is too expensive, too inconvenient, or too embarrassing.
We're building the alternative: a voice-first AI tutor you can actually have a conversation with, anytime, in your pocket.
Real-time. Sub-second.
Feels-like-a-person.
Already serving a million learners.
We're well-funded, seed-stage, and we're hiring the engineer who'll build the backbone behind that product.
The role You'll own a meaningful surface of our backend ; the systems that turn audio, models, prompts, and user state into a working tutor at scale.
Day-to-day, you'll: Design and operate the real-time conversational pipeline ; streaming services and WebSocket interfaces that keep latency budgets honest at the scale of a million users Build and harden the LLM orchestration layer ; prompt design as code, structured outputs, streaming, retries, fallbacks, cost control across multiple providers Treat prompts as engineering artifacts : versioned, evaluated, regression-tested.
Vibes are not a methodology.
Take open-source models (LLM, ASR, TTS, avatar) from a paper or HF repo and put them on our GPUs ; benchmark, optimize, serve, monitor Fine-tune and train our own models on top of open-source bases ; curate datasets, run training jobs, evaluate against production criteria, and ship the result Design event-driven media flows ; webhooks, post-session processing, recording and export pipelines Own third-party integrations end-to-end ; contracts, retries, observability, the boring-important stuff Make architecture decisions with the founders, not after them What we're looking for 5+ years writing production Python you're not embarrassed by ; typed, tested, readable Deep fluency in asyncio and concurrent/streaming code Strong command of HTTP, WebSockets, and event-driven systems Hands-on experience integrating with LLM APIs in production ; streaming, tool use, structured outputs, and the operational realities (rate limits, retries, cost control) A real sense of prompt engineering as engineering ; you've shipped prompts that survived contact with users, iterated on them with data, and didn't just "feel good in the playground" A real fine-tuning / training track record ; you've taken an open-source model, prepared the data, run the training, evaluated it honestly, and shipped the result to users.
Not a notebook tutorial.
A model that moved a metric.
Experience deploying and serving your own models on GPUs ; quantization, batching, KV-cache, latency/throughput tradeoffs A debugging instinct for distributed systems at scale: traces, profiling, backpressure, capacity planning Comfort with Postgres, Redis, and a queue/broker layer Pragmatism ; you ship, you measure, you iterate.
You don't over-engineer, and you don't under-test.
Nice to have Real-time media systems (WebRTC, SFU, streaming pipelines) Audio or speech model deployment and fine-tuning in production Distillation, synthetic data generation, or RLHF/DPO-style alignment work Multi-region or multi-cloud infrastructure Cost optimization at scale, token economics, GPU utilization, caching strategies Open-source contributions
This job post has been translated by AI and may contain minor differences or errors.

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.
Job alert created for this search. You’ll receive updates when new jobs match.
Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.