We’re hiring a Full Stack AI Engineer to build AI-native products end to end: applications, agents, RAG/GraphRAG, NL2SQL, evals, and observability. You’ll turn LLM capabilities into reliable, user-facing features that are measurable, debuggable, and safe in production.
What you’ll do- Build and own full-stack AI features across frontend, backend, and data layers for web applications.
- Design agentic workflows (single- and multi-agent / A2A) that can plan, route, call tools, and coordinate to complete complex tasks.
- Implement and refine RAG pipelines, including retrieval strategies, chunking, embeddings, reranking, and hybrid search across multiple data sources.
- Design and operate GraphRAG-style retrieval on top of knowledge graphs to support multi-hop reasoning and relationship-heavy use cases.
- Build NL2SQL / NL2DB capabilities that convert natural language into safe, validated queries against SQL databases, warehouses, or analytics systems.
- Define and manage tool interfaces and MCP-style capability layers so agents can call internal APIs, SaaS tools, and data services with proper contracts and permissions.
- Create evaluation pipelines for prompts, agents, RAG, GraphRAG, NL2SQL, and tool use, including regression tests, LLM-as-judge scoring, and human review loops.
- Instrument AI systems with traces, logs, metrics, and structured events so you can debug failures, track versions, and understand behavior across the entire request path.
- Build dashboards and alerts to monitor quality, latency, cost, and safety signals for AI features in production.
- Collaborate with product, design, data, and platform teams to move from prototype to production while adding guardrails, fallbacks, and human-in-the-loop flows where needed.
- Continuously experiment with new models, prompting techniques, and architectures, then distill what works into reusable patterns and libraries for the team.
Required qualifications- Hands-on experience shipping LLM-based features (agents, RAG, tool calling, or NL2SQL) into production.
- Strong full-stack engineering experience with modern web stacks (e.g., TypeScript/React/Next.js plus Python/Node.js).
- Solid backend fundamentals: REST/GraphQL APIs, relational databases, caching, and cloud deployment (AWS/GCP/Azure with Docker/CI).
- Experience designing, measuring, and improving AI behavior using evals, metrics, and real user feedback.
- Ability to work closely with product teams, own projects end to end, and make pragmatic tradeoffs between quality, speed, and cost.
Tech StackAreaExample toolsFrontendReact, Next.js, TypeScript, Tailwind
BackendPython, FastAPI, Node.js, PostgreSQL, Redis
AI orchestrationLangChain, LangGraph, Semantic Kernel, custom agent frameworks
RetrievalPinecone, Weaviate, FAISS, Elasticsearch, hybrid search
Graph / GraphRAGNeo4j, graph stores, entity linking, knowledge graph pipelines
EvalsLangSmith, DeepEval, custom benchmark suites, human review workflows
ObservabilityLangfuse, Arize, W&B, OpenTelemetry, custom dashboards
InfraAWS, Docker, Kubernetes, GitHub Actions