Principal AI Engineer

160 - 180 PLN/ godz./ godz.B2BB2B (netto)

SeniorFull-time·B2B

#367727·Dodano 18 dni temu·3

Źródło: nofluffjobs.com

Aplikuj teraz

Tech Stack / Keywords

PythonAIFastAPICeleryRedisMongoDBStoragePostgreSQLAuditSREREST APIAPIDesign PatternsTestingpytestMLAirflow

Firma i stanowisko

We are partnering with a US-based health-tech company on the takeover of a production AI-powered mobile coaching platform. The platform is built around a Python AI core (atlas-ai) which runs a FastAPI chat surface with intent routing and tool-using agents, a LangGraph-based agent framework with multi-agent orchestration, a Celery + Redis task queue for asynchronous agent flows, MongoDB for fitness-plan storage, Redis for conversation state, and Postgres for LangGraph checkpoints. The product includes several agent personas such as onboarding, chat, plan creation, in-workout smart adjust, plan smart adjust, and habit formation, with direct OpenAI integration for the LLM layer.

Wymagania

7+ years Python in production at senior+ level
Deep LangGraph experience including state graphs, checkpoints, interrupts, multi-agent supervision, subgraphs
Strong LangChain ecosystem knowledge (chains, tools, memory, output parsers, callbacks)
Production FastAPI experience including streaming responses, dependency injection, middleware, async patterns
Celery + Redis broker in production with task ordering, retries, idempotency, priority queues, dead-letter handling
Concurrency in Python: asyncio (gather, structured concurrency, cancellation), threading boundaries, mixing sync and async code safely
Multi-datastore operations with MongoDB, Redis, Postgres in a single service and transaction boundaries
OpenAI API at scale including rate limits, retries with exponential backoff, fallback model routing, streaming, tool/function calling
Agent design patterns: ReAct, plan-and-execute, supervisor patterns, tool-use loops, multi-turn state, interrupt resumption
Prompt engineering with evaluation, A/B testing, version control of prompts, regression detection
Token cost optimisation: prompt caching, model tiering, context window trimming, summary memory
Production LLM observability: per-route token spend, prompt-level tracing, drift monitoring
Testing discipline: pytest (including pytest-asyncio), property-based testing, snapshot tests for prompts, eval-based tests for agents
Pydantic v2 fluency, type-hinted code throughout

Nice to have:

RAG production experience (vector stores: Pinecone, Qdrant, pgvector)
Production incident command for LLM-powered systems
ML engineering background (model serving, feature engineering)
Anthropic / Claude API experience in addition to OpenAI
Data pipeline experience (Airflow, Dagster, Prefect)
Domain knowledge in fitness / health / wearables

Obowiązki

First 90 days:

Audit atlas-ai: agent flows, LangGraph state machines, Celery topology, datastore usage, OpenAI integration patterns
Produce a written assessment of operational risk including failure modes, race conditions, retry semantics, idempotency, checkpoint integrity
Quantify token cost per agent flow and per user session
Identify highest-risk subsystems and propose stabilization plans
Build or harden an evaluation harness for agent flows including golden cases, regression suites, hallucination/safety tests
Lead knowledge-transfer sessions from the client's AI team

Ongoing responsibilities:

Set the technical direction for the AI core
Lead design for new agent flows and major changes to existing ones
Own the production health of the AI surface with platform/SRE support
Hire and mentor the AI squad (~10 engineers at full scale)
Represent the AI core in cross-team architecture conversations with the client

Benefity

100% remote work
B2B engagement
Rate up to PLN 180 per hour
Start in July

apreel

215 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz