Nowa
Research Engineer (Agentic Behavior)
Brak informacji o wynagrodzeniu
MidFull-time
#358366·Dodano wczoraj·0
Źródło: nofluffjobs.comTech Stack / Keywords
PythonSQLKotlin
Firma i stanowisko
JetBrains is a company focused on creating developer tools. The Kotlin AI Value Stream team works on AI agents that understand, generate, and improve Kotlin code across multiple platforms including Android, Kotlin Multiplatform, server-side, web, and desktop. The team builds evaluation infrastructure, error analysis tools, and post-training pipelines to measure and improve agent behavior on real Kotlin developer tasks.
Wymagania
- Hands-on experience building evaluation or analysis pipelines for LLMs or AI coding agents in research or production.
- Strong Python engineering skills (at least three years) with ability to write clean, maintainable code in data-heavy and ML-adjacent codebases.
- Experience with data analysis at scale: querying large datasets (SQL/Athena), building data pipelines, and performing statistical analysis.
- Ability to own projects end to end from problem identification to designing evals, running experiments, and shipping fixes.
- Product-aware mindset with understanding of real failure modes in agent usage.
- Familiarity with Kotlin or strong willingness to develop Kotlin expertise.
Nice to have:
- Experience with post-training LLMs: SFT, RLHF, DPO, GRPO.
- Experience with modern deep learning frameworks (PyTorch) and LLM training stacks (TRL, verl, Megatron).
- AI agent development experience including tool-using agents and multi-step coding workflows.
- Experience with evaluation frameworks and tools like Inspect AI, Promptfoo, LM-evaluation-harness.
- Experience with experiment tracking and observability tools such as Weights & Biases, MLflow, Langfuse.
- Knowledge of the Kotlin ecosystem: Android, Gradle, KMP, Spring, Ktor.
- Contribution to or maintenance of open-source projects, especially benchmarks or evaluation tools.
Obowiązki
- Build tools for agentic error analysis.
- Design and implement tooling to systematically capture, classify, and analyse errors that AI coding agents make when generating Kotlin code.
- Build observability pipelines over agentic traces from JetBrains IDEs and other coding agents.
- Design, implement, and maintain evaluation pipelines measuring Kotlin code generation quality across correctness, idiomaticity, build success, framework usage, and test coverage.
- Build simulation environments for coding agents on realistic Kotlin developer tasks.
- Own evaluation infrastructure including metrics, experiment tracking, automated regression checks, and reproducible benchmarking.
- Research methods for improving agent and model behavior on Kotlin.
- Experiment with post-training techniques (SFT, DPO, GRPO) to improve model handling of Kotlin-specific patterns.
- Investigate context engineering approaches such as CLAUDE.md/AGENTS.md files, compiler-as-verifier feedback loops, Kotlin LSP integration, and MCP-based tooling.
- Run experiments to measure impact including A/B comparisons and benchmark suites.
- Collaborate with model providers to translate Kotlin-specific findings into model improvements.
- Design and build open-source benchmarks measuring AI coding agent performance on Kotlin tasks.
- Create task datasets covering server side, multiplatform projects, build systems, Android, library development, and others, including mined real-world and synthetic tasks.
Oferta
- Sport subscription
- Training budget
- Private healthcare
- Lunch card
- Free coffee
- Free snacks
- Free beverages
- In-house trainings
- Modern office
- Flat structure
- Small teams
- International projects
- Startup atmosphere
- No dress code
- Free breakfast
Karta sportowa
Dofinansowanie szkoleń
Opieka zdrowotna
Kursy językowe
Szkolenia wewnętrzne
Napoje w biurze
Darmowe przekąski
JetBrains
66 aktywnych ofert