JetBrains

Research Engineer (Agentic Behavior)

Brak informacji o wynagrodzeniu

MidFull-time

#358366·Dodano 21 dni temu·1

Źródło: nofluffjobs.com

Aplikuj teraz

Tech Stack / Keywords

PythonSQLKotlin

Firma i stanowisko

JetBrains is a company focused on creating developer tools. The Kotlin AI Value Stream team works on AI agents that understand, generate, and improve Kotlin code across multiple platforms including Android, Kotlin Multiplatform, server-side, web, and desktop. The team builds evaluation infrastructure, error analysis tools, and post-training pipelines to measure and improve agent behavior on real Kotlin developer tasks.

Wymagania

Hands-on experience building evaluation or analysis pipelines for LLMs or AI coding agents in research or production.
Strong Python engineering skills (at least three years) with ability to write clean, maintainable code in data-heavy and ML-adjacent codebases.
Experience with data analysis at scale: querying large datasets (SQL/Athena), building data pipelines, and performing statistical analysis.
Ability to own projects end to end from problem identification to designing evals, running experiments, and shipping fixes.
Product-aware mindset with understanding of real failure modes in agent usage.
Familiarity with Kotlin or strong willingness to develop Kotlin expertise.

Nice to have:

Experience with post-training LLMs: SFT, RLHF, DPO, GRPO.
Experience with modern deep learning frameworks (PyTorch) and LLM training stacks (TRL, verl, Megatron).
AI agent development experience including tool-using agents and multi-step coding workflows.
Experience with evaluation frameworks and tools like Inspect AI, Promptfoo, LM-evaluation-harness.
Experience with experiment tracking and observability tools such as Weights & Biases, MLflow, Langfuse.
Knowledge of the Kotlin ecosystem: Android, Gradle, KMP, Spring, Ktor.
Contribution to or maintenance of open-source projects, especially benchmarks or evaluation tools.

Obowiązki

Build tools for agentic error analysis.
Design and implement tooling to systematically capture, classify, and analyse errors that AI coding agents make when generating Kotlin code.
Build observability pipelines over agentic traces from JetBrains IDEs and other coding agents.
Design, implement, and maintain evaluation pipelines measuring Kotlin code generation quality across correctness, idiomaticity, build success, framework usage, and test coverage.
Build simulation environments for coding agents on realistic Kotlin developer tasks.
Own evaluation infrastructure including metrics, experiment tracking, automated regression checks, and reproducible benchmarking.
Research methods for improving agent and model behavior on Kotlin.
Experiment with post-training techniques (SFT, DPO, GRPO) to improve model handling of Kotlin-specific patterns.
Investigate context engineering approaches such as CLAUDE.md/AGENTS.md files, compiler-as-verifier feedback loops, Kotlin LSP integration, and MCP-based tooling.
Run experiments to measure impact including A/B comparisons and benchmark suites.
Collaborate with model providers to translate Kotlin-specific findings into model improvements.
Design and build open-source benchmarks measuring AI coding agent performance on Kotlin tasks.
Create task datasets covering server side, multiplatform projects, build systems, Android, library development, and others, including mined real-world and synthetic tasks.

Benefity

Sport subscription
Training budget
Private healthcare
Lunch card
Free coffee
Free snacks
Free beverages
In-house trainings
Modern office
Flat structure
Small teams
International projects
Startup atmosphere
No dress code
Free breakfast

Karta sportowa

Dofinansowanie szkoleń

Opieka zdrowotna

Kursy językowe

Szkolenia wewnętrzne

Napoje w biurze

Darmowe przekąski

JetBrains

49 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz