Nowa
AI Systems Engineer (Agents & Evaluation)
160 - 240 PLN/ godz.B2B (netto)
SeniorFull-time·B2B
#352193·Dodano wczoraj·0
Źródło: nofluffjobs.comTech Stack / Keywords
PythonMachine learningReinforcement LearningClaude CodeCodex
Firma i stanowisko
We’re looking for AI/ML/Environment Engineers to cooperate with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.
Wymagania
- 4+ years of experience in data engineering, simulation systems, or ML infrastructure.
- Strong command of Python and systems-level programming.
- Practical experience in working with AI, including frameworks (Langchain, Langraph, mcp-server) and prompt engineering.
- Deep understanding of ML concepts.
- Curiosity and conviction around building environments that steer AGI.
Nice to have:
- Knowledge of RL concepts - reward modeling, environment dynamics, verifiability, evaluation, and agent interaction loops.
- Familiarity with instrumentation, metrics, and data pipelines for RL evaluation.
- Knowledge of Codex or Claude Code.
- Experience in integrating AI with a system would be an advantage.
Obowiązki
- Design and develop RL environments for large-scale agent evaluation and reinforcement learning workflows.
- Build task generation pipelines, dynamic datasets, and scripted simulations with controlled complexity and stochastic behavior.
- Implement verification systems and reward models to automatically assess agent trajectories and reasoning quality.
- Collaborate with infrastructure and systems teams to ensure environments are scalable, reproducible, and fully instrumented for telemetry and monitoring.
- Develop APIs and orchestration frameworks for executing, resetting, and evaluating agents across multiple environments.
- Work closely with research and customer-facing teams to transform open-ended requirements into measurable and testable solutions.
- Optimize environment performance, logging systems, and reward reproducibility across distributed architectures.
- Work 2 p.m. - 10 p.m. daily due to client’s time zone.
Oferta
- Sport subscription
- Private healthcare
Karta sportowa
Opieka zdrowotna
Inne informacje
Due to the client’s time zone, work hours are 2 p.m. - 10 p.m. daily.
Acaisoft
15 aktywnych ofert