AI Systems Engineer (Agents & Evaluation)

160 - 240 PLN/ godz.B2B (netto)
SeniorFull-time·B2B
#352193·Dodano wczoraj·0
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

PythonMachine learningReinforcement LearningClaude CodeCodex

Firma i stanowisko

We’re looking for AI/ML/Environment Engineers to cooperate with a leading provider of AI evaluation and optimization solutions, trusted by multinational companies to optimize AI agents and detect performance issues in large language models. The company’s mission is to enable safe, verifiable, and aligned AGI through rigorous, real-world agent evaluation.


Wymagania

  • 4+ years of experience in data engineering, simulation systems, or ML infrastructure.
  • Strong command of Python and systems-level programming.
  • Practical experience in working with AI, including frameworks (Langchain, Langraph, mcp-server) and prompt engineering.
  • Deep understanding of ML concepts.
  • Curiosity and conviction around building environments that steer AGI.

Nice to have:

  • Knowledge of RL concepts - reward modeling, environment dynamics, verifiability, evaluation, and agent interaction loops.
  • Familiarity with instrumentation, metrics, and data pipelines for RL evaluation.
  • Knowledge of Codex or Claude Code.
  • Experience in integrating AI with a system would be an advantage.

Obowiązki

  • Design and develop RL environments for large-scale agent evaluation and reinforcement learning workflows.
  • Build task generation pipelines, dynamic datasets, and scripted simulations with controlled complexity and stochastic behavior.
  • Implement verification systems and reward models to automatically assess agent trajectories and reasoning quality.
  • Collaborate with infrastructure and systems teams to ensure environments are scalable, reproducible, and fully instrumented for telemetry and monitoring.
  • Develop APIs and orchestration frameworks for executing, resetting, and evaluating agents across multiple environments.
  • Work closely with research and customer-facing teams to transform open-ended requirements into measurable and testable solutions.
  • Optimize environment performance, logging systems, and reward reproducibility across distributed architectures.
  • Work 2 p.m. - 10 p.m. daily due to client’s time zone.

Oferta

  • Sport subscription
  • Private healthcare
Karta sportowa
Opieka zdrowotna

Inne informacje

Due to the client’s time zone, work hours are 2 p.m. - 10 p.m. daily.

Acaisoft

Acaisoft

15 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz