Zobacz podobne oferty/AI/GenAI/Architektura

Senior Software Engineer, AI Inference Systems

292.5k - 507k PLN292 500 - 507 000 PLN/ mies./ mies.UoPUmowa o pracę (brutto)

375k - 650k PLN375 000 - 650 000 PLN/ mies./ mies.UoPUmowa o pracę (brutto)

SeniorFull-time·Umowa o pracę

#341796·Dodano dwa miesiące temu·20

Źródło: NVIDIA

🚫Oferta wygasła. Ta oferta pracy nie jest już aktywna i rekrutacja została zakończona.

Tech Stack / Keywords

AINodeCloudArchitecturePythonGoRustSOLID

Firma i stanowisko

NVIDIA is advancing AI research and development to create technologies that enable anyone to harness the power of AI. The team consists of experts in AI, systems, and performance optimization, led by world-renowned experts in AI systems with multiple academic and industry research awards.

Wymagania

Bachelor’s degree (or equivalent experience) in CS, CE, or SE with 7+ years of experience; or Master’s degree with 5+ years; or PhD with thesis and top-tier publications in ML Systems, GPU architecture, or high-performance computing.
Strong programming skills in Python and C/C++; experience with Go or Rust is a plus.
Solid CS fundamentals: algorithms & data structures, operating systems, computer architecture, parallel programming, distributed systems, deep learning theories.
Knowledge and passion for performance engineering in ML frameworks (e.g., PyTorch) and inference engines (e.g., vLLM and SGLang).
Familiarity with GPU programming and performance: CUDA, memory hierarchy, streams, NCCL; proficiency with profiling/debug tools (e.g., Nsight Systems/Compute).
Experience with containers and orchestration (Docker, Kubernetes, Slurm); familiarity with Linux namespaces and cgroups.
Excellent debugging, problem-solving, and communication skills.

Obowiązki

Contribute features to vLLM to empower the newest models with the latest NVIDIA GPU hardware features.
Profile and optimize the inference framework (vLLM) using methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.
Develop, optimize, and benchmark GPU kernels using techniques such as fusion, autotuning, and memory/layout optimization.
Build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity.
Define and build inference benchmarking methodologies and tools; contribute to the MLPerf Inference benchmarking suite.
Architect scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.
Conduct and publish original research to advance ML Systems; integrate research ideas and prototypes into NVIDIA’s software products.

Benefity

Competitive base salary range for Poland: 292,500 PLN - 507,000 PLN for Level 4, and 375,000 PLN - 650,000 PLN for Level 5.
Hybrid work mode (#LI-Hybrid).

NVIDIA

17 aktywnych ofert

Zobacz wszystkie oferty