AI Engineer (RAG & On Prem LLMs)

17k - 23k PLN/ mies.UoP
MidFull-time·Umowa o pracę
#362365·Dodano 13 dni temu·9
Źródło: nofluffjobs.com
Aplikuj teraz

Tech Stack / Keywords

PythonMachine learningAIMLRAGLLAMADeep learningDeepSeekMistralPyTorchHugging FaceLangChainRed HatOpenShiftNLP

Firma i stanowisko

Diverse CG Sp. z o.o. Sp.k. is hiring an AI Engineer to work on enterprise use cases involving Retrieval Augmented Generation (RAG) pipelines and on-premises large language model (LLM) deployments in telco environments.


Wymagania

  • At least 3 years of professional experience in ML/NLP roles, including 2+ years working with RAG systems
  • Proven experience deploying and operating LLM-based solutions in on-prem or hybrid environments
  • Hands-on experience with vLLM, LiteLLM, and open-source LLMs such as LLAMA 3.2, DeepSeek, or Mistral
  • Strong Python skills and experience with frameworks such as PyTorch, Hugging Face Transformers, and LangChain
  • Experience with vector databases (e.g., Neo4j)
  • Familiarity with Linux-based systems and Red Hat OpenShift
  • Strong problem-solving and analytical skills
  • Ability to clearly communicate complex AI concepts to non-technical stakeholders
  • Bachelor's, Master's, or PhD degree in Computer Science, Artificial Intelligence, or a related field
  • Knowledge of English (B2+/C1)

Obowiązki

  • Architect, implement, and optimize end-to-end Retrieval Augmented Generation (RAG) pipelines for enterprise use cases in on-premises environments
  • Design and integrate retrieval mechanisms (e.g., vector databases such as Neo4j) with generative models (e.g., LLAMA 3.2, Mistral)
  • Fine-tune and optimize retrieval and generation components to achieve high accuracy and low latency
  • Implement and customize inference servers using vLLM and LiteLLM for efficient and scalable LLM serving
  • Integrate open-source large language models with proprietary data sources and enterprise APIs
  • Design GPU-optimized, scalable on-prem infrastructure for model training and inference, ensuring security and data governance compliance
  • Collaborate with DevOps teams to containerize workflows using Docker and Kubernetes and automate MLOps pipelines
  • Apply performance optimization techniques such as quantization, pruning, and dynamic batching
  • Monitor system performance, troubleshoot bottlenecks, and ensure high availability
  • Work closely with data engineers and business stakeholders to translate business requirements into technical AI solutions in telco environments

Oferta

  • Private medical care co-financing
  • Sports card (sport subscription)
  • Training and learning opportunities (training budget)
  • Life insurance co-financing (insurance)
  • Birthday day off (paid leave)
Opieka zdrowotna
Karta sportowa
Dofinansowanie szkoleń
Ubezpieczenie
Płatny urlop
Diverse CG

Diverse CG

19 aktywnych ofert

Zobacz wszystkie oferty
Aplikuj teraz