Submitting more applications increases your chances of landing a job.

Here’s how busy the average job seeker was last month:

Opportunities viewed

Applications submitted

Keep exploring and applying to maximize your chances!

Looking for employers with a proven track record of hiring women?

Click here to explore opportunities now!

We Value Your Feedback

You are invited to participate in a survey designed to help researchers understand how best to match workers to the types of jobs they are searching for

Would You Be Likely to Participate?

If selected, we will contact you via email with further instructions and details about your participation.

You will receive a $7 payout for answering the survey.

https://bayt.page.link/BgPXx88eDt5wM6hB9

Back to the job results

Lead AI Platform

- Integrant
- Egypt

30+ days ago 2026/09/03

Complete Questionnaire

Apply on company site

Other Business Support Services

Create a job alert for similar positions

Job alert turned off. You won’t receive updates for this search anymore.

Undo

Job description

Integrant is looking for game changers to join our team as " Lead AI Platform".
The Lead AI Platform Engineer is responsible for bridging AI workloads with production-grade infrastructure, with a strong focus on NVIDIA AI stack, enabling high-performance, scalable, and optimized AI systems.
This role focuses on model optimization, runtime efficiency, and GPU utilization, ensuring that AI workloads are production-ready, cost-efficient, and performant across enterprise environments.
Roles and Responsibilities: Translate AI/ML workloads into optimized infrastructure and deployment strategies Optimize model performance across GPU environments (latency, throughput, memory utilization) Design and implement inference and training pipelines using NVIDIA stack tools (TensorRT, Triton, NIM) Convert and optimize models across frameworks (PyTorch → ONNX → TensorRT) Analyze and resolve performance bottlenecks using profiling tools (GPU, memory, network) Improve GPU utilization and scheduling efficiency across clusters Design scalable distributed training and inference architectures Work closely with customers to define AI infrastructure strategies and deployment models Support production deployments including monitoring, rollback, and performance validation Conduct applied research to improve model efficiency and infrastructure utilization Mentor team members on AI infrastructure, optimization, and GPU systems Experiment tracking tools (MLflow, W&B, Neptune) log parameters, metrics, and artifacts for comparison Find the Model degradation happens post-deployment: concept drift, data pipeline changes, traffic pattern shifts Root cause analysis (RCA) applies to ML systems: isolating variables, reproducing issues Salary paid in USD Six-month career advancing opportunities Supportive and friendly work environment Premium medical insurance [employee +family] English language development courses Interest-free loans paid over 2.
5 years Technical development courses Planned overtime program (POP) Employment referral program Premium location in Maadi Social insurance 8+ years of experience in AI systems 8+ years of experience in ML systems, HPC and AI infrastructure Strong proficiency in Python Strong experience with GPU-based AI workloads and performance optimization Deep understanding of model optimization techniques (quantization, pruning, batching) Hands-on experience with: PyTorch ONNX / ONNX Runtime TensorRT / TensorRT-LLM Triton Inference Server Knowledge of CUDA, cuDNN, and GPU architecture fundamentals Experience with distributed systems (multi-GPU / multi-node) Familiarity with: NCCL communication NVLink / InfiniBand Kubernetes or Slurm for orchestration Experience deploying AI models into production environments Ability to analyze system bottlenecks (compute, memory, network) Experience with profiling tools (Nsight, TensorRT profiler, etc.
) Knowledge of cost optimization strategies for GPU workloads Experiment tracking tools (MLflow, W&B, Neptune) log parameters, metrics, and artifacts for comparison Find the Model degradation happens post-deployment: concept drift, data pipeline changes, traffic pattern shifts Root cause analysis (RCA) applies to ML systems: isolating variables, reproducing issues Nice to Have Experience with NVIDIA NIM and NGC ecosystem Exposure to Megatron-LM, NeMo, or large-scale LLM training/inference Experience with LLM optimization techniques (KV cache, batching strategies) Familiarity with MLOps practices and CI/CD for AI systems Experience in customer-facing architecture or consulting roles Familiarity with hybrid cloud / on-prem HPC environments

This job post has been translated by AI and may contain minor differences or errors.

Apply on company site Email to Friend Complete Questionnaire

Compare your profile with other applicants

Cancel

You’ve reached the maximum limit of 15 job alerts. To create a new alert, please delete an existing one first.

MANAGE

Job alert created for this search. You’ll receive updates when new jobs match.

Manage alerts

Are you sure you want to unapply?

You'll no longer be considered for this role and your application will be removed from the employer's inbox.