Current Openings

AI Training and Inference is very inefficient today incurring huge costs with real impact on the environment. We live in the Jurassic era of A.I efficiency and Nod.AI believes we can do much better with a solid focus on our Computer Science fundamentals with first principles thinking. Our A.I Compiler Engineers deliver the best front end for various Machine Learning frameworks like Tensorflow, PyTorch etc and very efficiently parallelize and distribute the workloads onto the Nod Runtime for auto-tuned high performance execution on large clusters and SoCs.  Our ML Systems Engineers effectively deploy compiled A.I models onto a wide range of hardware platforms – from supercomputers with thousands of devices to System-on-Chip designs with hundreds of processing elements (PEs). 

We enable companies to efficiently deploy Machine Learning models without doing the heavy lifting of optimizing the frameworks and models for their hardware and clusters. 

Come work with a high caliber team, where everyone is an engineer first and we guide our decisions with data and celebrate our results. 

Machine Learning Systems (MLSys) Engineer 

As a MLSys Engineer you would be responsible for scaling out the Nod AI Runtime over multiple devices on a single host and across hundreds of machines in a HPC / Datacenter environment.



  • Focus on High Efficiency Deployments of Machine Learning Models for training and inference on a wide range of hardware from Supercomputers to A.I System-on-Chips. 
  • You will have to understand the innards for Pytorch, Tensorflow and MXNet and make sure the Nod Runtime is keeping up with the latest developments and is constantly outperforming those frameworks
  • You will hold a lot of complexity in your head to be solving fundamental computer science problems at scale using first principles. 
  • You will have to adapt Nod Runtime to emerging A.I Silicon and HPC clusters


  • Experience with ML frameworks such as PyTorch, MxNET, Tensorflow, JAX etc.
  • Experience implementing or optimizing linear algebra functions (convolutions, fully connected, etc.)
  • Proficiency in Python, C or C++ 
  • Proficiency in large scale / exascale systems
  • Proficiency in C++ Concurrency and Parallelism standards
  • Programming experience with CUDA, OpenCL, OneAPI, DPC++ etc
  • Familiarity with Distributed Communication Collectives such as NCCL, OneCL etc.


ML Systems Engineer:

A.I Compiler Engineer

As a Compiler Engineer you will work on design and implementation of significant parts of the Nod Neural Compiler and Runtime. You will work on performance analysis and design/implementation of new optimizations passes and developing new backend targets for custom Accelerators and FPGAs


  • Analyze and design effective compiler optimizations.
  • Implement and/or enhance code generation targeting machine learning accelerators.
  • Code using a mixture of Python and C++.
  • Develop hardware-aware optimization for emerging ML algorithms
  • Contribute to the development of machine-learning libraries, intermediate representations, export formats and analysis tools.
  • Employ the scientific method to evaluate performance and to debug, diagnose and drive resolution of cross-disciplinary system issues.
  • Work with algorithm research teams to map CNN graphs to hardware implementations, model data-flows, create cost-benefit analysis and estimate cluster or silicon power and performance.


  • 2+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field.
  • Experience in deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Tensorflow XLA, LLVM/MLIR, Apache TVM
  • Guru in software design and programming in C++ 
  • Bonus if you have experience working on a compiler toolchain codebase, such as Clang, LLVM/MLIR, GCC. 
  • Bonus if you have written Compiler optimization passes, done polyhedral optimizations etc . 
  • Good understanding of language design, compiler optimizers, backend code generators.
  • Experience with compiler architecture, particularly dynamic language compilers or HPC compilers or ML compilers.
  • Experience in code generation targeting machine learning accelerators, GPUs and CPUs e.g: RISC-V

A.I Compiler Engineer:


GPU / CUDA Performance Engineer is looking for GPU Kernel Performance engineers with hands on performance optimization experience with CUDA kernels. You will be working on Nod.AI’s Compiler Codegen technologies to generate GPU kernels that can outperform hand tuned kernels. Yes you will be on the forefront of moving Machine Learning Frameworks from hand written GPU kernels to efficiently generated Codegen kernels.  


  • 4+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field.
  • Experience in GPU enabled deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Tensorflow XLA, LLVM/MLIR, Apache TVM
  • Expert in software design and programming in C++ 
  • GPU programming experience in CUDA, OpenACC, or OpenCL
  • Experience with GPU/CPU kernel benchmarking

GPU / CUDA Performance Engineer: