AI Training and Inference is very inefficient today incurring huge costs with real impact on the environment. We live in the Jurassic era of A.I efficiency and Nod.AI believes we can do much better with a solid focus on our Computer Science fundamentals with first principles thinking. Our A.I Compiler Engineers deliver the best front end for various Machine Learning frameworks like Tensorflow, PyTorch etc and very efficiently parallelize and distribute the workloads onto the SHARK Runtime for auto-tuned high performance execution on large clusters and SoCs.  Our ML Systems Engineers effectively deploy compiled A.I models onto a wide range of hardware platforms – from supercomputers with thousands of devices to System-on-Chip designs with hundreds of processing elements (PEs). 

We enable companies to efficiently deploy Machine Learning models without doing the heavy lifting of optimizing the frameworks and models for their hardware and clusters. 

Come work with a high caliber team, where everyone is an engineer first and we guide our decisions with data and celebrate our results. 

Current Openings

Machine Learning Systems (MLSys) Engineer 

As a MLSys Engineer you would be responsible for scaling out the Nod AI Runtime over multiple devices on a single host and across hundreds of machines in a HPC / Datacenter environment.

Responsibilities

  • Focus on High Efficiency Deployments of Machine Learning Models for training and inference on a wide range of hardware from Supercomputers to A.I System-on-Chips. 
  • You will have to understand the innards for Pytorch, Tensorflow and MXNet and make sure the Nod Runtime is keeping up with the latest developments and is constantly outperforming those frameworks
  • You will hold a lot of complexity in your head to be solving fundamental computer science problems at scale using first principles. 
  • You will have to adapt Nod Runtime to emerging A.I Silicon and HPC clusters

Qualifications

  • Experience with ML frameworks such as PyTorch, MxNET, Tensorflow, JAX etc.
  • Experience implementing or optimizing linear algebra functions (convolutions, fully connected, etc.)
  • Proficiency in Python, C or C++ 
  • Proficiency in large scale / exascale systems
  • Proficiency in C++ Concurrency and Parallelism standards
  • Programming experience with CUDA, OpenCL, OneAPI, DPC++ etc
  • Familiarity with Distributed Communication Collectives such as NCCL, OneCL etc.

ML Systems Engineer:  email stdin@nod.ai

A.I Compiler Engineer

As a Compiler Engineer you will work on design and implementation of significant parts of the Nod Neural Compiler and Runtime. You will work on performance analysis and design/implementation of new optimizations passes and developing new backend targets for custom Accelerators and FPGAs

Responsibilities

  • Analyze and design effective compiler optimizations.
  • Implement and/or enhance code generation targeting machine learning accelerators.
  • Code using a mixture of Python and C++.
  • Develop hardware-aware optimization for emerging ML algorithms
  • Contribute to the development of machine-learning libraries, intermediate representations, export formats etc.
  • Employ scientific methods to evaluate performance and to debug, diagnose and drive resolution of cross-disciplinary system issues.
  • Work with algorithm research teams to map graphs to hardware implementations, model data-flows, create cost-benefit analysis and estimate cluster or silicon power and performance.

Qualifications

  • 2+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field.
  • Experience in deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Tensorflow XLA, LLVM/MLIR, Apache TVM
  • Guru in software design and programming in C++ 
  • Bonus if you have experience working on a compiler toolchain codebase, such as Clang, LLVM/MLIR, GCC. 
  • Bonus if you have written Compiler optimization passes, done polyhedral optimizations etc . 
  • Good understanding of language design, compiler optimizers, backend code generators.
  • Experience with compiler architecture, particularly dynamic language compilers or HPC compilers or ML compilers.
  • Experience in code generation targeting machine learning accelerators, GPUs and CPUs e.g: RISC-V

A.I Compiler Engineer:  email stdin@nod.ai

GPU Performance Engineer (CUDA/ROCM/METAL)

Nod.ai is looking for GPU Kernel Performance engineers with hands on performance optimization experience with CUDA kernels. You will be working on Nod.AI’s Compiler Codegen technologies to generate GPU kernels that can outperform hand tuned kernels. Yes you will be on the forefront of moving Machine Learning Frameworks from hand written GPU kernels to efficiently generated Codegen kernels.  

Qualifications

  • 4+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field.
  • Experience in GPU enabled deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Tensorflow XLA, LLVM/MLIR, Apache TVM
  • Expert in software design and programming in C++ 
  • GPU programming experience in CUDA, OpenACC, or OpenCL
  • Experience with GPU/CPU kernel benchmarking

GPU / CUDA Performance Engineer: https://www.linkedin.com/jobs/view/2698156193/ or email stdin@nod.ai

Cloud Engineer

Nod.ai is seeking a Cloud Engineer to help build, operate, and support Nod.ai’s multi-cloud infrastructure. You will be automating deployment of large scale machine learning models for training and inference. In this role you will provide the first point of contact with a customer and it is important they have a pleasant and easy experience where the technology gets out of the way and enables them to solve their problems.

Responsibilities

  • Be responsible for the Nod.ai SaaS infrastructure hosted on the AWS, GCP, Azure
  • Operate services running in Kubernetes, managed cloud services, AWS SageMaker etc
  • Build and manage systems across multiple cloud environments and on-premise deployments

Qualifications

  • Expert understanding of Linux systems and networking
  • Experience working with cloud providers such as AWS, GCP, or Azure
  • Command of a scripting language such as Python or Bash, as well as Git
  • Proficiency and experience with infrastructure as code/configuration management tools, such as Terraform, SLURM etc.
  • Experience operating Kubernetes at scale

Cloud Engineer: email stdin@nod.ai