The Nod.ai team has been hard at work and is ready with a Summer release.

Release Notes:

Model Support:

  • Hundreds of new models and model variants added to the SHARK tank
  • Continuous Integration tests each supported model variant on each supported Hardware backend (Intel / AMD CPUs, Nvidia A100, AMD MI100, Apple Silicon CPU and Apple Silicon GPUs)
  • Upcoming interesting models: OPT/GPT3, DLRM and V-Diffusion
  • torch-mlir enhancements for PyTorch source builds, custom op support and more torchbench models. 
  • nod.ai team is the largest contributor to torch-mlir as shown below:

Deployment:

  • Added support for Nvidia Triton Inference server. SHARK models can now be deployed with Triton Inference Server. 
  • Reduced dependent packages installed to make installation easier / faster without importer tools.
  • Downloadable .mlir files from the SHARK tank in-lieu of local importing

Performance:

cuda_nodai_triton_fp16.png

Developer Tools:

Training and Finetuning:

SHARK Hardware Support:

  • Apple M1, M1 Max/Ultra and M2 Support now runs on CIs.
  • AMD MI100 (MFMA is WIP). 
  • NVIDIA A100 CUDA and VULKAN on CI
  • Intel LevelZero (XMX/DPAS is WIP).

Download SHARK from https://github.com/nod-ai/SHARK

Comments are closed.