Analysis of the Huggingface Infinity Inference Engine

We love Huggingface and use it a lot. It really has made NLP models so much easier to use.  They recently released an enterprise product that is an inference solution with all the magic software for a hardware deployment in a docker container. https://huggingface.co/infinity Performance of ML Systems is close […]

Read More

Generating code to outperform native MatMul libraries (Accelerate, BLIS, MKL) and measuring it with MMperf

GEMM operations dominate the computation in modern Machine Learning Models. Silicon vendors typically provide hand optimized GEMM libraries such as Apple’s Accelerate Framework [1], AMD’s BLIS[2] and Intel’s MKL[3]. There are also open source implementations like OpenBLAS[4], BLIS[5], RUY[6]. We will demonstrate the performance of Nod.ai’s compiler generated code outperforming […]

Read More

Nod.AI’s Neural Perception stack

Computer Vision and Neural Perception have been disrupted with Machine Learning. Nod has optimized State of Art Computer Vision models deployed on really low power devices. Check out our post on LinkedIn

Read More