Matrix Multiplication forms the foundation of Machine Learning. In this write up we take a survey of Bilinear Matrix Multiplication Algorithms that are the most common set of algorithms that perform better than the naive O(n^3) implementations.

Short Form / Easy Read:

Long Form:

Though most of the time the Matrices for Machine Learning are tall/skinny or irregular sizes. See here for Benoit’s matmul profile of what typical sizes look like. In follow on posts we will cover the theory of fast matrix multiplication of those irregular sizes and then cover current state of art implementations like RUY / XNNPACK.

Feel free to leave a comment on any of the material above or reach out to the author and we are happy to discuss any feedback.

Featured Image credit: J. Li, S. Ranka, S. Sahni. Strassen’s Matrix Multiplication on GPUs. 2011 IEEE 17th International Conference on Parallel and Distributed Systems

Leave a Comment

Your email address will not be published. Required fields are marked *