Performance

The core aim of this library, is to provide platform native, friction free performance. I'm confident that I can't do better than this - at least outside of an absurd about of effort.

In general cross platform performance is a hard problem. We sidestep where possible by simply providing compiletime @inline-shim-to-BLAS implementations.

	JVM	JS	Native	Cross
Data structure	`Array[Double]`	`Float64Array`	`Array[Double]`	`NArray[Double]`
Shims to	https://github.com/luhenry/netlib	https://github.com/stdlib-js/blas	CBLAS	Best available

Consider browsing the vecxt api, and particulaly the extensions object. You'll see that most definitions are @inline anotated - i.e. there is zero runtime overhead calling this library, and checkout the benchmarks

JVM

On the JVM, firstly, we have the JVM. JVM Good. Further this library also targets the "project Panama", "Vector", or "SIMD" apis, which aim to provide hardware accelerated performance. Each function has been benchmarked for performance vs a while loop.

The BLAS shim uses that API to hit C levels of performance for BLAS operations and is written by a MS researcher. It's good.

JS

On Node, this shim ships with it's own C BLAS implementation. The BLAS implementation it's done in loop-unrolled native arrays.

TODO: Investigate webassembly?

In this article

Generated with