You may need to refresh the page. https://github.com/scala/scala3/issues/21637
Here is the comparison of the standard while
loop with the vectorised version.
Conclusion
The case here is nuanced. The looped version is significantly faster, for small array sizes.
It could be, that the vectorised version is somehow inefficiently initiated. Whilst the case is more nuanced, I'm targeting larger data sizes, and so the vectorised version is left in, where it holds a cca 20% throughput advantage.
In this article