MKL is slower on the first call (especially since this is not an expensive operation itself) but is faster afterwards, at least if the dataset is large enough.
A quick benchmark on my machine (which is not useful for anything beyond demonstrating this):
Repeating 100 times avg, 4 rounds, len=10'000:
Managed: 0.0368ms, 0.0169ms, 0.0168ms, 0.0168ms
MKL: 0.0515ms, 0.0071ms, 0.0080ms, 0.0078ms
Repeating 100 times avg, 4 rounds, len=1'000'000:
Managed: 2.3922ms, 2.0536ms, 2.2833ms, 2.4194ms
MKL: 0.1852ms, 0.1513ms, 0.1555ms, 0.1680ms
Thanks,
Christoph
A quick benchmark on my machine (which is not useful for anything beyond demonstrating this):
Repeating 100 times avg, 4 rounds, len=10'000:
Managed: 0.0368ms, 0.0169ms, 0.0168ms, 0.0168ms
MKL: 0.0515ms, 0.0071ms, 0.0080ms, 0.0078ms
Repeating 100 times avg, 4 rounds, len=1'000'000:
Managed: 2.3922ms, 2.0536ms, 2.2833ms, 2.4194ms
MKL: 0.1852ms, 0.1513ms, 0.1555ms, 0.1680ms
Thanks,
Christoph