NOEL-V CoreMark

Based · 27 November 2025 14:29

Is it really true that NOEL-V is able to reach up to 4.69 CoreMark/MHz?

After series of tests, we’ve measured the best score we could with the GRLIB Release 2025.1-b4296. 10 iterations were running for almost 12 seconds with 24 MHz clock. This gives the next CoreMark result:

0.835 Coremark Score | 0.0348 CoreMark/MHz | GCC 14.2.0 -mcmodel=medany -static -std=gnu99 -march=rv32imafdc -mabi=ilp32d -Wall -ffunction-sections -fdata-section -nostartfiles -Wl, –gc-sections -lm -lc -lgcc -lc -ffp-contract=off -g -falign-functions=8 | STATIC memory 15 kB | Single core

Benchmark was compiled using xPack GCC. Gaisler’s GCC was also tested and didn’t give any different results in execution time. Using any code optimisation or suggested flags from this page only made the result worse, making same 10 iterations run whole 145 seconds. Malloc wasn’t used. Printf wasn’t used. Messages were send into terminal using APB UART. Execution time was measured using APB Timer.

jklockars · 28 November 2025 10:00

We can actually get significantly better than 4.69 CoreMark/MHz these days - should update those numbers.

Anyway, you say nothing about what NOEL-V configuration you are running on. With an IP as configurable as NOEL-V, the specific configuration makes for a lot of difference. I suppose it would be a good idea to include numbers on the web for something smaller than the HPP64 as well.

RV64 vs RV32 should not really matter, but there are a lot of extensions that would help performance, as well as using a newer compiler. The compiler options you mention do not even include any optimization level - I hope that is just not shown there?

Aside from the above, your amazingly low numbers (and the fact that things get even worse when you try what we used) strongly suggest that you have small/no caches (or they are not working properly). CoreMark really needs to be running from L1 cache, and that requires 16+16 kByte (I+D) caches (and then making sure that the code fits, not letting the compiler unroll loops too much - O3 level optimization by itself may not be good).

/ Johan

Based · 2 March 2026 17:57

Good evening.

Thanks for Your reply, jklockars.

I did some changes to the project. First time I forgot to mention I’m using RV32 with minimal configuration. Since then I tweaked the ‘noelvcpu.vhd‘ file in the MIN config. Those changes include:

Increasing L1 cache from 8+8 to 64+64 kBytes for each;
Field ‘bhtentries‘ set to 128;
included next extensions: Zcb, Zba, Zbb, Zbc, Zbkb, Zbkc, Zbkx;
Made sure that code executes from internal memory, generated with ‘ahbram‘.

Now, compiled with O3 level optimization and extensions mentioned above algorithm runs at 0.526 Coremark/MHz. It IS an improvement, but I still left wondering what else I can do to increase performance? Despite MMU, I don’t see anything that I can include into the project. I also doubt it’ll be a fair trade-off between perfomance and LUT’s usage.

Topic		Replies	Views
Cross-Compiling on a NOEL-V Linux Image NOEL-V	0	561	25 June 2021
Try out the updated NOEL-V on KCU105! NOEL-V	5	1072	2 July 2020
About the NOEL-V category NOEL-V	0	662	29 June 2020
Synthesis problem with noelv-generic NOEL-V	5	1355	24 August 2021
Please help to simulate noelv-generic NOEL-V	9	151	21 January 2025

NOEL-V CoreMark

Related topics