NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
LMBench on ARM Microprocessors
Contents
What is LMBench[edit]
LMBench is an opensource benchmark suite available under the GNU General Public License. You can download LMBench here: http://sourceforge.net/projects/lmbench/
Test set up[edit]
- AM37x rev C
- OMAP EVM Main Board Rev G
- Cortex-A8 clock speed of 600MHz
- L3 clock rate of 200MHz.
- Linux version - 2.6.32 (PSP03.00.01.06)
- Compiler version - (gcc version 4.3.3 (Sourcery G++ Lite 2009q1-203)
Memory Latency[edit]
The following results come from the LMBench test lat_mem_read with stride=128. This test can be helpful to understand the latency of data reads from L1, L2 and memory. Smaller block reads can fit entirely into L1, so they should have less latency than larger block reads.
Memory Latency Results[edit]
Block tranfer size | ns | clock cycles |
512 Bytes | 5.059 | 3.0354 |
2K Bytes | 5.059 | 3.0354 |
4K Bytes | 5.059 | 3.0354 |
16K Bytes | 5.034 | 3.0204 |
32K Bytes | 5.028 | 3.0168 |
48K Bytes | 12.842 | 7.7052 |
64K Bytes | 15.239 | 9.1434 |
128K Bytes | 19.436 | 11.6616 |
192K Bytes | 20.362 | 12.2172 |
256K Bytes | 37.598 | 22.5588 |
384K Bytes | 113.683 | 68.2098 |
512K Bytes | 140.427 | 84.2562 |
768K Bytes | 155.576 | 93.3456 |
1M Bytes | 161.576 | 96.9456 |
Read Memory Bandwidth[edit]
The following results come from the LMBench test bw_mem. This test is using the exact same setup as the previous test however, it is run with two different Cortex-A8 clock speeds - 600MHz and 1 GHz. All the reads and writes of this test are performed by the Cortex-A8.
Memory Latency Results[edit]
Operation | Clock speed - 600MHz | Clock speed -1GHz |
(rd) read 1MByte - 32 bit stride - 4 bytes at a time | 386.43 MB/s | 412.71 MB/s |
(rdwr) read/write 1 MByte - 32 bit stride - 4 bytes at a time | 274.8 MB/s | 310.61 MB/s |
(cp) copy 1 MByte - 32 bit stride - 4 bytes at a time | 328.03 MB/s | 345.01 MB/s |
(fcp) copy 1 MByte no stride - 4 bytes at a time | 279.77 MB/s | 329.95 MB/s |
bcopy copy 1 MByte - 1 byte at a time | 266.42 MB/s | 321.96 MB/s |