NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.
Sitara Device Crypto Performance Comparison
Contents
Under Construction[edit]
Device Comparison[edit]
This page will compare the cryptographic performance of different Sitara devices. Devices which have cryptographic accelerators available will be tested with and without acceleration to get a feel for how the acceleration improves performance over software-only cryptography.
Table of devices
Device | HW acceleration |
AM18x | none |
AM35x | AES, DES, 3DES, SHA, MD5 |
AM37x (BeagleBoard) | AES, DES, 3DES, SHA, MD5 |
AM335x (BeagleBone) | AES, SHA, MD5 |
Cryptography in the Sitara SDK[edit]
All Sitara SDK's include OpenSSL. OpenSSL is a pure software implementation of general cryptographic functions. The specific version of OpenSSL in each SDK may vary, but all versions are capable of general performance measurements. So for Sitara devices with no HW acceleration, OpenSSL is used to measure software-only crypto performance. In the SDK's for devices with HW acceleration there is an additional Linux driver for the crypto modules. There is also a driver for Open Cryptographic Framework (OCF). The OCF driver is an open source general abstraction layer for user level applications (like OpenSSL) to access available HW crypto acceleration modules. OCF includes a test application which can be used to directly measure crypto performance at the OCF level.
How the numbers are generated[edit]
Since OpenSSL is already included in the SDK it is easy to simply run the included speed test for individual algorithms. The examples below show the format of the command. Typing just "openssl speed" will run the speed test on every available algorithm. This can be time consuming and unnecessary. Specifying the name of algorithm will run the test for just that algorithm. Entering an invalid algorithm will cause OpenSSL to list the available algorithms. In the example below, the speed test is executed with an invalild algorithm. This lists the algorithms and then the speed test is executed for aes-256-cbc. Results of that test are shown below.
root@am335x-evm:~# openssl speed sdkjfh Error: bad option or value Available values: mdc2 md4 md5 hmac sha1 sha256 sha512 whirlpoolrmd160 idea-cbc seed-cbc rc2-cbc bf-cbc des-cbc des-ede3 aes-128-cbc aes-192-cbc aes-256-cbc aes-128-ige aes-192-ige aes-256-ige camellia-128-cbc camellia-192-cbc camellia-256-cbc rc4 rsa512 rsa1024 rsa2048 rsa4096 dsa512 dsa1024 dsa2048 ecdsap160 ecdsap192 ecdsap224 ecdsap256 ecdsap384 ecdsap521 ecdsak163 ecdsak233 ecdsak283 ecdsak409 ecdsak571 ecdsab163 ecdsab233 ecdsab283 ecdsab409 ecdsab571 ecdsa ecdhp160 ecdhp192 ecdhp224 ecdhp256 ecdhp384 ecdhp521 ecdhk163 ecdhk233 ecdhk283 ecdhk409 ecdhk571 ecdhb163 ecdhb233 ecdhb283 ecdhb409 ecdhb571 ecdh idea seed rc2 des aes camellia rsa blowfish Available options: -engine e use engine e, possibly a hardware device. -evp e use EVP e. -decrypt time decryption instead of encryption (only EVP). -mr produce machine readable output. -multi n run n benchmarks in parallel. Command exited with non-zero status 1 root@am335x-evm:~# time -v openssl speed aes-256-cbc Doing aes-256 cbc for 3s on 16 size blocks: 1399402 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 64 size blocks: 376353 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 256 size blocks: 96197 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 1024 size blocks: 24205 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 8192 size blocks: 3029 aes-256 cbc's in 3.00s OpenSSL 1.0.0d 8 Feb 2011 built on: Mon Mar 19 09:02:42 CDT 2012 options:bn(64,32) rc4(ptr,int) des(idx,risc1,2,long) aes(partial) idea(int) blowfish(idx) compiler: arm-arago-linux-gnueabi-gcc -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -mthumb-interwork -mno-thumb --sysroot=/home/hudson/amsdk-nightly-build-05.04.01.00/cortex-A8/arago-tmp/ sysroots/armv7a-arago-linux-gnueabi -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -fexpensive-optimizations -frename-registers -fomit-frame-pointer -O2 -ggdb2 -Wall -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256 cbc 7463.48k 8028.86k 8208.81k 8261.97k 8271.19k Command being timed: "openssl speed aes-256-cbc" User time (seconds): 15.01 System time (seconds): 0.02 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.05s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 6608 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 445 Voluntary context switches: 11 Involuntary context switches: 315 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 root@am335x-evm:~#
So the above test results show the OpenSSL speed test for aes256 cbc mode without crypto acceleration. The "time -v" switch was added to the beginning of the command to provide some additional metrics above how this operation performed with regard to its CPU usage. Note that the speed test without crypto acceleration occupied the CPU at 100%
Now the test is executed again but with additional parameters to the OpenSSL command to give it access to the crypto accelerators.
root@am335x-evm:~# time -v openssl speed -evp aes-256-cbc -engine cryptodev engine "cryptodev" set. Doing aes-256-cbc for 3s on 16 size blocks: 137551 aes-256-cbc's in 0.12s Doing aes-256-cbc for 3s on 64 size blocks: 102837 aes-256-cbc's in 0.07s Doing aes-256-cbc for 3s on 256 size blocks: 52428 aes-256-cbc's in 0.06s Doing aes-256-cbc for 3s on 1024 size blocks: 17712 aes-256-cbc's in 0.04s Doing aes-256-cbc for 3s on 8192 size blocks: 2460 aes-256-cbc's in 0.01s OpenSSL 1.0.0d 8 Feb 2011 built on: Mon Mar 19 09:02:42 CDT 2012 options:bn(64,32) rc4(ptr,int) des(idx,risc1,2,long) aes(partial) idea(int) blowfish(idx) compiler: arm-arago-linux-gnueabi-gcc -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -mthumb-interwork -mno-thumb --sysroot=/home/hudson/amsdk-nightly-build-05.04.01.00/cortex-A8/arago-tmp/sysroots/armv7a-arago-linux-gnueabi -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -fexpensive-optimizations -frename-registers -fomit-frame-pointer -O2 -ggdb2 -Wall -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-256-cbc 18340.13k 94022.40k 223692.80k 453427.20k 2015232.00k Command being timed: "openssl speed -evp aes-256-cbc -engine cryptodev" User time (seconds): 0.32 System time (seconds): 12.40 Percent of CPU this job got: 84% Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 15.05s Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 6592 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 444 Voluntary context switches: 11 Involuntary context switches: 313256 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 root@am335x-evm:~#
formatted text
-