Linux for ARM
Processor designs from Arm Ltd. are used in a plethora of microprocessors and SoC (System on a Chip) components. Which power a wide range of devices including: smartphones, tablets, PDAs, network routers, NAS systems, set-top boxes, etc. Some (non-exhaustive) lists of devices using the ARM architecture can be found in:
- List of products using ARM processors - Wikipedia
- [OpenWrt Wiki] Targets
- ARM Linux - Developer - Machines
The initial port of the Linux kernel to ARM began back in 1994, then targeting an Acorn A5000 running RISCOS, and grew from there through to being part of the mainline Linux kernel.
The use of ARM cores in microprocessors, microcontrollers and SoC devices, for many different vendors, meant that supporting Linux on a device would often require a specific kernel built for that specific device. This limited the availability of general purpose Linux distributions, while making vendor specific embedded Linux kernels common. Fortunately since most user-space applications use the kernel abstractions to access devices, the same user-space can be used with any kernel built for the same flavor of the architecture (see ArmPorts - Debian Wiki).
Support for a selection of ARM based systems ('arm') appeared in the Debian GNU/Linux 2.2 (`potato') release in 2000. The current Debian Linux 11 (bullseye) supports ARM through the 'armel', 'armhf' and 'arm64' (aka. 'aarch64') ports.
QEMU
The diversity of devices using the ARM processor means the QEMU system emulators for ARM provide a large number of emulated systems, with the QEMU 5.2.0 build I'm using listing 90 systems for the 64-bit system emulator (qemu-system-aarch64), and 84 for the 32-bit system emulator (qemu-system-arm). While most systems appear in the lists for both emulators (suggesting they could be implemented with 32-bit or 64-bit processor cores) a small set of systems are 64-bit only.
While most of the available systems correspond to physical hardware, the "QEMU ARM Virtual Machine" system ('virt') is a virtual system based on the use of paravirtualized devices. This provides performance improvements, particularly for I/O, and is useful for software development and testing, for cases where specific hardware features are not required.
Emulation Command
Our 'virt' system was run with the QEMU command:
$ qemu-system-arm \ -machine virt \ -m 1024M \ -drive if=none,file=hda_debian10_virt.qcow2,format=qcow2,id=hd \ -device virtio-blk-device,drive=hd \ -netdev 'user,guestfwd=:10.0.2.1:22-cmd:netcat 127.0.0.1 22,hostfwd=::2222-:22,id=mynet' \ -device virtio-net-device,netdev=mynet \ -kernel live-vmlinuz \ -initrd live-initrd.img \ -append 'root=/dev/vda2' \ -no-reboot \ -name 'Debian Linux 10 (buster) for armhf on QEMU (virt)'
By default the console is on the serial port (use ctrl-alt-2 to switch to the serial port, or the "View" menu if using the GUI). The storage and networking are specified as 'virtio' devices. The '-kernel' and '-initrd' parameters are used to boot the Linux kernel directly without having to worry about system firmware. The network forwarding rules provide host/guest ssh access.
System Information
So let's see what Linux has to say about the system...
uname & lsb_release
Operating system release and version information:
$ uname -a Linux deb-virt 4.19.0-17-armmp-lpae #1 SMP Debian 4.19.194-1 (2021-06-10) armv7l GNU/Linux
So a "Linux" kernel, on a node named "deb-virt", kernel release "4.19.0-17-armmp-lpae" (a patched 4.19.0 kernel), version "#1 SMP Debian 4.19.194-1 (2021-06-10)", machine type "armv7l" for operating system "GNU/Linux".
Distribution information from Linux Standard Base (LSB):
$ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster
Looking at the Debian release information files:
cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 10 (buster)" NAME="Debian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" $ cat /etc/debian_version 10.10
Confirmed as a Debian Linux 10 (buster) distribution.
/proc/cpuinfo & lscpu
Processor information:
$ lscpu Architecture: armv7l Byte Order: Little Endian CPU(s): 1 On-line CPU(s) list: 0 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 Vendor ID: ARM Model: 1 Model name: Cortex-A15 Stepping: r2p1 BogoMIPS: 125.00 Flags: half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm $ cat /proc/cpuinfo processor : 0 model name : ARMv7 Processor rev 1 (v7l) BogoMIPS : 125.00 Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc0f CPU revision : 1 Hardware : Generic DT based system Revision : 0000 Serial : 0000000000000000
While the processor is a Cortex-A15 (Wikipedia) it shows only a single core rather than the more usual 2 or 4 cores. This is due to the QEMU default for the number of CPUs being one, and can be overridden with the '-smp' option.
/proc/meminfo
Memory information:
$ cat /proc/meminfo MemTotal: 1021640 kB MemFree: 841148 kB MemAvailable: 900876 kB Buffers: 13044 kB Cached: 110268 kB SwapCached: 0 kB Active: 81756 kB Inactive: 54012 kB Active(anon): 12524 kB Inactive(anon): 1352 kB Active(file): 69232 kB Inactive(file): 52660 kB Unevictable: 0 kB Mlocked: 0 kB HighTotal: 262144 kB HighFree: 134604 kB LowTotal: 759496 kB LowFree: 706544 kB SwapTotal: 997372 kB SwapFree: 997372 kB Dirty: 4 kB Writeback: 0 kB AnonPages: 12428 kB Mapped: 13288 kB Shmem: 1444 kB Slab: 23380 kB SReclaimable: 14820 kB SUnreclaim: 8560 kB KernelStack: 504 kB PageTables: 712 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1508192 kB Committed_AS: 80592 kB VmallocTotal: 245760 kB VmallocUsed: 0 kB VmallocChunk: 0 kB Percpu: 128 kB AnonHugePages: 2048 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB CmaTotal: 16384 kB CmaFree: 12260 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB
So 1.0 GiB RAM, with about 997 MB of swap.
lspci and lsusb
Details of installed PCI and USB devices:
# lspci -v # lsusb
In this particular case there are no installed PCI or USB devices (the system uses virtio devices instead) so these commands don't report anything. However the 'virt' machine can have these installed if desired.
report-hw
Report hardware information using commands:
$ report-hw uname -a: Linux deb-virt 4.19.0-17-armmp-lpae #1 SMP Debian 4.19.194-1 (2021-06-10) armv7l GNU/Linux lsmod: Module Size Used by lsmod: evdev 24576 1 lsmod: ip_tables 24576 0 lsmod: x_tables 24576 1 ip_tables lsmod: autofs4 40960 2 lsmod: ext4 618496 2 lsmod: crc16 16384 1 ext4 lsmod: mbcache 16384 1 ext4 lsmod: jbd2 102400 1 ext4 lsmod: crc32c_generic 16384 3 lsmod: fscrypto 28672 1 ext4 lsmod: ecb 16384 0 lsmod: virtio_net 45056 0 lsmod: net_failover 20480 1 virtio_net lsmod: virtio_blk 20480 4 lsmod: failover 16384 1 net_failover lsmod: virtio_mmio 20480 0 lsmod: virtio_ring 24576 3 virtio_blk,virtio_net,virtio_mmio lsmod: virtio 16384 3 virtio_blk,virtio_net,virtio_mmio df: Filesystem 1K-blocks Used Available Use% Mounted on df: udev 491544 0 491544 0% /dev df: tmpfs 102164 1440 100724 2% /run df: /dev/vda2 6715744 978756 5376132 16% / df: tmpfs 510820 0 510820 0% /dev/shm df: tmpfs 5120 0 5120 0% /run/lock df: tmpfs 510820 0 510820 0% /sys/fs/cgroup df: /dev/vda1 482922 30205 427783 7% /boot df: tmpfs 102164 0 102164 0% /run/user/1000 free: total used free shared buff/cache available free: Mem: 1021640 44976 676632 1440 300032 894452 free: Swap: 997372 0 997372 /proc/cmdline: root=/dev/vda2 /proc/cpuinfo: processor : 0 /proc/cpuinfo: model name : ARMv7 Processor rev 1 (v7l) /proc/cpuinfo: BogoMIPS : 125.00 /proc/cpuinfo: Features : half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm /proc/cpuinfo: CPU implementer : 0x41 /proc/cpuinfo: CPU architecture: 7 /proc/cpuinfo: CPU variant : 0x2 /proc/cpuinfo: CPU part : 0xc0f /proc/cpuinfo: CPU revision : 1 /proc/cpuinfo: /proc/cpuinfo: Hardware : Generic DT based system /proc/cpuinfo: Revision : 0000 /proc/cpuinfo: Serial : 0000000000000000 /proc/iomem: 00000000-00000000 : pl011@9000000 /proc/iomem: 00000000-00000000 : pl011@9000000 /proc/iomem: 00000000-00000000 : pl031@9010000 /proc/iomem: 00000000-00000000 : rtc-pl031 /proc/iomem: 00000000-00000000 : pl061@9030000 /proc/iomem: 00000000-00000000 : pl061@9030000 /proc/iomem: 00000000-00000000 : a003c00.virtio_mmio /proc/iomem: 00000000-00000000 : a003e00.virtio_mmio /proc/iomem: 00000000-00000000 : System RAM /proc/iomem: 00000000-00000000 : Kernel code /proc/iomem: 00000000-00000000 : Kernel data /proc/interrupts: CPU0 /proc/interrupts: 18: 114697 GIC-0 27 Level arch_timer /proc/interrupts: 50: 3127 GIC-0 78 Edge virtio0 /proc/interrupts: 51: 18988 GIC-0 79 Edge virtio1 /proc/interrupts: 53: 0 GIC-0 34 Level rtc-pl031 /proc/interrupts: 54: 0 GIC-0 33 Level uart-pl011 /proc/interrupts: 55: 0 9030000.pl061 3 Edge GPIO Key Poweroff /proc/interrupts: IPI0: 0 CPU wakeup interrupts /proc/interrupts: IPI1: 0 Timer broadcast interrupts /proc/interrupts: IPI2: 0 Rescheduling interrupts /proc/interrupts: IPI3: 0 Function call interrupts /proc/interrupts: IPI4: 0 CPU stop interrupts /proc/interrupts: IPI5: 0 IRQ work interrupts /proc/interrupts: IPI6: 0 completion interrupts /proc/interrupts: Err: 0 /proc/meminfo: MemTotal: 1021640 kB /proc/meminfo: MemFree: 676764 kB /proc/meminfo: MemAvailable: 894604 kB /proc/meminfo: Buffers: 23384 kB /proc/meminfo: Cached: 254244 kB /proc/meminfo: SwapCached: 0 kB /proc/meminfo: Active: 107632 kB /proc/meminfo: Inactive: 182748 kB /proc/meminfo: Active(anon): 12840 kB /proc/meminfo: Inactive(anon): 1348 kB /proc/meminfo: Active(file): 94792 kB /proc/meminfo: Inactive(file): 181400 kB /proc/meminfo: Unevictable: 0 kB /proc/meminfo: Mlocked: 0 kB /proc/meminfo: HighTotal: 262144 kB /proc/meminfo: HighFree: 90984 kB /proc/meminfo: LowTotal: 759496 kB /proc/meminfo: LowFree: 585780 kB /proc/meminfo: SwapTotal: 997372 kB /proc/meminfo: SwapFree: 997372 kB /proc/meminfo: Dirty: 4 kB /proc/meminfo: Writeback: 0 kB /proc/meminfo: AnonPages: 12756 kB /proc/meminfo: Mapped: 13164 kB /proc/meminfo: Shmem: 1440 kB /proc/meminfo: Slab: 32760 kB /proc/meminfo: SReclaimable: 22416 kB /proc/meminfo: SUnreclaim: 10344 kB /proc/meminfo: KernelStack: 504 kB /proc/meminfo: PageTables: 736 kB /proc/meminfo: NFS_Unstable: 0 kB /proc/meminfo: Bounce: 0 kB /proc/meminfo: WritebackTmp: 0 kB /proc/meminfo: CommitLimit: 1508192 kB /proc/meminfo: Committed_AS: 45448 kB /proc/meminfo: VmallocTotal: 245760 kB /proc/meminfo: VmallocUsed: 0 kB /proc/meminfo: VmallocChunk: 0 kB /proc/meminfo: Percpu: 152 kB /proc/meminfo: AnonHugePages: 2048 kB /proc/meminfo: ShmemHugePages: 0 kB /proc/meminfo: ShmemPmdMapped: 0 kB /proc/meminfo: CmaTotal: 16384 kB /proc/meminfo: CmaFree: 7032 kB /proc/meminfo: HugePages_Total: 0 /proc/meminfo: HugePages_Free: 0 /proc/meminfo: HugePages_Rsvd: 0 /proc/meminfo: HugePages_Surp: 0 /proc/meminfo: Hugepagesize: 2048 kB /proc/meminfo: Hugetlb: 0 kB /proc/bus/input/devices: I: Bus=0019 Vendor=0001 Product=0001 Version=0100 /proc/bus/input/devices: N: Name="gpio-keys" /proc/bus/input/devices: P: Phys=gpio-keys/input0 /proc/bus/input/devices: S: Sysfs=/devices/platform/gpio-keys/input/input0 /proc/bus/input/devices: U: Uniq= /proc/bus/input/devices: H: Handlers=kbd event0 /proc/bus/input/devices: B: PROP=0 /proc/bus/input/devices: B: EV=3 /proc/bus/input/devices: B: KEY=100000 0 0 0 /proc/bus/input/devices:
Since this machine is mostly based on virtio devices, the main hints about the devices are from the loaded kernel modules.
lshw
Report on system hardware:
# lshw deb-virt description: ARMv7 Processor rev 1 (v7l) width: 32 bits *-core description: Motherboard physical id: 0 *-cpu description: CPU product: cpu physical id: 0 bus info: cpu@0 capabilities: half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm *-memory description: System memory physical id: 1 size: 997MiB *-virtio0 description: Ethernet interface physical id: 2 bus info: virtio@0 logical name: eth0 serial: 52:54:00:12:34:56 capabilities: ethernet physical configuration: autonegotiation=off broadcast=yes driver=virtio_net driverversion=1.0.0 ip=10.0.2.15 link=yes multicast=yes *-virtio1 description: Virtual I/O device physical id: 3 bus info: virtio@1 logical name: /dev/vda size: 8GiB (8589MB) capabilities: partitioned partitioned:dos configuration: driver=virtio_blk logicalsectorsize=512 sectorsize=512 signature=baa19ca7 *-volume:0 description: Linux filesystem partition vendor: Linux physical id: 1 bus info: virtio@1,1 logical name: /dev/vda1 logical name: /boot version: 1.0 serial: e8a52b96-9cbb-473a-8e57-ca46f5beb36e size: 487MiB capacity: 487MiB capabilities: primary bootable extended_attributes large_files ext2 initialized configuration: filesystem=ext2 lastmountpoint=/ modified=2021-06-21 15:37:12 mount.fstype=ext2 mount.options=rw,relatime mounted=2021-06-21 15:37:10 state=mounted *-volume:1 description: EXT4 volume vendor: Linux physical id: 2 bus info: virtio@1,2 logical name: /dev/vda2 logical name: / version: 1.0 serial: c431a7b7-8678-4326-9c99-9208aa9d221b size: 6728MiB capacity: 6728MiB capabilities: primary journaled extended_attributes large_files huge_files dir_nlink recover 64bit extents ext4 ext2 initialized configuration: created=2021-06-21 12:43:37 filesystem=ext4 lastmountpoint=/ modified=2021-06-21 15:37:43 mount.fstype=ext4 mount.options=rw,relatime,errors=remount-ro mounted=2021-06-21 15:37:54 state=mounted *-volume:2 description: Extended partition physical id: 3 bus info: virtio@1,3 logical name: /dev/vda3 size: 974MiB capacity: 974MiB capabilities: primary extended partitioned partitioned:extended *-logicalvolume description: Linux swap volume physical id: 5 logical name: /dev/vda5 version: 1 serial: ea76383f-0df9-4da0-86a0-c454504fa5c5 size: 974MiB capacity: 974MiB capabilities: nofs swap initialized configuration: filesystem=swap pagesize=4096
The approach taken by 'lshw' does a good job of picking up the virtio devices and providing a bit of information about them.
dmesg
System messages:
# dmesg [ 0.000000] Booting Linux on physical CPU 0x0 [ 0.000000] Linux version 4.19.0-17-armmp-lpae (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.194-1 (2021-06-10) [ 0.000000] CPU: ARMv7 Processor [412fc0f1] revision 1 (ARMv7), cr=30c5387d [ 0.000000] CPU: div instructions available: patching division code [ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache [ 0.000000] OF: fdt: Machine model: linux,dummy-virt [ 0.000000] Memory policy: Data cache writealloc [ 0.000000] efi: Getting EFI parameters from FDT: [ 0.000000] efi: UEFI not found. [ 0.000000] cma: Reserved 16 MiB at 0x000000007f000000 [ 0.000000] On node 0 totalpages: 262144 [ 0.000000] DMA zone: 1728 pages used for memmap [ 0.000000] DMA zone: 0 pages reserved [ 0.000000] DMA zone: 196608 pages, LIFO batch:63 [ 0.000000] HighMem zone: 65536 pages, LIFO batch:15 [ 0.000000] psci: probing for conduit method from DT. [ 0.000000] psci: PSCIv0.2 detected in firmware. [ 0.000000] psci: Using standard PSCI v0.2 function IDs [ 0.000000] psci: Trusted OS migration not required [ 0.000000] random: get_random_bytes called from start_kernel+0x9c/0x52c with crng_init=0 [ 0.000000] percpu: Embedded 17 pages/cpu s40460 r8192 d20980 u69632 [ 0.000000] pcpu-alloc: s40460 r8192 d20980 u69632 alloc=17*4096 [ 0.000000] pcpu-alloc: [0] 0 [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 260416 [ 0.000000] Kernel command line: root=/dev/vda2 [ 0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [ 0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [ 0.000000] Memory: 983088K/1048576K available (10240K kernel code, 1136K rwdata, 2640K rodata, 2048K init, 319K bss, 49104K reserved, 16384K cma-reserved, 245760K highmem) [ 0.000000] Virtual kernel memory layout: vector : 0xffff0000 - 0xffff1000 ( 4 kB) fixmap : 0xffc00000 - 0xfff00000 (3072 kB) vmalloc : 0xf0800000 - 0xff800000 ( 240 MB) lowmem : 0xc0000000 - 0xf0000000 ( 768 MB) pkmap : 0xbfe00000 - 0xc0000000 ( 2 MB) modules : 0xbf000000 - 0xbfe00000 ( 14 MB) .text : 0x(ptrval) - 0x(ptrval) (12256 kB) .init : 0x(ptrval) - 0x(ptrval) (2048 kB) .data : 0x(ptrval) - 0x(ptrval) (1137 kB) .bss : 0x(ptrval) - 0x(ptrval) ( 320 kB) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 [ 0.000000] ftrace: allocating 33878 entries in 100 pages [ 0.000000] rcu: Hierarchical RCU implementation. [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1. [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1 [ 0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 [ 0.000000] GICv2m: range[mem 0x08020000-0x08020fff], SPI[80:143] [ 0.000000] arch_timer: cp15 timer(s) running at 62.50MHz (virt). [ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns [ 0.000310] sched_clock: 56 bits at 62MHz, resolution 16ns, wraps every 4398046511096ns [ 0.000610] Switching to timer-based delay loop, resolution 16ns [ 0.013952] Console: colour dummy device 80x30 [ 0.017228] console [tty0] enabled [ 0.020224] Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=250000) [ 0.020617] pid_max: default: 32768 minimum: 301 [ 0.023237] Security Framework initialized [ 0.023513] Yama: disabled by default; enable with sysctl kernel.yama.* [ 0.029414] AppArmor: AppArmor initialized [ 0.031051] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes) [ 0.031226] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes) [ 0.063210] CPU: Testing write buffer coherency: ok [ 0.066335] CPU0: Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable [ 0.091535] /cpus/cpu@0 missing clock-frequency property [ 0.092076] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000 [ 0.105087] Setting up static identity map for 0x40400000 - 0x404000a0 [ 0.109599] rcu: Hierarchical SRCU implementation. [ 0.123175] EFI services will not be available. [ 0.127936] smp: Bringing up secondary CPUs ... [ 0.128125] smp: Brought up 1 node, 1 CPU [ 0.128280] SMP: Total of 1 processors activated (125.00 BogoMIPS). [ 0.128447] CPU: All CPU(s) started in SVC mode. [ 0.150130] devtmpfs: initialized [ 0.172709] VFP support v0.3: implementor 41 architecture 4 part 30 variant f rev 0 [ 0.215612] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.217325] futex hash table entries: 256 (order: 2, 16384 bytes) [ 0.242761] pinctrl core: initialized pinctrl subsystem [ 0.264268] DMI not present or invalid. [ 0.282301] NET: Registered protocol family 16 [ 0.308507] DMA: preallocated 256 KiB pool for atomic coherent allocations [ 0.311179] audit: initializing netlink subsys (disabled) [ 0.326825] audit: type=2000 audit(0.256:1): state=initialized audit_enabled=0 res=1 [ 0.333292] No ATAGs? [ 0.347263] hw-breakpoint: found 5 (+1 reserved) breakpoint and 4 watchpoint registers. [ 0.347783] hw-breakpoint: maximum watchpoint size is 8 bytes. [ 0.354175] Serial: AMBA PL011 UART driver [ 0.402540] 9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 54, base_baud = 0) is a PL011 rev1 [ 0.409453] console [ttyAMA0] enabled [ 0.466939] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 0.487892] vgaarb: loaded [ 0.492973] media: Linux media interface: v0.10 [ 0.493361] videodev: Linux video capture interface: v2.00 [ 0.493998] pps_core: LinuxPPS API ver. 1 registered [ 0.494137] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it> [ 0.494436] PTP clock support registered [ 0.524612] clocksource: Switched to clocksource arch_sys_counter [ 0.820729] VFS: Disk quotas dquot_6.6.0 [ 0.821736] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ 0.831274] AppArmor: AppArmor Filesystem Enabled [ 0.872644] NET: Registered protocol family 2 [ 0.887376] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 6144 bytes) [ 0.888210] TCP established hash table entries: 8192 (order: 3, 32768 bytes) [ 0.888637] TCP bind hash table entries: 8192 (order: 4, 65536 bytes) [ 0.889268] TCP: Hash tables configured (established 8192 bind 8192) [ 0.891389] UDP hash table entries: 512 (order: 2, 16384 bytes) [ 0.891850] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes) [ 0.895884] NET: Registered protocol family 1 [ 0.896943] NET: Registered protocol family 44 [ 0.897298] PCI: CLS 0 bytes, default 64 [ 0.907466] Unpacking initramfs... [ 4.169145] Freeing initrd memory: 20120K [ 4.169876] kvm [1]: HYP mode not available [ 4.180099] Initialise system trusted keyrings [ 4.184120] Key type blacklist registered [ 4.185554] workingset: timestamp_bits=14 max_order=18 bucket_order=4 [ 4.212610] zbud: loaded [ 12.669135] Key type asymmetric registered [ 12.669512] Asymmetric key parser 'x509' registered [ 12.669969] bounce: pool size: 64 pages [ 12.670435] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248) [ 12.672725] io scheduler noop registered [ 12.672931] io scheduler deadline registered [ 12.673869] io scheduler cfq registered (default) [ 12.674025] io scheduler mq-deadline registered [ 12.697603] pl061_gpio 9030000.pl061: PL061 GPIO chip @0x0000000009030000 registered [ 12.705054] pci-host-generic 4010000000.pcie: host bridge /pcie@10000000 ranges: [ 12.706226] pci-host-generic 4010000000.pcie: IO 0x3eff0000..0x3effffff -> 0x00000000 [ 12.707245] pci-host-generic 4010000000.pcie: MEM 0x10000000..0x3efeffff -> 0x10000000 [ 12.707451] pci-host-generic 4010000000.pcie: MEM 0x8000000000..0xffffffffff -> 0x8000000000 [ 12.715777] vmap allocation for size 1052672 failed: use vmalloc=<size> to increase size [ 12.716368] pci-host-generic 4010000000.pcie: ECAM ioremap failed [ 12.801081] pci-host-generic: probe of 4010000000.pcie failed with error -12 [ 12.821734] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled [ 12.829543] Serial: AMBA driver [ 12.848173] libphy: Fixed MDIO Bus: probed [ 12.869160] mousedev: PS/2 mouse device common for all mice [ 12.879635] rtc-pl031 9010000.pl031: rtc core: registered pl031 as rtc0 [ 12.891945] ledtrig-cpu: registered to indicate activity on CPUs [ 12.898750] NET: Registered protocol family 10 [ 13.493137] Segment Routing with IPv6 [ 13.494253] mip6: Mobile IPv6 [ 13.494558] NET: Registered protocol family 17 [ 13.495705] mpls_gso: MPLS GSO support [ 13.496845] ThumbEE CPU extension supported. [ 13.497077] Registering SWP/SWPB emulation handler [ 13.500863] registered taskstats version 1 [ 13.501062] Loading compiled-in X.509 certificates [ 14.257325] Loaded X.509 cert 'Debian Secure Boot CA: 6ccece7e4c6c0d1f6149f3dd27dfcc5cbb419ea1' [ 14.258391] Loaded X.509 cert 'Debian Secure Boot Signer 2021 - linux: 4b6ef5abca669825178e052c84667ccbc0531f8c' [ 14.260934] zswap: loaded using pool lzo/zbud [ 14.263395] AppArmor: AppArmor sha1 policy hashing enabled [ 14.276770] input: gpio-keys as /devices/platform/gpio-keys/input/input0 [ 14.280592] rtc-pl031 9010000.pl031: setting system clock to 2021-06-21 14:37:34 UTC (1624286254) [ 14.280826] sr_init: No PMIC hook to init smartreflex [ 14.293803] uart-pl011 9000000.pl011: no DMA platform data [ 14.467052] Freeing unused kernel memory: 2048K [ 14.478748] Run /init as init process [ 21.290454] virtio_blk virtio1: [vda] 16777216 512-byte logical blocks (8.59 GB/8.00 GiB) [ 21.326099] vda: vda1 vda2 vda3 < vda5 > [ 22.452198] PM: Image not found (code -22) [ 24.154099] random: fast init done [ 24.184340] EXT4-fs (vda2): mounted filesystem with ordered data mode. Opts: (null) [ 26.220623] systemd[1]: Inserted module 'autofs4' [ 26.356649] systemd[1]: systemd 241 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid) [ 26.360218] systemd[1]: Detected virtualization qemu. [ 26.363874] systemd[1]: Detected architecture arm. [ 26.441264] systemd[1]: Set hostname to <deb-virt>. [ 30.925919] random: systemd: uninitialized urandom read (16 bytes read) [ 30.978126] random: systemd: uninitialized urandom read (16 bytes read) [ 30.983044] systemd[1]: Started Forward Password Requests to Wall Directory Watch. [ 31.000374] random: systemd: uninitialized urandom read (16 bytes read) [ 31.013527] systemd[1]: Listening on Journal Socket (/dev/log). [ 31.024261] systemd[1]: Listening on Journal Socket. [ 31.121713] systemd[1]: Starting Load Kernel Modules... [ 31.203157] systemd[1]: Starting Set the console keyboard layout... [ 31.330925] systemd[1]: Starting Create list of required static device nodes for the current kernel... [ 31.369992] systemd[1]: Listening on Journal Audit Socket. [ 31.394703] systemd[1]: Started Dispatch Password Requests to Console Directory Watch. [ 31.409528] systemd[1]: Reached target Paths. [ 31.474747] systemd[1]: Created slice User and Session Slice. [ 31.494176] systemd[1]: Reached target Slices. [ 31.522608] systemd[1]: Listening on fsck to fsckd communication Socket. [ 31.581500] systemd[1]: Created slice system-getty.slice. [ 31.620261] systemd[1]: Created slice system-serial\x2dgetty.slice. [ 31.661822] systemd[1]: Listening on udev Kernel Socket. [ 31.714700] systemd[1]: Created slice system-systemd\x2dfsck.slice. [ 34.141708] EXT4-fs (vda2): re-mounted. Opts: errors=remount-ro [ 36.957671] systemd[1]: Started Journal Service. [ 38.783574] systemd-journald[157]: Received request to flush runtime journal from PID 1 [ 50.157370] Adding 997372k swap on /dev/vda5. Priority:-2 extents:1 across:997372k FS [ 51.882176] EXT4-fs (vda1): mounting ext2 file system using the ext4 subsystem [ 51.919406] EXT4-fs (vda1): mounted filesystem without journal. Opts: (null) [ 54.565974] audit: type=1400 audit(1624286294.780:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=216 comm="apparmor_parser" [ 54.568052] audit: type=1400 audit(1624286294.784:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=216 comm="apparmor_parser" [ 54.568337] audit: type=1400 audit(1624286294.784:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=216 comm="apparmor_parser" [ 54.752918] audit: type=1400 audit(1624286294.968:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=217 comm="apparmor_parser" [ 54.753422] audit: type=1400 audit(1624286294.972:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=217 comm="apparmor_parser" [ 63.633226] random: crng init done [ 63.633552] random: 7 urandom warning(s) missed due to ratelimiting
Lots of stuff in here...
Benchmark
Since this is a machine targeted at providing 32-bit ARM based virtual machines, how well does it perform?
BogoMips
The BogoMips pseudo-benchmark is used by the Linux kernel to calibrate a wait loop. The value obtained at boot is reported by '/proc/cpuinfo', 'lscpu' and 'dmesg' (see above).
Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=250000)
This BogoMips result is derived from a timer rather than using the delay loop calibration, and doesn't tell us anything about the processor performance. So an alternative benchmark is required to gauge performance.
OpenSSL
The OpenSSL cryptographic library provides a tool providing a command-line interface to the library methods and one aspect of this provides a speed test. Since I'm mostly interested in older systems I'm going to focus on the common RSA and MD5 methods.
$ openssl speed md5 Doing md5 for 3s on 16 size blocks: 1353924 md5's in 2.99s Doing md5 for 3s on 64 size blocks: 1168093 md5's in 3.00s Doing md5 for 3s on 256 size blocks: 794642 md5's in 3.00s Doing md5 for 3s on 1024 size blocks: 356521 md5's in 2.99s Doing md5 for 3s on 8192 size blocks: 59039 md5's in 3.00s Doing md5 for 3s on 16384 size blocks: 29876 md5's in 3.00s OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes md5 7245.08k 24919.32k 67809.45k 122099.50k 161215.83k 163162.79k $ openssl speed rsa Doing 512 bits private rsa's for 10s: 7005 512 bits private RSA's in 10.00s Doing 512 bits public rsa's for 10s: 64954 512 bits public RSA's in 9.99s Doing 1024 bits private rsa's for 10s: 1443 1024 bits private RSA's in 10.00s Doing 1024 bits public rsa's for 10s: 31001 1024 bits public RSA's in 9.99s Doing 2048 bits private rsa's for 10s: 289 2048 bits private RSA's in 10.02s Doing 2048 bits public rsa's for 10s: 11082 2048 bits public RSA's in 10.00s Doing 3072 bits private rsa's for 10s: 103 3072 bits private RSA's in 10.02s Doing 3072 bits public rsa's for 10s: 5493 3072 bits public RSA's in 10.00s Doing 4096 bits private rsa's for 10s: 49 4096 bits private RSA's in 10.04s Doing 4096 bits public rsa's for 10s: 3232 4096 bits public RSA's in 9.99s Doing 7680 bits private rsa's for 10s: 9 7680 bits private RSA's in 10.64s Doing 7680 bits public rsa's for 10s: 980 7680 bits public RSA's in 10.01s Doing 15360 bits private rsa's for 10s: 2 15360 bits private RSA's in 17.50s Doing 15360 bits public rsa's for 10s: 254 15360 bits public RSA's in 10.03s OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 sign verify sign/s verify/s rsa 512 bits 0.001428s 0.000154s 700.5 6501.9 rsa 1024 bits 0.006930s 0.000322s 144.3 3103.2 rsa 2048 bits 0.034671s 0.000902s 28.8 1108.2 rsa 3072 bits 0.097282s 0.001820s 10.3 549.3 rsa 4096 bits 0.204898s 0.003091s 4.9 323.5 rsa 7680 bits 1.182222s 0.010214s 0.8 97.9 rsa 15360 bits 8.750000s 0.039488s 0.1 25.3
Extracting the relevant figures for comparisons (see OpenSSL Speed Results):
- OpenSSL speed MD5 8,192 bytes: 161,215.83k
- OpenSSL speed RSA 4,096 bytes sign/s: 4.9
- OpenSSL speed RSA 4,096 bytes verify/s: 323.5
What does these results tell us about the performance of the emulated system?
Thoughts
The use of paravirtualized devices mostly shows performance improvement for I/O, which makes a significant difference when building software. That makes the use of the 'virt' machine attractive for handing software compiles or other I/O intensive operations. There is also additional flexibility since the emulated machine is not constrained by the limitations of the physical hardware, making the use of large memory and SMP machines an option for development and testing.
Further Sources
- ARM architecture - Wikipedia
- Documentation/Platforms/ARM - QEMU
- Installing Debian on QEMU’s 32-bit ARM “virt” board | translatedcode
- Debian -- ARM Ports
- ARM Linux - What is it?
Supplemental: OpenSSL 1.1.1d Results
Debian Linux 10 (buster) for PowerPC provides a build of OpenSSL 1.1.1d:
$ openssl version -a OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC platform: debian-armhf options: bn(64,32) rc4(char) des(long) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 OPENSSLDIR: "/usr/lib/ssl" ENGINESDIR: "/usr/lib/arm-linux-gnueabihf/engines-1.1" Seeding source: os-specific
From the compile flags this build has been compiled with assembler implementations for various methods (the '-D*_ASM' flags).
OpenSSL 1.1.1d speed
For reference a full run of the methods provided by OpenSSL on this QEMU system gives results (openssl speed):
OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes md2 0.00 0.00 0.00 0.00 0.00 0.00 mdc2 0.00 0.00 0.00 0.00 0.00 0.00 md4 2022.36k 7899.95k 26851.93k 69816.32k 130902.70k 139597.14k md5 7200.52k 25794.99k 68818.52k 123324.76k 163203.75k 163561.47k hmac(md5) 2046.50k 7701.38k 26572.20k 72404.99k 144195.64k 153032.02k sha1 5815.29k 16865.98k 38440.36k 62111.74k 74887.40k 74825.73k rmd160 1582.09k 5660.25k 15072.04k 25933.85k 33785.03k 34295.70k rc4 27083.19k 33698.10k 38254.99k 39255.50k 39921.62k 40099.70k des cbc 6977.67k 7657.51k 7798.67k 7900.84k 7912.54k 7946.24k des ede3 2524.98k 2712.38k 2740.74k 2755.21k 2766.17k 2681.51k idea cbc 0.00 0.00 0.00 0.00 0.00 0.00 seed cbc 11746.65k 13988.74k 14682.45k 14954.15k 14647.30k 14734.68k rc2 cbc 11376.46k 13449.37k 14163.31k 14397.78k 14327.81k 14394.91k rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00 0.00 blowfish cbc 13669.49k 16554.71k 17353.05k 17363.29k 17679.93k 18219.67k cast cbc 11018.32k 13716.52k 14944.85k 15305.39k 15305.39k 15400.96k aes-128 cbc 14755.46k 18263.47k 19698.39k 20151.30k 20335.27k 20310.70k aes-192 cbc 13592.21k 16617.82k 17430.35k 17790.29k 17866.23k 17934.73k aes-256 cbc 12605.05k 14901.25k 15636.82k 15807.83k 15941.63k 15521.11k camellia-128 cbc 12887.26k 15468.54k 16288.68k 16565.25k 16652.50k 16685.38k camellia-192 cbc 10960.87k 12776.04k 13388.71k 13552.07k 13567.49k 13576.87k camellia-256 cbc 10920.62k 12753.73k 13370.99k 13537.69k 13592.14k 13578.45k sha256 4642.14k 13129.05k 28556.97k 40522.41k 46093.65k 46383.10k sha512 3119.22k 12307.09k 23328.09k 37438.81k 45405.53k 46333.95k whirlpool 486.99k 1011.37k 1632.77k 1944.58k 2069.85k 2068.41k aes-128 ige 13847.30k 17011.39k 18064.95k 18294.44k 18330.97k 18317.31k aes-192 ige 12449.25k 15320.59k 15992.66k 16213.33k 16121.86k 16231.08k aes-256 ige 11396.25k 13600.58k 14330.20k 14554.11k 14472.53k 14543.53k ghash 14472.61k 16817.75k 17603.13k 18055.85k 17853.10k 18093.40k rand 249.23k 899.76k 2912.81k 6291.71k 9618.63k 9731.44k sign verify sign/s verify/s rsa 512 bits 0.001416s 0.000144s 706.1 6924.6 rsa 1024 bits 0.006638s 0.000318s 150.7 3146.5 rsa 2048 bits 0.034792s 0.000909s 28.7 1100.3 rsa 3072 bits 0.096990s 0.001887s 10.3 529.9 rsa 4096 bits 0.216170s 0.003241s 4.6 308.5 rsa 7680 bits 1.236667s 0.010707s 0.8 93.4 rsa 15360 bits 9.095000s 0.041577s 0.1 24.1 sign verify sign/s verify/s dsa 512 bits 0.003025s 0.002209s 330.6 452.8 dsa 1024 bits 0.005071s 0.004170s 197.2 239.8 dsa 2048 bits 0.012364s 0.011050s 80.9 90.5 sign verify sign/s verify/s 160 bits ecdsa (secp160r1) 0.0112s 0.0100s 89.1 100.0 192 bits ecdsa (nistp192) 0.0149s 0.0124s 67.1 80.6 224 bits ecdsa (nistp224) 0.0194s 0.0162s 51.6 61.8 256 bits ecdsa (nistp256) 0.0015s 0.0041s 675.1 244.0 384 bits ecdsa (nistp384) 0.0567s 0.0426s 17.6 23.5 521 bits ecdsa (nistp521) 0.1235s 0.0878s 8.1 11.4 163 bits ecdsa (nistk163) 0.0101s 0.0200s 98.8 50.1 233 bits ecdsa (nistk233) 0.0175s 0.0352s 57.3 28.4 283 bits ecdsa (nistk283) 0.0304s 0.0600s 32.9 16.7 409 bits ecdsa (nistk409) 0.0651s 0.1283s 15.4 7.8 571 bits ecdsa (nistk571) 0.1434s 0.2844s 7.0 3.5 163 bits ecdsa (nistb163) 0.0107s 0.0210s 93.5 47.5 233 bits ecdsa (nistb233) 0.0189s 0.0378s 52.9 26.4 283 bits ecdsa (nistb283) 0.0329s 0.0649s 30.4 15.4 409 bits ecdsa (nistb409) 0.0730s 0.1446s 13.7 6.9 571 bits ecdsa (nistb571) 0.1627s 0.3219s 6.1 3.1 256 bits ecdsa (brainpoolP256r1) 0.0184s 0.0175s 54.3 57.1 256 bits ecdsa (brainpoolP256t1) 0.0184s 0.0164s 54.2 61.0 384 bits ecdsa (brainpoolP384r1) 0.0574s 0.0455s 17.4 22.0 384 bits ecdsa (brainpoolP384t1) 0.0567s 0.0429s 17.6 23.3 512 bits ecdsa (brainpoolP512r1) 0.0805s 0.0649s 12.4 15.4 512 bits ecdsa (brainpoolP512t1) 0.0806s 0.0585s 12.4 17.1 op op/s 160 bits ecdh (secp160r1) 0.0110s 91.0 192 bits ecdh (nistp192) 0.0143s 69.9 224 bits ecdh (nistp224) 0.0185s 54.0 256 bits ecdh (nistp256) 0.0029s 342.5 384 bits ecdh (nistp384) 0.0542s 18.4 521 bits ecdh (nistp521) 0.1187s 8.4 163 bits ecdh (nistk163) 0.0094s 106.0 233 bits ecdh (nistk233) 0.0170s 58.8 283 bits ecdh (nistk283) 0.0294s 34.0 409 bits ecdh (nistk409) 0.0635s 15.8 571 bits ecdh (nistk571) 0.1387s 7.2 163 bits ecdh (nistb163) 0.0101s 99.3 233 bits ecdh (nistb233) 0.0184s 54.5 283 bits ecdh (nistb283) 0.0315s 31.8 409 bits ecdh (nistb409) 0.0709s 14.1 571 bits ecdh (nistb571) 0.1608s 6.2 256 bits ecdh (brainpoolP256r1) 0.0180s 55.5 256 bits ecdh (brainpoolP256t1) 0.0180s 55.4 384 bits ecdh (brainpoolP384r1) 0.0558s 17.9 384 bits ecdh (brainpoolP384t1) 0.0548s 18.2 512 bits ecdh (brainpoolP512r1) 0.0776s 12.9 512 bits ecdh (brainpoolP512t1) 0.0769s 13.0 253 bits ecdh (X25519) 0.0049s 204.1 448 bits ecdh (X448) 0.0171s 58.4 sign verify sign/s verify/s 253 bits EdDSA (Ed25519) 0.0018s 0.0056s 545.5 177.1 456 bits EdDSA (Ed448) 0.0079s 0.0183s 126.8 54.5
This version of OpenSSL supports accessing the Linux kernel cryptography implementations (see Linux Kernel Crypto API) via the 'afalg' engine:
$ openssl engine afalg -c (afalg) AFALG engine support [AES-128-CBC, AES-192-CBC, AES-256-CBC]
The engine only supports AES methods. So getting a baseline with engine invocation:
$ openssl speed -engine afalg aes-256-cbc engine "afalg" set. Doing aes-256 cbc for 3s on 16 size blocks: 2231421 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 64 size blocks: 640874 aes-256 cbc's in 2.99s Doing aes-256 cbc for 3s on 256 size blocks: 176351 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 1024 size blocks: 45237 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 8192 size blocks: 5053 aes-256 cbc's in 3.00s Doing aes-256 cbc for 3s on 16384 size blocks: 2670 aes-256 cbc's in 3.00s OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256 cbc 11900.91k 13717.70k 15048.62k 15440.90k 13798.06k 14581.76k
In this version of OpenSSL the engine specific method is only called when the '-evp' option is used:
$ openssl speed -engine afalg -evp aes-256-cbc engine "afalg" set. Doing aes-256-cbc for 3s on 16 size blocks: 8234 aes-256-cbc's in 0.34s Doing aes-256-cbc for 3s on 64 size blocks: 9833 aes-256-cbc's in 0.34s Doing aes-256-cbc for 3s on 256 size blocks: 9103 aes-256-cbc's in 0.36s Doing aes-256-cbc for 3s on 1024 size blocks: 7141 aes-256-cbc's in 0.26s Doing aes-256-cbc for 3s on 8192 size blocks: 2147 aes-256-cbc's in 0.08s Doing aes-256-cbc for 3s on 16384 size blocks: 1171 aes-256-cbc's in 0.04s OpenSSL 1.1.1d 10 Sep 2019 built on: Mon Mar 22 23:08:47 2021 UTC options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -Wa,--noexecstack -g -O2 -fdebug-prefix-map=/build/openssl-pp1hfQ/openssl-1.1.1d=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2 The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes aes-256-cbc 387.48k 1850.92k 6473.24k 28124.55k 219852.80k 479641.60k
Here we see a throughput improvement from 13.8 MB/s to 219.8 MB/s for 8 KB blocks.
The kernel method implementation can use hardware acceleration (if available) and processor specific features, which often significantly improves performance. Typically this manifests at larger block sizes, with the smaller block sizes seeing poor performance when the kernel methods are used.
No comments:
Post a Comment