 | TurboSFV - Blog | | |
|
TurboSFV |
2024-12-07 12:13:18 |
TurboSFV v10.30 - SHA-1 with SSE support (SHA1RNDS4) |
Notes to TurboSFV v10.30:
For the calculation of SHA-1 checksums, TurboSFV now offers a computation based on SSE instructions (Streaming SIMD Extensions). In modern CPUs (starting in 2016), special instructions are implemented for a faster computation, the Intel
SHA
extensions.
Having a look at the SHA-1 algorithm, the input message is divided into blocks of 64 bytes, and each block will be consumed in 80 computation rounds, each coming with a set of basic CPU instructions - but all together quite
calculation-intensive. The output produced by these rounds is kept in five state variables, each 32 bit. Together with the next block, they will be used as the input for the next set of rounds, until there is no new block available. Then, at the end, from
the five state variables, the 160 Bit SHA-1 hash value will be
formed.
The SSE based SHA-1 extensions help to consume the input (SHA1MSG1 and SHA1MSG2) and to accelerate the computation of the output. As these instructions use XMM register, each 128 bit wide, they can basically work with four of the five
state variables at a time. In addition, one instruction (SHA1RNDS4) is designed in a way, that four rounds can be processed at once. Another instruction (SHA1NEXTE) supports the calculation of the fifth state variable, all resulting in an
overall speed, which can be around five times
faster.
As a precondition, the used CPU must be capable of understanding these SSE based instructions. TurboSFV checks the CPU against a list of necessary CPU features, before offering the SSE based, fast calculation mode. This check includes a
test for the existence of some basic SSE commands, used for operations with XMM registers and introduced as SSE enhancements over the time - up to SSE4.1. If any of the needed features isn't offered by the processor, then TurboSFV uses the
x86 / x86-64 (x64) legacy instructions. Otherwise, the SSE calculation mode can be optionally
used.
On the other hand, the storage medium must be able to deliver the message input as fast as needed: Even a SSE based SHA-1 calculation can't be faster than the input bytes are dropping in.
A final hint: Both modes, the new fast SSE based calculation mode and the one based on legacy CPU instructions, produce the same hash values - at least here - otherwise it would be a disaster. However, TurboSFV still offers both modes on
modern CPUs, so the user can experience the speed
difference.
Feel free to add a comment regarding this new version here.
|
TurboSFV Cologne, Germany |
|
|
|
|
|