Index  Comments

As a form of practice, and as I'll be needing SHA1 for potential future work, I've decided to create my own implementations across my three main languages. The second implementation I'll be showcasing is written in Ada with, a ninety-one line package specification, and of a one hundred and eighty-two line package body. This is the largest Ada program I've written so far and was good practice.

This program is licensed under the GNU Affero General Public License version three.

Firstly, I will explain the preferred method of using this package. It is expected that perhaps the package will only be WITHed and not USEd and the naming convention was chosen to support such usage. It is most clear to use SHA1.Hash as opposed to simply Hash and this is preferable to embedding that SHA1 in the name of the subprogram; even if the package is USEd, it's advisable to continue to refer to at least SHA1.Hash in this way.

Secondly, the types defined will be explained. The most important types of this package are Status, Digest, Word, Word_Array, and Word_Block; these correspond to the intermediate SHA1 state, the final result, the basic unit of the algorithm, an array thereof, and an array subtype corresponding to the block which the algorithm is defined in terms of. There are also Bit, Octet, and two array types of each and those are only used for return types of conversion functions. The number types are modular and I've considered using the Interfaces.Unsigned_n types in their place. I've considered a subtype of Integer which could be used to limit message sizes to the 2**64 limit, but this seems in practice unreasonable and unnecessary and so the subprograms will merely ignore message sizes greater. Those Status and Digest types are internally the same; I grew fond of the notion of differing types, which I believe reduces the likelihood of some errors, and so Status is used for intermediate calculation, with Digest being used as a final result and unable to be converted back.

Thirdly, the subprograms remain. The primary subprogram of the library is that Hash, which is given several overloadings; I dislike explicit initialization where it's avoidable and so this library has been designed to provide options. The rightfully simplest overloading of Hash accepts a Word_Array, returning a Digest; it's intended to be used in every instance where the data is in-memory and has a bit-length which is a multiple of thirty-two, with that being the only caveat. There are two others and the choice of which to use relates to the constant Initial_Status, which is a pleasant interface to that beginning state of the algorithm; these overloadings are to be used where all of the data to hash is not in-memory or is not all available at once; if the first hash is distinguished from those others then the function variety may be appropriate, which returns a new Status to use further; when all iterations may be treated identically or for other reasons the procedure should perhaps be used, which modifies its current Status parameter. No variety of Hash modifies its data argument.

The Pad procedure is exposed solely to support hashing for those latter two overloadings or for such purposes as hashing messages with different bit-lengths; the best and least error-prone mechanism of this requires two Word_Blocks to be passed in, along with the bit-length and a Boolean; that Boolean will be true in the situation in which padding has resulted in overflow, meaning the Auxiliary block must also be hashed to return the correct Digest. It would be possible to design another procedure, which didn't require an extra block be used, but this would necessarily be more error-prone and such a cost isn't that great and so this is the mechanism decided upon. The remaining functions exist to convert from Status to Digest to a variety of final representations and I expect To_String to be the most common of these.

The naming scheme is entirely consistent, with the exception of Pad, for clarity. All references to a Word_Array or Word_Block are named Data, to a Status are named State, and to a Digest are named by Datum. The Pad uses alternate names for its Word_Block parameters to ease comprehension.

In programming with this, I sought an idealized library. This is not the idealized SHA1 library for any purpose, however I can believe it is an idealized library, for its approach. I considered a Pad procedure which would check the remainder of the Word_Block it was padding and raise an error if the block wasn't cleared, as a reliability check, but this would've been cumbersome to use properly, and this also resulted in the lone exception of the library and so the idea was discarded. An alternate form of the Hash function resulting in a Digest could exist, but has the possibility of a bit-length error when it varies wildly and so was discarded; a form could exist which merely specifies how many of the last thirty-two bits to disregard, however. This library deals in words due to repetition of code otherwise involved and the desire to remain rather simple; it also meshes nicely with the other implementations I'm writing, which will ultimately all use different bit units as the integral base. Those conversion functions could return definite subtypes of their results, but this isn't done now.

In sum, the type system won't be able to prevent flaws involving message lengths greater than 2**64, misuse of the Pad procedure Bit_Length parameter, and so care should be taken, when interacting with these concerns of the library.

Know I've carefully reviewed this library, but have not yet exhaustively tested it. I still need to provide an example program for the purpose of easily comparing this with a SHA1 implementation known to be correct or which is otherwise trusted.

Here is the package specification, the package body, and the documentation. Here is the license.