The count we passed to memcmp() in mpi_eq() and mpi_eq_abs() was actually the number of significant words in the MPI, rather than the number of bytes we wanted to compare. Multiply by 4 to get the correct value.
To make the intent of the code more apparent, introduce a private MPI_MSW() macro which evaluates to the number of significant words (or 1-based index of the most significant word). This also comes in handy in mpi_{add,sub,mul}_abs().
Add a couple of test cases which not only demonstrate the bug we fixed here but also demonstrate why we must compare whole words: on a big-endian machine, we would be comparing the unused upper bytes of the first and only word instead of the lower bytes which actually hold a value...
In my eagerness to eliminate a branch which is taken once per 2^38 bytes of keystream, I forgot that the state words are in host order. Thus, the counter increment code worked fine on little-endian machines, but not on big-endian ones. Switch to a simpler (branchful) solution.
The current version invokes undefined behavior when the count is negative, zero, or equal to or greater than the width of the operand. The new version masks the count to avoid these situations. Although branchless, it is relatively inefficient if the compiler does not recognize it and translate it to a rol or ror instruction. Empirical tests show that both clang and gcc get it right for constant counts, and recent versions of clang (but not gcc) get it right for variable counts as well. Note that our current code base has no instances of rolN / rorN with a variable count.
We failed to clear the negative flag when handling trivial cases, so if one of the terms was 0 and the other was negative, the result would be an exact copy of the non-zero term instead of its absolute value.
- Use the new vector byte-order conversion functions where appropriate.
- Use memset_s() instead of memset() where appropriate.
- Use consistent names and types for function arguments.
- Reindent, rename and reorganize to conform to Cryb style and idiom.
SHA224 and SHA256 were left mostly unchanged. MD2 and MD4 were completely rewritten as the previous versions (taken from XySSL) seem to have been copied from RSAREF.
This breaks the ABI as some context structures have grown or shrunk and some function arguments have been changed from int to size_t.