24 Feb 2014
Right when the whole industry appears to respond to the NSA spying by reinforcing their encryption defense, we learn about a bug in Apple's TLS implementation. There are many comments on the web about the genesis of this bug, such as for example this one from "Imperial Violet," which provides details about the coding mistake that caused the error. I like this bug, because among other things it exposes the need for good coding guidelines.
Of course, bugs happen. Having worked for some time on Windows, I know that. Critical procedures like the negotiation of TLS encryption keys ought to be tested. For me, one of the basic rule of system programming is, "if it was not tested, it probably does not work." In that case, good engineering requires a Unit Test that checks the various error conditions that the procedure is supposed to detect. One wonders why the Apple processes do not call for that. But this bug was particularly silly, a duplicated line in the source code that caused a critical test to be ignored. According to the Imperial Violet web page, which quotes Apple's published source code, the offending source code looks like that:
static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
uint8_t *signature, UInt16 signatureLen)
{
OSStatus err;
...
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
goto fail;
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
...
fail:
SSLFreeBuffer(&signedHashes);
SSLFreeBuffer(&hashCtx);
return err;
}
You can see the duplicated "goto fail" statement that causes subsequent tests to be ignored. This kind of error is normally caught by the compiler, but only if you call the compiler with a sufficiently restrictive option. Windows developers are required to using "warning level 4," which would have produced a compiling error because the "SSLHashSHA1.final" test is unreachable. Windows automated code checks would also have caught the error. Apparently, Apple does not require this kind of precautions.
But apart from restating the rules about "writing secure code," this bug also reminds me of the importance of good coding standards. Writing code is not just about encoding an algorithm, it is also about communicating with other developers, like for example the folks who will have to update that code in a couple years, or your buddies who will review the code before you check it in. We require code reviews before any check-in at Microsoft, I assume they do the same at Apple, but obviously the reviewers did not find the bug, maybe because they were fooled by the indentations. This particular bug is actually a kind of optical illusion, because the duplicate line have the same indentation as the original line. This is where coding standards kick in.
Coding standards or coding guidelines often specify things like when to use upper case and lower case in the name of variables, what kind of indentation to use, what kind of comments, etc. This is a bit arbitrary, different teams will often have different guidelines, but having common guidelines makes code easier to review, and that is a big advantage. One such standard is to never use the "if () instruction;" construct, but to always use brackets for the conditional branch. For example, instead of writing:
if ( cashAvailable < price)
refuseTheSale();
We would require that the developer writes:
if ( cashAvailable < price)
{
refuseTheSale();
}
Apart from making the program more readable, it also avoids future maintenance errors, such as writing:
if ( cashAvailable < price)
AddToRefusedSalesStatistics(cashAvailable, price);
refuseTheSale();
Instead of:
if ( cashAvailable < price)
{
AddToRefusedSalesStatistics(cashAvailable, price);
refuseTheSale();
}
That is, this coding guideline reduces the risk of confusion between "instruction" and "block," a classic pitfall in "if" statements. Applying it would have made the error quite obvious, and also harmless. The code would have looked like this:
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
{
goto fail;
}
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
{
goto fail;
goto fail;
}
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
{
goto fail;
}
...
The extra curly brackets add a bit to the line count, but they definitely make the code much easier to review, and the errors much easier to detect.
The other point that strikes me in that code is the use of the "goto" statement. I am probably in a minority here, but I have a lot of sympathy for Dijkstra's statement, "Go To Statement Considered Harmful." The code above could be easily written without using the goto statement:
static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
uint8_t *signature, UInt16 signatureLen)
{
OSStatus err;
...
if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) == 0)
{
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) == 0)
{
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
{
...
}
}
}
SSLFreeBuffer(&signedHashes);
SSLFreeBuffer(&hashCtx);
return err;
}
The code is easy to read, it will not terminate too early, but it requires extra indentations and that can be painful. If the "standard path" code represented by the ellipses is long, we end up with having to count curly brackets over many lines of code, which is error prone. That's the main reason why many of my colleagues prefer the "goto fail" pattern. On the other hand, if that was really a problem, it is always possible to factor the "common case" as a separate procedure. In any case, if you want to use goto statements, if you think that you know better than Edsger Dijkstra, you have to be extra careful!