Monday 10 February 2014

CTF Olympic - Emdee - An alternate approach

The "Emdee" challenge featured at the Olympic CTF required you to find a secret and salt that was hashed together. To be able to achieve it, you had access to a web API that was hashing for you a message of your choice with the salt and a timestamp. The normal approach to do it was to find the salt by using backspace characters for the message. This would reduce the length of the salt, allowed you to bruteforce smaller part of the salt and progressively get it. However, what I will present is how to get the secret without knowing the salt. One of the reason I find it interesting to talk about it is because it's a good introduction to hash extensions and most people seemed surprise that it was even possible.

First, let's look at how the message are hashed and how MD5 works at a high-level.

MD5


MD5 uses a Merkle-Damgard construction which can be described with the following image that's taken directly from Wikipedia.


In the case of MD5, the IV is static, the function "f" can found in any implementation of MD5 and the padding is constructed by adding 0x80 + (0x00)*(N) + (length of what is hashed). Here N is adjusted so that the length of the last block is exactly 64 bytes long.

The message


The message you have to find is given by MD5(salt+ secret).
The output of the API is given by MD5(salt+ input + timestamp) and the timestamp used is returned to you.

Doing the hash extension


So with that in mind, we now know that the message was evaluated this way:
hashed message = f(PADDING(salt + secret), IV)

And we also know that the output of the API is the following:
output = f(PADDING(salt + input + timestamp), IV)

For the message, it is likely that the length is below 64 bytes, so we will assume it always fit on a single block. For the output however, if the content gets bigger than 64 bytes it will then be hashed the following way:

txt = PADDING(salt + input + timestamp)
output = f(substring(txt, 64, 128), f(substring(txt, 0, 64), IV))

Where it starts to get interesting is that we can make the first 64 bytes equals to PADDING(salt + secret) and this will happen:

txt = PADDING(salt + input + timestamp)
output = f(substring(txt, 64, 128), f(PADDING(SALT + secret), IV))
output = f(timestamp + padding, hashed message)

And this means that we can verify if the secret we have sent to the server is correct if the following holds:

f(timestamp given by the API + padding, hashed message) = output of the API

To have the hash extension work properly, you need to know the length of the salt otherwise the padding will be incorrect. Since it's unknown for the problem we will have to test for both the secret and the length of salt.

Conclusion


When all wrapped up together it gives the following :
https://gist.github.com/HoLyVieR/8920691

This allowed you to find the length of the salt and the secret message. In the case of the challenge the length was 40 and the secret was "cow".