a simple method to identify the improper use of == operator in PHP applications from a black box perspective to exploit a type juggling functionality
This blog post provides some insight on a problem that can often be found in PHP applications, more specifically when comparing strings using the equal == operator. There’s currently several posts related to this topic however I will be focussing on how this can be exploited from a black box perspective against any web app in a way that is suitable for a common penetration test assessment. Firstly I will be analysing the root cause of this problem in order to better understand how it works and how we can get the most out of it.
In 2011, an interesting thread on the PHP official bug tracking system  refers to some weird behaviours regarding comparison of strings with numbers. This thread wasn’t specifically from a security standpoint however you can see one comment below;
php > var_dump('0xff' == '255'); bool(true)
In fact, this specific example is not a bug but the result of a documented behaviour known as 'type juggling' that PHP provides  when using loose comparison operators such as == . Essentially, for certain comparison operators ( == , !=, <> ), PHP first tries to figure out their type based on different things [3 , 4] and only then the comparison takes place. These transformations may affect the expected result with important security implications as this privilege escalation demonstrates [5 , 6] or this insecure password validation reported in Full Disclosure .
Gynvael has a great blog post on this , PHP equal operator =='  which covers it extensively for different data types where you can find a great comparison reference table  and several examples as follows:
"1.00000000000000001" == "0.1e1" → bool(true) "+1" == "0.1e1" → bool(true) "1e0" == "0.1e1" → bool(true) "-0e10" == "0" → bool(true) "1000" == "0x3e8" → bool(true) "1234" == " 1234" → bool(true)
As you can see, these numeric strings are compared as actual numbers when using == which is particularly interesting from a security perspective. In this case you can get a string with the representation of a number in scientific notation which PHP will evaluate as a number. This output format could actually come from a hashing algorithm (usually represented as hexadecimal) and if this number for example is 0 elevated to any other number then it will always match 0 on a loose comparison. For a given hashing algorithm such passwords would become interchangeable i.e. slightly more likely to be accepted since their hash once converted into a number in scientific notation will match several others that happen to represent that same number even if in some other way.
Finally, this problem became more notorious with a post from Ed Skoudis  and later on with Robert Hansen who wrote 'Magic Hashes'  where he compiled a table of numbers that for different hashing algorithms generate an output that matches ^0+ed*$ format i.e. 0 in scientific notation and others .
It's relatively trivial to spot these bugs from a static analysis point of view but what can we do from a black box perspective? For any account within an application, if we can find a pair of these interchangeable passwords for the most popular hashing algorithms (SHA1, MD5,..) that are both simultaneously accepted, with all likelihood you have determined improper use of a loose comparison for password hashes. If we consider the typical penetration testing engagement where you want to be provided with at least one account, you can try to set your password to one of these and then try to login with the other. Obviously this depends on which hashing algorithm is in use so you need to have two of these passwords per hashing algorithm and that is assuming that no salt is in use.
Now, before bringing the heat to find those password pairs, there is one more consideration to have in mind - password requirements. It would be a shame to work on finding a couple of these passwords and later on to fail to prove our vulnerability just because our passwords do not meet password complexity requirements. So let's make sure we look for passwords with length > 8, letters in mixed case, numbers and at least one special character, as follows:
I did not go into great lengths to optimise performance, just a carefully written python script running on all available cores of my AMD FX8350 using the PyPy interpreter. I used hashlib with OpenSSL's implementation for the hashing functions and to make sure I don't get bitten by the Python GIL, I just spawned independent processes where each one of them is looking at a different key space. For that, I used an ultra-sophisticated technique which is to just have a different prefix for each password (which in turn is derived from a user input provided letter), as you can see above.
Within about an hour I got not 2 but 4 hashes in SHA1 and surprisingly it took me a bit longer to get 4 of those passwords for MD5. I'm sure if you run this in your machine it won't take long until you uncover many more of these but we really only need 2 here per hashing algorithm.
Here you have the colliding passwords:
And to prove, you can pick any of those two and compare:
php > var_dump(md5('c!C123449477') == md5('d!D206687225')); bool(true) php > var_dump(sha1('aA1537368460!') == sha1('fF3560631665!')); bool(true)
If the above doesn't work and you are feeling lucky you could try to concatenate your password with the username hoping that it will be used as a salt and look for these numerical hashes. You can do it with a very simple change on the script provided here.
PHP provides a solution for this, if you want to compare hashes you should either use password_verify()  or hash_equals()  functions. These enforce a strict comparison and close this vector. Note that hash_equals() may also be used for strings other than specifically hashes and should be preferred over === because it offers protection against timing attacks.
Whilst trivial to execute, the approach described here may provide valuable knowledge that one can get from a black box perspective. If the passwords provided above are found to be interchangeable for a given application, several conclusions can be drawn (that can go much beyond the PHP datatype juggling issue itself), such as:
There is so much more that can be done to explore this further. An attacker could include those passwords into a wordlist and launch a horizontal password brute force attack against all users. Also if the application has an insecure password recovery mechanism, attackers may be able to trigger an arbitrary number of password resets against a targeted account and keep trying until successful. This is still quite unlikely to work, at least without having that account with a "few" millions of password reset emails but note this problem may not be exclusive to password validation, it could happen in any security sensitive comparison including session validation (sending a session ID == 0 multiple times against custom session validations in PHP, could work!).
It would also be interesting to get a couple of these passwords for SHA-2 but a rough estimation tells me this is not going to fly, not with the approach described above. SHA-2 hashes are longer so the odds of finding a string whose hash matches our desired output format are too slim. For this reason this is also less likely to become exploitable in a realistic scenario but still it would be a nice exercise for someone with a well tuned GPU rig.
Check here [PY].
 Magic Hashes