Stripe's web security CTF's level 0 and level 3 had SQL injection solutions described below.
app.get('/*', function(req, res) {
var namespace = req.param('namespace');
if (namespace) {
var query = 'SELECT * FROM secrets WHERE key LIKE ? || ".%"';
db.all(query, namespace, function(err, secrets) {
There's no input validation on the namespace parameter and it is injected into the SQL query with no encoding applied. This means you can use the '%' character as the namespace which is the wildcard character matching all secrets.
Code review red flag was using strings to query the database. Additional levels made this harder to exploit by using an API with objects to construct a query rather than strings and by running a query that only returned a single row, only ran a single command, and didn't just dump out the results of the query to the caller.
@app.route('/login', methods=['POST'])
def login():
username = flask.request.form.get('username')
password = flask.request.form.get('password')
if not username:
return "Must provide username\n"
if not password:
return "Must provide password\n"
conn = sqlite3.connect(os.path.join(data_dir, 'users.db'))
cursor = conn.cursor()
query = """SELECT id, password_hash, salt FROM users
WHERE username = '{0}' LIMIT 1""".format(username)
cursor.execute(query)
res = cursor.fetchone()
if not res:
return "There's no such user {0}!\n".format(username)
user_id, password_hash, salt = res
calculated_hash = hashlib.sha256(password + salt)
if calculated_hash.hexdigest() != password_hash:
return "That's not the password for {0}!\n".format(username)
There's little input validation on username before it is used to constrcut a SQL query. There's no encoding applied when constructing the SQL query string which is used to, given a username, produce the hashed password and the associated salt. Accordingly one can make username a part of a SQL query command which ensures the original select returns nothing and provide a new SELECT via a UNION that returns some literal values for the hash and salt. For instance the following in blue is the query template and the red is the username injected SQL code:
SELECT id, password_hash, salt FROM users WHERE username = 'doesntexist' UNION SELECT id, ('5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8') AS password_hash, ('word') AS salt FROM users WHERE username = 'bob' LIMIT 1
In the above I've supplied my own salt and hash such that my salt (word) plus my password (pass) hashed produce the hash I provided above. Accordingly, by
providing the above long and interesting looking username and password as 'pass' I can login as any user.
Code review red flag is again using strings to query the database. Although this level was made more difficult by using an API that returns only a single row and by using the execute method which only runs one command. I was forced to (as a SQL noob) learn the syntax of SELECT in order to figure out UNION and how to return my own literal values.
I was the 546th person to complete Stripe's web security CTF and again had a ton of fun applying my theoretical knowledge of web security issues to the (semi-)real world. As I went through the levels I thought about what red flags jumped out at me (or should have) that I could apply to future code reviews:
Level | Issue | Code Review Red Flags |
---|---|---|
0 | Simple SQL injection | No encoding when constructing SQL command strings. Constructing SQL command strings instead of SQL API |
1 | extract($_GET); | No input validation. |
2 | Arbitrary PHP execution | No input validation. Allow file uploads. File permissions modification. |
3 | Advanced SQL injection | Constructing SQL command strings instead of SQL API. |
4 | HTML injection, XSS and CSRF | No encoding when constructing HTML. No CSRF counter measures. Passwords stored in plain text. Password displayed on site. |
5 | Pingback server doesn't need to opt-in | n/a - By design protocol issue. |
6 | Script injection and XSS | No encoding while constructing script. Deny list (of dangerous characters). Passwords stored in plain text. Password displayed on site. |
7 | Length extension attack | Custom crypto code. Constructing SQL command string instead of SQL API. |
8 | Side channel attack | Password handling code. Timing attack mitigation too clever. |
More about each level in the future.
jQuery plugin that blindly removes lines with errors and recompiles until it works
HTTP Content Coding Token | gzip | deflate | compress |
---|---|---|---|
An encoding format produced by the file compression program "gzip" (GNU zip) | The "zlib" format as described in RFC 1950. | The encoding format produced by the common UNIX file compression program "compress". | |
Data Format | GZIP file format | ZLIB Compressed Data Format | The compress program's file format |
Compression Method | Deflate compression method | LZW | |
Deflate consists of LZ77 and Huffman coding |
Compress doesn't seem to be supported by popular current browsers, possibly due to its past with patents.
Deflate isn't done correctly all the time. Some servers would send the deflate data format instead of the zlib data format and at least some versions of Internet Explorer expect deflate data format instead of zlib data format.
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).
Getting into the more subtle levels of URI percent-encoding ignorance, folks try to apply their knowledge of percent-encoding to URIs as a whole producing the concepts escaped URIs and unescaped URIs. However there are no such things - URIs themselves aren't percent-encoded or decoded but rather contain characters that are percent-encoded or decoded. Applying percent-encoding or decoding to a URI as a whole produces a new and non-equivalent URI.
Instead of lingering on the incorrect concepts we'll just cover the correct ones: there's raw unencoded data, non-normal form URIs and normal form URIs. For example:
In the above (A) is not an 'encoded URI' but rather a non-normal form URI. The characters of 'the' and 'path' are percent-encoded but as unreserved characters specific in the RFC should not be encoded. In the normal form of the URI (B) the characters are decoded. But (B) is not a 'decoded URI' -- it still has an encoded '?' in it because that's a reserved character which by the RFC holds different meaning when appearing decoded versus encoded. Specifically in this case, it appears encoded which means it is data -- a literal '?' that appears as part of the path segment. This is as opposed to the decoded '?' that appears in the URI which is not part of the path but rather the delimiter to the query.
Usually when developers talk about decoding the URI what they really want is the raw data from the URI. The raw decoded data is (C) above. The only thing to note beyond what's covered already is that to obtain the decoded data one must parse the URI before percent decoding all percent-encoded octets.
Of course the exception here is when a URI is the raw data. In this case you must percent-encode the URI to have it appear in another URI. More on percent-encoding while constructing URIs later.
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping).
Worse than the lame blog comments hating on percent-encoding is the shipping code which can do actual damage. In one very large project I won't name, I've fixed code that decodes all percent-encoded octets in a URI in order to get rid of pesky percents before calling ShellExecute. An unnamed developer with similar intent but clearly much craftier did the same thing in a loop until the string's length stopped changing. As it turns out percent-encoding serves a purpose and can't just be removed arbitrarily.
Percent-encoding exists so that one can represent data in a URI that would otherwise not be allowed or would be interpretted as a delimiter instead of data. For example, the space character (U+0020) is not allowed in a URI and so must be percent-encoded in order to appear in a URI:
http://example.com/the%20path/
http://example.com/the path/
For an additional example, the question mark delimits the path from the query. If one wanted the question mark to appear as part of the path rather than delimit the path from the query, it must be percent-encoded:
http://example.com/foo%3Fbar
http://example.com/foo?bar
/foo
" from the query "bar
". And in the first, the querstion mark is percent-encoded and so
the path is "/foo%3Fbar
".
As a professional URI aficionado I deal with various levels of ignorance on URI percent-encoding (aka URI encoding, or URL escaping). The basest ignorance is with respect to the mere existence of percent-encoding. Percents in URIs are special: they always represent the start of a percent-encoded octet. That is to say, a percent is always followed by two hex digits that represents a value between 0 and 255 and doesn't show up in a URI otherwise.
The IPv6 textual syntax for scoped addresses uses the '%' to delimit the zone ID from the rest of the address. When it came time to define how to represent scoped IPv6 addresses in URIs there were two camps: Folks who wanted to use the IPv6 format as is in the URI, and those who wanted to encode or replace the '%' with a different character. The resulting thread was more lively than what shows up on the IETF URI discussion mailing list. Ultimately we went with a percent-encoded '%' which means the percent maintains its special status and singular purpose.
In short: excessive use of promises leads to a ton of short lived objects and resulting poorer pref.
“The syntax for allowed Top-Level Domain (TLD) labels in the Domain Name System (DNS) is not clearly applicable to the encoding of Internationalised Domain Names (IDNs) as TLDs. This document provides a concise specification of TLD label syntax based on existing syntax documentation, extended minimally to accommodate IDNs.” Still irritated about arbitrary TLDs.
From the document: ‘Appendix B. Implementation Report: The encoding defined in this document currently is used for two different HTTP header fields: “Content-Disposition”, defined in [RFC6266], and “Link”, defined in [RFC5988]. As the encoding is a profile/clarification of the one defined in [RFC2231] in 1997, many user agents already supported it for use in “Content-Disposition” when [RFC5987] got published.
Since the publication of [RFC5987], two more popular desktop user agents have added support for this encoding; see http://purl.org/
NET/http/content-disposition-tests#encoding-2231-char for details. At this time, only one major
desktop user agent (Safari) does not support it.
Note that the implementation in Internet Explorer 9 does not support the ISO-8859-1 encoding; this document revision acknowledges that UTF-8 is sufficient for expressing all code points, and removes the requirement to support ISO-8859-1.’
Yay for UTF-8!
I've just updated Encode-O-Matic with a Guess Input Encoding feature. When you start Encode-O-Matic or when you use the 'Guess Input Encoding' menu item from the 'Tools' menu, Encode-O-Matic will try out various combinations of encodings and guess at which set seem to apply to your input. For instance given the following text, Encode-O-Matic will correctly guess that it is percent encoded, base64 encoded, deflate compressed text:
S%2BWqUEhLLMoFUulFpXnZQLogMa%2BkmCuPqxzILk%2FMyeHK4QIA
It should work fairly well for simple things but I did pick 'Guess' for the name of the feature to intentionally lower
expectations. It doesn't currently apply to character encodings but that may be something to consider in the future.I've just put up an update for Encode-O-Matic with the following improvements: