Drupal Services SQL injection - don't trust abstractions

Jun 27th, 2020

Drupal doesn’t have many SQL injection vulnerabilities anymore, at least not since the original Drupalgeddon was released into the wild. So what makes Drupal so safe? Abstractions of course! The database abstraction layer or “DB layer” is used throughout core and contrib to make all sorts of database calls in a way that’s easy to understand and relatively secure. On top of that, now-a-days most code only needs to use the Entity API, which is another huge abstraction on top of the DB layer.

Here’s an example of the kinds of queries the DB layer produces (using “drush php” with Drupal 9 and SQLite):

>>> (string) \Drupal::database()->select("node")->fields("node", ["nid"])->condition("nid", 1, "!=")->orderBy("title", "ASC") => """ SELECT "node"."nid" AS "nid"\n FROM\n {node} "node"\n WHERE "nid" != :db_condition_placeholder_0\n ORDER BY "title" ASC """

That “:db_condition_placeholder_0” part is the key - all conditions use placeholders for comparison values, which prevent the majority of SQL injections.

This is great, but a downside of abstracting users this far away from raw SQL is they assume that all values passed to the DB layer are safely escaped before use. As it turns out, the operator used by conditions (“!=” above) is not sanitized, and in some cases can lead to SQL injection.

This isn’t a bug in core - it’s intentional because it allows users to use subqueries in conditions, which is common for complex queries. An abstraction around subqueries would be awkward, so it makes sense that raw SQL is allowed here. It’s not a bug, but are users aware of this behavior, and are they escaping user provided SQL conditions?

WIth these thoughts rumbling in my head, I knew that any time I saw an user-provided operator in code, that was a target for SQL injection. I happened to already be researching the Drupal 7 branch of the Services module, and noticed that the API for searching nodes (/node.json) allowed for some pretty complex queries:

http:///[domain]/node.json?fields=title&parameters[title]=bar

This query will return the titles of all nodes if their title is “bar”. After browsing the codebase again I saw that you can also provide an operator with a query parameter like options[parameters_op][title], which we now know could contain almost any valid SQL string.

Now I needed to write an SQL injection payload. An important step to SQL injection is knowing or guessing what the real SQL query is, then thinking of ways you can modify that query to perform a valid injection. I don’t have my notes for what the original query was exactly, but let’s assume it’s something like this:

SELECT "node"."title" AS "title" FROM {node} "node" WHERE ("title" = “bar”)

Our point of injection is “=” - we can write any string there to do something malicious. Here’s the payload I thought up to show this bug:

IS NOT NULL) UNION ALL SELECT SID FROM SESSIONS JOIN USERS ON USERS.UID = SESSIONS.UID WHERE USERS.NAME = LOWER("ADMIN") AND ("FOO" <>

This payload closes the conditional statement early, then does a union select for the admin user’s session ID. Note that the payload is uppercase because this API forced all operators to be uppercased. Here’s what the final query looks like:

SELECT "node"."title" AS "title" FROM {node} "node" WHERE ("title" IS NOT NULL) UNION ALL SELECT SID FROM SESSIONS JOIN USERS ON USERS.UID = SESSIONS.UID WHERE USERS.NAME = LOWER("ADMIN") AND ("FOO" <> = “bar”)

Due to the way the query is formed, the first and last conditions will always return true, so you’re guaranteed a result if there’s a user with the username “admin”.

When I reported this the maintainers and some security team members were surprised by the bug. This really seems like a core problem, but it’s not easily addressable in core without some major changes. Coming up with a fix in Services was tricky - how do you ensure that a SQL operator is “safe”? Eventually we decided to form a list of allowed operators, and compare user input to that list. Not the most elegant change, but feels like a good solution compared to a new and likely flawed regular expression. Here’s the SA for this release https://www.drupal.org/sa-contrib-2019-026

To me the lesson here is that we trust Drupal’s abstractions too much. It’s not clear what user input is automatically escaped, and what needs to be manually handled. I wonder how this can be made more clear to developers, but for now please read documentation and never blindly trust core!