SQL Injection

SQL injection is a technique used to take advantage of non-validated input vulnerabilities to pass SQL commands through a Web application for execution by a back end database. Attackers take advantage of the fact that programmers often chain together SQL commands with user-provided parameters, and can therefore embed SQL commands inside these parameters. The result is that the attacker can execute arbitrary SQL queries and/or commands on the back end database server through the Web application.

Detailed description

Databases are fundamental components of Web applications. Databases enable Web applications to store data, preferences and content elements. Using SQL, Web applications interact with databases to dynamically build customized data views for each user. A common example is a Web application that manages products. In one of the Web application's dynamic pages (such as ASP), users are able to enter a product identifier and view the product name and description.

The request sent to the database to retrieve the product's name and description is implemented by the following back end SQL statement.

SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = {ProductNumber}

Typically, Web applications use string queries, where the string contains both the query itself and its parameters. The string is built using server-side script languages such as ASP, JSP and CGI, and is then sent to the database server as a single SQL statement. The following example demonstrates an ASP code that generates a SQL query.

sql_query= "SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = " & Request.QueryString("ProductID")

The call Request.QueryString("ProductID") extracts the value of the Web form variable ProductID so that it can be appended as the SELECT condition.

When a user enters the following URL:

http://www.mydomain.com/products/products.asp?productid=123

The corresponding SQL query is executed:

SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = 123

An attacker may abuse the fact that the ProductID parameter is passed to the database without sufficient validation. The attacker can manipulate the parameter's value to build malicious SQL statements. For example, setting the value "123 OR 1=1" to the ProductID variable results in the following URL:

http://www.mydomain.com/products/products.asp?productid=123 or 1=1

The corresponding SQL Statement is:

SELECT ProductName, Product Description From Products WHERE ProductNumber = 123 OR 1=1

This condition would always be true and all ProductName and ProductDescription pairs are returned. The attacker can manipulate the application even further by inserting malicious commands. For example, an attacker can request the following destructive URL:

http://www.mydomain.com/products/products.asp?productid=123;DROP TABLE Products

In this example the semicolon is used to pass the database server multiple statements in a single execution. The second statement is "DROP TABLE Products" which causes SQL Server to delete the entire Products table. For such cases, the attacker would need to (a) access an SQL service vulnerable to the “;” meta-character usage (stacked commands) and (b) know that a table Products exists but as demonstrated below, it is not hard to access such information.

Not all SQL based back end could be exploited via the “;” meta-character. An attacker may however use an alternate method. A form of this attack use SQL injection to retrieve data from other tables. This can be done using the SQL UNION SELECT statement. The UNION SELECT statement allows the chaining of two separate SQL SELECT queries that have nothing in common. For example, consider the following SQL query:

SELECT ProductName, ProductDescription FROM Products WHERE ProductID = 123 UNION SELECT Username, Password FROM Users;

The result of this query is a table with two columns, containing the results of the first and second queries, respectively. Assuming the SQL account accessing the table Products also has access to the table Users, an attacker may use this type of SQL injection by requesting the following URL:

http://www.mydomain.com/products/products.asp?productid=123 UNION SELECT Username, Password FROM Users

SQL weaknesses exploits

Typically, such injection is NOT far fetched. In fact, two characteristics (and perhaps shortcomings) of SQL servers in general simplify such attack:

SQL allows accessing other tables in other databases shall the account running the query be allowed read access to those databases. Especially, accessing user,password FROM mysql.user would allow listing all user,password keys from a MySQL server.
since double dash "--" tells many SQL servers ignore the rest of the query, the hacker can push this one step further and basic protections against injection would fail.

Let us analyze one of our previous examples. The case:

SELECT ProductName, Product Description From Products WHERE ProductNumber = 123 OR 1=1

could be protected against listing all records by an additional SQL statement such as

SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = {ProductNumber} LIMIT 1

This would a-priori block listing all products. However, a request using a double hyphen would form in this case the SQL statement

SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = 123 or 1=1 -- LIMIT 1

ignoring the LIMIT 1. This is also trivially possible using strings, even though they are protected by quotes. For example:

SELECT ProductName, ProductDescription FROM Products WHERE ProductName = '{ProductName}'

could be modified by passing

http://www.mydomain.com/products/products.asp?productName=Paint' or 1=1 --

then becoming

SELECT ProductName, ProductDescription FROM Products WHERE ProductName = 'Paint' or 1=1 --'

and the protection through quotes, somewhat helping, is now rendered obsolete by the use of the double hyphen trick. Depending on the actual SQL query, you may have to try some of these possibilities:

' or 1=1--
" or 1=1--
or 1=1--
' or 'a'='a
" or "a"="a
') or ('a'='a

one of them will likely break an un-protected CGI. The same principle applies for the example using UNION SELECT Username, Password FROM USERS which could be inserted after strings or numbers.

The security model used by many Web applications assumes that an SQL query is a trusted command. This enables attackers to exploit SQL queries to circumvent access controls, authentication and authorization checks. In some instances, SQL queries may allow access to host operating system level commands. This can be done using stored procedures. Stored procedures are SQL procedures usually bundled with the database server. For example, the extended stored procedure xp_cmdshell executes operating system commands in the context of a Microsoft SQL Server. Using the same example, the attacker can set the value of ProductID to be "123;EXEC master..xp_cmdshell dir--", which returns the list of files in the current directory of the SQL Server process and so on. Now that the principles are understood, let's go through possible remediations.

Prevention and remediations

There are a few principles to follow to avoid injection. We will separate them in two categories, the best practices and the strong checks to avoid injections.

The sole viable solution is access of the SQL back end in (Web) applications from non-privileged accounts having no access to any other tables than what it needs to (only) read. Database administartor should deploy a startegy where a read/write account conceptual architecture and separation is a focus. Often, Web application only need read access to tables and restricting accounts in confined privileges would prevent DROP TABLE attacks.

On the user's side, never connect to the database as a superuser or as the database owner. Use always customized users with very limited privileges.

The best practices

This section contains general best practice guidances you should always apply. However, while following the guidance below will likely protect CGIs having numbers as input against a hacker attack, beware of CGI using strings: they can hardly be protected via the mechanism and best practices described in this section.

Using APIs such as DBI or JDBC instead of forming SQL statement yourself is a first step to good CGI's scripting. For example, a perl DBI interface would look like

$dbh->prepare(“SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = ?”);
$dbh->execute($productNumber);

where $productNumber is recovered from your CGI, would be by far preferred over an approach which would execute the literal SQL statement as showed in previous sections. In general, prepared statements are much safer than hand-made SQL statement as many API internally (but minimally) protects against injection by escaping arguments.

In fact, the Java based API for example protects against type spoofing by explicitely requiring a type specification. For example, using

String sqlString = "SELECT ...";

would be a monumental mistake as Java also supports prepared statements the same way than the perl API do. The proper form would then be to use

PreparedStatement dbh = con.prepareStatement("SELECT ProductName, ProductDescription FROM Products WHERE ProductNumber = ?");

and use the appropriate setXXX method for the type you are considering. For example, public void setString(int parameterIndex, String x) would set parameter parameterIndex to the value x.

The second principle, and if the API does not provide such functionalities, is to always protect your SQL statements with a type check, transformation or a type cast.

If you are expecting an integer, you should use casting or transformation to int(). An injection attempt using a combination of integer and strings would result in a null value for productNumber in our example (you could check for a null value and rake special action)
If you are expecting a string as argument but composed of a single word, you should not recover multiple values separated by spaces. A simple check such as a perl statement
$productname = (split(" ",$ProductName))[0]; would isolate the first argument ; your are still subject to 'a'='a' type injection but protected yourself against UNION or stack injection.
Most langages are C-like inspired. You can make extensive use of sprintf(). This works very well for a range of types and especially when arguments are of numeric representation.

Third, and if you are using scripts (<script></script>) instead of CGI, never expose the SQL statements you are using nor add debugging lines in CGI as comments. A source inspection should never allow guessing the SQL statements in use. Furthermore, exposed code in scripts or generated HTML could be easily parsed, the SQL statement auto-analyzed and help crafting an injection URL.

The forth rule relies on how your form the SQL statements - obfuscate with with additional statements preventing simple attacks. In other words, you could use hidden conditions which will make unlikely the use of an injection using “;” meta-character method. For example, the use of additional statements such as LIMIT 1 would protect against the injection of TRUE via 1=1, the addition of conditional selection for a WHERE clause such AND ProductPrice > 0 would likely prevent most “;” command stack attack.

Similarly, prepared statements taking advantage of complex statements such as INSERT ... (keys1,key2,key3) VALUES(?,?,?). While not full proof, it would be hard(er) to break as it a hacker would need to either have the knowledge of the exact number of arguments you are using or attempts multiple argument combinations.

For strings, a first line of defense and prevention measures is to escape all special characters and especially test characters such as semicolumn ;, the infamous -- double hyphen, single quotes '.

Our fifth recommendation relates to langage specifics methods to check or transform arguments or reshape (prepare) the SQL and include:

If you can call the MySQL API function mysql_escape_string do so. Its purpose is to escape all single quote and backslash characters from untrusted strings.
For DBI, use the equivalent of $dbh->quote($string) along with the use of prepared statements as in the example at the begginiing of this section
PHP also provides from the simplest ones found in Variable Functions and in Character Type Functions (e.g. is_numeric(), ctype_digit() respectively). You can also silently change parameter's type using settype(). Quote each non numeric user supplied value that is passed to the database with the database-specific string escape function (e.g. mysql_real_escape_string(), sql_escape_string(), etc.). If a database-specific string escape mechanism is not available, the addslashes() and str_replace() functions may be useful.
In Java, your best bet is the use of the PreparedStatement class with strong type checking. Java provides a wider range of type convertion, regular expression and replacement strings you could use as detection of an injection attempts

In the next section, we will discuss perl-like regular expression for detecting injections applicable in perl, Java, and PHP.

Strong checks to prevent injections

Detection at the source is one way to prevent injection. A reliable and portable technique is the use of regular expression. An important point to keep in mind while choosing your regular expression(s) for detecting SQL Injection attacks is that an attacker can inject SQL into input taken from a form, as well as through the fields of a cookie. Your input validation logic should consider each and every type of input that originates from the user as suspect.

Also if you discover too many alerts coming in from a signature that looks out for a single-quote or a semi-colon, it just might be that one or more of these characters are valid inputs in cookies created by your Web application. Therefore, you will need to evaluate each of these signatures for your particular Web application.

As mentioned earlier, a trivial regular expression to detect SQL injection attacks is to watch out for SQL specific meta-characters such as the single-quote (') or the double-dash (--).

In order to detect these characters and their hex equivalents, the following regular expression may be used:

Regex for detection of SQL meta-characters

/(\%27)|(\')|(\-\-)|(\%23)|(#)/ix

We first detect either the hex equivalent of the single-quote, the single-quote itself or the presence of the double-dash. These are SQL characters for MS SQL Server and Oracle, which denote the beginning of a comment, and everything that follows is ignored. Additionally, if you're using MySQL, you need to check for presence of the '#' or its hex-equivalent %23. Note that we do not need to check for the hex-equivalent of the double-dash, because it is not an HTML meta-character and will not be encoded by the browser. Also, if an attacker tries to manually modify the double-dash to its hex value of %2D, the SQL Injection attack fails.

The above signature could be additionally expanded to detect the occurrence of the semi-colon as well. However, the semi-colon has a tendency to occur as part of normal HTTP traffic so you have to use this form of regular expression with care to avoid false positives.

Modified regex for detection of SQL meta-characters

/((\%3D)|(=))[^\n]*((\%27)|(\')|(\-\-)|(\%3B)|(;))/i

This signature first looks out for the = sign or its hex equivalent %3D. It then allows for zero or more non-newline characters, and then it checks for the single-quote, the double-dash or the semi-colon.

A typical SQL injection attempt of course revolves around the use of the single quote to manipulate the original query so that it always results in a true value. Most of the examples that discuss this attack use the string 1' or '1'='1. However, detection of this string can be easily evaded by supplying a value such as 1' or 2 > 1--. Thus the only part that is constant in this is the initial alphanumeric value, followed by a single-quote, and then followed by the word 'or'.

The Boolean logic that comes after this may be varied to an extent where a generic pattern is either very complex or does not cover all the variants. Thus these attacks can be detected to a fair degree of accuracy by using the next regular expression below.

Regex for typical SQL Injection attack

/\w*((\%27)|(\'))((\%6F)|o|(\%4F))((\%72)|r|(\%52))/ix

\w* - zero or more alphanumeric or underscore characters
(\%27)|\' - the ubiquitous single-quote or its hex equivalent
(\%6F)|o|(\%4F))((\%72)|r|(\%52) - the word 'or' with various combinations of its upper and lower case hex equivalents.

The use of the UNION SQL query is also common in SQL Injection attacks against a variety of databases. If the earlier regular expression that just detects the single-quote or other SQL meta characters results in too many false positives, you could further modify the query to specifically check for the single-quote and the keyword UNION. This can also be further extended to other SQL keywords such as SELECT, INSERT, UPDATE, DELETE, etc...

Regex for detecting SQL Injection with the UNION keyword

/((\%27)|(\'))union/ix

(\%27)|(\') - the single-quote and its hex equivalent
union - the keyword union

Similar expressions can be written for other SQL queries such as SELECT, INSERT, UPDATE, DROP, and so on.

Regex for detecting SQL Injection attacks on a MS SQL Server

/exec(\s|\+)+(s|x)p\w+/ix

If, by this stage, the attacker has discovered that the Web application is vulnerable to SQL injection, he will try to exploit it. If he realizes that the back-end database is on an MS SQL server, he will typically try to execute one of the many dangerous stored and extended stored procedures. These procedures start with the letters 'sp' or 'xp' respectively. Typically, he would try to execute the 'xp_cmdshell' extended procedure, which allows the execution of Windows shell commands through the SQL Server. The access rights with which these commands will be executed are those of the account with which SQL Server is running -- usually Local System. Alternatively, he may also try and modify the registry using procedures such as xp_regread, xp_regwrite, etc...

In the above regexp, we have

exec - the keyword required to run the stored or extended procedure
(\s|\+)+ - one or more whitespaces or their HTTP encoded equivalents
(s|x)p - the letters 'sp' or 'xp' to identify stored or extended procedures respectively
\w+ - one or more alphanumeric or underscore characters to complete the name of the procedure

References and further readings

CIAC Tech06-001: Protecting Against SQL Injection Attacks
SQL Injection http://www.spidynamics.com/papers/SQLInjectionWhitePaper.pdf
Perl-compatible regular expressions (pcre) http://www.pcre.org
Advanced SQL Injection http://www.nextgenss.com/papers/advanced_sql_injection.pdf
SQL Injection Walkthrough http://www.securiteam.com/securityreviews/5DP0N1P76E.html
Detection of SQL Injection and Cross-site Scripting Attacks http://www.securityfocus.com/infocus/1768
java.sql Interface PreparedStatement http://java.sun.com/j2se/1.3/docs/api/java/sql/PreparedStatement.html