attack

Responsible SQL: How to Authenticate Users

James Socol

Nov 9, 2008 • 4 min read

Most SQL-injection articles set a horrible example for young programmers.

Here is a very typical “bad example” of why you need to escape user data before it goes into SQL queries:

(ed. The symbol « is a line break that’s not in the real code.)

$username = $_POST[‘username’]; // username=admin

$password = $_POST[‘password’]; // password=’ OR 1=1; — ‘

$user = $db->query("SELECT * FROM users WHERE «

username=’$username’ AND «

password=’$password’ LIMIT 1;");

The point, of course, is that you must sanitize your user input, or else this person would run this query:

$user = $db->query("SELECT * FROM users WHERE «

username=’admin’ AND «

password = ” OR 1=1; — ‘ LIMIT 1;");

Which grants the sneaky user all your admin privileges. Other versions have nefarious users dropping your users or articles tables.

The problem is: this is the wrong way to authenticate users. These examples are written for beginners to understand the importance of sanitizing input, but they also provide a model to those beginners for how user authentication works. And it’s a very bad model.

This is a long one, more after the break.

The only upside to authenticating this way is that you don’t expose any information on failure, that is, if I’m trying to hijack someone’s account, I can’t tell the difference between an invalid user name and a valid user name with a bad password. That’s good, but there are good reasons not to do this at the database level.

The “correct” way is not much more complex. Basically:

Look up the record with the username only.
Get the (hashed) password out of the database.
Hash the submitted password.
Compare the two hashes.

This is really not very hard to implement. In PHP:

/**

* Check a password against the database

* @[param](http://twitter.com/param) string $username The username to check

* @[param](http://twitter.com/param) string $password The (supposed) password

* @[return](http://twitter.com/return) int 0=success, 1=bad username, 2=bad password

function check_password ($username, $password){

$db = new mysqli(); // we need to talk to the DB

10.

11.

// the real_escape_string() function is much better

12.

// than add_slashes() for escaping MySQL database input

13.

$_username = $db->real_escape_string($username);

14.

15.

// I try to make my SQL queries as easy to read

16.

// as possible. (Not always very easy.)

17.

$result = $db->query("SELECT password "

18.

."FROM users "

19.

."WHERE username = ‘{$_username}’ "

20.

."LIMIT 1;");

21.

22.

// we’re assuming the query ran correctly

23.

24.

// if we can’t return a row, then there’s no user with

25.

// that name

26.

if( !$user = $result->fetch_assoc()){

27.

return1; // return code for bad username

28.

}

29.

30.

// now, assuming the password was hashed with crypt()

31.

if($user[‘password’] != «

32.

[crypt](http://www.php.net/crypt)($password, $user[‘password’])){

33.

return2; // return code for bad password

34.

}

35.

36.

return0; // return code for success

37.

}

What’s going on here? Basically, we’re looking up the user by the username. If we don’t find a user, we throw out an error. If we do find a user, we re-encrypt the password they supplied, and check it against the encrypted password we already have. If they don’t match, we throw out an error. If they do, the user is allowed to log in.

There are two key differences between this method and the method so often espoused by tutorial writers:

This method stores an encrypted password instead of plain text.
This method differentiates between bad usernames and bad passwords.

1 should be obvious. Never store an unencrypted password. It’s extremely dangerous: if someone ever gets a look at the table, they can just read the users’ passwords—which may well be the same as their bank password (no it shouldn’t be, but it probably is). And it’s unnecessary. Every server-side language implements the MD5 hash, which is weak but works. Better options (like PHP’s crypt()) can use algorithms like Triple-DES, SHA1, Blowfish, or at least MD5 with a random salt.

But wait, #2, I said it was better not to distinguish between a bad username and a bad password, right? Well… yes, to the end user. In either case, I should display a message like “Bad username or password” to the person who tried to log in.

Internally, however, I want to know what happened. Is someone targetting known users, or just trying random combinations? How did they find real usernames? Where should I be improving security?

You’re also minimizing the number of user-submitted strings that get sent to the database. There are fewer opportunities for you to accidently allows an injection attack. If you have a policy on username syntax, you can keep yourself even safer by not talking to the database if the username is bad:

(I’ve omitted logging or real error-handling here. In a live version, I would probably wrap most of this in a <a href="http://us2.php.net/manual/en/language.exceptions.php" onclick="window.open(this.href,'newwindow'); return false;">try</a> block, throw one of three types of exceptions, and do some logging in the catch block.)

// Usernames must start with a letter, and contain

// only letters, numbers, underscores and dots, but

// must not end with a dot or underscore.

$user_regex = ‘/[a-zA-Z][a-zA-Z0-9_\.]*[a-zA-Z0-9]/’;

if([preg_match](http://www.php.net/preg_match)($user_regex,$username)){

// the username matches our allowed syntax

10.

$auth = check_password($username, $password);

11.

12.

if($auth === 0){

13.

// the do_login() function is an exercise

14.

// to the reader

15.

do_login($username);

16.

}

17.

}

18.

19.

// the username was bad, or the username/password

20.

// was wrong

21.

// die() is an overly simplistic choice, here.

22.

[die](http://www.php.net/die)("Bad username or password.");

23.

24.

Obviously we still escape the username, to make damn sure, but this gives us another place to get information. Did someone actually enter `'; DROP TABLE users; --` into our login form, or did they just mistype their password.

I’m going to end with a request: if you’re about to write a tutorial for beginners, please be aware of what you’re modeling in your examples. If you’re doing something you would never do, for the sake of simplicity or because it’s not the focus of the tutorial, point that out. Link to another tutorial or at least mention that it’s a bad way to do something.

Don’t send a quiet message that wrong is OK.

Sign up for more like this.