SQL Injection attacks on Python code

rob101 · (This post was last modified: Oct-11-2022, 03:20 PM by rob101.)

I've been researching sql injection attacks (partly because I want to understand the attack, but mostly because it's cropped up in a few recent threads) and from what I can gather, the danger (for want of a better noun) seems to be from user inputs and as such a function to satanize said input would mitigate the danger and do away with all of this messing about with place holders in the cursor.execute() command, no?

What am I missing, if anything?

**deanhystad** · Oct-11-2022, 03:57 PM

The "messing about" with placeholders is how you call the built in sanitizer function. It also does nice things like letting you to use SQL keywords in strings that you save in your database. Imagine a product database where the product description could not use all, any, as, from, in, ...

rob101 · Oct-11-2022, 04:12 PM

(Oct-11-2022, 03:57 PM)deanhystad Wrote: Imagine a product database where the product description could not use all, any, as, from, in, ...

Thanks for the reply.

Is it not the case that strings being returned from a database (as would be the case for a product description) are not the issue. If it's a search term that's being sent to the database, then where's the harm? It's easy to exclude * or ' or '''' in fact anything that's not a-z, A-Z or 0-9.

ibreeden · (This post was last modified: Oct-11-2022, 07:43 PM by ibreeden.)

This is the most famous joke about SQL injection.
It appears the school has an application to insert student data in the table "Students". Then this application appears to contain some code like:

insert_statement = f"insert into Students (id, class, surname, name) 
    values ({seqno}, '{classinput}', '{surnameinput}', '{nameinput}');"

This would produce a correct insert statement looking like this:

insert into Students (id, class, surname, name) 
    values (12345, '1A', 'Xkcd', 'Robert');

But when the data of this specific student is inserted, the following statement will be generated:

insert into Students (id, class, surname, name) 
    values (12346, '1A', 'Xkcd', 'Robert'); DROP TABLE Students; -- ');

Most databases interpret the semicolon as "execute the previous" and then continue with the next. Also "--" is an introducer for a line comment (it is in Oracle) and this prevents the spurious "');" to produce an error. So the complete table is dropped without an error.

For all these troubles is a very simple solution. The application should instead contain some code like this:

insert_statement = "insert into Students (id, class, surname, name) 
    values (?, ?, ?, ?);"
...
db.execute(insert_statement, (seqno, classinput, surnameinput, nameinput))

Now "name" will only be interpreted as a value and will not execute any "injected SQL". This solves the problem. I agree with Deanhystad that this is the most simple and elegant way to make your application safe.

woooee · (This post was last modified: Oct-11-2022, 05:51 PM by woooee.)

Use named placeholders https://blog.devgenius.io/til-python-sql...d10338cb6b
https://bobby-tables.com/python

rob101 · Oct-11-2022, 06:20 PM

Again, my thanks, to you both for the replies.

My point is that if any user input is sanitized, before it's allowed any near the rest of the script, then it will mitigate any injection attacks.

As a simple example (which could be coded in different and maybe better ways) maybe this could do the job:

def check_input(user_input):
    chars = ['_', '@','.', ' '] # add any allowable corner case characters
    invalid = []
    for char in range(48,58):
        chars.append(chr(char))
    for char in range(97,123):
        chars.append(chr(char))

    for char in user_input:
        if char not in chars:
            invalid.append(char)
    if invalid:
        return False, invalid
    else:
        return user_input

user_input = ''
while not user_input:
    user_input = check_input(input("User name:> ").lower())
    if user_input[0] == False:
        print("Invalid characters:")
        print(*user_input[1])
        print(f"Invalid user name. Please try again.\n")
        user_input = ''

print(f"User name {user_input} valid")

So, only if the user input gets past that routine, would it be allowed to be processed by a database routine.

The focus here should not be on the code that I've posted, but my original point: the danger comes from the user input; sanitize that and you mitigate any injection attack.

jefsummers · Oct-11-2022, 07:11 PM

I think you have it. The danger is indeed from the user input, like many attacks (buffer overflow for example).
@ibreeden the picture did not come through but I am pretty sure you were referring to this famous XKCD cartoon. That rascal, little Bobby Tables...

https://xkcd.com/327/

rob101 · Oct-11-2022, 08:21 PM

(Oct-11-2022, 07:11 PM)jefsummers Wrote: I think you have it. The danger is indeed from the user input

Thank you.

As for the 'cartoon' (which is very cool) it does underline my point.

So, moving forward, maybe what we should be advising coders that post SQL based questions, is that the user input should be sanitized (which I think is very good general advise, regardless) or use place holders; no need to do both: doing both would be a 'belt and braces' approach, but somewhat overkill.

bowlofred · Oct-11-2022, 10:34 PM

Sanitization is hard. Anything short of perfect and you have a vulnerability (and/or you are risking mangling incoming data). What do you do when '*' or similar characters are valid input? Also it's difficult to prove after the fact that the job is sufficient. Placeholders/parameters are much simpler.

Not to say you shouldn't sanitize (that might well be necessary for all sorts of vulnerabilities beyond simple SQL injection). But relying on it when there are better alternatives isn't a good idea.

Don't write your own encryption. Don't rely on sanitization to avoid SQL injection. A (non python oriented) take on this is at https://kevinsmith.io/sanitize-your-inputs/.

See also https://cheatsheetseries.owasp.org/cheat...Sheet.html

Quote:Defense Option 1: Prepared Statements (with Parameterized Queries)

The use of prepared statements with variable binding (aka parameterized queries) is how all developers should first be taught how to write database queries. They are simple to write, and easier to understand than dynamic queries. Parameterized queries force the developer to first define all the SQL code, and then pass in each parameter to the query later. This coding style allows the database to distinguish between code and data, regardless of what user input is supplied.

Prepared statements ensure that an attacker is not able to change the intent of a query, even if SQL commands are inserted by an attacker.

rob101 · (This post was last modified: Oct-12-2022, 06:41 AM by rob101.)

Thanks for your input.

The links you provide and the comments you make are interesting.

I'm not trying to say that there's a 'one size fits all' for user input checks. What I am saying is that even basic user input checks are not so hard to do. As an example, for a user name:

user_name=''
while not user_name:
    user_name = input("User name:> ").lower().strip()
    if ' ' in user_name:
        print("Error: no space allowed.\n")
        user_name = ''

print(f"\nUser name {user_name} valid.")

... would make the Bobby Tables attack impossible as well as the likes of john' OR 'a'='a';--, which is another attack that I've seen mentioned (I forget where).

Kevin Smith puts it very succinctly with: "... using prepared statements is the only way to reliably protect your database from SQL injection attacks without the risk of mangling incoming data." and I see the same sentiment quoted from the Cheat Sheet. I think that roughly translates to: one should never allow ones Python code to directly use any user input in a cursor.execute() command. If we do that, why do we need the place holders?

I'm not simply playing devil's advocate here; I want to understand the issue, as a matter of learning and passing on good advice, rather than iterating advice that I've been given and don't fully understand.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Better Understanding of Security and injection attacks SQLite	KevinBrown	1	2,176	Apr-09-2019, 09:45 PM Last Post: Legomancer
	Sql Injection using python	sumandas89	2	3,428	Jan-15-2018, 03:04 PM Last Post: metulburr

SQL Injection attacks on Python code

User Panel Messages

Announcements