The problem
We need to defend against a 'WAITFOR DELAY' sql injection attack in our java application.
Background
[This is long. Skip to 'Solution?' section below if you're in a rush ]
Our application mostly uses prepared statements and callable statements (stored procedures) in accessing the database.
In a few places we dynamically build-and-execute queries for selection. In this paradigm we use a criteria object to build the query depending on the user-input criteria. For example, if the user specified values for first_name and last_name, the result querying always looks something like this:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier'
(In this example the user would have specified "joe" and "frazier" as his/her input values. If the user had more or less critieria we would have longer or shorter queries. We have found that this approach is easier than using prepared statements and quicker/more performant than stored procedures).
The attack
A vulnerability audit reported an sql injection failure. The attacker injected the value 'frazier WAITFOR DELAY '00:00:20' for the 'last_name' parameter, resulting in this sql:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier' WAITFOR DELAY '00:00:20'
The result: the query executes successfully, but takes 20 seconds to execute. An attacker could tie up all your database connections in the db pool and effectively shut down your site.
Some observations about this 'WAITFOR DELAY' attack
I had thought that because we used Statement executeQuery(String) we would be safe from sql injection. executeQuery(String) will not execute DML or DDL (deletes or drops). And executeQuery(String) chokes on semi-colons, thus the 'Bobby Tables' paradigm will fail (i.e. user enters 'frazier; DROP TABLE member' for a parameter. See. http://xkcd.com/327/)
The 'WAITFOR' attack differs in one important respect: WAITFOR modifies the existing 'SELECT' command, and is not a separate command.
The attack only works on the 'last parameter' in the resulting query. i.e. 'WAITFOR' must occur at the very end of the sql statement
Solution, Cheap Hack, or Both?
The most obvious solution entails simply tacking "AND 1=1" onto the where clause.
The resulting sql fails immediately and foils the attacker:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier' WAITFOR DELAY '00:00:20' AND 1=1
The Questions
Is this a viable solution for the WAITFOR attack?
Does it defend against other similar vulnerabilities?
I think the best option would entail using prepared statements. More work, but less vulnerable.
The correct way to handle SQL injection is to use parameterized queries. Everything else is just pissing in the wind. It might work one time, even twice, but eventually you'll get hit by that warm feeling that says "you screwed up, badly!"
Whatever you do, except parameterized queries, is going to be sub-optimal, and it will be up to you to ensure your solution doesn't have other holes that you need to patch.
Parameterized queries, on the other hand, works out of the box, and prevents all of these attacks.
SQL injection is SQL injection - there's nothing special about a WAITFOR DELAY.
There is absolutely no excuse for not using prepared statements for such a simple query in this day and age.
(Edit: Okay, not "absolutely" - but there's almost never an excuse)
I think you suggested the solution yourself: Parameterized Queries.
How did you find that your dynamically built query is quicker than using a stored procedure? In general it is often the opposite.
To answer all your questions:
Is this a viable solution for the WAITFOR attack?
No. Just add -- to the attack string and it will ignore your fix.
Does it defend against other similar vulnerabilities?
No. See above.
I think the best option would entail using prepared statements. More work, but less vulnerable.
Yes. You don't fix SQL injection yourself. You use what's already existing and you use it right, that is, by parametrizing any dynamic part of your query.
Another lesser solution is to escape any string that is going to get inserted in your query, however, you will forget one one day and you only need one to get attacked.
Everyone else has nailed this (parameterize!) but just to touch on a couple of points here:
Is this a viable solution for the WAITFOR attack?
Does it defend against other similar vulnerabilities?
No, it doesn't. The WAITFOR trick is most likely just being used for 'sniffing' for the vulnerability; once they've found the vulnerable page, there's a lot more they can do without DDL or (the non-SELECT parts of) DML. For example, think about if they passed the following as last_name
' UNION ALL SELECT username, password FROM adminusers WHERE 'A'='A
Even after you add the AND 1 = 1, you're still hosed. In most databases, there's a lot of malicious things you can do with just SELECT access...
How about follow xkcd and sanitize the input. You could check for reserved words in general and for WAITFOR in particular.
Related
Is it possible to capture the query text of the entire RPC SELECT statement from within the view which it calls? I am developing an enterprise database in SQL Server 2014 which must support a commercial application that submits a complex SELECT statement in which the outer WHERE clause is critical. This statement also includes subqueries against the same object (in my case, a view) which do NOT include that condition. This creates a huge problem because the calling application is joining those subqueries to the filtered result on another field and thus producing ID duplication that throws errors elsewhere.
The calling application assumes it is querying a table (not a view) and it can only be configured to use a simple WHERE clause. My task is to implement a more sophisticated security model behind this rather naive application. I can't re-write the offending query but I had hoped to retrieve the missing information from the cached query plan. Here's a super-simplified psuedo-view of my proposed solution:
CREATE VIEW schema.important_data AS
WITH a AS (SELECT special_function() AS condition),
b AS (SELECT c.criteria
FROM a, lookup_table AS c
WHERE a.condition IS NULL OR c.criteria = a.condition)
SELECT d.field_1, d.field_2, d.filter_field
FROM b, underlying_table AS d
WHERE d.filter_field = b.criteria;
My "special function" reads the query plan for the RPC, extracts the WHERE condition and preemptively filters the view so it always returns only what it should. Unfortunately, query plans don't seem to be cached until after they are executed. If the RPC is made several times, my solution works perfectly -- but only on the second and subsequent calls.
I am currently using dm_exec_cached_plans, dm_exec_query_stats and dm_exec_sql_text to obtain the full text of the RPC by searching on spid, plan creation time and specific words. It would be so much better if I could somehow get the currently executing plan. I believe dm_exec_requests does that but it only returns the current statement -- which is just the definition of my function.
Extended Events look promising but unfamiliar and there is a lot to digest. I haven't found any guidance, either, on whether they are appropriate for this particular challenge, whether a notification can be obtained in real time or how to ensure that the Event Session is always running. I am pursuing this investigation now and would appreciate any suggestions or advice. Is there anything else I can do?
This turns out to be an untenable idea and I have devised a less elegant work-around by restructuring the view itself. There is a performance penalty but the downstream error is avoided. My fundamental problem is the way the client application generates its SQL statements and there is nothing I do about that -- so, users will just have to accept whatever limitations may result.
I have a simple SELECT statement with a couple columns referenced in the WHERE clause. Normally I do these simple ones in the VB code (setup a Command object, set Command Type to text, set Command Text to the Select statement). However I'm seeing timeout problems. We've optimized just about everything we can with our tables, etc.
I'm wondering if there'd be a big performance hit just because I'm doing the query this way, versus creating a simple stored procedure with a couple params. I'm thinking maybe the inline code forces SQL to do extra work compiling, creating query plan, etc. which wouldn't occur if I used a stored procedure.
An example of the actual SQL being run:
SELECT TOP 1 * FROM MyTable WHERE Field1 = #Field1 ORDER BY ID DESC
A well formed "inline" or "ad-hoc" SQL query - if properly used with parameters - is just as good as a stored procedure.
But this is absolutely crucial: you must use properly parametrized queries! If you don't - if you concatenate together your SQL for each request - then you don't benefit from these points...
Just like with a stored procedure, upon first executing, a query execution plan must be found - and then that execution plan is cached in the plan cache - just like with a stored procedure.
That query plan is reused over and over again, if you call your inline parametrized SQL statement multiple times - and the "inline" SQL query plan is subject to the same cache eviction policies as the execution plan of a stored procedure.
Just from that point of view - if you really use properly parametrized queries - there's no performance benefit for a stored procedure.
Stored procedures have other benefits (like being a "security boundary" etc.), but just raw performance isn't one of their major plus points.
It is true that the db has to do the extra work you mention, but that should not result in a big performance hit (unless you are running the query very, very frequently..)
Use sql profiler to see what is actually getting sent to the server. Use activity monitor to see if there are other queries blocking yours.
Your query couldn't be simpler. Is Field1 indexed? As others have said, there is no performance hit associated with "ad-hoc" queries.
For where to put your queries, this is one of the oldest debates in tech. I would argue that your requests "belong" to your application. They will be versionned with your app, tested with your app and should disappear when your app disappears. Putting them anywhere other than in your app is walking into a world of pain. But for goodness sake, use .sql files, compiled as embedded resources.
Select statement which is part of form clause of any
another statement is called as inline query.
Cannot take parameters.
Not a database object
Procedure:
Can take paramters
Database object
can be used globally if same action needs to be performed.
Does cfquery becomes a prepared statement as long as there's 1 cfqueryparam? Or are there other conditions?
What happen when the ORDER BY clause or FROM clause is dynamic? Would every unique combination becomes a prepared statement?
And what happen when we're doing cfloop with INSERT, with every value cfqueryparam'ed, and invoke the cfquery with different number of iterations?
Any potential problems with too many prepared statements?
How does DB handle prepared statement? Will they be converted into something similar to store procedure?
Under what circumstances should we Not use prepared statement?
Thank you!
I can answer some parts of your question:
a query will become a preparedStatement as long as there is one <queryparam. I have in the past added a
where 1 = <cfqueryparam value="1" to queries which didn't have any dynamic parameters, in order to get them run as preparedStatements
Most DBs handle preparedStarements similarly to Stored Procedures, just held temporarily, rather than long-term, however the details are likely to be DB-specific.
Assuming you are using the drivers supplied with ColdFusion, if you turn on the 'Log Activity' checkbox in the advanced panel of the DataSource setup, then you'll get very detailed information about how CF is interacting with he DB and when it is creating a new preparedStatement and when it is re-using them. I'd recommend trying this out for yourself, as so many factors are involved (DB setup, Driver, CF version etc). If you do use the DB logging, re-start CF before running your test code, so you can see it creating the prepared statements, otherwise you'll just see it re-using statements by ID, without seeing what those statements are.
In addition, if you are asking about execution plans then there is more involved than just the number PreparedStatement's generated. It is a huge topic and very database dependent. I do not have a DBA's grasp on it, but I can answer a few of the questions about MS SQL.
What happen when the ORDER BY clause or FROM clause is dynamic? Would
every unique combination becomes a prepared statement?
The base sql is different. So you will end up with separate execution plans for each unique ORDER BY clause.
And what happen when we're doing cfloop with INSERT, with every value
cfqueryparam'ed, and invoke the cfquery with different number of
iterations?
MS SQL should reuse the same plan for all iterations because only the parameters change.
The sys.dm_exec_cached_plans view is very useful for seeing what plans are cached and how often they are reused.
SELECT p.usecounts, p.cacheobjtype, p.objtype, t.text
FROM sys.dm_exec_cached_plans p
CROSS APPLY sys.dm_exec_sql_text( p.plan_handle) t
ORDER BY p.usecounts DESC
To clear the cache first, use DBCC FLUSHPROCINDB. Obviously do not use it on a production server.
DECLARE #ID int
SET #ID = DB_ID(N'YourTestDatabaseName')
DBCC FLUSHPROCINDB( #ID )
Does anyone know of a way to verify the correctness of the queries in all stored procedures in a database?
I'm thinking of the scenario where if you modify something in a code file, simply doing a rebuild would show you compilation errors that point you to places where you need to fix things. In a database scenario, say if you modify a table and remove a column which is used in a stored procedure you won't know anything about this problem until the first time that procedure would run.
What you describe is what unit testing is for. Stored procedures and functions often require parameters to be set, and if the stored procedure or function encapsulates dynamic SQL--there's a chance that a [corner] case is missed.
Also, all you mention is checking for basic errors--nothing about validating the data returned. For example - I can change the precision on a numeric column...
This also gets into the basic testing that should occur for the immediate issue, and regression testing to ensure there aren't unforeseen issues.
You could create all of your objects with SCHEMABINDING, which would prevent you from changing any underlying tables without dropping and recreating the views and procedures built on top of them.
Depending on your development process, this could be pretty cumbersome. I offer it as a solution though, because if you want to ensure the correctness of all procedures in the db, this would do it.
I found this example on MSDN (SQL Server 2012). I guess it can be used in some scenarios:
USE AdventureWorks2012;
GO
SELECT p.name, r.*
FROM sys.procedures AS p
CROSS APPLY sys.dm_exec_describe_first_result_set_for_object(p.object_id, 0) AS r;
Source: sys.dm_exec_describe_first_result_set_for_object
I'm building my own clone of http://statoverflow.com/sandbox (using the free controls provided to 10K users from Telerik). I have a proof of concept available I can use locally, but before I open it up to others I need to lock it down some more. Currently I run everything through a stored procedure that looks something like this:
CREATE PROCEDURE WebQuery
#QueryText nvarchar(1000)
AS
BEGIN
-- no writes, so no need to lock on select
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
-- throttles
SET ROWCOUNT 500
SET QUERY_GOVERNOR_COST_LIMIT 500
exec (#QueryText)
END
I need to do two things yet:
Replace QUERY_GOVERNOR_COST_LIMIT with an actual rather than estimated timeout, so no query runs longer than say 2 minutes.
Right now nothing stops users from just putting their own 'SET ROWCOUNT 50000;' in front of their query text to override my restriction, so I need to somehow limit the queries to a single statement or (preferrably) disallow the SET commands inside the exec function.
Any ideas?
You really plan to allow users to run arbitrary Ad-Hoc SQL? Only then can a user place in a SET to override your restrictions. If that's the case, you're best bet is to do some basic parsing using lexx/yacc or flex/bison (or your favorite CLR language tree parser) and detect invalid SET statements. Are you going to allow SET #variable=value though, which syntactically is a SET...
If you impersonate low privileged users via EXECUTE AS make sure you create an irreversible impersonation context, so the user does not simply execute REVERT and regain all the privileges :) You also must really understand the implications of database impersonation, make sure you read Extending Database Impersonation by Using EXECUTE AS.
Another thing to consider is deffering execution of requests to a queue. Since queue readers can be calibrated via MAX_QUEUE_READERS, you get a very cheap throttling. See Asynchronous procedure execution for a related article how to use queues to execute batches. This mechanism is different from resource governance, but I've seen it used to more effect that the governor itself.
Throwing this out there:
The EXEC statement appears to support impersonation. See http://msdn.microsoft.com/en-us/library/ms188332.aspx. Perhaps you can impersonate a limited user. I am looking into the availability of limitations that may prevent SET statements and the like.
On a very basic level, how about blocking any statement that doesn't start with SELECT? Or will other query starts be supported, like CTE's or DECLARE statements? 1000 chars isn't too much room to play with, but i'm not too clear what this is in the first place.
UPDATED
Ok, how about prefixing whatever they submit with SELECT TOP 500 FROM (
and appending a ). If they try to do multiple statements it'll throw an error you can catch. And to prevent denial of service, replace their starting SELECT with another SELECT TOP 500.
Doesn't help if they've appended an ORDER BY to something returning a million rows, though.