is there error in lack of sql injection attack? - prepared-statement

If we use best secure way for executing query such as prepared statement or parameterized query to prevent SQL injection attacks, is there guarantee to not any database error occure in executing it?
for example, sending invalid parameter for a given type in inserting record, cause error instead of use default value. can you bring an example?

One example, where parameterised queries doesn't prevent database errors is a user could provide input so long it won't fit inside the database type such as a 100,000 words essay in a username field limited to 50 characters.
Also, parameterised queries won't protect against duplicate primary key errors, if say, a username they request on a registration form is already taken.
parameterised queries only ensure the values aren't interpreted as SQL, it won't prevent any other form of error (such as the SQL statement failing because the database is out of disk space)

This depends on the database you're using but yes in general, the conversions that happens depending on types and such are often implicit, so you could very well pass in strings that could fail at run-time to convert to a proper value.
As far as SQL injection attacks goes, as long as you don't expect parameters to contain SQL which you then invoke dynamically, you're safe.

No there is no guarantee that some runtime error (think, 'trigger' if nothing else) will go off.

Related

Why prepared statements don't allow fields and table names as parameters?

DBMSes only allow values as parameters for prepared statements. However, table, column, and field names are not allowed with prepared statements. For example:
String sql = "Select * from TABLE1 order by ?";
PreparedStatement st = conn.prepareStatement(sql);
st.setString(1, "column_name_1");
Such statement is not allowed. What is the reason for DBMSes to not implement filed names in prepared statements?
There are basically two reasons that I am aware of:
Although details vary per database system, conceptually, when a statement is prepared, it is compiled and checked for correctness (do all tables and columns exist). The server generates a plan which describes which tables to access, which fields to retrieve, indexes to use, etc.
This means that at prepare time the database system must know which tables and fields it needs to access, therefore parameterization of tables and fields is not possible. And even if it where technically possible, it would be inefficient, because statement compilation would need to be deferred until execution, essentially throwing away one of the primary reasons for using prepared statements: reusable query plans for better performance.
And consider this: if table names and field names are allowed to parameterized, why not function names, query fragments, etc?
Not allowing parameterization of objects prevents ambiguity. For example in a query with a where column1 = ?, if you set the parameter to peter, would that be a column name or a string value? That is hard to decide and preventing that ambiguity would make the API harder to use, while the use case of even allowing such parameterization is almost non-existent (and in my experience, the need for such parameterization almost always stem from bad database design).
Allowing parameterization of objects is almost equivalent to just dynamically generating the query and executing it (see also point 1), so why not just forego the additional complexity and disallow parameterization of objects instead.

Should local variables for constants be avoided in stored procedures?

When I write SQL I try to make it as readable as possible. Among other things I often declare "constants" instead of using "magic numbers".
i.e. instead of
WHERE [Order].OrderType = 3
I do
DECLARE #OrderType_Cash AS int = 3;
...
WHERE [Order].OrderType = #OrderType_Cash
This works fine and I have not noticed any performance issues for the size of queries and data I normally work with.
Recently I read an article about parameter sniffing and workarounds (https://blogs.msdn.microsoft.com/turgays/2013/09/10/parameter-sniffing-problem-and-possible-workarounds/). In the artice one of the workarounds presented is "use local variables".
Workaround : Use local variable – This workaround is very similar with previous one (OPTION (OPTIMIZE FOR (#VARIABLE UNKNOWN))) – when
you assign the paramaters to local ones SQL Server uses statistic
densities instead of statistic histograms – So It estimates the same
number of records for all paramaters – The disadvantage is that some
queries will use suboptimal plans because densities are not precise
enough as the statistic histogram.
This makes me a bit worried since my interpretations is that I might get a suboptimal plan in my stored procedures just because I use a local variable instead of a "magic number".
I was also under the impression that SQL Server automatically convert "magic numbers" into variables in order to reuse plans.
Can someone clear this up for me?
Is there a difference between using a "magic number" and a local variable?
If yes, is it only in stored procedures or does it also apply to ad-hoc queries and dynamic SQL?
Is it a bad habit to use local variables like I do?
As documented in the article Statistics Used by the Query Optimizer in Microsoft SQL Server 2005
If you use a local variable in a query predicate instead of a
parameter or literal, the optimizer resorts to a reduced-quality
estimate, or a guess for selectivity of the predicate. Use parameters
or literals in the query instead of local variables
Regarding your questions...
I was also under the impression that SqlServer automatically convert "magic numbers" into variables in order to reuse plans.
No, never, it can auto parameterise adhoc queries but parameters behave differently from variables and can be sniffed. By default it will only do this in very limited circumstances where it is "safe" and unlikely to introduce parameter sniffing issues.
Is there a difference between using a "magic number" and a local
variable?
Yes, the statement is generally compiled before the variable value is even assigned. And even if the statement was to be subject to deferred compilation (or happen to be recompiled after the assignment) values of variables are still never sniffed except if you use option (recompile). If you use the literal inline SQL Server can look up that literal value in the histogram and potentially get much more accurate estimates rather than resorting to guesses. Accurate row estimates are important in getting the correct overall plan shape (e.g. Join type or access method selection) as well as getting an appropriate memory grant for your query.
The book "SQL Server 2005 practical troubleshooting" has this to say on the issue.
In SQL Server 2005, statement level compilation allows for compilation
of an individual statement in a stored procedure to be deferred until
just before the first execution of the query. By then the local
variable's value would be known. Theoretically SQL Server could take
advantage of this to sniff local variable values in the same way that
it sniffs parameters. However because it was common to use local
variables to defeat parameter sniffing in SQL Server 7.0 and SQL
Server 2000+, sniffing of local variables was not enabled in SQL
Server 2005. It may be enabled in a future SQL Server release though
(NB: this has not in fact been enabled in any version to date)
If yes, is it only in stored procedures or does it also
apply to ad-hoc queries and dynamic sql?
This applies to every use of variables. Parameters can be sniffed though so if you were to have a variable in the outer scope passed as a parameter in the inner scope that would allow the variable value to be sniffed.
Is it a bad habit to use local variables like I do?
If the plan is going to be sensitive to the exact variable value than yes. There are certain places where it will be entirely innocuous however.
The disadvantage of option (recompile) as a fix is that it recompiles the statement every time. This is unnecessary when the only reason for doing so is to get it to sniff a variable whose value is constant. The disadvantage of option (optimize for) with a specific literal value is that if the value changes you need to update all those references too.
Another approach would be to create a view of Constants.
CREATE VIEW MyConstants
AS
SELECT 3 AS OrderTypeCash, 4 AS OrderTypeCard
Then, instead of using a variable at all for those, reference that instead.
WHERE [Order].OrderType = (SELECT OrderTypeCash FROM MyConstants)
This will allow the value to be resolved at compile time and only need to be updated in one place.
Alternatively, if you use SSDT and database projects you could use a sqlcmd variable that is defined once and assigned to and then replace all your TSQL variable references with that. The code deployed to the server will still have "magic numbers" but in your source code it is a single SqlCmd variable (NB: For this pattern you might need to create a stub procedure in the project and use the post deployment script to actually alter it with the desired definition and performing the sqlcmd substitutions).

Parse all stored procedures in a database

Does anyone know of a way to verify the correctness of the queries in all stored procedures in a database?
I'm thinking of the scenario where if you modify something in a code file, simply doing a rebuild would show you compilation errors that point you to places where you need to fix things. In a database scenario, say if you modify a table and remove a column which is used in a stored procedure you won't know anything about this problem until the first time that procedure would run.
What you describe is what unit testing is for. Stored procedures and functions often require parameters to be set, and if the stored procedure or function encapsulates dynamic SQL--there's a chance that a [corner] case is missed.
Also, all you mention is checking for basic errors--nothing about validating the data returned. For example - I can change the precision on a numeric column...
This also gets into the basic testing that should occur for the immediate issue, and regression testing to ensure there aren't unforeseen issues.
You could create all of your objects with SCHEMABINDING, which would prevent you from changing any underlying tables without dropping and recreating the views and procedures built on top of them.
Depending on your development process, this could be pretty cumbersome. I offer it as a solution though, because if you want to ensure the correctness of all procedures in the db, this would do it.
I found this example on MSDN (SQL Server 2012). I guess it can be used in some scenarios:
USE AdventureWorks2012;
GO
SELECT p.name, r.*
FROM sys.procedures AS p
CROSS APPLY sys.dm_exec_describe_first_result_set_for_object(p.object_id, 0) AS r;
Source: sys.dm_exec_describe_first_result_set_for_object

Defending against a 'WAITFOR DELAY' sql injection attack?

The problem
We need to defend against a 'WAITFOR DELAY' sql injection attack in our java application.
Background
[This is long. Skip to 'Solution?' section below if you're in a rush ]
Our application mostly uses prepared statements and callable statements (stored procedures) in accessing the database.
In a few places we dynamically build-and-execute queries for selection. In this paradigm we use a criteria object to build the query depending on the user-input criteria. For example, if the user specified values for first_name and last_name, the result querying always looks something like this:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier'
(In this example the user would have specified "joe" and "frazier" as his/her input values. If the user had more or less critieria we would have longer or shorter queries. We have found that this approach is easier than using prepared statements and quicker/more performant than stored procedures).
The attack
A vulnerability audit reported an sql injection failure. The attacker injected the value 'frazier WAITFOR DELAY '00:00:20' for the 'last_name' parameter, resulting in this sql:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier' WAITFOR DELAY '00:00:20'
The result: the query executes successfully, but takes 20 seconds to execute. An attacker could tie up all your database connections in the db pool and effectively shut down your site.
Some observations about this 'WAITFOR DELAY' attack
I had thought that because we used Statement executeQuery(String) we would be safe from sql injection. executeQuery(String) will not execute DML or DDL (deletes or drops). And executeQuery(String) chokes on semi-colons, thus the 'Bobby Tables' paradigm will fail (i.e. user enters 'frazier; DROP TABLE member' for a parameter. See. http://xkcd.com/327/)
The 'WAITFOR' attack differs in one important respect: WAITFOR modifies the existing 'SELECT' command, and is not a separate command.
The attack only works on the 'last parameter' in the resulting query. i.e. 'WAITFOR' must occur at the very end of the sql statement
Solution, Cheap Hack, or Both?
The most obvious solution entails simply tacking "AND 1=1" onto the where clause.
The resulting sql fails immediately and foils the attacker:
SELECT first_name,last_name FROM MEMBER WHERE first_name ='joe' AND last_name='frazier' WAITFOR DELAY '00:00:20' AND 1=1
The Questions
Is this a viable solution for the WAITFOR attack?
Does it defend against other similar vulnerabilities?
I think the best option would entail using prepared statements. More work, but less vulnerable.
The correct way to handle SQL injection is to use parameterized queries. Everything else is just pissing in the wind. It might work one time, even twice, but eventually you'll get hit by that warm feeling that says "you screwed up, badly!"
Whatever you do, except parameterized queries, is going to be sub-optimal, and it will be up to you to ensure your solution doesn't have other holes that you need to patch.
Parameterized queries, on the other hand, works out of the box, and prevents all of these attacks.
SQL injection is SQL injection - there's nothing special about a WAITFOR DELAY.
There is absolutely no excuse for not using prepared statements for such a simple query in this day and age.
(Edit: Okay, not "absolutely" - but there's almost never an excuse)
I think you suggested the solution yourself: Parameterized Queries.
How did you find that your dynamically built query is quicker than using a stored procedure? In general it is often the opposite.
To answer all your questions:
Is this a viable solution for the WAITFOR attack?
No. Just add -- to the attack string and it will ignore your fix.
Does it defend against other similar vulnerabilities?
No. See above.
I think the best option would entail using prepared statements. More work, but less vulnerable.
Yes. You don't fix SQL injection yourself. You use what's already existing and you use it right, that is, by parametrizing any dynamic part of your query.
Another lesser solution is to escape any string that is going to get inserted in your query, however, you will forget one one day and you only need one to get attacked.
Everyone else has nailed this (parameterize!) but just to touch on a couple of points here:
Is this a viable solution for the WAITFOR attack?
Does it defend against other similar vulnerabilities?
No, it doesn't. The WAITFOR trick is most likely just being used for 'sniffing' for the vulnerability; once they've found the vulnerable page, there's a lot more they can do without DDL or (the non-SELECT parts of) DML. For example, think about if they passed the following as last_name
' UNION ALL SELECT username, password FROM adminusers WHERE 'A'='A
Even after you add the AND 1 = 1, you're still hosed. In most databases, there's a lot of malicious things you can do with just SELECT access...
How about follow xkcd and sanitize the input. You could check for reserved words in general and for WAITFOR in particular.

What is the best way of determining whether our own Stored procedure has been executed successfully or not

I know some ways that we can use in order to determine that whether our own Stored procedure has been executed successfully or not. (using output parameter, putting a select such as select 1 at the end of the stored procedure if it has been executed without any error, ...)
so which one is better and why?
Using RAISERROR in case of error in the procedure integrates better with most clients than using fake out parameters. They simply call the procedure and the RAISERROR translates into an exception in the client application, and exceptions are hard to avoid by the application code, they have to be caught and dealt with.
Having a print statement that clearly states whether the SP has been created or not would be more readable.
e.g.
CREATE PROCEDURE CustOrdersDetail #OrderID int
AS
...
...
...
GO
IF OBJECT_ID('dbo.CustOrdersDetail') IS NOT NULL
PRINT '<<< CREATED PROCEDURE dbo.CustOrdersDetail >>>'
ELSE
PRINT '<<< FAILED CREATING PROCEDURE dbo.CustOrdersDetail >>>'
GO
SP is very much like a method/subroutine/procedure & they all have a task to complete. The task could be as simple as computing & returning a result or could be just a simple manipulation to a record in a table. Depending on the task, you could either return a out value indicating the result of the task whether it was a success, failure or the actual results.
If you need common T-SQL solution for your entire project/database, you can use the output parameter for all procedures. But RAISEERROR is the way to handle errors in your client code, not T-SQL.
Why don't use different return values which then can be handled in code?
Introducing an extra output paramter or an extra select is unnecessary.
If the only thing you need to know is whether there is a problem, a successful execution is good enough choice. Have a look at the discussions of XACT_ABORT and TRY...CATCH here and here.
If you want to know specific error, return code is the right way to pass this information to the caller.
In the majority of production scenarios I tend to deploy a custom error reporting component within the database tier, as part of the solution. Nothing fancy, just a handful of log tables and a few of stored procedures that manage the error logging process.
All stored procedure code that is executed on a production server is then encapsulated using the TRY-CATCH-BLOCK feature available within SQL Server 2005 and above.
This means that in the unlikely event that a given stored procedures were to fail, the details of the error that occurred and the stored procedure that generated it are recorded to a log table. A simple stored procedure call is made from within the CATCH BLOCK in order to record the relevant details.
The foundations for this implementation are actually explained in books online here
Should you wish, you can easily extend this implementation further, for example by incorporating email notification to a DBA or even an SMS alert could be sent dependent on the severity of the error.
An implementation of this sort ensures that if your stored procedure did not report failure then it was of course successful.
Once you have a simple and robust framework in place, it is then straightforward to duplicate and rollout your base implementation to other production servers/application platforms.
Nothing special here, just simple error logging and reporting that works.
If on the other hand you also need to record the successful execution of stored procedures then again, a similar solution can be devised that incorporates log table/s.
I think this question is screaming out for a blog post……..

Resources