Avoid SQL Injection when using Dynamic SQL Code - sql-server

I am working on an security remediation of an existing java web application. The application has some dynamic sql code executed by JDBC.But, this is not accepted by Static Code analysis tool we use. So, I am looking for a way to remediate the issue.Basically, I have validated all the input passed to code which constructs the query , so there is no possiblity of SQL Injection. But, the SCA tool still does not approve of this validation. So, want to know if there is any way I can avoid Dynamic Query logic. Prepared Statements cannot be used as the query is dynamicly constructed based on conditions.
I know Stored Procedure can help. But, I understand it has its own issues and the team is also not experienced on Stored Procedures. So, looking for a better way to address this issue. Also, since we are using SQL Server I didn't find any encoding function in the ESAPI toolkit to sanitize the query parameters which has support for oracle and mysql only.
Want to know if using a framework like Mybatis to offload the java code which constructs sql to xml files would resolve the issue. Can you guys let me know if there is any other better way.

You can generate SQL dynamically and use prepared statements.
Here is the idea how this can be done.
Now you have code like this:
StringBuilder whereClause = new StringBuilder();
if (name != null) {
whereClause.append(String.format("name = '%s'", name));
}
// other similar conditions
String sql = "select * from table" + (whereClause.length() != 0 ? "where " + whereClause.toString() : "");
Statement stmt = connection.createStatement();
ResultSet rs = stmt.executeQuery(sql);
// use rs to fetch data
And you need to change this to something like
StringBuilder whereClause = new StringBuilder();
ArrayList<Object> parameters = new ArrayList<>();
if (name != null) {
whereClause.append("name = ?");
parameters.add(name);
}
// other similar conditions
String sql = "select * from table" + (whereClause.length() != 0 ? "where " + whereClause.toString() : "");
PreparedStatement stmt = connection.prepareStatement();
for (int i = 0; i < parameters.length(); ++i) {
setParameterValue(stmt, i + 1, parameter.get(i));
}
ResultSet rs = stmt.executeQuery(sql);
// use rs to fetch data
setParameterValue should look like this:
void setParameterValue(PreparedStatement ps, int index, Object value) {
if (value instanceof String) {
ps.setString(index, (String)value);
} if (value instanceof Integer) {
ps.setInt(index, (Integer)value);
} // and more boilerplate code like this for all types you need
}
With mybatis you can avoid writing such boilerplate code do generate dynamic sql and make this much easier. But I don't know how CSA treats mybatis generated SQL.

I've found this question while trying to solve similar problem myself.
First, we may factor sql code out of java files and store it in text file under resources folder. Then, from java code, use classloader's method to read sql as inputStream and convert it to String. Storing sql code in separate files will enable statical code analysis.
Second, we can use named parameters in sql in some form that is easily recognizable via regular expressions. E.g. ${namedParam} syntax which is familiar by different expression languages. Then we can write helper method to take this parametrised sql and Map<String, Object> with query params. Keys in this map should correspond to sql parameter names. This helper method would produce PreparedStatement with set parameters. Using named parameters will make sql code more readable and will save us some debugging.
Third, at last, we can use sql comments to mark parts of sql code as dependant on presence of some parameter. And use it in the previously described helper method to include in the resulting Statement only parts, for which entries in parameters Map exist. E.g.: /*${namedParam}[*/ some sql code /*]${namedParam}*/. This would be an unobtrusive way to insert conditions into our dynamic sql.
Following DRY principle, we could also try to employ some existing expression language engine, but it would get us one more dependency and processing expense.
I will post the solution here once I get working code.

Related

MSSQL record and display all languages in table including English, Chinese and Arabic [duplicate]

I am very new to working with databases. Now I can write SELECT, UPDATE, DELETE, and INSERT commands. But I have seen many forums where we prefer to write:
SELECT empSalary from employee where salary = #salary
...instead of:
SELECT empSalary from employee where salary = txtSalary.Text
Why do we always prefer to use parameters and how would I use them?
I wanted to know the use and benefits of the first method. I have even heard of SQL injection but I don't fully understand it. I don't even know if SQL injection is related to my question.
Using parameters helps prevent SQL Injection attacks when the database is used in conjunction with a program interface such as a desktop program or web site.
In your example, a user can directly run SQL code on your database by crafting statements in txtSalary.
For example, if they were to write 0 OR 1=1, the executed SQL would be
SELECT empSalary from employee where salary = 0 or 1=1
whereby all empSalaries would be returned.
Further, a user could perform far worse commands against your database, including deleting it If they wrote 0; Drop Table employee:
SELECT empSalary from employee where salary = 0; Drop Table employee
The table employee would then be deleted.
In your case, it looks like you're using .NET. Using parameters is as easy as:
string sql = "SELECT empSalary from employee where salary = #salary";
using (SqlConnection connection = new SqlConnection(/* connection info */))
using (SqlCommand command = new SqlCommand(sql, connection))
{
var salaryParam = new SqlParameter("salary", SqlDbType.Money);
salaryParam.Value = txtMoney.Text;
command.Parameters.Add(salaryParam);
var results = command.ExecuteReader();
}
Dim sql As String = "SELECT empSalary from employee where salary = #salary"
Using connection As New SqlConnection("connectionString")
Using command As New SqlCommand(sql, connection)
Dim salaryParam = New SqlParameter("salary", SqlDbType.Money)
salaryParam.Value = txtMoney.Text
command.Parameters.Add(salaryParam)
Dim results = command.ExecuteReader()
End Using
End Using
Edit 2016-4-25:
As per George Stocker's comment, I changed the sample code to not use AddWithValue. Also, it is generally recommended that you wrap IDisposables in using statements.
You are right, this is related to SQL injection, which is a vulnerability that allows a malicioius user to execute arbitrary statements against your database. This old time favorite XKCD comic illustrates the concept:
In your example, if you just use:
var query = "SELECT empSalary from employee where salary = " + txtSalary.Text;
// and proceed to execute this query
You are open to SQL injection. For example, say someone enters txtSalary:
1; UPDATE employee SET salary = 9999999 WHERE empID = 10; --
1; DROP TABLE employee; --
// etc.
When you execute this query, it will perform a SELECT and an UPDATE or DROP, or whatever they wanted. The -- at the end simply comments out the rest of your query, which would be useful in the attack if you were concatenating anything after txtSalary.Text.
The correct way is to use parameterized queries, eg (C#):
SqlCommand query = new SqlCommand("SELECT empSalary FROM employee
WHERE salary = #sal;");
query.Parameters.AddWithValue("#sal", txtSalary.Text);
With that, you can safely execute the query.
For reference on how to avoid SQL injection in several other languages, check bobby-tables.com, a website maintained by a SO user.
In addition to other answers need to add that parameters not only helps prevent sql injection but can improve performance of queries. Sql server caching parameterized query plans and reuse them on repeated queries execution. If you not parameterized your query then sql server would compile new plan on each query(with some exclusion) execution if text of query would differ.
More information about query plan caching
Two years after my first go, I'm recidivating...
Why do we prefer parameters? SQL injection is obviously a big reason, but could it be that we're secretly longing to get back to SQL as a language. SQL in string literals is already a weird cultural practice, but at least you can copy and paste your request into management studio. SQL dynamically constructed with host language conditionals and control structures, when SQL has conditionals and control structures, is just level 0 barbarism. You have to run your app in debug, or with a trace, to see what SQL it generates.
Don't stop with just parameters. Go all the way and use QueryFirst (disclaimer: which I wrote). Your SQL lives in a .sql file. You edit it in the fabulous TSQL editor window, with syntax validation and Intellisense for your tables and columns. You can assign test data in the special comments section and click "play" to run your query right there in the window. Creating a parameter is as easy as putting "#myParam" in your SQL. Then, each time you save, QueryFirst generates the C# wrapper for your query. Your parameters pop up, strongly typed, as arguments to the Execute() methods. Your results are returned in an IEnumerable or List of strongly typed POCOs, the types generated from the actual schema returned by your query. If your query doesn't run, your app won't compile. If your db schema changes and your query runs but some columns disappear, the compile error points to the line in your code that tries to access the missing data. And there are numerous other advantages. Why would you want to access data any other way?
In Sql when any word contain # sign it means it is variable and we use this variable to set value in it and use it on number area on the same sql script because it is only restricted on the single script while you can declare lot of variables of same type and name on many script. We use this variable in stored procedure lot because stored procedure are pre-compiled queries and we can pass values in these variable from script, desktop and websites for further information read Declare Local Variable, Sql Stored Procedure and sql injections.
Also read Protect from sql injection it will guide how you can protect your database.
Hope it help you to understand also any question comment me.
Old post but wanted to ensure newcomers are aware of Stored procedures.
My 10¢ worth here is that if you are able to write your SQL statement as a stored procedure, that in my view is the optimum approach. I ALWAYS use stored procs and never loop through records in my main code. For Example: SQL Table > SQL Stored Procedures > IIS/Dot.NET > Class.
When you use stored procedures, you can restrict the user to EXECUTE permission only, thus reducing security risks.
Your stored procedure is inherently paramerised, and you can specify input and output parameters.
The stored procedure (if it returns data via SELECT statement) can be accessed and read in the exact same way as you would a regular SELECT statement in your code.
It also runs faster as it is compiled on the SQL Server.
Did I also mention you can do multiple steps, e.g. update a table, check values on another DB server, and then once finally finished, return data to the client, all on the same server, and no interaction with the client. So this is MUCH faster than coding this logic in your code.
Other answers cover why parameters are important, but there is a downside! In .net, there are several methods for creating parameters (Add, AddWithValue), but they all require you to worry, needlessly, about the parameter name, and they all reduce the readability of the SQL in the code. Right when you're trying to meditate on the SQL, you need to hunt around above or below to see what value has been used in the parameter.
I humbly claim my little SqlBuilder class is the most elegant way to write parameterized queries. Your code will look like this...
C#
var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName);
myCommand.CommandText = bldr.ToString();
Your code will be shorter and much more readable. You don't even need extra lines, and, when you're reading back, you don't need to hunt around for the value of parameters. The class you need is here...
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
public class SqlBuilder
{
private StringBuilder _rq;
private SqlCommand _cmd;
private int _seq;
public SqlBuilder(SqlCommand cmd)
{
_rq = new StringBuilder();
_cmd = cmd;
_seq = 0;
}
public SqlBuilder Append(String str)
{
_rq.Append(str);
return this;
}
public SqlBuilder Value(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append(paramName);
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public SqlBuilder FuzzyValue(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append("'%' + " + paramName + " + '%'");
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public override string ToString()
{
return _rq.ToString();
}
}

How can you create a table (or other object) that always returns the value passed to its WHERE-clause, like a mirror

There is a legacy application that uses a table to translate job names to filenames. This legacy application queries it as follows:
SELECT filename FROM aJobTable WHERE jobname = 'myJobName'
But in reality those jobnames always match the filenames (e.g. 'myJobName.job' is the jobname but also the filename) That makes this table appear unnecessary. But unfortunately, we cannot change the code of this program, and the program just needs to select it from a table.
That's actually a bit annoying. Because we do need to keep this database in sync. If a jobname is not in the table, then it cannot be used. So, as our only way out, right now we have some vbscripts to synchronize this table, adding records for each possible filename. As a result, the table just 2 columns with identical values. -- We want to get rid of this.
So, we have been dreaming about some hack that queries the data with the jobname, but just always returns the jobname again, like a copy/mirror query. Then we don't actually have to populate a table at all.
"Exploits"
The following can be configured in this legacy application. My hunch is that these may open the door for some tricks/hacks.
use of either MS Access or SQL Server (we prefer sql server)
The name of the table (e.g. aJobTable)
The name of the filename column (e.g. filename)
The name of the jobname column (e.g. jobname)
Here is what I came up with:
If I create a table-valued function mirror(a) then I get pretty close to what I want. Then I could use it like
SELECT filename FROM mirror('MyJobName.job')
But that's just not good enough, it would be if I could force it to be like
SELECT filename FROM mirror WHERE param1 = 'MyJobName.job'
Unfortunately, I don't think it's possible to call functions like that.
So, I was wondering if perhaps somebody else knows how to get it working.
So my question is: "How can you create a table (or other object) that always returns the value passed to its WHERE-clause, like a mirror."
It's kinda hard to answer not knowing the code that the application use, but if we assume it only takes strings and concatenate them without any tests whatsoever, I would assume code like this: (translated to c#)
var sql = "SELECT "+ field +" FROM "+ table +" WHERE "+ conditionColumn +" = '"+ searchValue +"'";
As this is an open door for SQL injection, and given the fact that SQL Server allows you two ways of creating an alias - value as alias and alias = value,
you can take advantage of that and try to generate an SQL statement like this:
SELECT field /* FROM table WHERE conditionColumn */ = 'searchValue'
So field should be "field /* ",
and conditionColumn should be "conditionColumn */"
table name doesn't matter, you could leave an empty string for it.

Recommended solutions to load a huge CSV file to SQL Server 2008 R2 [duplicate]

What is your recommended way to import .csv files into Microsoft SQL Server 2008 R2?
I'd like something fast, as I have a directory with a lot of .csv files (>500MB spread across 500 .csv files).
I'm using SQL Server 2008 R2 on Win 7 x64.
Update: Solution
Here's how I solved the problem the end:
I abandoned trying to use LINQ to Entities to do the job. It works - but it doesn't support bulk insert, so its about 20x slower. Maybe the next version of LINQ to Entities will support this.
Took the advice given on this thread, used bulk insert.
I created a T-SQL stored procedure that uses bulk insert. Data goes into a staging table, is normalized then copied into the target tables.
I mapped the stored procedure into C# using the LINQ to Entities framework (there is a video on www.learnvisualstudio.net showing how to do this).
I wrote all the code to cycle through files, etc in C#.
This method eliminates the biggest bottleneck, which is reading tons of data off the drive and inserting it into the database.
The reason why this method is extremely quick at reading .csv files? Microsoft SQL Server gets to import the files directly from the hard drive straight into the database, using its own highly optimized routines. Most of the other C# based solutions require much more code, and some (like LINQ to Entities) end up having to pipe the data slowly into the database via the C#-to-SQL-server link.
Yes, I know it'd be nicer to have 100% pure C# code to do the job, but in the end:
(a) For this particular problem, using T-SQL requires much less code compared to C#, about 1/10th, especially for the logic to denormalize the data from the staging table. This is simpler and more maintainable.
(b) Using T-SQL means you can take advantage of the native bulk insert procedures, which speeds things up from 20-minute wait to a 30-second pause.
Using BULK INSERT in a T-SQL script seems to be a good solution.
http://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
You can get the list of files in your directory with xp_cmdshell and the dir command (with a bit of cleanup). In the past, I tried to do something like this with sp_OAMethod and VBScript functions and had to use the dir method because I had trouble getting the list of files with the FSO object.
http://www.sqlusa.com/bestpractices2008/list-files-in-directory/
If you have to do anything with the data in the files other than insert it, then I would recommend using SSIS. It can not only insert and/or update, it can also clean the data for you.
First officially supported way of importing large text files is with command line tool called "bcp" (Bulk Copy Utility), very useful for huge amounts of binary data.
Please check out this link: http://msdn.microsoft.com/en-us/library/ms162802.aspx
However, in SQL Server 2008 I presume that BULK INSERT command would be your choice number one, because on the first place it became a part of standard command set. If for any reason you have to maintain vertical compatibility, I'd stick to bcp utility, available for SQL Server 2000 too.
HTH :)
EDITED LATER: Googling around I recalled that SQL Server 2000 had BULK INSERT command too... however, there was obviously some reason I sticked up to bcp.exe, and I cannot recall why... perhaps of some limits, I guess.
I should recommend this:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
namespace ReadDataFromCSVFile
{
static class Program
{
static void Main()
{
string csv_file_path=#"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
}
}
//Copy the DataTable to SQL Server using SqlBulkCopy
function static void InsertDataIntoSQLServerUsingSQLBulkCopy(DataTable csvData)
{
using(SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=yourDB;Integrated Security=SSPI;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = "Your table name";
foreach (var column in csvFileData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(csvFileData);
}
}
}
If the structure of all your CSVs are the same i recomend you to use Integration Services (SSIS) in order to loop between them and insert all of them into the same table.
I understand this is not exactly your question. But, if you get into a situation where you use a straight insert use tablock and insert multiple rows. Depends on the row size but I usually go for 600-800 rows at at time. If it is a load into an empty table then sometimes dropping the indexes and creating them after it is loaded is faster. If you can sort the data on the clustered index before it is loaded. Use IGNORE_CONSTRAINTS and IGNORE_TRIGGERS if you can. Put the database in single user mode if you can.
USE AdventureWorks2008R2;
GO
INSERT INTO Production.UnitMeasure with (tablock)
VALUES (N'FT2', N'Square Feet ', '20080923'), (N'Y', N'Yards', '20080923'), (N'Y3', N'Cubic Yards', '20080923');
GO

SQL Server: Parameter is not replaced

I've got a fully functionable (secure) session to a SQL Server database (version 10.50.4000). This is stored in a public variable:
SqlConnection conn = new SqlConnection();
I only want to run SELECT queries. For anything else, the user account got no rights.
The queries are built with only one user entry, which is inserted into a simple Text Box.
Unfortunately I must not tell you the original command text. So I make it simple for you:
function print_users(string filtervalue)
{
SqlCommand cmd = null;
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE '%#fv%'", this.conn)
cmd.Parameters.Add("#fv", SqlDbType.NVarChar);
cmd.Parameters["#fv"].Value=filtervalue;
rdr = cmd.ExecuteReader();
while(rdr.Read())
{
//Do something with the answer from the DB
}
}
But this does not do the trick. I also tried AddWithValue, but I got no luck.
When creating a stop-point on the line, where #fv should be replaced, I can go through the code line-by-line. And I can see that the command, where #fv should be replaced, is processed with no error. But #fv is not replaced (or at least I cannot see the replacement in the debug console).
What am I doing wrong?
EDIT:
thank you for your replies. Leaving out the single quotes ( ' ) did the trick.
And I also learned that this is not a string replacement. Thank you.
Just one word: The connection is not left open all the time. It's immediately closed, when it's not needed any more; and re-established when needed again - I just forgot to write that into my sample code.
Again: Thank you for your help!
You can't see it being replaced in your debug session; the replacement occurs in the SQL server code itself...
The client (your code) send both the SQL-string and the value of the parameter as seperate things to the server. There the SQL Engine 'replaces' the parameter with its value while executing it.
You should also put the 'wildcards' inside your parametervalue, not inside the query.
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE #fv ", this.conn)
cmd.Parameters.Add("#fv", SqlDbType.NVarChar);
cmd.Parameters["#fv"].Value= "%" + filtervalue + "%";
The parameter is not working because it is inside a string literal. You want to build the string like this:
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE '%' + #fv + '%'");
While we're at it, keeping a global connection like that is bad. It can cause strange side effects, especially in web apps. Instead, keep a global connection string, and then use that string to create a new connection on each request.
Also, "replace" is the wrong word here. Sql parameters are never replaced, even when they work. That's the whole point. There is no string replacement into your query at any point, ever. It's more like you declared an #fv variable at the server level in a stored procedure, and assigned your data directly to that variable. In this way, there is no possibility for a vulnerability in parameter replacement code, because the data portion of your query remains separate throughout the execution process. In same way, don't think in terms of "sanitizing" a parameter for a query; instead, think in terms of quarantining the data.

How to get data from database according to string length without using any string function

I have to get the records from a table field where Length of record/data/string is greater then 8 characters. I cannot use any string function as the query has to be used on (MySQL, MSSQL, Oracle).
I don't want to do the below EXAMPLE:
List<String> names = new ArrayList<String>();
String st = select 'name' from table;
rs = executeSQL(st);
if ( rs != null )
{
rs.next();
names.add(rs.getString(1));
}
for(String name : names)
{
if(name.length() > 8)
result.add(name);
}
Any idea other then the one coded above? A query that can get the required result instead of processing on the retrieved data and then getting the required result.
Thank you for any help / clue.
JDBC Drivers may implement a JDBC escapes for the functions listed in appendix D (Scalar Functions) of the JDBC specification. A driver should convert the scalar functions it supports to the appropriate function on the database side. A list of the supported functions can be queried using 'DatabaseMetaData.getStringFunctions()'
To use this in a query you would then either use CHAR_LENGTH(string) or LENGTH(string) like :
SELECT * FROM table WHERE {fn CHAR_LENGTH(field)} > 8
You can replace CHAR_LENGTH with LENGTH. The driver (if it supports this function) will then convert it to the appropriate function in the underlying database.
From section 13.4.1 Scalar Functions of the JDBC 4.1 specification:
Appendix D “Scalar Functions" provides a list of the scalar functions
a driver is expected to support. A driver is required to implement
these functions only if the data source supports them, however.
The escape syntax for scalar functions must only be used to invoke the
scalar functions defined in Appendix D “Scalar Functions". The escape
syntax is not intended to be used to invoke user-defined or vendor
specific scalar functions.
I think you may be better off leveraging the power of the database and implementing a factory for your SQL statements (or perhaps for objects encapsulating your SQL functionality).
That way you can configure your factory with the name/type of the database, and it'll give you the appropriate SQL statements for that database. It gives you a clean means of parameterising this info, whilst allowing you to leverage the functionality of your databases and not having to replicate the database functionality in a suboptimal fashion in your code.
e.g.
DabaseStatementFactory fac = DatabaseStatementFactory.for(NAME_OF_DATABASE);
String statement = fac.getLongNames();
// then use this statement. It'll be configured for each db type
It's probably wise to encapsulate further and use something like:
DabaseStatementFactory fac = DatabaseStatementFactory.for(NAME_OF_DATABASE);
List<String> names = fac.getLongNames();
such that you're not making assumptions re. common schema and means of queries etc.
Another solution that I found is:
Select name from table where name like '________';
SQL counts the underscore (_) characters and return a name of length equal to number of underscore characters.

Resources