I've got some free-response text fields and I'm not sure how to scrub them to prevent SQL injection. Any ideas?
Create a parameterized query instead of concatenating the user's input into the query.
Here is how to do this in classic asp:
http://blog.binarybooyah.com/blog/post/Classic-ASP-data-access-using-parameterized-SQL.aspx
It's also important to note that the only way you can be 100% safe from sql injection is to parameterize any sql statement that uses user input, even once it's in the database. Example: Say you take user input via a parameterized query or stored procedure. You will be safe on the insert, however you need to make sure that anything down the road that uses that input also uses a parameter. Directly concatenating user input is a bad idea anywhere, including inside the db.
Call a stored procedure.
EDIT: Just to clarify. Building dynamic sql in a sp can of course be just as dangerous as doing it in the app, but binding user inputs into a query will protect you against sql injection, as described here (oracle-specific discussion, but the principle applies elsewhere):
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:23863706595353
It is not dynamic sql that is the
issue (all sql is dynamic in Oracle
actually -- even static sql in
pro*c/plsql!). It is "the
construction" of this sql that is the
problem. If a user gives you inputs -
they should be BOUND into the query --
not concatenated. The second you
concatenate user input into your SQL
-- it is as if you gave them the ability to pass you code and you
execute that code. Plain and simple.
Related
I am trying to execute (call) a SQL Server stored procedure from Infa Developer, I created a mapping (new mapping from SQL Query). I am trying to pass it runtime variables from the previous mapping task in order to log these to a SQL Server table (the stored procedure does an INSERT). It generated the following T-SQL query:
?RETURN_VALUE? = call usp_TempTestInsertINFARunTimeParams (?Workflow_Name?, ?Instance_Id?, ?StartTime?, ?EndTime?, ?SourceRows?, ?TargetRows?)
However, it does not validate, the validation log states 'the mapping must have a source' and '... must have a target'. I have a feeling I'm doing this completely wrong. And: this is not Power Center (no sessions, as far as I can tell).
Any help is appreciated! Thanks
Now with the comments I can confirm and answer your question:
Yes, Soure and Target transformations in Informatica are mandatory elements of the mapping. It will not be a valid mapping without them. Let me try to explain a bit more.
The whole concept of ETL tool is to Extract data from the Source, do all the needed Transformations outside the database and Load the data to required Target. It is possible - and quite often necessary - to invoke Stored Procedures before or after the data load. Sometimes even use the exisitng Stored Procedures as part of the dataload. However, from ETL perspective, this is the additional feature. ETL tool - here Informatica being a perfect example - is not meant to be a tool for invoking SPs. This reminds me a question any T-SQL developer asks with his first PL-SQL query: what in the world is this DUAL? Why do I need 'from dual' if I just want to do some calculation like SELECT 123*456? That is the theory.
Now in real world it happens quite often that you NEED to invoke a stored procedure. And that it is the ONLY thing you need to do. Then you do use the DUAL ;) Which in PowerCenter world means you use DUAL as the Source (or actually any table you know that exists in the source system), you put 1=2 in the Source Filter property (or put the Filter Transforation in the mapping with FALSE as the condition), link just one port with the target. Next, you put the Stored Procedure call as Pre- or Post-SQL property on your source or target - depending on where you actually want to run it.
Odd? Well - the odd part is where you want to use the ETL tool as a trigger, not the ETL tool ;)
I'm working with a new REST backend talking to a SQL Server. Our REST api allows for the caller to pass in the columns/fields they want returned (?fields=id,name,phone).
The idea seems very normal. The issue I'm bumping up against is resistance to dynamically generating the SQL statement. Any arguments passed in would be passed to the database using a parameterized query, so I'm not concerned about SQL injection.
The basic idea would be to "inject" the column-names passed in, into a SQL that looks like:
SELECT <column-names>
FROM myTable
ORDER BY <column-name-to-sort-by>
LIMIT 1000
We sanitize all column names and verify their existence in the table, to prevent SQL injection issues. Most of our programmers are used to having all SQL in static files, and loading them from disk and passing them on to the database. The idea of code creating SQL makes them very nervous.
I guess I'm curious if others actually do this? If so, how do you do this? If not, how do you manage "dynamic columns and dynamic sort-by" requests passed in?
I think a lot of people do it especially when it comes to reporting features. There are actually two things one should do to stay on the safe side:
Parameterize all WHERE clause values
Use user input values to pick correct column/table names, don't use the user values in the sql statement at all
To elaborate on item #2, I would have a dictionary where Key is a possible user input and Value is a correponding column/table name. You can store this dictionary wherever you want: config file, database, hard code, etc. So when you process user input you just check a dictionary if the Key exists and if it does you use the Value to add a column name to your query. This way you just use user input to pick required column names but don't use the actual values in your sql statement. Besides, you might not want to expose all columns. With a predefined dictionary you can easily control the list of available columns for a user.
Hope it helps!
I've done similar to what Maksym suggests. In my case, keys were pulled directly from the database system tables (after scrubbing the user request a bit for syntactic hacks and permissions).
The following query takes care of some minor injection issues through the natural way SQL handles the LIKE condition. This doesn't go as far as handling permissions on each field (as some fields are forbidden based on the log-in) but it provides a very basic way to retrieve these fields dynamically.
CREATE PROC get_allowed_column_names
#input VARCHAR(MAX)
AS BEGIN
SELECT
columns.name AS allowed_column_name
FROM
syscolumns AS columns,
sysobjects AS tables
WHERE
columns.id = tables.id AND
tables.name = 'Categories' AND
#input LIKE '%' + columns.name + '%'
END
GO
-- The following only returns "Picture"
EXEC get_allowed_column_names 'Category_,Cat%,Picture'
GO
-- The following returns both "CategoryID and Picture"
EXEC get_allowed_column_names 'CategoryID, Picture'
GO
This might seem like a silly question, but I'm surprised that I didn't find a clear answer to this already:
Is it possible to use SQL Server parameters for writing a query with dynamic column names (and table names), or does the input just need to be sanitized very carefully?
The situation is that tables and their column names (and amount of columns) are generated dynamically and there is no way to know beforehand to manually write a query. Since the tables & columns aren't known I can't use an ORM, so I'm resorting to manual queries. Usually I'd use parameters to fill in values to prevent SQL injection, however I'm pretty sure that this cannot be done the same way when specifying the table name and/or column names. I want to create generic queries for insert, update, upsert, and select, but I obviously don't want to open myself up to potential injection. Is there a best practices on how to accomplish this safely?
Just as an FYI - I did see this answer, but since there's no way for me to know the column / table names beforehand a case statement probably won't work for this situation.
Environment: SQL Server 2014 via ADO.NET (.NET 4.5 / C#)
There is no mechanism for passing table or column references to procedures. You just pass them as strings and then use dynamic SQL to build your queries. You do have to take precautions to ensure that your string parameters are valid.
One way to do this would be to validate that all table and column reference strings have valid names in sys.tables and sys.columns before building your T-SQL queries. Then you can be sure that they can be used safely.
You can also use literal parameters with dynamic sql when using the sp_executesql procedure. You can't use it to validate your table and column names, but it validates and prevents SQL injection with your other parameters.
A coworker and I were browsing SO when we came across a question about SQL Injection, and it got us wondering: how do parametrized queries work internally? Does the API you are using (assuming it supports parametrized queries) perform concatenation, combining the query with the parameters? Or do the parameters make it to the SQL engine separately from the query, and no concatenation is performed at all?
Google hasn't been very helpful, but maybe we haven't searched for the right thing.
The parameters make it to the SQL engine separately from the query. Execution plan calculated or reused for the parametrized query, and then query is executed by sql engine with parameters.
Paramters make it to the SQL server intact, and individually "packaged" with meta data indicating their type, whether Input or Output etc. As Alex Reitbort points out, it is so because the parametrized statements are a server level concept, not merely a convenient way of invoking commands from various connection layers.
I doubt that SQL SERVER builds a complete query string from the given parametrized query where the parameter list is concatenated in.
It most likely parses the given parametrized command string splitting it into an internal data structure based on reserved words and symbols (SELECT, FROM, ",", "+", etc). Within that data structure, there are properties/places for values like table names, literals, etc. It is here that it copies (verbatim) the each passed in parameter (from the list) into the proper section of that structure.
so your #UserName value of: 'x';delete from users --
in not never needs to be escaped, just used as the literal value it really is.
Parameters are passed along with the query (not within the query), and are automatically escaped by the API as they are sent in accordance with the underlying database communications protocol.
For example, you might have
Query: <<<<select * from users where username = :username>>>>
Param: <<<<:username text<<<<' or '1' = '1>>>>>>>>
That's not the exact encoding any database protocol actually uses, but you get the idea.
Currently I am designing a database schema where one table will contains details about all students of a university.
I am thinking the way how can I create the search engine query for administrators where they will search for students. (Some properties are Age, Location, Name, Surname etc etc) (approx 20 properties - 1 table)
My idea is to create the sql query dynamically from the code side. Is it the best way or is there any other better ways?
Shall I use a stored procedure?
Is there any other ways?
feel free to share
I am going to assume you have a front end that collects user input, executes a query and returns a result. I would say you HAVE to create the query dynamically from the code side. At the very least you will need to pass in variables that the user selected to query by. I would probably create a method that takes in the key/value search data and use that to execute the query. Because it will only be one table there would probably be no need for a view or stored procedure. I think a simple select statement including your search criteria will work fine.
I would suggest you to use LINQ to SQL and this will allow you to write such queries just in C# code without any SQL procedures. LINQ to SQL will care about security and prevent SQL injections
p.s.
Do not ever compose SQL from concatenated strings like SQL = "select * from table where " + "param1=" + param1 ... :)