I execute SQL scripts to change the database schema. It looks something like this:
using (var command = connection.CreateCommand())
{
command.CommandText = script;
command.ExecuteNonQuery();
}
Additionally, the commands are executed within a transaction.
The scrip looks like this:
Alter Table [TableName]
ADD [NewColumn] bigint NULL
Update [TableName]
SET [NewColumn] = (SELECT somevalue FROM anothertable)
I get an error, because NewColumn does not exist. It seems to parse and validate it before it is executed.
When I execute the whole stuff in the Management Studio, I can put GO between the statements, then it works. When I put GO into the script, ADO.NET complains (Incorrect syntax near 'GO').
I could split the script into separate scripts and execute it in separate commands, this would be hard to handle. I could split it on every GO, parsing the script myself. I just think that there should be a better solution and that I didn't understand something. How should scripts like this be executed?
My implementation if anyone is interested in, according to John Saunders' answer:
List<string> lines = new List<string>();
while (!textStreamReader.EndOfStream)
{
string line = textStreamReader.ReadLine();
if (line.Trim().ToLower() == "go" || textStreamReader.EndOfStream)
{
ExecuteCommand(
string.Join(Environment.NewLine, lines.ToArray()));
lines.Clear();
}
else
{
lines.Add(line);
}
}
Not using one of umpteen ORM libraries to do it ? Good :-)
To be completely safe when running scripts that do structural changes use SMO rather than SqlClient and make sure MARS is not turned on via connection string (SMO will normally complain if it is anyway). Look for ServerConnection class and ExecuteNonQuery - different DLL of course :-)
The diff is that SMO dll pases the script as-is to SQL so it's genuine equivalent of running it in SSMS or via isql cmd line. Slicing on GO-s ends up growing into much bigger scanning every time you encounter another glitch (like that GO can be in the middle of a multi-line comment, there can be multiple USE statements, a script can be dropping the very DB that SqlCLient connected to - oops :-). I just killed one such thing in the codebase I inherited (after more complex scripts conflicted with MARS and MARS is good for production code but not for admin stuff).
You have to run each batch separately. In particular, to run a script that may contain multiple batches ("GO" keywords), you have to split the script on the "GO" keywords.
Not Tested:
string script = File.ReadAllText("script.sql");
string[] batches = script.Split(new [] {"GO"+Environment.NewLine}, StringSplitOptions.None);
foreach (string batch in batches)
{
// run ExecuteNonQuery on the batch
}
Related
I am very new to working with databases. Now I can write SELECT, UPDATE, DELETE, and INSERT commands. But I have seen many forums where we prefer to write:
SELECT empSalary from employee where salary = #salary
...instead of:
SELECT empSalary from employee where salary = txtSalary.Text
Why do we always prefer to use parameters and how would I use them?
I wanted to know the use and benefits of the first method. I have even heard of SQL injection but I don't fully understand it. I don't even know if SQL injection is related to my question.
Using parameters helps prevent SQL Injection attacks when the database is used in conjunction with a program interface such as a desktop program or web site.
In your example, a user can directly run SQL code on your database by crafting statements in txtSalary.
For example, if they were to write 0 OR 1=1, the executed SQL would be
SELECT empSalary from employee where salary = 0 or 1=1
whereby all empSalaries would be returned.
Further, a user could perform far worse commands against your database, including deleting it If they wrote 0; Drop Table employee:
SELECT empSalary from employee where salary = 0; Drop Table employee
The table employee would then be deleted.
In your case, it looks like you're using .NET. Using parameters is as easy as:
string sql = "SELECT empSalary from employee where salary = #salary";
using (SqlConnection connection = new SqlConnection(/* connection info */))
using (SqlCommand command = new SqlCommand(sql, connection))
{
var salaryParam = new SqlParameter("salary", SqlDbType.Money);
salaryParam.Value = txtMoney.Text;
command.Parameters.Add(salaryParam);
var results = command.ExecuteReader();
}
Dim sql As String = "SELECT empSalary from employee where salary = #salary"
Using connection As New SqlConnection("connectionString")
Using command As New SqlCommand(sql, connection)
Dim salaryParam = New SqlParameter("salary", SqlDbType.Money)
salaryParam.Value = txtMoney.Text
command.Parameters.Add(salaryParam)
Dim results = command.ExecuteReader()
End Using
End Using
Edit 2016-4-25:
As per George Stocker's comment, I changed the sample code to not use AddWithValue. Also, it is generally recommended that you wrap IDisposables in using statements.
You are right, this is related to SQL injection, which is a vulnerability that allows a malicioius user to execute arbitrary statements against your database. This old time favorite XKCD comic illustrates the concept:
In your example, if you just use:
var query = "SELECT empSalary from employee where salary = " + txtSalary.Text;
// and proceed to execute this query
You are open to SQL injection. For example, say someone enters txtSalary:
1; UPDATE employee SET salary = 9999999 WHERE empID = 10; --
1; DROP TABLE employee; --
// etc.
When you execute this query, it will perform a SELECT and an UPDATE or DROP, or whatever they wanted. The -- at the end simply comments out the rest of your query, which would be useful in the attack if you were concatenating anything after txtSalary.Text.
The correct way is to use parameterized queries, eg (C#):
SqlCommand query = new SqlCommand("SELECT empSalary FROM employee
WHERE salary = #sal;");
query.Parameters.AddWithValue("#sal", txtSalary.Text);
With that, you can safely execute the query.
For reference on how to avoid SQL injection in several other languages, check bobby-tables.com, a website maintained by a SO user.
In addition to other answers need to add that parameters not only helps prevent sql injection but can improve performance of queries. Sql server caching parameterized query plans and reuse them on repeated queries execution. If you not parameterized your query then sql server would compile new plan on each query(with some exclusion) execution if text of query would differ.
More information about query plan caching
Two years after my first go, I'm recidivating...
Why do we prefer parameters? SQL injection is obviously a big reason, but could it be that we're secretly longing to get back to SQL as a language. SQL in string literals is already a weird cultural practice, but at least you can copy and paste your request into management studio. SQL dynamically constructed with host language conditionals and control structures, when SQL has conditionals and control structures, is just level 0 barbarism. You have to run your app in debug, or with a trace, to see what SQL it generates.
Don't stop with just parameters. Go all the way and use QueryFirst (disclaimer: which I wrote). Your SQL lives in a .sql file. You edit it in the fabulous TSQL editor window, with syntax validation and Intellisense for your tables and columns. You can assign test data in the special comments section and click "play" to run your query right there in the window. Creating a parameter is as easy as putting "#myParam" in your SQL. Then, each time you save, QueryFirst generates the C# wrapper for your query. Your parameters pop up, strongly typed, as arguments to the Execute() methods. Your results are returned in an IEnumerable or List of strongly typed POCOs, the types generated from the actual schema returned by your query. If your query doesn't run, your app won't compile. If your db schema changes and your query runs but some columns disappear, the compile error points to the line in your code that tries to access the missing data. And there are numerous other advantages. Why would you want to access data any other way?
In Sql when any word contain # sign it means it is variable and we use this variable to set value in it and use it on number area on the same sql script because it is only restricted on the single script while you can declare lot of variables of same type and name on many script. We use this variable in stored procedure lot because stored procedure are pre-compiled queries and we can pass values in these variable from script, desktop and websites for further information read Declare Local Variable, Sql Stored Procedure and sql injections.
Also read Protect from sql injection it will guide how you can protect your database.
Hope it help you to understand also any question comment me.
Old post but wanted to ensure newcomers are aware of Stored procedures.
My 10ยข worth here is that if you are able to write your SQL statement as a stored procedure, that in my view is the optimum approach. I ALWAYS use stored procs and never loop through records in my main code. For Example: SQL Table > SQL Stored Procedures > IIS/Dot.NET > Class.
When you use stored procedures, you can restrict the user to EXECUTE permission only, thus reducing security risks.
Your stored procedure is inherently paramerised, and you can specify input and output parameters.
The stored procedure (if it returns data via SELECT statement) can be accessed and read in the exact same way as you would a regular SELECT statement in your code.
It also runs faster as it is compiled on the SQL Server.
Did I also mention you can do multiple steps, e.g. update a table, check values on another DB server, and then once finally finished, return data to the client, all on the same server, and no interaction with the client. So this is MUCH faster than coding this logic in your code.
Other answers cover why parameters are important, but there is a downside! In .net, there are several methods for creating parameters (Add, AddWithValue), but they all require you to worry, needlessly, about the parameter name, and they all reduce the readability of the SQL in the code. Right when you're trying to meditate on the SQL, you need to hunt around above or below to see what value has been used in the parameter.
I humbly claim my little SqlBuilder class is the most elegant way to write parameterized queries. Your code will look like this...
C#
var bldr = new SqlBuilder( myCommand );
bldr.Append("SELECT * FROM CUSTOMERS WHERE ID = ").Value(myId);
//or
bldr.Append("SELECT * FROM CUSTOMERS WHERE NAME LIKE ").FuzzyValue(myName);
myCommand.CommandText = bldr.ToString();
Your code will be shorter and much more readable. You don't even need extra lines, and, when you're reading back, you don't need to hunt around for the value of parameters. The class you need is here...
using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
public class SqlBuilder
{
private StringBuilder _rq;
private SqlCommand _cmd;
private int _seq;
public SqlBuilder(SqlCommand cmd)
{
_rq = new StringBuilder();
_cmd = cmd;
_seq = 0;
}
public SqlBuilder Append(String str)
{
_rq.Append(str);
return this;
}
public SqlBuilder Value(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append(paramName);
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public SqlBuilder FuzzyValue(Object value)
{
string paramName = "#SqlBuilderParam" + _seq++;
_rq.Append("'%' + " + paramName + " + '%'");
_cmd.Parameters.AddWithValue(paramName, value);
return this;
}
public override string ToString()
{
return _rq.ToString();
}
}
This happened to me today.
My MVC.Net application was running fine since few months. Today it caught error when executing this part of code.(this is the simplified version)
var cmd = db.Database.Connection.CreateCommand();
cmd.CommandText = $"mySchema.myStoredProcedureName {param1};
db.Database.CommandTimeout = 0;
db.Database.Connection.Open();
var reader = cmd.ExecuteReader();
Where db is a DbContext EF6.
The timeOut occured on the last line
I tried the syntax "using" no success
I tried also the following, maybe the connection is not opened
while(db.Database.Connection.State != ConnectionState.Open) {
db.Database.Connection.Open(); }
No! success.
The stored procedure returns result in 2 seconds on SSMS.
Finally I created a similar stored procedure with another name
Then it worked.
My question:
- Did MSSQL blackList my stored procedure?
I don't think it was blacklisted. Is it possible that your indexes were in need of a rebuild? In other words the renaming really may not have fixed the problem, but some other sort of SQL Server maintenance behind the scenes did?
My educated guess is the server provider did something to affect you if you did not change any code.
What is your recommended way to import .csv files into Microsoft SQL Server 2008 R2?
I'd like something fast, as I have a directory with a lot of .csv files (>500MB spread across 500 .csv files).
I'm using SQL Server 2008 R2 on Win 7 x64.
Update: Solution
Here's how I solved the problem the end:
I abandoned trying to use LINQ to Entities to do the job. It works - but it doesn't support bulk insert, so its about 20x slower. Maybe the next version of LINQ to Entities will support this.
Took the advice given on this thread, used bulk insert.
I created a T-SQL stored procedure that uses bulk insert. Data goes into a staging table, is normalized then copied into the target tables.
I mapped the stored procedure into C# using the LINQ to Entities framework (there is a video on www.learnvisualstudio.net showing how to do this).
I wrote all the code to cycle through files, etc in C#.
This method eliminates the biggest bottleneck, which is reading tons of data off the drive and inserting it into the database.
The reason why this method is extremely quick at reading .csv files? Microsoft SQL Server gets to import the files directly from the hard drive straight into the database, using its own highly optimized routines. Most of the other C# based solutions require much more code, and some (like LINQ to Entities) end up having to pipe the data slowly into the database via the C#-to-SQL-server link.
Yes, I know it'd be nicer to have 100% pure C# code to do the job, but in the end:
(a) For this particular problem, using T-SQL requires much less code compared to C#, about 1/10th, especially for the logic to denormalize the data from the staging table. This is simpler and more maintainable.
(b) Using T-SQL means you can take advantage of the native bulk insert procedures, which speeds things up from 20-minute wait to a 30-second pause.
Using BULK INSERT in a T-SQL script seems to be a good solution.
http://blog.sqlauthority.com/2008/02/06/sql-server-import-csv-file-into-sql-server-using-bulk-insert-load-comma-delimited-file-into-sql-server/
You can get the list of files in your directory with xp_cmdshell and the dir command (with a bit of cleanup). In the past, I tried to do something like this with sp_OAMethod and VBScript functions and had to use the dir method because I had trouble getting the list of files with the FSO object.
http://www.sqlusa.com/bestpractices2008/list-files-in-directory/
If you have to do anything with the data in the files other than insert it, then I would recommend using SSIS. It can not only insert and/or update, it can also clean the data for you.
First officially supported way of importing large text files is with command line tool called "bcp" (Bulk Copy Utility), very useful for huge amounts of binary data.
Please check out this link: http://msdn.microsoft.com/en-us/library/ms162802.aspx
However, in SQL Server 2008 I presume that BULK INSERT command would be your choice number one, because on the first place it became a part of standard command set. If for any reason you have to maintain vertical compatibility, I'd stick to bcp utility, available for SQL Server 2000 too.
HTH :)
EDITED LATER: Googling around I recalled that SQL Server 2000 had BULK INSERT command too... however, there was obviously some reason I sticked up to bcp.exe, and I cannot recall why... perhaps of some limits, I guess.
I should recommend this:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
namespace ReadDataFromCSVFile
{
static class Program
{
static void Main()
{
string csv_file_path=#"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}
}
}
//Copy the DataTable to SQL Server using SqlBulkCopy
function static void InsertDataIntoSQLServerUsingSQLBulkCopy(DataTable csvData)
{
using(SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=yourDB;Integrated Security=SSPI;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = "Your table name";
foreach (var column in csvFileData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(csvFileData);
}
}
}
If the structure of all your CSVs are the same i recomend you to use Integration Services (SSIS) in order to loop between them and insert all of them into the same table.
I understand this is not exactly your question. But, if you get into a situation where you use a straight insert use tablock and insert multiple rows. Depends on the row size but I usually go for 600-800 rows at at time. If it is a load into an empty table then sometimes dropping the indexes and creating them after it is loaded is faster. If you can sort the data on the clustered index before it is loaded. Use IGNORE_CONSTRAINTS and IGNORE_TRIGGERS if you can. Put the database in single user mode if you can.
USE AdventureWorks2008R2;
GO
INSERT INTO Production.UnitMeasure with (tablock)
VALUES (N'FT2', N'Square Feet ', '20080923'), (N'Y', N'Yards', '20080923'), (N'Y3', N'Cubic Yards', '20080923');
GO
I'm playing with the camel-SQL component in order to run a quartz-scheduled SQL report. My SQL is a lengthy query and I prefer to format it such that it spans many lines (92 lines to be exact) even though it could be formatted into a single reaaaallly long line of text.
My preference is to place this script into an external file and then run it. Alternatively, I could put it into a properties file (not really my preference), but I tried that and even after adding a backslash '\' at the end of each line, it still causes an exception for some reason. Ignoring that issue for the time being, how might I run this script using camel SQL with the script residing in an external file? Seems like it ought to be easy, but I'm not sure how to do it. Thanks for any pointers.
You may use the jdbc component that accesses databases through JDBC, where SQL queries and operations are sent in the message body.
Example:
from("direct:start")
.to("jdbc:myDataSource?useHeadersAsParameters=true")
.log("result = ${body}");
Tested with:
final ProducerTemplate template = context.createProducerTemplate();
template.sendBody("direct:start", "select p.ID, p.PROJECT from projects p");
Instead of a passing a static string to the body you may read your SQL statement from a file:
final String sql = FileUtils.readFileToString(new File("src/main/resources/sql/select.sql"));
template.sendBody("direct:start", sql);
I've got a fully functionable (secure) session to a SQL Server database (version 10.50.4000). This is stored in a public variable:
SqlConnection conn = new SqlConnection();
I only want to run SELECT queries. For anything else, the user account got no rights.
The queries are built with only one user entry, which is inserted into a simple Text Box.
Unfortunately I must not tell you the original command text. So I make it simple for you:
function print_users(string filtervalue)
{
SqlCommand cmd = null;
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE '%#fv%'", this.conn)
cmd.Parameters.Add("#fv", SqlDbType.NVarChar);
cmd.Parameters["#fv"].Value=filtervalue;
rdr = cmd.ExecuteReader();
while(rdr.Read())
{
//Do something with the answer from the DB
}
}
But this does not do the trick. I also tried AddWithValue, but I got no luck.
When creating a stop-point on the line, where #fv should be replaced, I can go through the code line-by-line. And I can see that the command, where #fv should be replaced, is processed with no error. But #fv is not replaced (or at least I cannot see the replacement in the debug console).
What am I doing wrong?
EDIT:
thank you for your replies. Leaving out the single quotes ( ' ) did the trick.
And I also learned that this is not a string replacement. Thank you.
Just one word: The connection is not left open all the time. It's immediately closed, when it's not needed any more; and re-established when needed again - I just forgot to write that into my sample code.
Again: Thank you for your help!
You can't see it being replaced in your debug session; the replacement occurs in the SQL server code itself...
The client (your code) send both the SQL-string and the value of the parameter as seperate things to the server. There the SQL Engine 'replaces' the parameter with its value while executing it.
You should also put the 'wildcards' inside your parametervalue, not inside the query.
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE #fv ", this.conn)
cmd.Parameters.Add("#fv", SqlDbType.NVarChar);
cmd.Parameters["#fv"].Value= "%" + filtervalue + "%";
The parameter is not working because it is inside a string literal. You want to build the string like this:
cmd = new SqlCommand("SELECT users From groups WHERE group_name LIKE '%' + #fv + '%'");
While we're at it, keeping a global connection like that is bad. It can cause strange side effects, especially in web apps. Instead, keep a global connection string, and then use that string to create a new connection on each request.
Also, "replace" is the wrong word here. Sql parameters are never replaced, even when they work. That's the whole point. There is no string replacement into your query at any point, ever. It's more like you declared an #fv variable at the server level in a stored procedure, and assigned your data directly to that variable. In this way, there is no possibility for a vulnerability in parameter replacement code, because the data portion of your query remains separate throughout the execution process. In same way, don't think in terms of "sanitizing" a parameter for a query; instead, think in terms of quarantining the data.