RODBC::sqlSave - problems creating/appending to a table - sql-server

Related to several other questions on the RODBC package, I'm having problems using RODBC::sqlSave to write to a table on a SQL Server database. I'm using MS SQL Server 2008 and 64-bit R on a Windows RDP.
The solution in the 3rd link (questions) does work [sqlSave(ch, df)]. But in this case, it writes to the wrong data base. That is, my default DB is "C2G" but I want to write to "BI_Sandbox". And it doesn't allow for options such as rownames, etc. So there still seems to be a problem in the package.
Obviously, a possible solution would be to change my ODBC solution to the specified database, but it seems there should be a better method. And this wouldn't solve the problem of unusable parameters in the sqlSave command--such as rownames, varTypes, etc.
I have the following ODBC- System DSN connnection:
Microsoft SQL Server Native Client Version 11.00.3000
Data Source Name: c2g
Data Source Description: c2g
Server: DC01-WIN-SQLEDW\BISQL01,29537
Use Integrated Security: Yes
Database: C2G
Language: (Default)
Data Encryption: No
Trust Server Certificate: No
Multiple Active Result Sets(MARS): No
Mirror Server:
Translate Character Data: Yes
Log Long Running Queries: No
Log Driver Statistics: No
Use Regional Settings: No
Use ANSI Quoted Identifiers: Yes
Use ANSI Null, Paddings and Warnings: Yes
R code:
R> ch <- odbcConnect("c2g")
R> sqlSave(ch, zinq_scores, tablename = "[bi_sandbox].[dbo].[table1]",
append= FALSE, rownames= FALSE, colnames= FALSE)
Error in sqlColumns(channel, tablename) :
‘[bi_sandbox].[dbo].[table1]’: table not found on channel
# after error, try again:
R> sqlDrop(ch, "[bi_sandbox].[dbo].[table1]", errors = FALSE)
R> sqlSave(ch, zinq_scores, tablename = "[bi_sandbox].[dbo].[table1]",
append= FALSE, rownames= FALSE, colnames= FALSE)
Error in sqlSave(ch, zinq_scores, tablename = "[bi_sandbox].[dbo].[table1]", :
42S01 2714 [Microsoft][SQL Server Native Client 11.0][SQL Server]There is already an object named 'table1' in the database.
[RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE [bi_sandbox].[dbo].[table1] ("credibility_review" float, "creditbuilder" float, "no_product" float, "duns" varchar(255), "pos_credrev" varchar(5), "pos_credbuild" varchar(5))'
In the past, I've gotten around this by running the supremely inefficient sqlQuery with insert into row-by-row to get around this. But I tried this time and no data was written. Although the sqlQuery statement did not have an error or warning message.
temp <-"INSERT INTO [bi_sandbox].[dbo].[table1]
+ (credibility_review, creditbuilder, no_product, duns, pos_credrev, pos_credbuild) VALUES ("
>
> for(i in 1:nrow(zinq_scores)) {
+ sqlQuery(ch, paste(temp, "'", zinq_scores[i, 1], "'",",", " ",
+ "'", zinq_scores[i, 2], "'", ",",
+ "'", zinq_scores[i, 3], "'", ",",
+ "'", zinq_scores[i, 4], "'", ",",
+ "'", zinq_scores[i, 5], "'", ",",
+ "'", zinq_scores[i, 6], "'", ")"))
+ }
> str(sqlQuery(ch, "select * from [bi_sandbox].[dbo].[table1]"))
'data.frame': 0 obs. of 6 variables:
$ credibility_review: chr
$ creditbuilder : chr
$ no_product : chr
$ duns : chr
$ pos_credrev : chr
$ pos_credbuild : chr
Any help would be greatly appreciated.
Also, if there is any missing detail, please let me know and I'll edit the question.

My apologies up front. This is not exactly a "simple example." It's pretty trivial, but there are a lot of parts. And by the end, you'll probably think I'm crazy for doing it this way.
Starting in SQL Server Management Studio
First, I've created a database on SQL Server called mtcars with default schema dbo. I've also added myself as a user. Under my own user name, I am the database owner, so I can do anything I want to the database, but from R, I will connect using a generic account that only has EXECUTE privileges.
The predefined table in the database that we are going to write to is called mtcars. (So the full path to the table is mtcars.dbo.mtcars; it's lazy, I know). The code to define the table is
USE [mtcars]
GO
/****** Object: Table [dbo].[mtcars] Script Date: 2/22/2016 11:56:53 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[mtcars](
[OID] [int] IDENTITY(1,1) NOT NULL,
[mpg] [numeric](18, 0) NULL,
[cyl] [numeric](18, 0) NULL,
[disp] [numeric](18, 0) NULL,
[hp] [numeric](18, 0) NULL
) ON [PRIMARY]
GO
Stored Procedures
I'm going to use two stored procedures. The first is an "UPSERT" procedure, that will first try to update a row in a table. If that fails, it will insert the row into the table.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE dbo.sample_procedure
#OID int = 0,
#mpg numeric(18,0) = 0,
#cyl numeric(18,0) = 0,
#disp numeric(18,0) = 0,
#hp numeric(18,0) = 0
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- TRANSACTION code borrowed from
-- http://stackoverflow.com/a/21209131/1017276
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
UPDATE dbo.mtcars
SET mpg = #mpg,
cyl = #cyl,
disp = #disp,
hp = #hp
WHERE OID = #OID;
IF ##ROWCOUNT = 0
BEGIN
INSERT dbo.mtcars (mpg, cyl, disp, hp)
VALUES (#mpg, #cyl, #disp, #hp)
END
COMMIT TRANSACTION;
END
GO
Another stored procedure I will use is just the equivalent of RODBC::sqlFetch. As far as I can tell, sqlFetch depends on SQL injection, and I'm not allowed to use it. Just to be on the safe side of our data security policies, I write little procedures like this (Data security is pretty tight here, you may or may not need this)
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE dbo.get_mtcars
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SELECT * FROM dbo.mtcars
END
GO
Now, from R
I have a utility function I use to help me manage inputting data into the stored procedures. sqlSave would do a lot of this automatically, so I'm kind of reinventing the wheel. The gist of the utility function is to determine if the value I'm pushing to the database needs to be nested in quotes or not.
#* Utility function. This does a couple helpful things like
#* Convert NA and NULL into a SQL NULL
#* wrap character strings and dates in single quotes
sqlNullString <- function(value, numeric=FALSE)
{
if (is.null(value)) value <- "NULL"
if (is.na(value)) value <- "NULL"
if (inherits(value, "Date")) value <- format(x = value, format = "%Y-%m-%d")
if (value == "NULL") return(value)
else if (numeric) return(value)
else return(paste0("'", value, "'"))
}
This next step isn't strictly necessary, but I'm going to do it just so that my R table is similar to my SQL table. This is organizational strategy on my part.
mtcars$OID <- NA
Now let's establish our connection:
server <- "[server_name]"
uid <- "[generic_user_name]"
pwd <- "[password]"
library(RODBC)
channel <- odbcDriverConnect(paste0("driver=SQL Server;",
"server=", server, ";",
"database=mtcars;",
"uid=", uid, ";",
"pwd=", pwd))
Now this next part is pure laziness. I'm going to use a for loop to push each row of the data frame the to SQL table one at a time. As noted in the original question, this is kind of inefficient. I'm sure I could write a stored procedure to accept several vectors of data, compile them into a temporary table, and do the UPSERT in SQL, but I don't work with large data sets when I'm doing this, and so it hasn't yet been worth it to me to write such a procedure. Instead, I prefer to stick with the code that is a little easier for me to reason with on my limited SQL skills.
Here, we're just going to push the first 5 rows of mtcars
#* Insert the first 5 rows into the SQL Table
for (i in 1:5)
{
sqlQuery(channel = channel,
query = paste0("EXECUTE dbo.sample_procedure ",
"#OID = ", sqlNullString(mtcars$OID[i]), ", ",
"#mpg = ", mtcars$mpg[i], ", ",
"#cyl = ", mtcars$cyl[i], ", ",
"#disp = ", mtcars$disp[i], ", ",
"#hp = ", mtcars$hp[i]))
}
And now we'll take a look at the table from SQL
sqlQuery(channel = channel,
query = "EXECUTE dbo.get_mtcars")
This next line is just to match up the OIDs in R and SQL for illustration purposes. Normally, I would do this manually.
mtcars$OID[1:5] <- 1:5
This next for loop will UPSERT all 32 rows. We already have 5, we're UPSERTing 32, and the SQL table at the end should have 32 if we've done it correctly. (That is, SQL will recognize the 5 rows that already exist)
#* Update/Insert (UPSERT) the entire table
for (i in 1:nrow(mtcars))
{
sqlQuery(channel = channel,
query = paste0("EXECUTE dbo.sample_procedure ",
"#OID = ", sqlNullString(mtcars$OID[i]), ", ",
"#mpg = ", mtcars$mpg[i], ", ",
"#cyl = ", mtcars$cyl[i], ", ",
"#disp = ", mtcars$disp[i], ", ",
"#hp = ", mtcars$hp[i]))
}
#* Notice that the first 5 rows were unchanged (though they would have changed
#* if we had changed the data...the point being that the stored procedure
#* correctly identified that these records already existed)
sqlQuery(channel = channel,
query = "EXECUTE dbo.get_mtcars")
Recap
The stored procedure approach has a major disadvantage in that it is blatantly reinventing the wheel. It also requires that you learn SQL. SQL is pretty easy to learn for simple tasks, but some of the code I've written for more complex tasks is pretty difficult to interpret. Some of my procedures have taken me the better part of a day to get right. (once they are done, however, they work incredibly well)
The other big disadvantage to the stored procedure is, I've noticed, it does require a little bit more code work and organization. I'd say it's probably been about 10% more code work and documentation than if I were just using SQL Injection.
The chief advantages of the stored procedures approach are
you have massive flexibility for what you want to do
You can store your SQL code into the database and not pollute your R code with potentially huge strings of SQL code
Avoiding SQL injection (again, this is a data security thing, and may not be an issue depending on your employer's policies. I'm strictly forbidden from using SQL injection, so stored procedures are my only option)
It should also be noted that I've not yet explored using Table-Valued parameters in my stored procedures, which might simplify things for me a bit.

In the past, I've gotten around this by running the supremely inefficient sqlQuery with insert into row-by-row to get around this. But I tried this time and no data was written. Although the sqlQuery statement did not have an error or warning message.
Faced it yesterday: in my case the issue was in scheme. The table was actually created but in my user own scheme.
First time you can create it and than you have this error (that object already exists)
After the investigation I found that some packages does not work correctly with schemes.
In the end I used "insert by line" solution. The solution is available here and here

Related

Unable to pass empty string into non-null database field

I'm stumped on something which should be very straight-forward. I have a SQL Server database, and I'm trying to update a non-nullable varchar or nvarchar field with an empty string. I know it's possible, because an empty string '' is not the same thing as NULL. However, using the TADOQuery, it is not allowing me to do this.
I'm trying to update an existing record like so:
ADOQuery1.Edit;
ADOQuery1['NonNullFieldName']:= '';
//or
ADOQuery1.FieldByName('NonNullFieldName').AsString:= '';
ADOQuery1.Post; //<-- Exception raised while posting
If there is anything in the string, even just a single space, it saves just fine, as expected. But, if it is an empty string, it fails:
Non-nullable column cannot be updated to Null.
But it's not null. It's an empty string, which should work just fine. I swear I've passed empty strings many, many times in the past.
Why am I getting this error, and what should I do to resolve it?
Additional details:
Database: Microsoft SQL Server 2014 Express
Language: Delphi 10 Seattle Update 1
Database drivers: SQLOLEDB.1
Field being updated: nvarchar(MAX) NOT NULL
I can reproduce your reported problem using the code below with SS2014, the OLEDB driver and
Seattle and the difference in behaviour when the table has been created with MAX as the column size and a specific number (4096 in my case). I thought I would post this is as an alternative
answer because it not only shows how to investigate this difference systematically
but also identifies why this difference arises (and hence how to avoid it in future).
Please refer to and execute the code below, as written, i.e. with the UseMAX define
active.
Turning on "Use Debug DCUs" in the the project options before executing the code, immediately
reveals that the described exception occurs in Data.Win.ADODB at line 4920
Recordset.Fields[TField(FModifiedFields[I]).FieldNo-1].Value := Data
of TCustomADODataSet.InternalPost and the Debug evaluation window reveals that
Data at this point is Null.
Next, notice that
update jdtest set NonNullFieldName = ''
executes in an SSMS2014 Query window without complaint (Command(s) completed successfully.), so it seems that the
fact that Data is Null at line 4920 is what is causing the problem and the next question is "Why?"
Well, the first thing to notice is that the form's caption is displaying ftMemo
Next, comment out the UseMAX define, recompile and execute. Result: No exception
snd notice that the form's caption is now displaying ftString.
And that's the reason: Using a specific number for the column size means that
the table metadata retrieved by the RTL causes the client-side Field to be created
as a TStringField, whose value you can set by a string assignment statement.
OTOH, when you specify MAX, the resulting client-side Field is of type ftMemo,
which is one of Delphi's BLOB types and when you assign
string values to an ftMemo field, you are at the mercy of code in Data.DB.Pas , which does all the reading (and writing) to the record buffer using a TBlobStream. The problem with that is that as far as I can see, after a lot of experiments and tracing through the code, the way a TMemoField uses a BlobStream fails to properly distinguish between updating the field contents to '' and setting the field's value to Null (as in System.Variants).
In short, whenever you try to set a TMemoField's value to an empty string, what actually happens is that the field's state is set to Null, and this is what causes the exception in the q. AFAICS, this is unavoidable, so no work-around is obvious, to me at any rate.
I have not investigated whether the choice between ftMemo and ftString is made by the Delphi RTL code or the MDAC(Ado) layer it sits upon: I would expect it is actually determined by the RecordSet TAdoQuery uses.
QED. Notice that this systematic approach to debugging has revealed the
problem & cause with very little effort and zero trial and error, which was
what I was trying to suggest in my comments on the q.
Another point is that this problem could be tracked down entirely without
resorting to server-side tools including the SMSS profiler. There wasn't any need to use the profiler to inspect what the client was sending to the server
because there was no reason to suppose that the error returned by the server
was incorrect. That confirms what I said about starting investigation at the client side.
Also, using a table created on the fly using IfDefed Sql enabled the problem effectively to be isolated in a single step by simple observation of two runs of the app.
Code
uses [...] TypInfo;
[...]
implementation[...]
const
// The following consts are to create the table and insert a single row
//
// The difference between them is that scSqlSetUp1 specifies
// the size of the NonNullFieldName to 'MAX' whereas scSqlSetUp2 specifies a size of 4096
scSqlSetUp1 =
'CREATE TABLE [dbo].[JDTest]('#13#10
+ ' [ID] [int] NOT NULL primary key,'#13#10
+ ' [NonNullFieldName] VarChar(MAX) NOT NULL'#13#10
+ ') ON [PRIMARY]'#13#10
+ ';'#13#10
+ 'Insert JDTest (ID, [NonNullFieldName]) values (1, ''a'')'#13#10
+ ';'#13#10
+ 'SET ANSI_PADDING OFF'#13#10
+ ';';
scSqlSetUp2 =
'CREATE TABLE [dbo].[JDTest]('#13#10
+ ' [ID] [int] NOT NULL primary key,'#13#10
+ ' [NonNullFieldName] VarChar(4096) NOT NULL'#13#10
+ ') ON [PRIMARY]'#13#10
+ ';'#13#10
+ 'Insert JDTest (ID, [NonNullFieldName]) values (1, ''a'')'#13#10
+ ';'#13#10
+ 'SET ANSI_PADDING OFF'#13#10
+ ';';
scSqlDropTable = 'drop table [dbo].[jdtest]';
procedure TForm1.Test1;
var
AField : TField;
S : String;
begin
// Following creates the table. The define determines the size of the NonNullFieldName
{$define UseMAX}
{$ifdef UseMAX}
S := scSqlSetUp1;
{$else}
S := scSqlSetUp2;
{$endif}
ADOConnection1.Execute(S);
try
ADOQuery1.Open;
try
ADOQuery1.Edit;
// Get explicit reference to the NonNullFieldName
// field to make working with it and investigating it easier
AField := ADOQuery1.FieldByName('NonNullFieldName');
// The following, which requires the `TypInfo` unit in the `USES` list is to find out which exact type
// AField is. Answer: ftMemo, or ftString, depending on UseMAX.
// Of course, we could get this info by inspection in the IDE
// by creating persistent fields
S := GetEnumName(TypeInfo(TFieldType), Ord(AField.DataType));
Caption := S; // Displays `ftMemo` or `ftString`, of course
AField.AsString:= '';
ADOQuery1.Post; //<-- Exception raised while posting
finally
ADOQuery1.Close;
end;
finally
// Tidy up
ADOConnection1.Execute(scSqlDropTable);
end;
end;
procedure TForm1.Button1Click(Sender: TObject);
begin
Test1;
end;
The problem occurs when using MAX in the data type. Both varchar(MAX) and nvarchar(MAX) exploit this behavior. When removing MAX and replacing it with a large number, such as 5000, then it allows empty strings.

R RODBCext and Parameterizing IN statement?

I've been working to parameterize a SQL Statement that uses the IN statement in the WHERE clause. I'm using rodbcext library for parameterizing but it seems to lack expansion of a list.
I was hoping to write code such as
sqlExecute("SELECT * FROM table WHERE name IN (?)", c("paul","ringo","john", "george")
I'm using the following code but wondered if there's an easier way.
library(RODBC)
library(RODBCext)
# Search inputs
names <- c("paul", "ringo", "john", "george")
# Build SQL statement
qmarks <- replicate(length(names), "?")
stringmarks <- paste(qmarks, collapse = ",")
sql <- paste("SELECT * FROM tableA WHERE name IN (", stringmarks, ")")
# expand to Columns - seems to be the magic step required
bindnames <- rbind(names)
# Execute SQL statement
dbhandle <- RODBC::odbcDriverConnect(connectionString)
result <- RODBCext::sqlExecute(dbhandle, sql, bindnames, fetch = TRUE)
RODBC::odbcClose(dbhandle)
It works but feel I'm using R to expand the strings in the wrong way (bit new to R - so many ways to do the same thing wrong). Somebody will probably say "that creates factors - never do that" :-)
I found this article which suggest I'm on the right track but it doesn't discuss having to expand the "?" and turn the list into columns of a data.frame
R RODBC putting list of numbers into an IN() statement
Thank you.
UPDATE: As Benjamin shows below - the sqlExecute function can handle a list() of inputs. However upon inspection of the resulting SQL I discovered that it uses cursors to rollup the results. This significantly increases the CPU and I/O over the sample code I show above.
While the library can indeed solve this for you, for large results it may be too expensive. There are two answers and it depends upon your needs.
Since your only parameter in the query is in collection for IN, you could get away with
sqlExecute(dbhandle,
"SELECT * FROM table WHERE name IN (?)",
list(c("paul","ringo","john", "george")),
fetch = TRUE)
sqlExecute will bind the values in the list to the question mark. Here, it will actually repeat the query four times, once for each value in the vector. It may seem kind of silly to do it this way, but when trying to pass strings, it's a lot safer in many ways to let the binding take care of setting up the appropriate quote structure rather than trying to paste it in yourself. You will generate fewer errors this way and avoid a lot of database security concerns.
What if you declare a variable table in a character object and then concatenate with the query.
library(RODBC)
library(RODBCext)
# Search inputs
names <- c("paul", "ringo", "john", "george")
# Build SQL statement
sql_top <- paste0( "SET NOCOUNT ON \r\n DECLARE #LST_NAMES TABLE (ID NVARCHAR(20)) \r\n INSERT INTO #LST_NAMES VALUES ('", paste(names, collapse = "'), ('" ) , "')")
sql_body <- paste("SELECT * FROM tableA WHERE name IN (SELECT id FROM #LST_NAMES)")
sql <- paste0(sql_top, "\r\n", sql_body)
# Execute SQL statement
dbhandle <- RODBC::odbcDriverConnect(connectionString)
result <- RODBCext::sqlExecute(dbhandle, sql, bindnames, fetch = TRUE)
RODBC::odbcClose(dbhandle)
The query will be (the set no count on is important to retrieve the results)
SET NOCOUNT ON
DECLARE #LST_NAMES TABLE (ID NVARCHAR(20))
INSERT INTO #LST_NAMES VALUES ('paul'), ('ringo'), ('john'), ('george')
SELECT * FROM tableA WHERE name IN (SELECT id FROM #LST_NAMES)

Executing a mix of single-row NHibernate Save's and HQL deletes with subselect in a single database call

I'm using NHibernate 3.2.0.4000, SQL Server 2012, C# .NET 4.0 MVC.
I have an NHibernate session in which a class A object is Save'd and an HQL delete statement is executed against a class B object.
The DEBUG level logging showed...
DEBUG NHibernate.SQL delete from MyDb.MySchema.ClassB ...
DEBUG NHibernate.AdoNet.AbstractBatcher ExecuteNonQuery took 5 ms
DEBUG NHibernate.Transaction.AdoTransaction Start Commit
DEBUG NHibernate.AdoNet.AbstractBatcher Adding to batch:INSERT INTO MyDb.MySchema.ClassA ...
DEBUG NHibernate.AdoNet.AbstractBatcher ExecuteBatch for 1 statements took 6 ms
The SQL Profiler shows the 2 statements as being part of 2 separate remote procedure calls RPC:Completed events.
The above log reference to a batch with 1 statement and the profiler data seem to indicate that the 2 statements are being executed in two separate database calls. How can these 2 statements be combined into a single database call, just like is done when using Future and FutureValue with select statements?
Some extra details that might help are:
//The NHibernate Insert code
ClassADao.Save(new ClassA());
//The HQL delete code
session.CreateQuery
(
"delete from Solution.Project.Classes.ClassB as classB " +
"where exists ( " +
"from Solution.Project.Classes.ClassC as classC " +
"where classC.classD_FK.Id = :classD_FK_Id_One " +
"and classC = classB.classC_FK " +
"and classB.classD_FK.Id = :classD_FK_Id_Two " +
") "
)
.SetInt32("classD_FK_Id_One", Id1)
.SetInt32("classD_FK_Id_Two", Id2)
.ExecuteUpdate();
//The DEBUG level logged translation of the above HQL delete statement
delete from MyDb.MySchema.CLASSB
where exists (
select CLASSC1_.id
from MyDb.MySchema.CLASSC CLASSC1_
where CLASSC1_.classd_id = #p0
and classc_id = MyDb.MySchema.CLASSB.classc_id
and MyDb.MySchema.CLASSB.classd_id = #p1
);
//and the fact that I'm using adonet.batch_size value of 5.`
Thanks in advance for all attempts to help.
adonet.batch_size doesn't create single sp_executesql commands. This wouldn't be possible, because sp_executesql is created by NHibernate, while the ado.net batch behind the scenes.
In fact, you do not see the ado.net batches with the SQL Profiler. It creates a single server call with multiple commands, which are all seen separately with the Profiler.

FreeTDS / SQL Server UPDATE Query Hangs Indefinitely

I'm trying to run the following UPDATE query from a python script (note I've removed the database info):
print 'Connecting to db for update query...'
db = pyodbc.connect('DRIVER={FreeTDS};SERVER=<removed>;DATABASE=<removed>;UID=<removed>;PWD=<removed>')
cursor = db.cursor()
print ' Executing SQL queries...'
for i in range(len(data)):
sql = '''
UPDATE product.sanction
SET action_summary = '{action_summary}'
WHERE sanction_id = {sanction_id};
'''.format(sanction_id=data[i][0], action_summary=data[i][1])
cursor.execute(sql)
cursor.close()
db.commit()
db.close()
However, it hangs indefinitely, no error.
I'm new to pyodbc, but it should be setup correctly considering I'm having no problems performing SELECT queries. I did have to call CAST for SELECT queries (I've cast sanction_id AS INT [int identity on the database] and action_summary AS TEXT [nvarchar on the database]) to properly populate data, so perhaps the problem lies somewhere there, but I don't know where to start debugging. Converting the text to NVARCHAR didn't do anything either.
Here's an example of one of the rows in data:
(2861357, 'Exclusion Program: NonProcurement; Excluding Agency: HHS; CT Code: Z; Exclusion Type: Prohibition/Restriction; SAM Number: S4MR3Q9FL;')
I was unable to find my issue, but I ended up using QuerySets rather than running an UPDATE query.

SQL Server: Concatenating WHERE Clauses. Seeking Appropriate Pattern

I want to take a poorly designed SQL statement that's embedded in C# code and rewrite it as a stored procedure (presumably), and am looking for an appropriate means to address the following pattern:
sql = "SELECT <whatever> FROM <table> WHERE 1=1";
if ( someCodition.HasValue )
{
sql += " AND <some-field> = " + someCondition.Value;
}
This is a simplification. The actual statement is quite long and contains several such conditions, some of which include INNER JOIN's to other tables if the condition is present. This last part is key, otherwise I'd probably be able to solve all of them with:
WHERE <some-condition-value> IS NULL OR <some-field> = <some-condition-value>
I can think of a few possible approaches. I'm looking for the correct approach.
Edit:
I don't want to perform concatenation in C#. I consider this a serious compromise to security.
If I understand the question properly, the idea is to replace a whole section of code in C# in charge of producing, "long hand", a specific SQL statement corresponding to a list of search criteria, by a single call to a stored-procedure which would, SQL-side, use a generic template of the query aimed at handling all allowed combinations of search criteria in a uniform fashion.
In addition to the difficulty of mapping expressions evaluated on the application-side (eg. someCondition.HasValue) to expressions evaluated on the SQL-side (eg "some-condition-value"), the solution you envision may be logically/functionally equivalent to a "hand-crafted" SQL statement, but slower and more demanding of SQL resources.
Essentially, the C# code encapsulates specific knowledge about the "physical" layout of the database and its schema. It uses this info to figure-out when a particular JOIN may be required or when a particular application-level search criteria value translate to say a SQL "LIKE" rather than an "=" predictate. It may also encaspsulate business rules such as "when the ZIP code is supplied, search by that rather than by State".
You are right to attempt and decouple the data model (the way the application sees the data) from the data schema (the way it is declared and stored in SQL), but the proper mapping needs to be done somehow, somewhere.
Doing this at the level of the application, with all the expressive power of C# as opposed to say T-SQL, is not necessarily a bad thing, provided it is done
- in a module that is independent of other features of the application
and, where practical,
- it is somewhat data/configuration-driven as so to allow small changes in the data model (say the addition of a search criteria) to be implemented by changing a configuration file, rather than plugging this in somewhere in the middle of a long series of C# conditional statements.
start with this WHERE clause:
WHERE 1=1
then append all conditions as:
AND <some-field> = " + someCondition.Value;
the optimizer will toss out the 1=1 condition and you don't have to worry about too many ANDs
EDIT based on OP's comment about not wanting to concatinate strings:
here is a very comprehensive article on how to handle this topic:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
it covers all the issues and methods of trying to write queries with multiple optional search conditions
here is the table of contents:
Introduction
The Case Study: Searching Orders
The Northgale Database
Dynamic SQL
Introduction
Using sp_executesql
Using the CLR
Using EXEC()
When Caching Is Not Really What You Want
Static SQL
Introduction
x = #x OR #x IS NULL
Using IF statements
Umachandar's Bag of Tricks
Using Temp Tables
x = #x AND #x IS NOT NULL
Handling Complex Conditions
Hybrid Solutions – Using both Static and Dynamic SQL
Using Views
Using Inline Table Functions
Conclusion
Feedback and Acknowledgements
Revision History
Well you can start with
StringBuilder sb = new StringBuilder();
sb.Append("SELECT <whatever> FROM <table> WHERE 1 = 1 ");
if ( someCodition.HasValue )
{
sb.Append(" AND <some-field> = " + someCondition.Value);
}
// And so on
Will save you the trouble of putting the first WHERE - AND
[Edit]
You can also try this
Create an SP with all required parameters for the table, and write the query like this.
DECLARE #sqlStatement NVARCHAR(MAX)
#sqlStatement = " SELECT fields1, fields2 FROM TableA WHERE 1 = 1 "
if(#param1 IS NOT NULL) #sqlStatement = #sqlStatement + "AND Column1 = " + #param1
if(#param2 IS NOT NULL) #sqlStatement = #sqlStatement + "AND Column2 = " + #param2
// and so on
sp_executeSql #sqlStatement
Also you can try similar SP but with:
SELECT fields1, fields2 FROM TableA WHERE 1 = 1
AND ( ( #param1 IS NULL ) OR ( Column1 = #param1 ) )
AND ( ( #param2 IS NULL ) OR ( Column2 = #param2 ) )
this is definitely injection proof!

Resources