Related
We have an old database product; the master deployment has always done ALTER Database [databasename] SET CONCAT_NULL_YIELDS_NULL OFF. Since this setting is going away, we want to come up with a testing plan for getting rid of it and tracking down what code depends on it.
With this switch off, the documentation says that SELECT 'abc' + NULL FROM sometable results in abc rather than NULL.
I can easily observe this behavior at the connection level; however the application never sets it at the connection level. That's not the question here.
How can I observe by any SQL statement the effect of turning off CONCAT_NULL_YIELDS_NULL only at the database level using .NET System.Data.SqlClient as the database access client?
We audited the entire codebase and removed all instances of SET commands except for transaction isolation level as these hose things pretty hard. While I tried EXEC to see if it would bypass the driver settings and found it didn't, I'm reasonably certain there's some sufficiently exotic SQL construction that could succeed.
I think I have finally understood what you actually want: to change the connection option without changing it. No, there are no ways to achieve this, otherwise I would have heard about them, and Rutzky would have mentioned them in his answer I referenced in my original response.
There is a strict hierarchy in a way the options can be set:
Instance-wide settings for all databases, using sp_configure 'user options'. Lowest level defaults.
Database-specific options which can be set via ALTER DATABASE. Take precedence over instance defaults.
Explicit SET statements executed on the connection. Override everything.
The advice? Change the application' connection string, if you can, so that it would use another driver that doesn't set / touch this option. Assuming, of course, it wouldn't crash anything, but this is a thing which can be tested relatively quickly.
Also, you may try to change the instance defaults, and see if the driver somehow favours them over #2. It is possible that it can be too smart for its own good.
P.S. For the sake of completeness, the only cases of interdependent options that I know of are listed below:
Shortcut settings, for example ANSI_DEFAULTS. Setting it actually affects several options related to ANSI compatibility;
SET LANGUAGE. When you change the connection language, it also overrides the DATEFIRST and DATEFORMAT options.
P.P.S. I decided to leave my previous answer intact, as it still contains some info which might be useful for people who would understand your question the way I initially did.
Even if your application doesn't specify this option explicitly, different ODBC and OLEDB drivers have different defaults for connection options. Depending on the driver, this particular option can either be set implicitly, or inherited from database / server settings.
In order to see the actual state of this option for connections established by your app, you can setup a Profiler trace which will show you effective connection settings.
In regards to overriding the driver's behaviour setting this option, there is a rather thorough answer for a similar question on DBA StackExchange. There, you can find several possible approaches, and see which one suits you best.
On the SQL side, one possible solution is to replace old style + string concatenations with the newer concat() function which is available since SQL Server 2012. This function skips NULL arguments during concatenation, so the result always looks as if concat_null_yields_null is set to off. Depending on the amount of SQL code modules in your system, it might be quite an undertaking, especially as it's difficult to distinguish between numerical additions and string concatenations with a naked eye.
I would recommend to shop around for some code analysis / refactoring tools that might lend a hand in this. Not sure if SSDT has this capability, but I would start with it first, since it's free (and it might make sense to try out the Visual Studio 2019, which was released 2 days ago - there might be something there). Other than that, well... RedGate, Idera, ApexSQL, you name it. Someone might have done this already.
This can be achieved with this query
IF (SELECT ''+NULL) IS NULL
BEGIN
SET CONCAT_NULL_YIELDS_NULL OFF
END
ELSE
BEGIN
SET CONCAT_NULL_YIELDS_NULL ON
END
Given help from this microsoft link, I am aware of many tools related to SSIS diagnostics:
Event Handlers (in particular, "OnError")
Error Outputs
Operations Reports
SSISDB Views
Logging
Debug Dump Files
I just want to know what is the basic, "go to" approach for (non-production) diagnostics setup with SSIS. I am a developer who WILL have access to the QA and UAT servers where I will be performing diagnostics.
In my first attempt to find the source of an error, I used SSMS to view operational reports. All I saw was this:
I followed the instructions shown above, but all it did was lead me in a circle. The overview allows me to see the details, but the details show the above message and ask me to go back to the overview. In short, there is zero error information beyond telling me which task failed within the SSIS package.
I simply want to get to a point where I can at last see SOMETHING about the error(s).
If the answer is that I first need to configure an OnError event in my package, then my next question is: what would the basic, "go to" designer-flow look like for that OnError event?
FYI, this question is similar to "best way of logging in SSIS"
I also noticed an overall strategy for success with SSIS in this answer. The author says:
Instrument your code - have it make log entries, possibly recording diagnostics such as check totals or counts. Without this, troubleshooting is next to impossible. Also, assertion checking is a good way to think of error handling for this (does row count in a equal row count in b, is A:B relationship really 1:1).
Sounds good! But I'd like to have a more concrete example...particularly for feeding me the basics of what specific errors were generated.
I'm trying to avoid learning ALL the SSIS diagnostic approaches, just for the purpose of picking one good "all around" approach.
Update
Per Nick.McDermaid suggestion, in the SSISDB DB I run this:
SELECT * FROM [SSISDB].[catalog].[executions] ORDER BY start_time DESC
This shows to me the packages that I manually executed. The timestamps correctly reflect when I ran them. If anything is unusual(?) it is that the reference_id, reference_type and environment_name columns are null. All the other columns have values.
Update #2
I discovered half the answer I'm looking for. The reason no error information is available, is because by default the SSIS package execution logging level is "none". I had to change the logging level.
Nick.McDermaid gave me the rest of the answering by explaining that I don't need to dive into OnError tooling or SSIS logging provider tooling.
I'm not sure what the issue with your reports are but in answer to the question "Which SSIS diagnostics should I learn", I suggest the vanilla ones out of the box.
In other words use built in SSIS logging (which does not require any additional code) to log failures. Then use the built in reports (once you get them working) to check those logs.
vanilla functionality requires no maintenance. Custom functionality (i.e. filling your packages up with OnError events) requires a lot more maintenance.
You may find situations where you need to learn some of the SSISDB tricks to troubleshoot but in the first instance, try to get everything you can out of the vanilla reports.
If you need to maintain an SQL 2012 or after existing system, then all of this logging is built in. Manual OnError additions are not guaranteed to be built in
The only other thing to be aware of is that script tasks never yield informative errors. I actually suggest you avoid the use of script tasks in SSIS. I feel that if you have to use a script task, you might be using the wrong tool
Adding to the excellent answer of #Nick.McDermaid.
I use SSIS Catalog error reporting. In most cases, it is sufficient and have the following functionality for error analysis. Emphasis is on the following:
Usually the first or second error message contains meaningful information on error. The latter is some error occurred in the dataflow....
If you look at the first/second error message at All Messages report at Error Messages section, you will see Error Context hyperlink. Invoking it will show you environment, connection managers and some variables at the moment of package crash.
Good error analysis is more an approach and practice than a mere tool selection. Here are my recommendations:
SSIS likes to report error code instead of meaningful explanation. So, Integration Services Error and Message Reference is your friend.
SSIS includes in error context (see above) dump those variables which have Include in ErrorDump property set to true.
I build SQL commands for SQL Task or DataFlow Source in variables. This allows to display SQL command executed at error in error context, when you set incude in Dump property on these variables.
Structure your variables well. If some variable is used only at some task - declare it on this task. Otherwise a mess of dumped variables will hurt you more than do any good.
My database has had several successive maintainers over the years and any naming guidelines that may have once been in place have been ignored.
I'd like to rename the stored procedures to a consistent format. Obviously I can rename them from within SQL Server Management Studio, but this will not then update the calls made in the website code behind (C#/ASP.NET).
Is there anything I can do to ensure all calls get updated to the new names, short of searching for every single old procedure name in the code? Does Visual Studio have the ability to refactor such stored procedure names?
NB I do not believe my question to be a duplicate of this question as the latter is solely about renaming within the database.
You could make the change in stages:
Copy of the stored procedures to the new stored procedures under their new name.
Alter the old stored procedures to call the new ones.
Add logging to the old stored procedures when you've changed all the code in the website.
After a while when you're not seeing any calls to the old stored procedures and you're happy you've found all the calls in the web site, you can remove the old stored procedures and logging.
You can move the 'guts' of the SPROC to a new SPROC meeting your new naming conventions, and then leave the original sproc as a shell / wrapper which delegates to the new SPROC.
You can also add an 'audit' table to track when the old wrapper SPROC is called - this way you will know that there are no dependencies on the old SPROC, and the old SPROC can be safely dropped (also, make sure that it isn't just 'your app' using the DB - e.g. cross database joins or other apps)
This has a small performance penalty, and won't really buy you that much (other than being able to 'find' your new SPROCs easier)
You will need to handle this in at least two areas, the application and the database. There could be other areas as well, and you have to be careful not to overlook them.
The Application
A Nice Practice for Future Projects
It helps to abstract your sprocs out. In our apps, we wrap all of our sprocs in a giant class, I can make calls like this:
Dim SomeData as DataTable = Sprocs.sproc_GetSomeData(5)
That way, the code end is nice and encapsulated. I can go into Sprocs.sproc_GetSomeData and tweak the sproc name in just one place, and of course I can right click on the method and do a symbolic rename to fix the method call solution-wide.
Without the Abstraction
Without that abstraction, you can just do Find In Files (Cntl+Shift+F) for the sproc name and then if the results looks right, open the files up and Find/Replace all the occurances.
The Sql Server
Don't Trust View Dependencies
On the SQL server end, theoretically in MSSMS 2008 you can right click on a sproc and select View Dependencies.
That should show you a list of all the places where the sproc is used in the database, however my confidence in this feature is very low. It might be better in SQL 2008, but in previous versions it definitely had problems.
View Dependencies hurt me, and it will take time for that to heal. :)
Wrap It!
You end up having to keep the old sproc around for awhile. This is the major reason why renaming sprocs is a such a project - it can take a month to finally be done with it.
First replace its contents with some simple TSQL that calls the the new sproc with the same parameters, and write some logging so that once some time goes by, you can tell if the old sproc is actually unused.
Finally, when you're sure the old sproc is unused, delete it.
Other Areas?
There could be a lot of other areas as well. Reporting Services springs to mind. SSIS packages. Using the technique of keeping the old sproc around and re-routing to the new one (mentioned above) will help you know if you missed anything, however it won't tell you what you missed. This can lead to much pain!
Good luck!
Short of testing every path in your application to ensure that any calls to the database and the relevant stored procedures have been updated... no.
Use global search and replace (but review each suggested replacement) to try to avoid missing any instances. If you app is well structured then there really should only be 1 place each stored proc is called.
As far as changing your application, I have all my stored procs as settings in the web.config file, so all the names are in one place and can be changed at any time to match changes to the database.
When the application needs to call a stored proc, the name is determined from web.config.
This makes it easier to manage all the potential calls which the application could make to the database services layer.
It will be a bit of a tedious search through your source code and other database objects I'm afraid.
Don't forget SSIS Packages, SQL Agent Jobs, Reporting Services rdl as well as your main application code.
You could use a regular expression like spProc1|spProc2 to search in the source code for all object names at the same time if you have a tool that supports searching through files using regular expressions (I have used RegexBuddy for this in the past)
If you want to just cover the possibility you might have missed the odd one you could leave all the previous stored procedures behind for a month and just have them log a custom SQL trace event with APP_NAME(), SUSER_NAME() and any other info you find helpful then have it call the renamed version. Then set up a trace monitoring this event.
If you use a connection to DB, stored procedures etc, you should create a service class to delegate these methods.
This way when something in your database, SP etc changes, you only have to update your service class, and everything is protected from breaking.
There are tools for VS that can manage changing a name, like refactor, and resharper
I did this and I relied heavily on global search in my source code for stored procedure names and SQL digger to find sql procs that called sql proces.
http://www.sqldigger.com/
SQL Server (as of SQL 2000) poorly understands it own dependencies, so one is left searching the text of the scripts to find dependencies, which could be other stored procs or substrings of dynamic sql.
I would obtain a list of references to a procedure by using the following, because SSMS dependencies doesn't pickup dynamic SQL references or references outside the database.
SELECT OBJECT_NAME(m.object_id), m.*
FROM SYS.SQL_MODULES m
WHERE m.definition LIKE N'%my_sproc_name%'
The SQL needs to be run in every database where there could be references.
syscomments and INFORMATION_SCHEMA.routines have nvarchar(4000) columns. So if "mySprocName" is used at position 3998, it won't be found. syscomments does have multiple lines but ROUTINES truncates. Should you disagree, take it up with gbn.
Based on that list of dependencies, I'd create new stored procedures starting the foundation stored procedures - those with the least dependencies. But I'd mind not to create stored procedures, prefixing the name with "sp_"
Verify the foundation procedures work identically to existing ones
Move to the next level of stored procedures - repeat steps 1-3 as needed till the highest level procedure has been processed.
Test the switch over the application uses to the new procedure - don't wait until the all the procedures are updated to test interaction with the application code. This doesn't need to be done for every stored procedure, but waiting to do this wholesale isn't a great approach either.
Developing in parallel has it's risks too:
Any changes to existing code needs to also be applied to the new code. If possible, work in areas where development is frozen or use a bug fix as an opportunity to migrate to new code rather than apply the patch in two places (while also minimizing downtime for transition).
Use a utility like FileSeek to search the contents inside each and every file in your project folder. Don't trust the windows search - it's slow and user-unfriendly.
So if you had a Stored Procedure named OldSprocOne and want to rename it to SP_NewONe, search all occurrences Of OldSprocOne then search all occurrences of OldSprocOne to see if that name isn't already being used somewhere else and won't cause problems. Then rename each and every occurrence in the code.
This can be very time consuming and repetitive for larger systems.
I would be more concerned about ignoring the names of the procedures and replacing your legacy DAL with Enterprise Library Data Access Block 5
Database Accessors in Enterprise Library 5 DAAB - Database.ExecuteSprocAccessor
Having code that is like
public Contact FetchById(int id)
{
return _database.ExecuteSprocAccessor<Contact>
("FetchContactById", id).SingleOrDefault();
}
Will have atleast a billion times more value than having stored procs with consistent names, especially if the current code passes around DataTables or DataSets ::shudders::
I'me all in favor of refactoring any sort of code.
What you really need here is a method slowly and incrementally renaming your stored procs.
I certainly would not do a global find and replace.
Rather, as you identify small pieces of functionality and understand the relationships between the procs, you can re-factor in small pieces.
Fundamental to this process, though, is source-code control of your database.
If you do not manage changes to your database the same as normal code, you will be in serious trouble.
Have a look at DBSourceTools. http://dbsourcetools.codeplex.com
It's specifically designed to help developers get their databases under source code control.
You need a repeatable method of restoring your database to a specific state - prior to refactoring.
Then re-apply your refactored changes in a controlled way.
Once you have embraced this mindset, this mammoth and error-prone task will become simple.
This is assuming that you use SQL Server 2005 or above. An option that I have used before is to rename the old database object and create a SQL Server Synonym with the old name. This will allow for you to update your objects to whatever convention you choose and replace the refrences in code, SSIS packages, etc... as you come along them. Then you can concentrate updating the references in your code gradually over however maintenance releases you choose (as opposed to breaking them all at once). As you feel that you've found all references you can remove the synonym as the code goes to QA.
I know a little about SQL injections and URL decode, but can someone who's more of an expert than me on this matter take a look at the following string and tell me what exactly it's trying to do?
Some kid from Beijing a couple weeks ago tried a number of injections like the one below.
%27%20and%20char(124)%2Buser%2Bchar(124)=0%20and%20%27%27=%27
It's making a guess about the sort of SQL statement that the form data is being substituted into, and assuming that it will be poorly sanitised at some step along the road. Consider a program talking to an SQL server (Cish code purely for example):
fprintf(sql_connection, "SELECT foo,bar FROM users WHERE user='%s';");
However, with the above string, the SQL server sees:
SELECT foo,bar FROM users WHERE user='' and char(124)+user+char(124)=0 and ''='';
Whoops! That wasn't what you intended. What happens next depends on the database back-end and whether or not you've got verbose error reporting turned on.
It's quite common for lazy web developers to enable verbose error reporting unconditionally for all clients and to not turn it off. (Moral: only enable detailed error reporting for a very tight trusted network, if at all.) Such an error report typically contains some useful information about the structure of the database which the attacker can use to figure out where to go next.
Now consider the username '; DESCRIBE TABLE users; SELECT 1 FROM users WHERE 'a'='. And so it goes on... There are a few different strategies here depending on exactly how the data comes out. SQL injection toolkits exist which can automate this process and attempt to automatically dump out the entire contents of a database via an unsecured web interface. Rafal Los's blog post contains a little more technical insight.
You're not limited to the theft of data, either; if you can insert arbitrary SQL, well, the obligatory xkcd reference illustrates it better than I can.
You'll find detailed info here:
http://blogs.technet.com/b/neilcar/archive/2008/03/15/anatomy-of-a-sql-injection-incident-part-2-meat.aspx
These lines are double-encoded -- the
first set of encoded characters, which
would be translated by IIS, are
denoted by %XX. For example, %20 is a
space. The second set aren't meant to
be translated until they get to the
SQL Server and they use the char(xxx)
function in SQL.
' and char(124)+user+char(124)=0 and ''='
that's strange..however, make sure you escape strings so there will be no sql injections
Other people have covered what's going on, so I'm going to take a moment to get on my high-horse and strongly suggest that if you're not already (I suspect not from a comment below) that you use parameterized queries. They literally make you immune to SQL injection because they cause parameters and the query to be transmitted completely separately. There's also potential performance benefits, yadda yadda, etc.
But seriously, do it.
I am receiving a message from a commercial program stating that the "LogMessage" stored procedure is not found. There does not appear to be a stored procedure called LogMessage in the associated MS SQLServer 2000 database. What can I do to track down the missing procedure, other than calling the company?
The reason you couldn't find it is because it's not there. Unless you have the original proc, you're going to have to call the company.
Granted, you could take a stab at creating the proc, yourself. But why bother when somebody already has the original proc?
Is this a fresh install of the commercial product? If so, this is completely their responsibility.
Occasionally you will also get this message if you do not have permissions to access the stored procedure. Log in as 'sa' or equivalent to verify that the proc is indeed missing.
LogMessage seems pretty self-explanatory. You could probably take a stab at creating one yourself just to see what happens, if you can't easily get the real thing.
Create a new table called LoggedMessages and just insert to the table when the proc is called. Then see what pops in.
Kind of hacky, but given that it's a logging mechanism, which is tangential to the main features of the app, you could give it a try.
Well, if the company is out of business, unhelpful, etc, or your in a hurry, then attach a SQL trace, look at what kind and how many parameters are being passed and create a stored procedure with that name and signature. It may take some experimentation to get the signature right dependening on the data access API being used. The body of the stored procedure would be empty. Since this is just logging, presumably this will let the rest of the app run, but logging would be off.
Make sure that the rest of the schema is there. Obviously if the entire schema is missing, then this trick won't work.
If you have a maintenance agreement with the vendor go holler at them, that's what maintenance and support is for.