Basic Approach to Diagnostic Logging in SSIS

Basic Approach to Diagnostic Logging in SSIS - sql-server

Given help from this microsoft link, I am aware of many tools related to SSIS diagnostics:
Event Handlers (in particular, "OnError")
Error Outputs
Operations Reports
SSISDB Views
Logging
Debug Dump Files
I just want to know what is the basic, "go to" approach for (non-production) diagnostics setup with SSIS. I am a developer who WILL have access to the QA and UAT servers where I will be performing diagnostics.
In my first attempt to find the source of an error, I used SSMS to view operational reports. All I saw was this:
I followed the instructions shown above, but all it did was lead me in a circle. The overview allows me to see the details, but the details show the above message and ask me to go back to the overview. In short, there is zero error information beyond telling me which task failed within the SSIS package.
I simply want to get to a point where I can at last see SOMETHING about the error(s).
If the answer is that I first need to configure an OnError event in my package, then my next question is: what would the basic, "go to" designer-flow look like for that OnError event?
FYI, this question is similar to "best way of logging in SSIS"
I also noticed an overall strategy for success with SSIS in this answer. The author says:
Instrument your code - have it make log entries, possibly recording diagnostics such as check totals or counts. Without this, troubleshooting is next to impossible. Also, assertion checking is a good way to think of error handling for this (does row count in a equal row count in b, is A:B relationship really 1:1).
Sounds good! But I'd like to have a more concrete example...particularly for feeding me the basics of what specific errors were generated.
I'm trying to avoid learning ALL the SSIS diagnostic approaches, just for the purpose of picking one good "all around" approach.
Update
Per Nick.McDermaid suggestion, in the SSISDB DB I run this:
SELECT * FROM [SSISDB].[catalog].[executions] ORDER BY start_time DESC
This shows to me the packages that I manually executed. The timestamps correctly reflect when I ran them. If anything is unusual(?) it is that the reference_id, reference_type and environment_name columns are null. All the other columns have values.
Update #2
I discovered half the answer I'm looking for. The reason no error information is available, is because by default the SSIS package execution logging level is "none". I had to change the logging level.
Nick.McDermaid gave me the rest of the answering by explaining that I don't need to dive into OnError tooling or SSIS logging provider tooling.

I'm not sure what the issue with your reports are but in answer to the question "Which SSIS diagnostics should I learn", I suggest the vanilla ones out of the box.
In other words use built in SSIS logging (which does not require any additional code) to log failures. Then use the built in reports (once you get them working) to check those logs.
vanilla functionality requires no maintenance. Custom functionality (i.e. filling your packages up with OnError events) requires a lot more maintenance.
You may find situations where you need to learn some of the SSISDB tricks to troubleshoot but in the first instance, try to get everything you can out of the vanilla reports.
If you need to maintain an SQL 2012 or after existing system, then all of this logging is built in. Manual OnError additions are not guaranteed to be built in
The only other thing to be aware of is that script tasks never yield informative errors. I actually suggest you avoid the use of script tasks in SSIS. I feel that if you have to use a script task, you might be using the wrong tool

Adding to the excellent answer of #Nick.McDermaid.
I use SSIS Catalog error reporting. In most cases, it is sufficient and have the following functionality for error analysis. Emphasis is on the following:
Usually the first or second error message contains meaningful information on error. The latter is some error occurred in the dataflow....
If you look at the first/second error message at All Messages report at Error Messages section, you will see Error Context hyperlink. Invoking it will show you environment, connection managers and some variables at the moment of package crash.
Good error analysis is more an approach and practice than a mere tool selection. Here are my recommendations:
SSIS likes to report error code instead of meaningful explanation. So, Integration Services Error and Message Reference is your friend.
SSIS includes in error context (see above) dump those variables which have Include in ErrorDump property set to true.
I build SQL commands for SQL Task or DataFlow Source in variables. This allows to display SQL command executed at error in error context, when you set incude in Dump property on these variables.
Structure your variables well. If some variable is used only at some task - declare it on this task. Otherwise a mess of dumped variables will hurt you more than do any good.

Related

Play framework evolution script error line number

I'm using evolution scripts, via Slick, using the Play Framework to update the schema for a Microsoft SQL Server database.
This largely works great, except that when something goes wrong, I just get a terse error message, with no indication which line of the script caused the error, and—for large scripts—that makes identifying the error time-consuming and challenging.
For example, consider this error:
An evolution has not been applied properly. Please check the problem and resolve it manually before marking it as resolved. -
We got the following error: , while trying to run this SQL script:
That's it.
Is there a way to get the full error message (through a log file, a configuration setting, etc.) that includes the line number and context, etc. with the evolution failure?

It took me a while, but I finally figured this out.
I edited the conf/logback.xml file to include the following:
<logger name="play.api.db.evolutions" level="DEBUG" />
Each statement in the Ups script is then written to logs/application.log as it is executed, together with its result, including detailed error messages.

What can I do with generated error logs?

I'm currently working on a web application which generates daily error (and non error) logs.
The current system outputs a log per task to a text file, and outputs critical errors as well as "start" and "finish" type messages to an email account.
The current workflow is as follows: scour the email box for errors, then go and find the .txt file to look at the associated errors and find the cause.
There are around 30 txt files split across about 5 servers.
This system was set up before me, but I'm looking for any advice on how to deal with the situation.
I have control of the script forming the error logs so can do pretty much anything - but I'm lost where to start: I'd considered some kind of web facing dashboard tool, maybe output the files to RSS or something?
Are there any external or internal tools I should be using?

Of course you may use the SQL Server Reporting Services or review this comparison table, there are some packages which may support SQL Server but they may be overwhelming for your task.

It's not really clear what your problem is or what you want to do, but if I understand correctly, your biggest problem is that some messages are logged to a log file but others are sent by email. Therefore, there is no single location that has all error messages in it and that makes analysis and troubleshooting difficult.
The best solution would be to use a logging framework that supports multiple logging destinations (file, DB, email) and severities. That would allow you to specify a configuration like "all errors are logged to a text file and critical ones are also sent by email", so you can ensure that you have everything in one place for general analysis but critical errors are also handled with priority.
You didn't mention what programming language you use, but assuming it's .NET-based then log4net and Enterprise Library are two common frameworks and there are many questions about them here on SO. Googling should give you a good idea of the pros and cons for your situation. If you're using a different language then you can look for the equivalent package: log4j (Java), logging (Python) etc.

Linq-To-Sql and MARS woes - A severe error occurred on the current command. The results, if any, should be discarded

We have built a website based on the design of the Kigg project on CodePlex:
http://kigg.codeplex.com/releases/view/28200
Basically, the code uses the repository pattern, with a repository implementation based on Linq-To-Sql. Full source code can be found at the link above.
The site has been running for some time now and just about a year ago we started to get errors like:
There is already an open DataReader associated with this Command which must be closed first.
ExecuteNonQuery requires an open and available Connection. The connection's current state is closed.
These are the closest error examples I can find based on my memory. These errors started to occur when the site traffic started to pick up. After banging my head against the wall, I figured out assumed that the problem is inherit within Linq-To-Sql and how we are using the same connection to call multiple commands in a single web request.
Evenually, I discovered MARS (Multiple Active Result Sets) and added that to the data context's connection string and like magic, all of my errors went away.
Now, fast forward about 1 year and the site traffic has increased tremendously. Every week or so, I will get an error in SQL Server that reads:
A severe error occurred on the current command. The results, if any, should be discarded
Immediately after this error, I receive hundreds to thousands of InvalidCastException errors in the error logs. Basically, this error shows up for each and every call to the Linq-To-Sql data context. Only after I restart the web server do these errors clear up.
I read a post on the Micosoft Support site that descrived my problem (minus the InvalidCastException errors) and stating the solution is that if I'm going to use MARS that I should also use Asncronous Processing=True. I tried this, but it did not solve my problem either.
Not really sure where to go from here. Hopefully someone here has seen and solved this problem before.

I have the same issue. Once the errors start, I have to restart the IIS Application Pool to fix.
I have not been able to reproduce the bug in dev despite trying many different scenarios involving multi-threading, leaving connections open, etc etc.
One possible lead I do have is that amongst the errors in the server Event Log is an OutOfMemoryException for the Application Pool. Perhaps this is the underlying cause of the spurious SQL Datareader errors (a memory leak elsewhere). Although again I haven't been able to reproduce this in dev.
Obviously if you are using a 64 bit OS then this is probably not the cause in your case.

So after much refactoring and re-architecting, we figured out that problem all along is MARS (Multiple Active Result Sets) itself. Not sure why or what happens exactly but MARS somehow gets result sets mixed up and doesn't recover until the web app is restarted.
We removed MARS and the errors stopped.
If I remember correctly, we added MARS to solve the problem where a connection/command was already closed using LinqToSql and we tried to access an object graph that hadn't been loaded. Without MARS, we'd get an error. But when we added MARS, it seemed to not care about it. This is really a great example of us not really understanding what the heck we were doing and we learned some valuable (and expensive) lessons from this.
Hope this helps others who have experienced this.
Thanks to all how have contributed their comments and answers.

I understand you figured out the solution..
Following is not a direct solution to the problem; but it is good for others to take a look at
What does "A severe error occurred on the current command. The results, if any, should be discarded." SQL Azure error mean?
http://social.msdn.microsoft.com/Forums/en-US/bbe589f8-e0eb-402e-b374-dbc74a089afc/severe-error-in-current-command-during-datareaderread

is there a way to find out which statement in TSQL stored procedure crashed?

this article http://www.novicksoftware.com/tipsandtricks/tips-erorr-handling-in-a-stored-procedure.htm makes this claim about SQL Server development: "errors must be checked after every sql statement of interest". The fairly vague online descriptions of the sql debugger for TSQL neither refute nor support this claim.
So, it it really the case that an "easy to debug" stored procedure is one that has 50% of its code dedicated to detecting errors? Or are there better ways to do it, that could come closer to the sort of ease of finding crashes that we are familiar with in stack trace based debugging in modern programming environments?
Is this an area that calls for creating clever new tools to fill the gaps left by imperfection of the existing programming environment or one that calls for me to get with the program and learn some well known state-of-the-art way of getting this stuff done?
ETA: got it on the try-catch, thanks. In fact, to expand on that, here Recording SQL Server call stack when reporting errors is a discussion of how to emulate stack trace since apparently stack trace is not yet supported by SQL Server. Well, at least that's how it can be emulated in your own codebase written after reviewing the article - a legacy codebase without all this stuff would be harder to deal with.

You can use:
ERROR_LINE()
to get information about the line where problem occurred. To use this, you would have to use "try ... catch" for error handling. This information is available in "catch" block.
Documentation : http://msdn.microsoft.com/en-us/library/ms175976.aspx
Examples : http://blog.sqlauthority.com/2007/04/11/sql-server-2005-explanation-of-trycatch-and-error-handling/

SQL injection attempt on my server

I know a little about SQL injections and URL decode, but can someone who's more of an expert than me on this matter take a look at the following string and tell me what exactly it's trying to do?
Some kid from Beijing a couple weeks ago tried a number of injections like the one below.
%27%20and%20char(124)%2Buser%2Bchar(124)=0%20and%20%27%27=%27

It's making a guess about the sort of SQL statement that the form data is being substituted into, and assuming that it will be poorly sanitised at some step along the road. Consider a program talking to an SQL server (Cish code purely for example):
fprintf(sql_connection, "SELECT foo,bar FROM users WHERE user='%s';");
However, with the above string, the SQL server sees:
SELECT foo,bar FROM users WHERE user='' and char(124)+user+char(124)=0 and ''='';
Whoops! That wasn't what you intended. What happens next depends on the database back-end and whether or not you've got verbose error reporting turned on.
It's quite common for lazy web developers to enable verbose error reporting unconditionally for all clients and to not turn it off. (Moral: only enable detailed error reporting for a very tight trusted network, if at all.) Such an error report typically contains some useful information about the structure of the database which the attacker can use to figure out where to go next.
Now consider the username '; DESCRIBE TABLE users; SELECT 1 FROM users WHERE 'a'='. And so it goes on... There are a few different strategies here depending on exactly how the data comes out. SQL injection toolkits exist which can automate this process and attempt to automatically dump out the entire contents of a database via an unsecured web interface. Rafal Los's blog post contains a little more technical insight.
You're not limited to the theft of data, either; if you can insert arbitrary SQL, well, the obligatory xkcd reference illustrates it better than I can.

You'll find detailed info here:
http://blogs.technet.com/b/neilcar/archive/2008/03/15/anatomy-of-a-sql-injection-incident-part-2-meat.aspx
These lines are double-encoded -- the
first set of encoded characters, which
would be translated by IIS, are
denoted by %XX. For example, %20 is a
space. The second set aren't meant to
be translated until they get to the
SQL Server and they use the char(xxx)
function in SQL.

' and char(124)+user+char(124)=0 and ''='
that's strange..however, make sure you escape strings so there will be no sql injections

Other people have covered what's going on, so I'm going to take a moment to get on my high-horse and strongly suggest that if you're not already (I suspect not from a comment below) that you use parameterized queries. They literally make you immune to SQL injection because they cause parameters and the query to be transmitted completely separately. There's also potential performance benefits, yadda yadda, etc.
But seriously, do it.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight