What does a database log keep track of? - sql-server

I'm quite new to SQL Server and was wondering what the difference between the SQL Server log is and a custom log (in my case, using log4net)? I guess there's more choice on what to log using log4net, but what things are automatically logged by the database? For example, if a user signs up to my site, would I have to manually log that transaction, or would that be recorded in the database's log automatically? I'm currently starting a project and would like to figure out exactly what I should bother logging.

Apples and Oranges.
Log4net and other custom 'logging' is just a way to capture events an application is reporting. 'Log' in this context reffers to whatever store is used by this infrastucture to persist information about these events.
The database log on the other hand is something compeltely different. In order to maintain consistency and atomicity databases use a so called Write-Ahead-Log protocol. In WAL all changes are first durable written into a journal, or log, before being applied to the data. This allows recovery to replay the log (the journal) and get the data back into a consistent state, by rolling back any uncommited work.
Database logs have absolutely nothing to do with your application code. Any database update will be automatically logged by the engine, simply because this is how any data is updated in a database. You cannot modify that, nor do you have any access to what's written in the log (strictly speaking you can look into the log, but you won't find any usefull information for your application).

SQL log handles tansaction logging for rolling back or comiting data. They are usually only dealt with by someone who knows what they are doing restoring backups or shipping the logs to use for backups.
The log4net and other logging framweworks handle in code logging of exceptions, warning, or debug level info that you would like to output for your own info. They can be sent to a table in a database, command window, flat file or web service. Common logging scenarios are catching unhandled exceptions at the application level to help track down bugs, or in any try catch statements writing out the stack trace.

It keeps track of the transactions so it can roll them back or replay in case of a crash. Quite more involved than simple logging.

The two are almost completely unrelated.
A database log is used to rollback transactions, recover from crashes, etc. All good things to ensure database consistency. It has updates/inserts/deletes in it--not really anything about intent or what your app is trying to do unless it directly affects data in the database.
The application log on the other hand (with Log4Net) can be extremely useful when building and debugging your application. It is driven by you and should contain information that traces what your app is doing. This is something that can safely be turned off or reduced (by toggling the log level) when you no longer need it.

The SQL Server log file is actually used for maintaining it's own stability, but it's not terribly useful for normal developers. It's not what you think (and I what I thought), a list of SQL statements that have been run. It's a propriety format designed to help SQL recover from a crash or roll back transactions.
If you need to track what's going on in the system, the SQL transaction log won't be helpful, and it would be very difficult to get that information back out. Instead, I would suggest adding triggers on your tables that write information off to another table, or add some code in your data layer that saves off a log of what's going on. It could be as simple as wrapping the SQL command object with your own implementation, which saved SQL statements off to log4net in addition to whatever normal code it was executing.

It is the mechanism by which the RMDBS can assure atomicity and consistency, see ACID.


How can I turn logging off / on while database is running? Postgres

I'm trying to upgrade a cluster. The problem is that the versions are far far away, both for the cluster and the extensions.. I managed to dump and restore, modify tables, modify the code, etc..
But now I somehow have to do replication too since we can't afford downtime. Existing replication systems cannot be used due to the massive difference between pretty much everything. My idea is to have all the queries logged, then an application wrote by me will take these, modify them to comply with the new constraints, and execute them on the new databases (assuming the queries are somewhat compatible in terms of side-effects)
I finished writing the application, but I need now to turn on logging on the old databases so I can replicate after I dump/reload (the database will be in backup mode so I can operate on the committed data; the rest of the data will be in the logs)
After setting all the required configs, it boils down to logging_collector which requires restart (unlike others where pg_reload_conf() is enough)
Is there some way to turn on logging without restarting?
Yes, you can
ALTER SYSTEM SET log_statement = 'all';
SELECT pg_reload_conf();
to start logging all statements. To turn it off again, run
ALTER SYSTEM RESET log_statement;
SELECT pg_reload_conf();
But I doubt that that will work as you intend. First, a statements like COPY ... TO will be logged, but the data won't, so you cannot capture that. There are some other cases, like the fastpath API, that won't be logged (this also affects large objects). Take a look at pgreplay which does something like what you intend and look at its limitations.
I recommend that you use a trigger-based replication solution like Slony-I.

Managing high-volume writes to SQL Server database

I have a web service that is used to manage files on a filesystem that are also tracked in a Microsoft SQL Server database. We have a .NET system service that watches for files that are added using the FileSystemWatcher class. When a file-added callback comes from FileSystemWatcher, metadata about the file is added to our database, and it works fairly well.
I've now come to a bit of a scalability problem. I'm adding large quantities of files to the filesystem in rapid succession, and this ends up hammering the database with file adds which results in locking up my web front-end.
I have yet to work on database scability issues, so I'm trying to come up with mitigate tactics. I was thinking of perhaps caching file adds and only writing them off to the database every five minutes or so, but I'm not sure how practical that is. This is data that needs to find its way into our database at some point anyway, and so it's going to have to get hammered at some point. Maybe I could limit the number of file db entries written per second to a certain amount, but then I risk having that amount be less than the rate at which files are added. How can I best tackle this?
Have you thought about using something like SQL Server Service Broker? That way you could push through tons of entries in a burst and it would level out the inserts into your database.
Basically you'd be pushing messages onto a queue which would then be consumed by a receiver stored procedure that would perform the insert for you. You could limit the maximum number of receivers executing to help with the responsiveness issues in your web interface.
There's a nice intro paper here. Although it's for 2005, not much has changed between 2005 and the newer versions of SQL Server.
You have a performance problem and you should approach it with a performance investigation methodology like Waits and Queues. Once you identify the actual problem, we can discuss solutions.
This is just a guess but, assuming the notification 'update metadata' code is a stright forward insert, the likely problem is that you're generating one transaction per notification. This results in commit flush waits, see Diagnosing Transaction Log Performance . Batch commit (aggregate multiple notifications before committing) is the canonical solution.
first option is using Caching to handle high-volume data. or using clusters for analysis high volume data. please click here for more information.

Monitor account activity for SQL Server 2005

I'm looking for the easiest way to view what users are logging into my database. We have some old user accounts that might not be getting used anymore. Instead of just turning them off and seeing who complains, I thought there might be some way to monitor who logs in and runs some type of query over the next month or so. What would be the easiest way to monitor and track this kind of activity?
I would like to do this for all databases on the server.
To see who's connecting, you can use Logon Triggers which allows you to log access. Running a trace for a month or 2 to audit login events may simply not work if you failover, restart SQL etc
However, to see what someone is doing after connection, then you'll really have to use Profiler like Mitch said
Run a profiler trace with the Audit Login event selected: or just select the Standard Trace Template (and perhaps limit the trace size).
See Using SQL Server Profiler
The easiest way to do this would be with a third-party tool that's custom-written to do the work for you. Otherwise you have to fuss with (not SQL Profiler but) traces, regularly loading resulting data, and processing it, and for my money, that just is not an "easy" thing to do.
Not much help. The reason I'm posting is that just because someone (or something) hasn't logged in for a day, a week, or a month, does not mean that the account has gone derelict--I would only consider it an indication. I would recommend that once you've identified it as potentially derelict, then you disable it and see what happens. Give that a month, a quarter, or even a year (depends on your system) before actually deleting it.
(Of course, tracking that information over a month/quarter/year is yet more fuss and bother. Ideally, all accounts get created with deactivation/deletion rules, and their users/owners are informed of the rules under which they get to access the system. This probably won't help you now, but keep it in mind for the next system you design.)

Sharing transactional space between two connections

There's an app that starts a transaction on SQL Server 2008 and moves some data around. Then, while the transaction is still not committed, the app prints out some labels. It is very important that the transaction is not committed until printing succeeded; if a printing error occurs, everything is rolled back.
Now, the printing engine is a) grew quite huge and complex, and b) is eventually required from lots of places. It is therefore decided to separate the engine and make it a service.
Yes, it is possible to pass all data required for printing from the client app to that server so that the server only prints and is not concerned about databases. However, that would mean leaving piles of code and label templates in each application that requires printing; effectively, very little separation will then occur. On contrary, it would be extreemely efficient (and easier for me to write and maintain) to just pass the IDs of what is required to the service which then would go to the database and get the data. All formats and layouts will be centralized and apps will only ask for 5 delivery notes from print job 12345.
Now, this is not going to happen as the transaction is still not committed at the moment of printing. The service would not be able to read the data, and using READ UNCOMMITTED is not quite an option.
I was going to use the good old sp_bindsession to join the two sessions, app's and service's, but then it is suddenly deprecated and to be removed from future releases. The help suggests I use MARS or distributed transactions instead, but I can't see how they would help.
Any advice?
My gut feeling is that attempting to share a transaction between two processes in this way is not a good idea.
My approach would be to either to pass all data to service, or investigate alternatives to keeping the transaction open for the duration of the printing - would a simpler mechanism (such as an IsPrinted flag for each record) not suffice?
Failing that, the eaisest way I can see of doing this would be to have the printing service pass all of its SQL requests back through to the originating process so that they can be executed in the context of the original transaction.
Only sp_getbindtoken/sp_bindsession can do what you ask, and it is deprecated and will be removed.
In theory you should use short transactions, represent the 'printing' state as a committed state, and have compensating actions if the print fails. Also if the printing engine is exposed as a service, it should be autonomous and receive as a message all data it needs to print (like label templates). I understand this is easy for me to to say but may be a major undertaking on the product.
For the moment I think your best bet is to use the session binding tokens. Altough I have to call out that leaving transactions open for the duration of physical operations (printing) is a very bad practice.

Separate production database for logging

I've got an application in test that's logging events for a departmental Windows forms application to a SQL Server database. The application is nearly ready for production. The logging database is completely separate from the application database.
My question is, do I really need to create a production version of the logging database, or can I just log production and test events to the same database? The obvious answer is yes, of course I need a separate database. You never want test and production environments to mix. But in this case, the data being written to the database isn't really production data, exactly, it's just logging details that we use to troubleshoot problems. It has no business value, and nothing significant would be lost if the data were to be inadvertently dropped or the database were to be temporarily unavailable. And having it all in a test environment would make it much simpler for me to manage.
So on a pros and cons basis using a single logging database for all environments seems like a better solution. But it just doesn't feel right. Can anybody give me any specific reasons why this is a bad idea?
Your logging may not work if your dev server is down but prod isn't. Can guarantee that will be when something critical you need logged on prod will happen. In our case prod and dev are not in the same physical location which would mean sending logging data across our network and cause pipeline bottlenecks and cranky network guys.
Plus what if you decide to change the logging process? While you are doing new development, the entire prod process might break.
And there will be times when someone might read lthe logs, panic at some error forgetting that it happened on dev. Or worse, someone might see a bad error that they thought was happening on dev that was really happening on prod.
I would say:
Use a standard methodology for logging (a single DLL or similar) and actually house the logging DB in production.
That way, your logging database can be considered a "Logging Server" and ALL apps (Dev, Staging, Test, and Prod) can log to it, since you are using a vetted library.
Of course, you have to still watch out that you don't flood the server...
I don't see anything wrong with keeping it on the dev box, unless your app will fail if it's unable to properly log, or unless the information being logged is more valuable than you indicate.
On the upside, keeping your logging db on your dev server will help take the load for handling this data off of the production server - a definite plus where performance is concerned.
Keeping logs on a database is a bad idea in the first place. What happens when you can't connect to the DB for some reason or another? I suggest you use log4net and implement RollingFileAppenders. They write log entries to a file and when the file hits a certain limit, log4net starts writing to a new file. If you have questions getting setup, feel free to ask. I would be glad to help!
