Separate production database for logging - database

I've got an application in test that's logging events for a departmental Windows forms application to a SQL Server database. The application is nearly ready for production. The logging database is completely separate from the application database.
My question is, do I really need to create a production version of the logging database, or can I just log production and test events to the same database? The obvious answer is yes, of course I need a separate database. You never want test and production environments to mix. But in this case, the data being written to the database isn't really production data, exactly, it's just logging details that we use to troubleshoot problems. It has no business value, and nothing significant would be lost if the data were to be inadvertently dropped or the database were to be temporarily unavailable. And having it all in a test environment would make it much simpler for me to manage.
So on a pros and cons basis using a single logging database for all environments seems like a better solution. But it just doesn't feel right. Can anybody give me any specific reasons why this is a bad idea?

Your logging may not work if your dev server is down but prod isn't. Can guarantee that will be when something critical you need logged on prod will happen. In our case prod and dev are not in the same physical location which would mean sending logging data across our network and cause pipeline bottlenecks and cranky network guys.
Plus what if you decide to change the logging process? While you are doing new development, the entire prod process might break.
And there will be times when someone might read lthe logs, panic at some error forgetting that it happened on dev. Or worse, someone might see a bad error that they thought was happening on dev that was really happening on prod.

I would say:
Use a standard methodology for logging (a single DLL or similar) and actually house the logging DB in production.
That way, your logging database can be considered a "Logging Server" and ALL apps (Dev, Staging, Test, and Prod) can log to it, since you are using a vetted library.
Of course, you have to still watch out that you don't flood the server...

I don't see anything wrong with keeping it on the dev box, unless your app will fail if it's unable to properly log, or unless the information being logged is more valuable than you indicate.
On the upside, keeping your logging db on your dev server will help take the load for handling this data off of the production server - a definite plus where performance is concerned.

Keeping logs on a database is a bad idea in the first place. What happens when you can't connect to the DB for some reason or another? I suggest you use log4net and implement RollingFileAppenders. They write log entries to a file and when the file hits a certain limit, log4net starts writing to a new file. If you have questions getting setup, feel free to ask. I would be glad to help!

Related

server, client debugging

I'm writing a multiprocess server in C and an I'm just wondering what are the best tools to debug and test my programs? Specifically what is being sent to the client and vise versa.
Thank you for your help.
Every process should write log. It's not exactly a debugging tool like gdb, but very-very useful.
Every log entry should contain a precise timestamp, the process ID, and socket data. You can write the log to file(s), to database, maybe to a log server. Logging to a database (e.g. SQLite) is useful because it's easy to filter the log for specific time range, for specific client connection etc. It's also easy to merge the log of different processes (SQLite: ATTACH DATABASE). On Linux, I would consider using syslog.
Specify different logging levels. Detailed logging helps to debug your code in the development phase. Basic logging will help you tracking down rare errors which will emerge in the long term. Make sure you can turn on and off logging and set logging levels easily, without turning off your server.

Postgres Database Error Invalid Page Header

I'm using the Django ORM to access a PostgreSQL database, and on rare occassions Django will throw a DatabaseError like django.db.utils.DatabaseError: invalid page header in block 299560 of relation base/83966/84778.
I've researched this, and it seems due to the database getting corrupted somehow. This is immensely frustrating, because I've always shutdown the database cleanly when rebooting, and every check I can run on my disk drive says there's nothing wrong with the disk itself. Therefore, I can only conclude that PostgreSQL is not actually ACID compliant and is corrupting my data in rare instances.
The only fix I've been able to find, is to drop and recreate my database. Obviously, this isn't really a fix, since I'm losing all my data. Is there any other way to resolve this, or should I switch to a more reliable database like MySQL?
I'm running Postgresql-8.4.8 on Ubuntu 10.04.
Most of the time you see this you either have bad memory or a bad drive. The difference between PostgreSQL and MySQL is that PostgreSQL sees it and complains as it should, while MySQL often just keeps on going with no stopping. I think the db that stops when the machine corrupts the data store is the more reliable db, because it lets you know right up front there are issues with your system.
BTW, PostgreSQL can survive an emergency shutdown (pull the plug out the back of the machine) just fine as long as the hard drives aren't lying about fsync.
Try memtest86 to see if your memory's ok, and do something like
sudo dd if=/dev/sdc1 of=/dev/null
to see if you get any errors. Anything in your dmesg or message logs about drive read write errors?

What does a database log keep track of?

I'm quite new to SQL Server and was wondering what the difference between the SQL Server log is and a custom log (in my case, using log4net)? I guess there's more choice on what to log using log4net, but what things are automatically logged by the database? For example, if a user signs up to my site, would I have to manually log that transaction, or would that be recorded in the database's log automatically? I'm currently starting a project and would like to figure out exactly what I should bother logging.
Thanks
Apples and Oranges.
Log4net and other custom 'logging' is just a way to capture events an application is reporting. 'Log' in this context reffers to whatever store is used by this infrastucture to persist information about these events.
The database log on the other hand is something compeltely different. In order to maintain consistency and atomicity databases use a so called Write-Ahead-Log protocol. In WAL all changes are first durable written into a journal, or log, before being applied to the data. This allows recovery to replay the log (the journal) and get the data back into a consistent state, by rolling back any uncommited work.
Database logs have absolutely nothing to do with your application code. Any database update will be automatically logged by the engine, simply because this is how any data is updated in a database. You cannot modify that, nor do you have any access to what's written in the log (strictly speaking you can look into the log, but you won't find any usefull information for your application).
SQL log handles tansaction logging for rolling back or comiting data. They are usually only dealt with by someone who knows what they are doing restoring backups or shipping the logs to use for backups.
The log4net and other logging framweworks handle in code logging of exceptions, warning, or debug level info that you would like to output for your own info. They can be sent to a table in a database, command window, flat file or web service. Common logging scenarios are catching unhandled exceptions at the application level to help track down bugs, or in any try catch statements writing out the stack trace.
It keeps track of the transactions so it can roll them back or replay in case of a crash. Quite more involved than simple logging.
The two are almost completely unrelated.
A database log is used to rollback transactions, recover from crashes, etc. All good things to ensure database consistency. It has updates/inserts/deletes in it--not really anything about intent or what your app is trying to do unless it directly affects data in the database.
The application log on the other hand (with Log4Net) can be extremely useful when building and debugging your application. It is driven by you and should contain information that traces what your app is doing. This is something that can safely be turned off or reduced (by toggling the log level) when you no longer need it.
The SQL Server log file is actually used for maintaining it's own stability, but it's not terribly useful for normal developers. It's not what you think (and I what I thought), a list of SQL statements that have been run. It's a propriety format designed to help SQL recover from a crash or roll back transactions.
If you need to track what's going on in the system, the SQL transaction log won't be helpful, and it would be very difficult to get that information back out. Instead, I would suggest adding triggers on your tables that write information off to another table, or add some code in your data layer that saves off a log of what's going on. It could be as simple as wrapping the SQL command object with your own implementation, which saved SQL statements off to log4net in addition to whatever normal code it was executing.
It is the mechanism by which the RMDBS can assure atomicity and consistency, see ACID.

Handling database queries that fail due to a server failover

In an environment with a SQL Server failover cluster or mirror, how do you prefer to handle errors? It seems like there are two options:
Fail the entire current client request, and let the user retry
Catch the error in your DAL, and retry there
Each approach has its pros and cons. Most shops I've worked with do #1, but many of them also don't follow strict transactional boundaries, and seem to me to be leaving themselves open for trouble in the event of failure. Even so, I'm having trouble talking them into #2, which should also result in a better user experience (one catch is the potentially long delay while the failover happens).
Any arguments one way or the other would be appreciated. If you use the second approach, do you have a standard wrapper that helps simplify implementation? Either way, how do you structure your code to avoid issues such as those related to the lack of idempotency in the command that failed?
Number 2 could be an infinite loop. What if it's network related, or the local PC needs rebooted, or whatever?
Number 1 is annoying to users, of course.
If you only allow access via a web site, then you'll never see the error anyway unless the failover happens mid-call. For us, this is unlikely and we have failed over without end users realising.
In real life you may not have nice clean DAL on a web server. You may have an Excel sheet connecting (most financials) or WinForms where the connection is kept open, so you only have the one option.
Fail over should only take a few seconds anyway. If the DB recovery takes more than that, you have bigger issues anyway. And if it happens often enough to have to think about handling it, well...
In summary, it will happen that rarely that you want to know and number 1 would be better. IMHO.

Rich database frontend - how to correctly handle low quality networks?

I have a very limited experience of database programming and my applications that access databases are simple ones :). Until now :(. I need to create a medium-size desktop application (it's called rich client?) that will use a database on the network to share data between multiple users. Most probably i will use C# and MSSQL/MySQL/SQLite.
I have performed a few drive tests and discovered that on low quality networks database access is not so smooth. In one company's LAN it's a lot of data transferred over network and servers are at constant load, so it's a common situation that a simple INSERT or SELECT SQL query will take 1-2 minutes or even fail with timeout / network error.
Is it any best practices to handle such situations? Of course i can split my app into GUI thread and DB thread so network problems will not lead to frozen GUI. But what to do with lots of network errors? Displaying them to user too often will be not very good :(. I'm thinking about automatic creating local copy of a database on each computer my app is running: first updating local database and synchronize it in background, simple retrying on network errors. This will allow an app to function event if network has great lags / problems.
Any hints and buzzwords what can i look into? Maybe it's some best practices already available that i don't know :)
Sorry this is prob not the answer you are looking for but you mention that a simple insert / update could take 1-2 minutes or even fail with timeout / network error.
This to me sounds like there may be another problem rather than the network itself. If your working on a corporate network there would have to be insane levels of traffic for this sort of behavior. I would do everything in your power to look at improving the network before proceeding. Can you post the result of a ping to the db box?
If your going to architect your application around this type of network it will significantly alter the end product and even possibly result in a poor quality product for other clients.
Depending upon the nature of the application maybe look at implementing an async persistence queue and caching data on startup or even embedding a copy of the db into your application.
Even though async behaviour/queues/caching/copying the database to each local instance etc will help solve the symptoms, the problem will still remain. If the network really is that bad then I'd address it with their I.T. department, or the project manager and build some performance requirement from their side of things into the contract.

Resources