Postgres Database Error Invalid Page Header

Postgres Database Error Invalid Page Header - database

I'm using the Django ORM to access a PostgreSQL database, and on rare occassions Django will throw a DatabaseError like django.db.utils.DatabaseError: invalid page header in block 299560 of relation base/83966/84778.
I've researched this, and it seems due to the database getting corrupted somehow. This is immensely frustrating, because I've always shutdown the database cleanly when rebooting, and every check I can run on my disk drive says there's nothing wrong with the disk itself. Therefore, I can only conclude that PostgreSQL is not actually ACID compliant and is corrupting my data in rare instances.
The only fix I've been able to find, is to drop and recreate my database. Obviously, this isn't really a fix, since I'm losing all my data. Is there any other way to resolve this, or should I switch to a more reliable database like MySQL?
I'm running Postgresql-8.4.8 on Ubuntu 10.04.

Most of the time you see this you either have bad memory or a bad drive. The difference between PostgreSQL and MySQL is that PostgreSQL sees it and complains as it should, while MySQL often just keeps on going with no stopping. I think the db that stops when the machine corrupts the data store is the more reliable db, because it lets you know right up front there are issues with your system.
BTW, PostgreSQL can survive an emergency shutdown (pull the plug out the back of the machine) just fine as long as the hard drives aren't lying about fsync.
Try memtest86 to see if your memory's ok, and do something like
sudo dd if=/dev/sdc1 of=/dev/null
to see if you get any errors. Anything in your dmesg or message logs about drive read write errors?

Related

Mirth database in Postgres causing storage problems

I have set up mirth on ubuntu server using Postgres database(on the same server). The problem is mirth messages takes up all the storage after some hours and mirth crashes. I want mirth to be continuously running on my server.
I've enabled message pruning but that only deletes the message data but does not free storage. Although 'Remove all messages' option in mirth launcher UI frees the storage. I've also tried to free the storage by truncating the tables, that works, but causes the error and no messages can be received further, and also WAL segments gets allocated.

After you prune the messages and then vacuum the tables, space should be freed up for internal reuse, but will probably not be handed back to the OS (for use in different files, or to show up in df). You can use pg_freespacemap or maybe pgstattuple to check if space is available for internal reuse, or pg_stat_activity or pg_stat_progress_vacuum to see if vacuuming is currently underway. Vacuuming should happen automatically after a large fraction of a table is deleted, unless you have gone out of your way to prevent it from happening.

What does a database log keep track of?

I'm quite new to SQL Server and was wondering what the difference between the SQL Server log is and a custom log (in my case, using log4net)? I guess there's more choice on what to log using log4net, but what things are automatically logged by the database? For example, if a user signs up to my site, would I have to manually log that transaction, or would that be recorded in the database's log automatically? I'm currently starting a project and would like to figure out exactly what I should bother logging.
Thanks

Apples and Oranges.
Log4net and other custom 'logging' is just a way to capture events an application is reporting. 'Log' in this context reffers to whatever store is used by this infrastucture to persist information about these events.
The database log on the other hand is something compeltely different. In order to maintain consistency and atomicity databases use a so called Write-Ahead-Log protocol. In WAL all changes are first durable written into a journal, or log, before being applied to the data. This allows recovery to replay the log (the journal) and get the data back into a consistent state, by rolling back any uncommited work.
Database logs have absolutely nothing to do with your application code. Any database update will be automatically logged by the engine, simply because this is how any data is updated in a database. You cannot modify that, nor do you have any access to what's written in the log (strictly speaking you can look into the log, but you won't find any usefull information for your application).

SQL log handles tansaction logging for rolling back or comiting data. They are usually only dealt with by someone who knows what they are doing restoring backups or shipping the logs to use for backups.
The log4net and other logging framweworks handle in code logging of exceptions, warning, or debug level info that you would like to output for your own info. They can be sent to a table in a database, command window, flat file or web service. Common logging scenarios are catching unhandled exceptions at the application level to help track down bugs, or in any try catch statements writing out the stack trace.

It keeps track of the transactions so it can roll them back or replay in case of a crash. Quite more involved than simple logging.

The two are almost completely unrelated.
A database log is used to rollback transactions, recover from crashes, etc. All good things to ensure database consistency. It has updates/inserts/deletes in it--not really anything about intent or what your app is trying to do unless it directly affects data in the database.
The application log on the other hand (with Log4Net) can be extremely useful when building and debugging your application. It is driven by you and should contain information that traces what your app is doing. This is something that can safely be turned off or reduced (by toggling the log level) when you no longer need it.

The SQL Server log file is actually used for maintaining it's own stability, but it's not terribly useful for normal developers. It's not what you think (and I what I thought), a list of SQL statements that have been run. It's a propriety format designed to help SQL recover from a crash or roll back transactions.
If you need to track what's going on in the system, the SQL transaction log won't be helpful, and it would be very difficult to get that information back out. Instead, I would suggest adding triggers on your tables that write information off to another table, or add some code in your data layer that saves off a log of what's going on. It could be as simple as wrapping the SQL command object with your own implementation, which saved SQL statements off to log4net in addition to whatever normal code it was executing.

It is the mechanism by which the RMDBS can assure atomicity and consistency, see ACID.

Separate production database for logging

I've got an application in test that's logging events for a departmental Windows forms application to a SQL Server database. The application is nearly ready for production. The logging database is completely separate from the application database.
My question is, do I really need to create a production version of the logging database, or can I just log production and test events to the same database? The obvious answer is yes, of course I need a separate database. You never want test and production environments to mix. But in this case, the data being written to the database isn't really production data, exactly, it's just logging details that we use to troubleshoot problems. It has no business value, and nothing significant would be lost if the data were to be inadvertently dropped or the database were to be temporarily unavailable. And having it all in a test environment would make it much simpler for me to manage.
So on a pros and cons basis using a single logging database for all environments seems like a better solution. But it just doesn't feel right. Can anybody give me any specific reasons why this is a bad idea?

Your logging may not work if your dev server is down but prod isn't. Can guarantee that will be when something critical you need logged on prod will happen. In our case prod and dev are not in the same physical location which would mean sending logging data across our network and cause pipeline bottlenecks and cranky network guys.
Plus what if you decide to change the logging process? While you are doing new development, the entire prod process might break.
And there will be times when someone might read lthe logs, panic at some error forgetting that it happened on dev. Or worse, someone might see a bad error that they thought was happening on dev that was really happening on prod.

I would say:
Use a standard methodology for logging (a single DLL or similar) and actually house the logging DB in production.
That way, your logging database can be considered a "Logging Server" and ALL apps (Dev, Staging, Test, and Prod) can log to it, since you are using a vetted library.
Of course, you have to still watch out that you don't flood the server...

I don't see anything wrong with keeping it on the dev box, unless your app will fail if it's unable to properly log, or unless the information being logged is more valuable than you indicate.
On the upside, keeping your logging db on your dev server will help take the load for handling this data off of the production server - a definite plus where performance is concerned.

Keeping logs on a database is a bad idea in the first place. What happens when you can't connect to the DB for some reason or another? I suggest you use log4net and implement RollingFileAppenders. They write log entries to a file and when the file hits a certain limit, log4net starts writing to a new file. If you have questions getting setup, feel free to ask. I would be glad to help!

Does large SQL Server Memory Usage cause errors?

Just started getting a bunch of errors on our C# .Net app that seemed to be happening for no reason. Things like System.IndexOutOfRangeException on a SqlDataReader object for an index that should be returned and has been returning for a while now.
Anyways, I looked at the Task Manager and saw that sqlservr.exe was running at around 1,500,000 K Mem Usage. I am by no means a DBA, but that large usage of memory looked wrong to me on a Win Server 2003 R2 Enterprise with Intel Xeon 3.33Ghz with 4GB ram. So I restarted the SQL Server instance. After the restart, everything went back to normal. Errors suddenly stopped occurring. So does this large main memory usage eventually cause errors?
Additionally, I did a quick Google for high memory usage mssql. I found that if left to default settings; SQL Server can grow to be that large. Also, found a link to MS about How to adjust memory usage by using configuration options in SQL Server.
Question now is...how much main memory should SQL Server should be limited to?

I'd certainly be very surprised if it's the database itself, SQLServer is an extremely solid product - far better than anything in Office or Windows itself, and can generally be relied on absolutely and completely.
1.5Gb is nothing for a rdbms - and and all of them will just keep filling up their available buffers with cached data. Reads in core are typically 1000x or more faster than disk access, so using every scrap of memory available to it is optimal design. In fact if you look at any RDBMS design theory you'll see that the algorithms used to decide what to throw away from core are given considerable prominence as it makes a major impact on performance.
Most dedicated DB servers will be running with 4Gb memory (assuming 32bit) with 90% dedicated to SQL Server, so you are certainly not looking at any sort of edge condition here.
Your most likely problem by far is a coding error or structural issue (such as locking)
I do have one caveat though. Very (very, very - like twice in 10 years) occasionally I have seen SQL Server return page tear errors due to corruption in its database files, both times caused by an underlying intermittent hardware failure. As luck would have it on both occasions these were in pages holding the indexes and by dropping the index, repairing the database, backing up and restoring to a new disk I was able to recover without falling back to backups. I am uncertain as to how a page tear error would feed through to the C# API, but conceivably if you have a disk error which only manifests itself after core is full (i.e. it's somewhere on some swap space) then an index out of bounds error does seem like the sort of manifestation I would expect as a call could be returning junk - hence falling outside an array range.

There are a lot of different factors that can come into play as to what limit to set. Typically you want to limit it in a manner that will prevent it from using up too much of the ram on the system.
If the box is a dedicated SQL box, it isn't uncommon to set it to use 90% or so of the RAM on the box....
However, if it is a shared box that has other purposes, there might be other considerations.

how much main memory should MSSQL
should be limited to?
As much as you can give it, while ensuring that other system services can function properly. Yes, it's a vague answer, but on a dedicated DB box, MSSQL will be quite happy with 90% of the RAM or such. By design it will take as much RAM as it can.

1.5GB of 4.0GB is hardly taxing... One of our servers typically runs at 1.6GB of 2.5GB with no problems. I think I'd be more concerned if it wasn't using that much.
I don't mean to sound harsh but I wouldn't be so quick to blame the SQL Server for application errors. From my experience, every time I've tried to pass the buck on to SQL Server, it's bit me in the ass. It's usually sys admins or rogue queries that have brought our server to its knees.
There were several times where the solution to a slow running query was to restart the server instead of inspecting the query, which were almost always at fault. I know I personally rewrote about a dozen queries where the cost was well above 100.
This really sounds like a case of "'select' is broken" so I'm curious if you could find any improvements in your code.

SQL needs the ram that it is taking. If it was using 1.5 gigs, its using that for data cache, procedure cache, etc. Its generally better left alone - if you set a cap too low, you'll end up hurting performance. If its using 1.5 gigs on a 4 gig web box, i wouldn't call that abnormal at all.
Your errors could very likely have been caused by locking - i'd have a hard time saying that the SQL memory usage that you defined in the question was causing the errors you were getting.

C MySQL client library behaviour

I have a client application that connects to the MySQL database 4 server using stock libraries on SuSE SLES 9. However, at times when processing a particular reset set from the server, iterating throw the results does not allow me to process all the results that is in the database.
This issue happens sometimes, mostly when servers have had several days of uptime. I would suspect that a reboot solves the problem.
Is there anyway that not releasing the MySQL result set over time gives rise to this memory leak and displays itself in this strange behavior must all result sets always be freed? However same table and same program behaves as should on another computer.
Could corruption of the result set occur because of implementation issues in either the application or the mysql client library?

Anything is possible, however I'd be inclined to go with app-level issues by default. Any problem that smells like it could be memory related is a prime candidate for a heap corruption bug if you're coding in C/C++, and that could cause result set problems. Also, I'm curious about how long you're holding this result set open for -- is it possible that the rows that you're "missing" might have been inserted between the time the query ran and when you're retrieving the value from the result set?
Finally, releasing a result set on the server happens automatically when you close the database connection, so unless you're holding a single connection open for days, that is unlikely to be the problem, absent a bug in MySQL.

You may think about upgrading to MySQL 5.
It's usually good to have the latest version.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight