Oracle XE or Postgres? - database

Should I use for a web application Postgres or Oracle XE and why?
They are both available for free and for use commercially.
Why should I use one database over the other?

Well, with Oracle XE, if you hit the limits, you have to either buy Oracle or migrate your entire application to a different database. With PostgreSQL, there are no limits, and the software is completely free and open-source.

A few things to be aware of if you choose Oracle XE:
It will only use one CPU if you have a multiprocessor server
It will only use up to a maximum of 1 gigabyte of memory
It has a database size limit of 4 gigabytes for user data
Only available for 32-bit Windows and 32-bit Linux
If those limitations aren't an issue for you and you like the Oracle approach then give it a shot, otherwise consider an opensource server like Postgres or MySQL, which have none of the aforementioned limitations.
If you do choose XE and then later find your requirements have changed, the next version up of Oracle is 900USD for 5 seats, and an additional 180 per seat. This is in fact a bit cheaper than MS SQL Server afaik.
There are some good reasons to choose Oracle, particularly if you're a Java developer e.g. you can write stored procedures in Java, and I think there's native support for Java web services. Ultimately however you need to weigh up the cost with the requirements of your application. MySQL and Postgres will allow you to scale your application without any cost (other than hardware obviously).

You haven't provided much information, but unless you're already an Oracle expert, I see no reason to choose Oracle XE over PostgreSQL. PostgreSQL will always be free and is far more capable and more scalable.
And you can choose to run PostgreSQL on Windows, Mac OS X or Linux. I think Oracle XE is limited to Windows and Linux.

when choosing a db vendor be careful to match the demands of your app against the strengths of the vendors' db product. And watch out for their weaknesses.
For example, if you know your writes will be as frequent as your reads and both will occur simultaneously, then you'll want to know how each vendor you consider handles concurency. Vendors that rely on elaborate lock managers with complex lock escalation schemes are likely to bring you grief if you expect heavy load on you app. You'll spend more time trying to work around the DB's lock manager than actually solving your problems.
That's one example. Every DB has its strengths and weaknesses to consider. Do your research, find a site that compares vendors and make a choice that balances your needs against that. If you can get an eval copy, all the better to run some proof of concept tests against. Write scripts that pummel the db in some similar to what you expect your app to produce and go from there. While your at it, get the query plans for the SQL in your scripts from each vendor and see what you can learn from that about how each vendor's optimizer works.
There's more that can be said, but hopefully you get the gist.

What is your platform of choice? If it's Windows, I'd go with Oracle or MySQL as I have heard only bad things about running PostgreSQL on Windows machines.
Also, Oracle has a more GUI apprach to configuration so if you're not a Linux hacker, you may like it better.
Postgres on the other hand has a way superior SQL console and console tools in general are more developed and easier to use. Oracle has more tools but the decent ares (like PL/SQL navigator) are not free and the free ones simply suck.
Remember that choosing a database is a long-term decision so consider the possible changes in requirements for your application as well. If you are not prepared to eventually spend money on Oracle, better go with Postgres -- safe option. And good enough for most projects.

Related

Migrating from SQL Server to firebird: pro and cons

I am considering the migration for 4 reasons:
1) SQLSERVER installation is a nightmare, expecially for 1-user software (Even if typically I have 3-20 users, sometimes I sell my software to single users: it is incredible to have troubles installing the DB, while installing the applicatino means copying an exe...). (note my max installation is 100 users, but there is no an upper limit). Software installs in 10 seconds, SQLServer in 1 hour. Firebird installation is much easier.
2) SQLSERVER runs on windows server only
3) My customers have all the express edition
4) i am not using any advanced feature, I am now starting using filestream, but the main reason for this is that Express edition has 4/10GB db size limit
So these are all Pros of moving to Firebird.
Which are the cons?
I can also plan to support both platforms, but this will backfire I fear.
MSSQL server is faster and better optimized for large databases and complex queries, especially if administered properly, while Firebird alows you to run without any administration and just forget about it. Although this penalty affects very small percentage of people using it, before complete migration I suggest you to first just migrate data and then test speed of most complex query on both systems. If speed satisfies you then you are good to go.
I don't see any besides need to thoroughly test all of your existing code for compatability issues.
Firebird is wonderful for server installations or single user installations.
It has an embedded version that is suitable for single user scenarios and you do not have to install anything.
It uses the same database file for both server and embedded database so you can easy go from single user to multi user and vice versa.
I have embedded Firebird 2.5 today in my freeware Software. It's great, and there had never been connection problems. I used multiple processes to do both insert and read long operations simultaneously and it all gone correct as was expected. I am waiting for Firebird 3.0. I recommend Firebird when you don't want to trust on other commercial database software.
If there is only one user you can use Sqlite which is even easier to manage than Firebird.

What database to use for big data storage and manipulation?

I have to make a decision of which database server to use for my next project, but the simple decision to use MySQL like almost all the projects I did is harder now, because I expect very much records.
The database will store a user list, some other irrelevant tables, and the last one, some user-collected data. Let's say, if I have 6000 users responding to a quiz about each other. Simple math shows that from those users, if each one completes the quiz about everyone (and in my project that is 99% sure that will happen) I'll end up with 35.99million records(they will exclude themselves and in this particular situation the operation is 6000*5999). Unfortunately 6000 maybe is a small number, the real one growing day by day.
What to choose? MySQL and maybe if things go well and the project grows to expand it in a cluster? PostgreSQL, MSSQL? Oracle?
I've read about all of them, each one has it's pros and cons, but still don't know what to choose. The advantage of MySQL and PostgreSQL is of course, the starting price of $0 which is pretty nice in a usual self-funded startup.
Any opinions, pieces of advice? If you encountered this situation in your experience as developers, I'd love to hear from you.
These days, free isn't something that differenciates between databases any more. Both Oracle and SQL Server have free versions, but the limitations is resources - 4 GB database, RAM & single CPU utilization. Millions of records is not a concern - it's what datatypes you're using.
I saw the OPs comment about not liking MS software - that's your prerogative, but using the free versions of either Oracle or SQL Server do benefit from seamless transition to upscale versions of the respective database.
Personally, my choice would be either Oracle or SQL Server because of IMHO, real feature considerations like hierarchical query support, subquery factoring/CTE, packages (long before I get concerned with functions/procedures), full text searching, xml support, etc.
MySQL will handle 35 million records no problem. Worry about scalability when you get there. You can easily add raid hard disks backing your database tables, and if you really start getting big you can get a compellant SAN that will scream... Don't worry about the DB engine as much as the underlying hardware.. MySQL rocks for us with millions of records.
I've had no problems handling tables as large as 36,000,000 rows on MySQL and Oracle.
Just be sure that you index the proper columns, run EXPLAINs for your queries, and maintain proper design principles.
Most of the truly large scale web properties use a distributed key-value store. That said, 35 million is large, but not that large. With most modern databases, your main two scaling worries should be throughput and what happens when no single box can contain your entire database anymore. And both of these problems can be solved to some degree for any database you choose to use. (Caching, replication, sharding, etc.)
Use MySQL until you can't anymore. At that point, you ought to be rolling in dough anyways and you now have a very desirable problem.
Use MySQL as it's free and you have experience with it.
Besides in my opinion it matters more on how you design the tables than which database you use.
35 million records can be easily handled by MS SQL Server (assuming proper database design, indices, etc.). You can start with the free SQL Server Express edition and later, if you need, you can upgrade to the full version which supports clustering, etc.
SQL Server Express does have some limitations - single CPU, 1 GB memory, max 4 GB database size and a few other things. I'm not sure how quickly these limitations will become a problem but you can always move to the full version when you run into them.
MySQL(i) & Postgre
0$ of costs
large community
many tutorials
well documentated
MSSQL
You can get "money" from MS if you promote that you are using MSSQL (secret information from some companies I worked for)
MS tools work very well
Complete tool set from C# IDE over .NET lib to Windows Server 2003
Oracle
Professional and commercial provider
Used by many large companies (I also heard about Blizzard (World of Warcraft) using Oracle)
- expensive
The final decision depends on the very special requirements of your project.
Make yourself a quick list of things , that ARE IMPORTANT for your project (e.g. quick performed queries) and look up which Database pros are matching the most to your requirements.
Everything is about design. SQL Database are some kind of cars, you just have to know which component has to be placed here and which there.
Make a clear design and you won't struggle with any of them.
May be you can test Firebird
Blog post about big Firebird database here
MySQL licence is here (not allways free).
Postgresql and Firebird are free.
First of all, don't think about performance. Premature optimization being the root of all evil and all that. You can always throw more hardware and/or tuning at it later.
All of the mentioned should perform nicely if tuned/maintained correctly. I'd focus on manageability and familiarity. IMHO open source databases excels on manageability (perhaps not the best GUIs, but the CLI has been my home for a long long time).
And if the database becomes the bottleneck, why limit yourself to those choices? How about a key-value distributed database? Or perhaps serialize data directly to disk? Storing data outside of a RDBMS, while often frowned upon, might be the correct path. Or simply use the common route of denormalization.
Always remember not to optimize prematurely.
As far as opinions go (since you specifically asked for it) I favor open source databases, specifically PostgreSQL. It's rock solid, fast and very well-featured. And even with (relatively) large datasets it has performed superbly on mediocre hardware (some tuning involved, of course, but you can't skip that step no matter which db you end up choosing).

When is it time to change database backends?

Is there a general rule of thumb to follow when storing web application data to know what database backend should be used? Is the number of hits per day, number of rows of data, or other metrics that I should consider when choosing?
My initial idea is that the order for this would look something like the following (but not necessarily, which is why I'm asking the question).
Flat Files
BDB
SQLite
MySQL
PostgreSQL
SQL Server
Oracle
It's not quite that easy. The only general rule of thumb is that you should look for another solution when the current one can't keep up anymore. That could include using different software (not necessarily in any globally fixed order), hardware or architecture.
You will probably get a lot more benefit out of caching data using something like memcached than switching to another random storage backend.
If you think you are going to ever need one of the heavyweights (SqlServer, Oracle), you should start with one of those at the beginning. Data migrations are extremely difficult. In the long run it will cost you less to just start at the top and stay there.
I think you're being overly specific in your rankings. You can pretty much start with flat files and the like for very small data sets, go up to something like DBM for slightly bigger ones that don't require SQL-like syntax, and go to some kind of SQL database after that.
But who wants to do all that rewriting? If the application will benefit from access to joins, stored procedures, triggers, foreign key validation, and the like--just use a SQL database regardless of the dataset size.
Which one should depend more on the client's existing installations and what DBA skills are available than on the amount of data you're holding.
In other words, the size of your database is far from the only consideration, and maybe not the most important one.
There is no blanket answer to this, but ALMOST always, using flat files is not a good idea. You have to parse through them (i suppose) and they do not scale well. Starting with a proper database, like Oracle or SQL Server (or MySQL, Postgres if you are looking for free options) is a good idea. For very little overhead, you will save yourself a lot of effort and headache later on. They also allow you to structure your data in a non-stupid fashion, leaving you free to think of WHAT you will do with the data rather than HOW you will be getting it in/out.
It really depends on your data, and how you intend to use it. At one of my previous positions, we used Postgres due to the native geo-location and timezone extensions which existed because it allowed us to manage our data using polygonal datatypes. For us, we needed to do that, and we also wanted to use stored procedures, views and the like.
Now, another place I worked at used MySQL simply because the data was normalized, standard row by row data.
SQL Server, for a long time, had a 4gb database limit (see SQL Server 2000), but despite that limitation it remains a very stable platform for small to medium applications for which the old data is purged.
Now, from working with Oracle and SQL Server 05/08, all I can tell you is that if you want the creme of the crop for stability, scalability and flexibility, then these two are your best bet. For enterprise applications, I strongly recommend them (merely because that's what we use where I work now).
Other things to consider:
Language integration (ASP.NET session storage, role management, etc.)
Query types (Select, Update, Delete) [Although this is more of a schema design issue, not a DBMS issue)
Data storage requirements
Your application's utilization of the database is the most critical ones. Mainly what queries are used most often (SELECT, INSERT or UPDATE)?
Say if you use SQLite, it is gears for smaller application but for "web" application you might a bigger one like MySQL or SQL Server.
The way you write scripts and your web application platforms also matters. If you're developing on a Microsoft platform, then SQL Server is a better alternative.
Typically, I go with what is commonly accepted by whichever framework I am using. So, if I'm doing .NET => SQL Server, Python (via Django or Pylons) => MySQL or SQLite.
I almost never use flat files though.
There is more to choosing an RDBMS solution that just "back end horsepower". The ability to have commitment control, for example, so you can roll back a failed transaction is one. reason.
Unless you are in the megatransaction rate application, most database engines would be adequate - so it becomes a question of how much you want to pay for the software, whether it runs on the hardware and operating system environment you want, and what expertise you have in managing that software.
That progression sounds painful. If you're going to include MS products (especially the for-pay SQL Server) in there anywhere, you may as well use the whole stack, since you only have to pay for the last of these:
SQL Server Compact -> SQL Server Express -> SQL Server Enterprise (clustered).
If you target your app at SQL Server Compact initially, all your SQL code is guaranteed to scale up to the next version without modification. If you get bigger than SQL Server Enterprise, then congratulations. That's what they call a good problem to have.
Also: go back and check the SO podcasts. I believe they talked about this briefly.
This question depends on your situation really.
If you have control over the server you're deploying to and you can install whatever services you need, then the time to install a MySql or MSSQL Express server and code against an existing database framework VERSUS coding against flat file structure is not worth the effort of considering.
What about FireBird? Where would that fit into that list?
And lets not forget the requirements that the "customer" of your solution must also have in place. If your writing a commercial application for a small companies, then Oracle might not be a good choice... but if your writing a customized solution for a large enterprise which must share data among multiple campuses, and has a good sized IT department then the decision of Oracle vs Sql Server would come down to what does the customer most likely already have deployed.
Data migration nowdays isn't that bad since we have those great tools from Embarcadero, so I would instead let the customer needs drive the decision.
If you have the option SQL Server is a good choice from the word go, predominantly because you have access to solid procedures and functions and the database backup facilities are totally reliable. Wrapping up as much as your logic as you can inside the database itself (rather than in whatever language you are using) helps security and performance - indeed there's an good argument to be made for always using procedures for insert/update logic as these make you invulnerable to injection attacks.
If I have the choice the only time I'd consider MySQL in preference is with a large, fairly simple, database predominantly used for read access. This isn't to decry MySQL which has improved markedly of late and I happily use if I don't have the choice, but for more complex systems with update/insert activity MSSQL is generally the superior option.
I think your list is subjective but I will play your game.
Flat Files
BDB
SQLite
MySQL
PostgreSQL
SQL Server
Oracle
Teradata

What are the advantages of VistaDB

I have seen the references to VistaDB over the years and with tools like SQLite, Firebird, MS SQL et. al. I have never had a reason to consider it.
What are the benefits of paying for VistaDB vs using another technology? Things I have thought of:
1. Compact Framework Support. SQLite+MSSQL support the CF.
2. Need migration path to a 'more robust' system. Firebird+MSSQL.
3. Need more advanced features such as triggers. Firebird+MSSQL
The VistaDB client runtime is free. The runtime will never "expire at 3am" as you put it. Only the developer tools are licensed in that manner. You need 1 license per developer, simple. We even offer a really inexpensive Lite version with no Visual Studio tools.
Some other benefits
100% managed code - there are no interop or other unmanaged calls in the engine. This is a big deal to some, and others couldn't care less.
No registry access required - Most other in proc databases require registry access to look for parent controls, or permissions. VistaDB only does what you tell it to do, and will even run in Medium Trust.
XCopy deployment for runtime and your database (single file). You can xcopy you application, the runtime, and your database and run. Nothing to install or configure on the machine, no special privileges needed (we can run in Medium Trust or higher).
Isolated storage - You can put your entire database into Isolated Storage and run it from there directly. This makes it very easy to build secure click once applications that write databases in a domain friendly way for corporate environments. There is no need to store the user data on a shared drive or worry about permission mapping.
CLR Triggers / CLR Procs - You can write CLR Code and use them as Triggers or Stored Procs. We have just recently introduced changes to make it even easier to maintain a single CLR Assembly that can run in both VistaDB and SQL Server 2005/2008.
T-SQL Procs - VistaDB T-SQL Procs are compatible with SQL Server 2005/2008. Any procedure that works in our engine will run in SQL Server. That does not mean anything that runs there will port to us. We are a subset of the functionality in SQL Server. But we are also the only way to run T-SQL Procs without SQL Server (SQL CE can't do it).
I personally think one of the biggest features is the ability to upsize to SQL Server later. All of the VistaDB types, syntax, and CLR Procs, T-SQL procs, etc all will run on SQL Server. (You can't take everything from SQL Server down to VistaDB though, it is a subset)
32/64 bit Deployment - VistaDB is a single assembly deployment that runs both 32 and 64 bit without changes. SQL CE requires two different runtimes depending upon the OS, and cannot run under IIS at all. Access has no 64 bit runtime, and the most recent 32 bit runtime can only be deployed through MSI. The 32 bit version of Windows has the runtime, the 64 bit version does not.
Relational Integrity - VistaDB also actually enforces your constraints and Foreign Keys. You can specific cascade update, and delete operations. The person who commented we are like SQLITE is wrong in this regard. They parse constraints, but do not enforce them.
EDIT: They do have support for FK's now in SQLite. But they are not compiled in by default, and do not use the same syntax as SQL Server.
Medium Trust - The ability to run on a medium trust web server is another feature that many will not care about, but it is a big deal. Many third party controls can't even run in Medium Trust. We can run the complete engine within Medium Trust because of our commitment to 100% managed code and least permission required.
- Full disclosure - I am the owner of VistaDB so I may be biased. :)
Well, the main thing is that it is pure managed code - for what that is worth; it works not only on your typical Windows machines running .NET, but works wherever you run the Compact Framework and even works on Mono. Here are some noteworthy bullet points from their homepage:
Small < 1 MB footprint truly embedded ZeroClick
Microsoft SQL Server 2005 compatible data types and T-SQL syntax
None of the SQL CE limits
Single user, multi user local or using shared network.
Partially trusted shared hosting is no problem.
Royalty-free distribution - single CPU deployment of SQL Server costs more than a site license of VistaDB!
One thing worth noting is that Rob Howard's company, telligent, uses it as the default database for their new CMS software, "Graffiti."
I have played with it here and there but have yet to build anything against it.
For me this most interesting feature of VistaDB is that it can be run in Medium Trust environment. Which makes it perfect solution for creating small to medium .NET websites which can be deployed on server by copying and pasting (x-copy deployment).
And almost all windows shared hosting providers (like GoDaddy) won't let you run your websites in Full Trust mode. And also won't install for you any 3rd party binaries into GAC like System.Data.SQLite.dll if you wish to use SQLite for example.
I hadn't seen VistaDB before, it does look pretty cool.
Update: Received a comment from someone from VistaDB - their update model is only for getting new versions. Your old ones won't stop working if your license expires, which is good to know.
Keeping the original post here as IMHO the warning about expiring software licenses is still worth thinking about, even though VistaDB itself is fine.
It definitely seems 'more featureful' than SQLite, but I don't see anything there to justify the cost. The site seems to indicate that you can buy one license for $279, but it implies this is just a 1 year subscription. Would you have to then pay another $279 next year to stop your site falling over?
If so, remember to factor into the 'cost' how much inconvenience it's going to be when you get a call at 3am (murphy's law, it's always 3am) from your panicking customers because their VistaDB license has expired :-(
I've had this experience personally with some expiring software, and it's never good. You can send your customers emails and messages and flash their entire screen blinking red saying "YOU NEED TO GET A NEW LICENSE BEFORE NEXT WEEK" and they'll still never do it, and you'll still get the pain at 3am when it does expire.

Oracle XE or SQL Server Express

I'm starting a new project here (Windows Forms). What's the best option today for a small (free as in beer) DBMS?
I've used SQL Server Express on the past projects, but time and time again I hear people saying that the product from Oracle is faster and more powerful.
It will be used in a small company (around 20 users) and will not reach the 4 GB limit any time soon :)
I don't want to start a flame war on my first post, so please point me to some link showing a good (and actual) comparison between the 2 products, if possible.
PS: I've heard about IBM DB2 Express too, but I coudn't find any information about it. (Marketing material from IBM doesn't count :) )
I would go for the SQL Server Express solution, unless you absolutely have to use a feature in Oracle that SQL Server does not have and you have no usable workaround.
Example of Oracle's strengths:
Analytical Functions in Oracle ROCK!
PL/SQL is better than T-SQL.
If you're going to scale up the system to 1,000's of users all updating the same small dataset
You scale upto multi-TB databases,
You need to scale to need big numbers of CPU's in your server (over 8).
need instant failover (RAC)
you really cannot afford to lose a transaction.
Maybe you can tell, I'm a big Oracle fan! But I think that Oracle Express is a commercial reaction to SQL Server Express and I don't think Oracle really deep deep down likes it.
You know with SQL Server that there is an upgrade path (SQL Server 2008 is soon) plus service packs.
SQL Express is also more "install and forget" than Oracle.
and it will integrate better with your IDE (if your using .NET)
In terms of speed, both are going to be lighting quick with such a small dataset size.
It would be hard to argue either way given the needs you outlined, that either would shine over the other.
What I will say is this:
You say you are already familar with SSExpress, then that is a good reason to stick with it
IMHO the tools with SSExpress are superior and easier to use than the Oracle equivalent
That said, I have much more experience with SS than Oracle so YMMV.
Sorry, no link, but one advice. Because we support Oracle and SQL Server, I know that getting fixes for the 'normal' Oracle database, is not something what I call fun. You have to pay for it, and if you have no tool which updates your Oracle system for you, it's a pain in the a.., if you ask me. Check out how the Oracle XE is supported with updates/fixes. I don't know, I only use the 'normal' Oracle (Developer) database.
I think it's great to rethink things every once in a while and that it's very smart to consider alternative products when you are at a junction to do so.
If you are comfortable optimizing systems and are dba level in skills, I'd consider PostgreSQL. I do not consider myself a dba and have middling database skills and find SQL Server Express extremely easy to use. Also, I've had products exceed the limits of SQL Server Express - the transition to SQL Server Standard/Enterprise is seemless.
I realize that this doesn't matter at a technical level, but Larry Ellison buys jets and prostitutes with his profit. Bill Gates is solving problems of immense importance to humanity with his. All things being equal, I always prefer to give my money to Bill Gates.
Is this any use:
https://web.archive.org/web/1/http://downloads.techrepublic%2ecom%2ecom/5138-9592-6028761.html
NB Registration is required
Both of KiwiBastard's points are very good and I completely agree with him.
If you really want a free alternative that is similar to MS SQL and supports growth should you need it, you could have a look at MySQL or PostgreSQL. SQLite also seems a good choice.
Surely you can afford an old Linux server if you work in a company with 20 employees.
100% SQL Express, more easy to install and maintain than Oracle.
IMHO the major problem with SQL Server, has for a long time been, no multi-version read consistency. Fortunately this has been corrected since SQL Server 2005 with the snapshot isolation level.
If your looking for a good RDBMS for a small project requring minimal knowledge for maintenance, SQL Server Express Edition is a good pick. The SQL Server Express Edition UI is much easier to understand than RMAN or the "easier"-to-use backup scripts included with Oracle Database XE which requires offlining your database.
Oracle Database XE is on my *** list. They recently released an ODBC driver for Linux that wasn't compiled properly (ld returns missing symbols for required ODBC functions) to be at all usable (10.2.0.4). With this kind of lack of attention to any reasonable amount of QA even for a 'free' product I would think twice about going down that road.
For DB2 Express-C see:
"DB2 Express-C™ is the free version of one of the most advanced
database management systems in the world. Why pay when you can have
all you need for free? DB2 Express-C is free to develop, deploy and
distribute.
It is a fast, secure, reliable, and amazingly scalable dataserver,
ideal for most startups and small/medium sized businesses. DB2
Express-C 9.7 is available on Linux, Unix, Windows, and now Mac OS X
as well! It also enables developers to easily handle XML through the
native storage technology called pureXML™. Whether you develop in
Java, .Net, Ruby, Python, Perl or pretty much any other programming
language out there, DB2 can be your technological advantage."

Resources