I work with a platform that have a messaging system that uses SQL server tables as queues.
That system was based on this: Using Tables as queues
ATM we are facing some scalability issues, since this distributive schema is mainly based on SQL Locks and disk operations in order to guarantee the durability/coherence of data.
In order to solve the disk based I/O bottleneck and to improve the bad distributive logic, we are thinking in changing disk based SQL tables to In memory SQL (Hekaton) available at SQL 2014 & 2016.
I've read some stuff about Hekaton already, but I'm still not sure if this is the best approach, or if is possible to implement those queues into In memory and if this is the best approach.
Most of those queues are implementing pessimistic concurrency, and Hekaton uses no locks system only optimistic concurrency (based on multi-versioning). Is it "always" (I know this is a bad word) possible to change the pessimistic concurrency into an optimistic one? For example on the above queues.
Is Hekaton made for many inserts/deletes (enqueue/dequeue), order rows (FIFO queues), and a lot of variations of table sizes (workload variations on the server will increase/decrease the queues size)? It will be possible to update properly the Statistics for the query performance of native store procedures?
I feel like native compiled SQL store procedures will improve a lot the performance, but I'm not sure if this kind of implementation (correlated FIFO queues) are good to be used on Hekaton, since I'm not finding any examples of "In memory queues" implementations using Hekaton.
You can implement in Hekaton what you described - as you mentioned, the app will have to take care of retrying in case of the transaction being aborted due to concurrency on the same row. Having said that, you also have to consider that SQL 2014 does not support large binary objects, you will need to use SQL 2016 or workaround it as we do it for ASP.NET Session state:
http://blogs.msdn.com/b/kenkilty/archive/2014/07/03/asp-net-session-state-using-sql-sever-in-memory.aspx
Hekaton is designed for OLTP, that means, lots of inserts, updates, deletes.
Plan ahead the memory requirements:
https://msdn.microsoft.com/en-us/library/dn133186.aspx
I'm designing a system where questions like "what was the sum of all records matching certain criteria, at a certain time" are important.
Multi-value concurrency control (MVCC) seems like the way to go here, since it keeps an audit trail forever.
However, it would be nice if the data storage could handle this for me, rather than cobbling it together out of other database features. Now, Oracle and CouchDB other database engines use MVCC, but only behind the scenes. They use it to resolve conflicts or to decide what to do when a long-running query encounters recently updated data. But I don't know of any systems that allow you to explicitly say "in the system state of 17:00 January 20 2009, what does this query return". So are there such systems out there? Ideally, open source?
Take a look at flashback queries with Oracle.
As far as I know, MVCC doesn't guarantee the duration of earlier versions of rows. Its target is concurrency control for transactions; when all transactions for row 'x' have been committed, there's no real need for MVCC to keep earlier versions of that row.
You're thinking of Temporal Databases. Snodgrass's old book Developing Time-Oriented Database Applications in SQL is worth skimming, especially now that it's available as a PDF. If I had to do a temporal database from scratch again, I'd also read Temporal Data & the Relational Model, and anything else dealing with 6NF. (Where '6NF' <> 'DKNF'.)
Has open source ever created a single file database that has better performance when handling large sets of sql queries that aren't delivered in formal SQL transaction sets? I work with a .NET server that does some heavy replication of thousands of rows of data from another server and it does so it a 1-by-1 fashion without formal SQL transactions. So, therefore I cannot use SQLite or FirebirdDB or JavaDB because they all don't automatically batch the transactions and therefore the performance is dismal. Each insert waits for the success of the previous one, etc. So, I am forced to use a heavier database like SQLServer, MySQL, Postgres, or Oracle.
Does anyone know of a flat file database (that has a JDBC connect driver) that would support auto batching transactions and solve my problem?
The main think I dont like about the heavier databases is the lack of the ability to see inside the database with a one-mouse-click operation, like you can with SQLLite.
I tried creating a SQLite database and
then set PRAGMA read_uncommitted=TRUE;
and it didn't result in any
performance improvement.
I think that Firebird can work for this.
Firebird have good dotnet provider and many solution for replication
May be you can read this article for Firebird transaction
Try hypersonic DB - http://hsqldb.org/doc/guide/ch02.html#N104FC
If you want your transactions to be durable (i.e. survive a power failure) then the database will HAVE to write to the disc after each transaction (this is usually a log of some sort).
If your transactions are very small this will result in a huge number of writes, and very poor performance even on your battery backed raid controller or SSD, but worse performance on consumer-grade hardware.
The only way of avoiding this is to somehow disable the flush at txn commit (which of course breaks durability). I have no idea which ones support this, but it should be easy to find out.
Given a .NET environment with Windows CE, can you persist thousands of records per second in a local database (SQL Server 2008 - standard or CE).
What are the performance issues with persisting realtime instrument data in a database versus a log file?
SQL Server 2008 standard is more than capable of those insertion rates PROVIDED you have hardware capable of supporting it.
The question you really need to be asking is do I require the ability to search the captured data quickly?
This SO answer might be of interest: What does database query and insert speed depend on?
The number (and width) of indexes on a table will obviously have an impact on insertion rate.
If you are considering open-source, then MySQL is often cited as being able to handle high volumes.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
MS SQL Server and Oracle, which one is better in terms of scalability?
For example, if the data size reach 500 TB etc.
Both Oracle and SQL Server are shared-disk databases so they are constrained by disk bandwidth for queries that table scan over large volumes of data. Products such as Teradata, Netezza or DB/2 Parallel Edition are 'shared nothing' architectures where the database stores horizontal partitions on the individual nodes. This type of architecture gives the best parallel query performance as the local disks on each node are not constrained through a central bottleneck on a SAN.
Shared disk systems (such as Oracle Real Application Clusters or Clustered SQL Server installations still require a shared SAN, which has constrained bandwidth for streaming. On a VLDB this can seriously restrict the table-scanning performance that is possible to achieve. Most data warehouse queries run table or range scans across large blocks of data. If the query will hit more than a few percent of rows a single table scan is often the optimal query plan.
Multiple local direct-attach disk arrays on nodes gives more disk bandwidth.
Having said that I am aware of an Oracle DW shop (a major european telco) that has an oracle based data warehouse that loads 600 GB per day, so the shared disk architecture does not appear to impose unsurmountable limitations.
Between MS-SQL and Oracle there are some differences. IMHO Oracle has better VLDB support than SQL server for the following reasons:
Oracle has native support for bitmap indexes, which are an index structure suitable for high speed data warehouse queries. They essentially do a CPU for I/O tradeoff as they are run-length encoded and use relatively little space. On the other hand, Microsoft claim that Index Intersection is not appreciably slower.
Oracle has better table partitioning facilities than SQL Server. IIRC The table partitioning in SQL Server 2005 can only be done on a single column.
Oracle can be run on somewhat larger hardware than SQL Server, although one can run SQL server on some quite respectably large systems.
Oracle has more mature support for Materialized views and Query rewrite to optimise relational queries. SQL2005 does have some query rewrite capability but it is poorly documented and I haven't seen it used in a production system. However, Microsoft will suggest that you use Analysis Services, which does actually support shared nothing configurations.
Unless you have truly biblical data volumes and are choosing between Oracle and a shared nothing architecture such as Teradata you will probably see little practical difference between Oracle and SQL Server. Particularly since the introduction of SQL2005 the partitioning facilities in SQL Server are viewed as good enough and there are plenty of examples of multi-terabyte systems that have been successfully implemented on it.
When you are talking 500TB, that is (a) big and (b) specialized.
I'd be going to a consultancy firm with appropriate specialists to look at the existing skill sets, integration with existing technology stacks, expected usage, backup/recovery/DR requirements....
In short, it's not the sort of project I'd be heading into based on opinions from stackoverflow. No offence intended, but there's simply too many factors to take into account, a lot of which would be business confidential.
Whether Oracle or MSSQL will scale / perform better is question #15. The data model is the first make-it or break-it item regardless of if you're running Oracle, MSSQL, Informix or anything else. Data model structure, what kind of applicaiton, how it accesses the db etc, which platform your developers know well enough to target for a large system etc are the first questions you should ask yourself.
I've worked as a DBA on Oracle (although some years back) and I use MSSQL extensively now, although not as a formal DBA. My advice would be that in the vast majority of cases both will meet everything you can throw at them and your performance issues will be much more dependent upon database design and deployment than the underlying characteristics of the products, which in both cases are absolutely and utterly solid (MSSQL is the best product that MS makes in many peoples opinion so don't let the usual perception of MS blind you on that).
Myself I would tend towards MSSQL unless your system is going to be very large and truly enterprise level (massive numbers of users, multiple 9's uptime etc.) simply because in my experience Oracle tends to require a higher level of DBA knowledge and maintenance than MSSQL to get the best out of it. Oracle also tends to be more expensive, both for initial deployment and in the cost to hire DBAs for it. OTOH if you are looking at an enterprise system then Oracle would have the edge, not least because if you can afford it their support is second to none.
I have to agree with those who said deisgn was more important.
I've worked with superfast and super slow databases of many different flavors (the absolute worst being an Oracle database, but it wasn't Oracle's fault). Design of the database and how you decide to index it and partition it and query it have far more to do with the scalability than whether the product is from MSSQL Server or Oracle.
I think you may more easily find more Oracle dbas with terrabyte database experience (running a large database is a specialty just like knowing a particular flavor of SQL) but that could depend on your local area.
oracle people will tell you oracle is better, sql server peopele will tell you sql server is better.
i say they scale pretty much the same. use what you know better. you have databases out there that are that size on oracle as well as sql server
When you get to OBSCENE database sizes (where over 1TB is really big enough, and 500TB is frigging massive), then operational support must come very high up on the list of requirements. With that much data, you don't mess about with penny pinching system specifications.
How are you going to backup that size of system? Upgrade the OS and patch the database? Scalability and reliability a concern?
I have experience of both Oracle and MS SQL, and for the really really big systems (users, data or importance) then Oracle is better designed for operational support and data management.
Every tried to backup and restore a 1TB+ SQL Server database split over multiple databases on multiple instances with transaction log files being spat out everywhere by each database and trying to keep it all in sync? Good luck with that.
With Oracle, you have ONE database (so I disagree with the "shared nothing" approach is better) with ONE set of REDO logs(1) and one set of archive logs(2) and you can just add extra hardware nodes without changing (i.e. repartitioning) you application and data.
(1) Redo logs are, of course, mirrored.
(2) Archive logs are, of course, stored in multiple locations.
It would also depend on what is your application meant for. If it uses only Inserts with very few updates, then I think MSSQL would be more scalable and better in terms of performance. However if one has lots of updates, then Oracle would scaleup better
I very much doubt that you are going to get an objective answer to that particular question, until you come across anyone that has implemented the same database (schema, data, etc.) on both platforms.
However given the fact that you can find millions of happy users of both databases, I dare say it's not too much of a stretch to say either will scale just fine (I've seen a snappy Sql 2005 implementation of 300 TB that seemed pretty responsive)
Oracle like a high-quality manual film camera, which needs the best photographer to take the best picture while MS SQL like an automatic digital camera. In old days, of course, all professional photographers will use film camera, now think about how many professional photographers use automatic digital camera.