I'm writing a database-style thing in C (i.e. it will store and operate on about 500,000 records). I'm going to be running it in a memory-constrained environment (VPS) so I don't want memory usage to balloon. I'm not going to be handling huge amounts of data - perhaps up to 200MB in total, but I want the memory footprint to remain in the region of 30MB (pulling these numbers out of the air).
My instinct is doing my own page handling (real databases do this), but I have received advice saying that I should just allocate it all and allow the OS to do the VM paging for me. My numbers will never rise above this order of magnitude. Which is the best choice in this case?
Assuming the second choice, at what point would it be sensible for a program to do its own paging? Obviously RDBMsses that can handle gigabytes must do this, but there must be a point along the scale at which the question is worth asking.
Thanks!
Use malloc until it's running. Then and only then, start profiling. If you run into the same performance issues as the proprietary and mainstream "real databases", you will naturally begin to perform cache/page/alignment optimizations. These things can easily be slotted in after you have a working database, and are orthogonal to having a working database.
The database management systems that perform their own paging also benefit from the investment of huge research efforts to make sure their paging algorithms function well under varying system and load conditions. Unless you have a similar set of resources at your disposal I'd recommend against taking that approach.
The OS paging system you have at your disposal has already benefit from tuning efforts of many people.
There are, however, some things you can do to tune your OS to benefit database type access (large sequential I/O operations) vs. the typical desktop tuning (mix of seq. and random I/O).
In short, if you are a one man team or a small team, you probably should make use of existing tools rather than trying to roll your own in that particular area.
Related
Ok, so I've been doing a bit of research into NoSQL databases, and they seem to be the right option for what I need. The problem is however, that a lot of these databases, if not most of them are reading to/writing from RAM, as opposed to disk. That's great when you have plenty of server resources or don't expect massive data blocks - but I think I should prepare for the worst.
What I expect to receive from these data sources is anywhere from 25KB to 150KB per query - yup - up to 150KB for a single key value. The average user will produce anywhere from 500 to 5000 of these keys and they can grow infinitely (but will probably stop somewhere in that 5000 range). If you quickly do the calculations (most of the data will be on the higher end of 25-150, so I'll use 100KB as an "average", most users will probably produce 2000-3000 queries): 100KB*3000 - that's 300MB per user! An insane amount of data when you start getting a decent userbase. So, ultimately I'll probably throw away most of the data in the queries so it is no more than 1KB or so, but that will still far surpass most RAM capabilities.
So I think what I'm looking for is a solution that will store data to disk, and cache objects in RAM.. But I'm open to all solutions! Let me know what you guys think. I would love to keep this thing running fast...
Edit:
Wording it slightly differently as to be useful to a passerby:
If one is looking to maximize performance but handle large dataloads in a NoSQL database, what would be the recommended NoSQL database? I would think it would be one which stores data to disk, but this can compromise performance significantly. Is there a "best of both worlds" solution out there? It is important to note I assume, that these records would not be modified once they were submitted, only read from (but maybe not even that often).
I've been looking into Redis for such a task, because it looks very clean to manage - however it runs entirely in RAM, thus requires small data blocks, or multiple servers running multiple instances at once.. Which is something I don't have access to.
First of all, I think when you say most you've seen store data in RAM, you refer to in memory Key/Value data stores like Redis or Memcached.
But there's more than that. Before closing the discussion on in-memory NoSQL options, I should say that you are right. Memory fills up quite easily and you would need tons of it, judging from your requirements. So in-memory options should be discarded (not they're not useful, but not not in this specific situation).
My proposal is MongoDb. Does what you need: stores data on disk, caches stuff in-memory (as much as it can).
However, you need some powerful data storage options (SSD is what you should think about) so it can handle your data throughput needs. I've tested Mongo, but with far less data.
I was looking for over 1 million elements collections, with value sizes ranging from 5Kb to 50Kb.
I was mostly interested in read speeds. I should also mention write speeds, which I tested, and must say that they are impressive. One million 20Kb inserts in a few minutes (on a small server - quad core, 8GB of RAM, VMware VM).
Getting back to read speeds, I was looking for semi-concurrent queries that would give me under 50msec read times for around 100 concurrent users.
With some help from the MongoDb team I managed to get close to those times, but then I got into something else and had to drop my research (temporarily, I hope to resume it soon). There are far more things to look into, as speeds for aggregates, map/reduce, etc.
I can say that query times on the server were super fast and all the overhead was added by BSON serialization/deserialization and transport over the network.
So, for you Mongo would be appropriate, but you have to back it up with some good hardware.
You should really install it and test it in your specific situation and draw your conclusions from your own tests.
If you're going to do it and your client is .NET, then you should use their official driver. Otherwise, there are plenty others listed here: http://www.mongodb.org/display/DOCS/Drivers.
A good intro on Mongo features and how to use them can be found here: http://www.mongodb.org/display/DOCS/Developer+Zone. Granted, their documentation is not as good as the one for RavenDb (another NOSQL solution I've tested, but not nearly as fast) but you can get good support here or on Google Groups.
My question is why Databases are not used with Drawing, 3D Modelling, 3D Design, Game Engines and architecture etc. software to save the current state of the images or the stuff that is present on the screen or is the part of a project in a Database.
One obvious answer is the speed, the speed of retrieving or saving all the millions of triangles or points forming the geometry is very low, as there would be hundreds or thousands of queries per second, but is it really the cause? Considering the apparent advantages of using databases can allow sharing the design live over a network when it is being saved at a common location, and more than one people can work on it at a time or can use can give live feedback when something is being designed when it is being shared, specially when time based update is used, such as update after every 5 or 10 seconds, which is not as good as live synchronization, but should be quick enough. What is the basic problem in the use of Databases in this type of software that caused them not to be used this way, or new algorithms or techniques not being developed or studied for optimizing the benefits of using them this way? And why this idea wasn't researched a lot.
The short answer is essentially speed. The speed of writing information to a disk drive is an order of magnitude slower than writing it to ram. The speed of network access is in turn an order of magnitude slower than writing or reading a hard disk. Live sharing apps like the one you describe are indeed possible though, but wouldn't necessarily require what you would call a "database", nor would using a database be such a great idea. The reason more don't exist is that they are actually fiendishly difficult to program. Programming by itself is difficult enough, even just thinking in a straight line, with a single narrative. But to program something like that requires you to be able to accurately visualise multiple parallel dimensions acting on the same data simultaneously, without breaking anything. This is actually difficult.
Your obvious answer is correct; I'm not an expert in that particular field but I'm at a point that even from a distance you can see that's (probably) the main reason.
This doesn't negate the possibility that such systems will use a database at some point.
Considering the apparent advantages of using databases can allow
sharing the design live over a network when it is being saved at a
common location...
True, but just because the information is being passed around doesn't mean that you have to store it in a database in order to be able to do that. Once something is in memory you can do anything with it - the issue with not persisting stuff is that you will lose data if the server fails / stops / etc.
I'll be interested to see if anyone has a more enlightened answer.
Interesting discussion... I have a biased view towards avoiding adding too much "structure" (as in in the indexes, tables, etc. in a DBMS solution) to spacial problems or challenges beyond those that cannot be readily recreated from a much smaller subset of data. I think if the original poster would look at the problem from what is truly needed to answer the need/develop solutions ... he may not need to query a DBMS nearly as often... so the DBMS in question might relate to a 3D space but in fact be a small subset of total values in that space... in a similar way that digital displays do not concern themselves with every chnage in pixel (or voxel) value but only those that have changed. Since most 3D problems would have a much lower "refresh" rate vs video displays the hits on the DBMS would be infrequent enough to possibly make a 3D DB look "live".
Has anyone had any experience with MonetDB? Currently, I have a MySQL database that is growing too large, and queries are getting too slow. According to column-oriented paradigm, insertions will be slower (which I don't mind at all), but data retrieval becomes very fast. Do I stand a chance of getting more data retrieval performance just by switching to MonetDB? Is it MonetDB mature enough?
You have a chance of improving the performance of your application. The gain is, however, largely dependent on your workload, the size of your database and your hardware. MonetDB is developed/tuned under two main assumptions:
Your workload is analytical, i.e., you have lots of (grouped) aggregations and the like.
Even more important: your hot dataset (the data that you actually work with) fits into the main memory of your system. MonetDB does not have it's own Buffer Manager but relies on the OS to handle disk I/O. Since the OS (especially windows but Linux too) is sometimes very dumb about disk swapping that may become a problem (especially for joins that run out of memory).
As for the maturity, there are probably more opinions on that than people inhabiting this planet. Personally, I find it mature enough but I am a member of the development team and, thus, biased. But MonetDB is a research project so if you have an interesting application we'd love to hear about it and see if we can help.
The answer of course depends on your payload but my experience so far would seem to indicate that about everything is faster in MonetDB than I've seen in MySQL. The exception would be joins, which not only seem slow, but seem completely inept at pipelining so you end up needing gobs of memory to process large ones. That said my experience with joins in MySQL hasn't exactly been stellar either, so I'm guessing your expectations may be low. If you really want good join performance, I'd probably recommend SQL Server or the like; for those other queries you mention in the follow up comments, MonetDB should be awesome.
For instance, given a table with about 2 million rows in it, I was able to range on one column (wherin there were about 800K rows in the range) and order by another column and the limited result was processed and returned in 25ms. Performance of those type of queries does seem to degrade with scale, but that should give you a taste for what you might expect at that scale.
I should caution that the optimistic concurrency model might throw off those that have only been exposed to pessimistic concurrency (most people). I'd research it before wondering why some of your commits fail under concurrent load.
I am trying to create a key/value database with 300,000,000 key/value pairs of 8 bytes each (both for the key and the value). The requirement is to have a very fast key/value mechanism which can query about 500,000 entries per second.
I tried BDB, Tokyo DB, Kyoto DB, and levelDB and they all perform very bad when it comes to databases at that size. (Their performance is not even close to their benchmarked rate at 1,000,000 entries).
I cannot store my database in memory because of hardware limitations (32 bit software), so memcached is out of the question.
I cannot use external server software as well (only a database module), and there is no need for multi-user support at all. Of course server software cannot hold 500,000 queries per second from a single endpoint anyways, so that leaves out Redis, Tokyo tyrant, etc.
David Segleau, here. Product Manager for Berkeley DB.
The most common problem with BDB performance is that people don't configure the cache size, leaving it at the default, which is pretty small. The second most common problem is that people write application behavior emulators that do random look-ups (even though their application is not really completely random) which forces them to read data out of cache. The random I/O then takes them down a path of conclusions about performance that are not based on the simulated application rather than the actual application behavior.
From your description, I'm not sure if your running into these common problems or maybe into something else entirely. In any case, our experience is that Berkeley DB tends to perform and scale very well. We'd be happy to help you identify any bottlenecks and improve your BDB application throughput. The best place to get help in this regard would be on the BDB forums at: http://forums.oracle.com/forums/forum.jspa?forumID=271. When you post to the forum it would be useful to show the critical query segments of your application code and the db_stat output showing the performance of the database environment.
It's likely that you will want to use BDB HA/Replication in order to load balance the queries across multiple servers. 500K queries/second is probably going to require a larger multi-core server or a series of smaller replicated servers. We've frequently seen BDB applications with 100-200K queries/second on commodity hardware, but 500K queries per second on 300M records in a 32-bit application is likely going to require some careful tuning. I'd suggest focusing on optimizing the performance of a the queries on the BDB application running on a single node, and then use HA to distribute that load across multiple systems in order to scale your query/second throughput.
I hope that helps.
Good luck with your application.
Regards,
Dave
I found a good benchmark comparison web page that basically compares 5 renowned databases:
LevelDB
Kyoto TreeDB
SQLite3
MDB
BerkeleyDB
You should check it out before making your choice: http://symas.com/mdb/microbench/.
P.S - I know you've already tested them, but you should also consider that your configuration for each of these tests was not optimized as the benchmark shows otherwise.
Try ZooLib.
It provides a database with a C++ API, that was originally written for a high-performance multimedia database for educational institutions called Knowledge Forum. It could handle 3,000 simultaneous Mac and Windows clients (also written in ZooLib - it's a cross-platform application framework), all of them streaming audio, video and working with graphically rich documents created by the teachers and students.
It has two low-level APIs for actually writing your bytes to disk. One is very fast but is not fault-tolerant. The other is fault-tolerant but not as fast.
I'm one of ZooLib's developers, but I don't have much experience with ZooLib's database component. There is also no documentation - you'd have to read the source to figure out how it works. That's my own damn fault, as I took on the job of writing ZooLib's manual over ten years ago, but barely started it.
ZooLib's primarily developer Andy Green is a great guy and always happy to answer questions. What I suggest you do is subscribe to ZooLib's developer list at SourceForge then ask on the list how to use the database. Most likely Andy will answer you himself but maybe one of our other developers will.
ZooLib is Open Source under the MIT License, and is really high-quality, mature code. It has been under continuous development since 1990 or so, and was placed in Open Source in 2000.
Don't be concerned that we haven't released a tarball since 2003. We probably should, as this leads lots of potential users to think it's been abandoned, but it is very actively used and maintained. Just get the source from Subversion.
Andy is a self-employed consultant. If you don't have time but you do have a budget, he would do a very good job of writing custom, maintainable top-quality C++ code to suit your needs.
I would too, if it were any part of ZooLib other than the database, which as I said I am unfamiliar with. I've done a lot of my own consulting work with ZooLib's UI framework.
300 M * 8 bytes = 2.4GB. That will probably fit into memory (if the OS does not restrict the address space to 31 bits)
Since you'll also need to handle overflow, (either by a rehashing scheme or by chaining) memory gets even tighter, for linear probing you probably need > 400M slots, chaining will increase the sizeof item to 12 bytes (bit fiddling might gain you a few bits). That would increase the total footprint to circa 3.6 GB.
In any case you will need a specially crafted kernel that restricts it's own "reserved" address space to a few hundred MB. Not impossible, but a major operation. Escaping to a disk-based thing would be too slow, in all cases. (PAE could save you, but it is tricky)
IMHO your best choice would be to migrate to a 64 bits platform.
500,000 entries per second without holding the working set in memory? Wow.
In the general case this is not possible using HDDs and even difficult SSDs.
Have you any locality properties that might help to make the task a bit easier? What kind of queries do you have?
We use Redis. Written in C, its only slightly more complicated than memcached by design. Never tried to use that many rows but for us latency is very important and it handles those latencies well and lets us store the data in the disk
Here is a bench mark blog entry, comparing redis and memcached.
Berkely DB could do it for you.
I acheived 50000 inserts per second about 8 years ago and a final database of 70 billion records.
I need to choose a database management system (DBMS) that uses the least amount of main memory since we are severely constrained. Since a DBMS will use more and more memory to hold the index in main memory, how exactly do I tell which DBMS has the smallest memory footprint?
Right now I just have a memory monitor program open while I perform a series of queries we'll call X. Then I run the same set of queries X on a different DBMS and see how much memory is used in its lifetime and compare with the other memory footprints.
Is this a not-dumb way of going about it? Is there a better way?
Thanks,
Jbu
Just use SQLite. In a single process. With C++, preferably.
What you can do in the application is manage how you fetch data. If you fetch all rows from a given query, it may try to build a Collection in your application, which can consume memory very quickly if you're not careful. This is probably the most likely cause of memory exhaustion.
To solve this, open a cursor to a query and fetch the rows one by one, discarding the row objects as you iterate through the result set. That way you only store one row at a time, and you can predict the "high-water mark" more easily.
Depending on the JDBC driver (i.e. the brand of database you're connecting to), it may be tricky to convince the JDBC driver not to do a fetchall. For instance, some drivers fetch the whole result set to allow you to scroll through it backwards as well as forwards. Even though JDBC is a standard interface, configuring it to do row-at-a-time instead of fetchall may involve proprietary options.
On the database server side, you should be able to manage the amount of memory it allocates to index cache and so on, but the specific things you can configure are different in each brand of database. There's no shortcut for educating yourself about how to tune each server.
Ultimately, this kind of optimization is probably answering the wrong question.
Most likely the answers you gather through this sort of testing are going to be misleading, because the DBMS will react differently under "live" circumstances than during your testing. Futhermore, you're locking yourself in to a particular architecture. It's difficult to change DBMS down the road, once you've got code written against it. You'd be far better served finding which DBMS will fill your needs and simplify your development process, and then make sure you're optimizing your SQL queries and indices to fit the needs of your application.