Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I am looking for a non-SQL database.
My requirements are as follow:
Should be able to store >10 billion records
Should consume only 1 gb of memory atmost.
User request should take less than 10 ms. (including processing time)
Java based would be great.(i need to access it from java and also if anytime I need to modify the database code )
The database will hold e-commerce search records like number of searches ,sales , product bucket,product filters...and many more...the database now is a flat file and I show now some specific data to users.The data to be show I configure prior and then according to that configuration users can send http request to view data. I want to make things more dynamic and people can view data without prior configuration....
In other words I want to built a fast analyzer which can show users what the user request for.
The best place to find names of non-relational databases is the NoSQL site. Their home page has a pretty comprehensive list, split onto various categories - Wide Column Store, Key-value Pair, Object, XML, etc. Find out more.
You don't really give enough information about your requirements. But it sounds like kdb+ meets all of the requirements that you've stated. But only if you want to get to grips with the rather exotic (and very powerful) Q language.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 months ago.
Improve this question
I'm developing a web backend with two modules. One handles a relatively small amount of data that doesn't change often. The other handles real-time data that's constantly being dumped into the database and never gets changed or deleted. I'm not sure whether to have separate databases for each module or just one.
The data between the modules is interconnected quite a bit, so it's a lot more convenient to have it in a single database.
But anything fails, I need the first database to be available for reads as soon as possible, and the second one can wait.
Also I'm not sure how much performance impact the constantly growing large database would have on the first one.
I'd like to make dumps of the data available to public, and I don't want users downloading gigabytes that they don't need.
And if I decide to use a single one, how easy is it to separate them later? I use Postgres, btw.
Sounds like you have a website with its content being the first DB, and some kind of analytics being the second DB.
It makes sense to separate those physically (as in on different servers). Especially if one of those is required to be available as much as possible. Separating mission critical parts from something not that important is a good design. Also, smaller DB means shorter recovery times from a backup, if such need to arise.
For the data that is interconnected, if you need remote lookup from one DB into another, Foreign Data Wrappers may help.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Firebase says it can have only 100k users simultaneously for spark plan. It also states per database. What does that mean? How can I store data in multiple databases and connect each other? Also it states 1gb data stored. How much will that be approx? Say 1 users data will have 10 childs. So how many users data can be stored at that space? Someone please help me out as google isn't very clear about it.
I'm going to assume you're talking about Realtime Databases and not Cloud Firestore.
The Firebase Spark "Free" Plan includes 100 simultaneous users not 100k. (100k+ users is supported with the Flame plan and Blaze plan).
You can store 1GB worth of data in the Real Time Database, and 100GB worth a month for download. This plan only supports 1 database per project, connecting of multiple databases isn't possible.
It's hard to determine how much "storage" that would take up, due to varying factors. But, a good rule of thumb is that most JSON data doesn't take up a lot of space so you should be good.
I would like to clarify with you that simultaneous users is just the amount of users that can access your database (via any interface or platform) at the same time to a single database.
There's a great documentation on the features and pricing of Firebase here, and I would also recommend reading some of their documentation on Realtime Databases.
I hope this helps, if you need any more help please let me know.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I am building a social network app and I am very concerned about next thing.What happens when in mongoDB a lot of users(let's assume millions) try to modify the same document at the same time. Will there be any mismatch ,or ignored queries or any kind of unexpected behaviour?
Practical Example:
2 collections: 'posts' and 'likes'
posts will have fields id | name | info | numberOfLikes
likes will have fields id | post | fromUser
When assumed millions of users like the post ,like object appears in 'likes' collection and business logic automatically increments numberOfLikes for post. I thought if there could be a conflict when tons of users try to modify that post likes count at the same time.
Databases have mechanisms in place to prevent this kind of situation. You can 'lock' on various logical structures, so you can be assured your data is intact - regardless of your transaction count.
See more below:
http://docs.mongodb.org/manual/faq/concurrency/
In MongoDB, operations are atomic at the document level.
See http://docs.mongodb.org/manual/core/data-modeling-introduction/#atomicity-of-write-operations
A couple of things.
You're saying you're building a social app and expect millions of likes and "tons" of them at the same time. Of course it's good to consider performance and scaling at the start of a project, but you're not going to build the next Facebook right now.
Furhtermore, you seem to want to use MongoDB as primary database for this app, and you seem to want to use it as a relational database. Read the somewhat biased titled article Why You Should Never Use MongoDB.
I'd suggest backing your site with a relational database (which may also be better for queries like "What posts did this user like" and "Did this user already like this post") and denormalizing that into MongoDB at regular intervals.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
We have a service that currently runs on top of a MySQL database and uses JBoss to run the application. The rate of the database growth is accelerating and I am looking to change the setup to improve scaling. The issue is not in a large number of rows nor (yet) a particularly high volume of queries but rather in the large number of BLOBs stored in the db. Particularly the time it takes to create or restore a backup (we use mysqldump and Percona Xtrabackup ) is a concern as well as the fact that we will need to scale horizontally to keep expanding the disk space in the future. At the moment the db size is around 500GB.
The kind of arrangement that I figure would work well for our future needs is a hybrid database that uses both MySQL and some key-value database. The latter would only store the BLOBs. The meta data as well as data for user management and business logic of the application would remain in the MySQL db and benefit from structured tables and full consistency. The application itself would handle the issue of consistency between the databases.
The question is which database to use? There are lots of NoSQL databases to choose from. Here are some points on what qualities I am looking for:
Distributed over multiple nodes, which are flexible to add or remove.
Redundancy of storage, with the database automatically making sure each value object is stored on at least two different nodes.
Value objects' size could range from a few dozen bytes to around 100MB.
The database is accessed from a java EJB application on top of JBoss as well as a program written in C++ that processes the data in the db. Some sort of connector for each would be needed.
No need for structure for the data. A single string or even just a large integer would suffice for the key, pure byte array for the value.
No updates for the value objects are needed, only inserts and deletes. If a particular object is made obsolete by a new object that fulfills the same role, the old object is deleted and a new object with a new key is inserted.
Having looked around a bit, Riak sounds good except for its problems with storing large value objects. Which database would you recommend?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I have around 10 tables containing millions of rows. Now I want to archive 40% of data due to size and performance problem.
What would be best way to archive the old data and let the web application run? And in the near future if I need to show up the old data along with existing.
Thanks in advance.
There is no single solution for any case. It depends much on your data structure and application requirements. Most general cases seemed to be as follows:
If your application can't be redesigned and instant access is required to all your data, you need to use more powerful hardware/software solution.
If your application can't be redesigned but some of your data could be count as obsolete because it's requested relatively rearely you can split data and configure two applications to access different data.
If your application can't be redesigned but some of your data could be count as insensitive and could be minimized (consolidated, packed, etc.) you can perform some data transformation as well as keeping full data in another place for special requests.
If it's possible to redesign your application there are many ways to solve the problem.In general you will implement some kind of archive subsystem and in general it's complex problem especially if not only your data changes in time but data structure changes too.
If it's possible to redesign your application you can optimize you data structure using new supporting tables, indexes and other database objects and algorythms.
Create archive database if possible maintain different archive server because this data wont be much necessary but still need to be archived for future purposes, hence this reduces load on server and space.
Move all the table's data to that location. Later You can retrieve back in number of ways:
Changing the path of application
or updating live table with archive table