Which database support scalability and availability? [closed] - database

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
What database software can do these?
Scalability via data partitioning, such as consistent hash.
Redundancy for fail over. Both memory cache and disk storage together. Key-value data. Value is document type such as JSON.
Prefers A and P in CAP theory.
I heard that MemcacheD can do these all, but I am not sure.
Here's details:
Data storage volume is, <30KB JSON document for each key. Keys shall be >100,000,000.
Data is accessed >10K times for a second.
Persistence is needed for every key-value data.
No need for transaction.
Development environment is C#, but other languages are ok if the protocol spec is known.
Map reduce is not needed.

This is too short a spec description to choose a database. There are tons of other contraints to consider (data storage volume, data transfer volume, persistence requirement, needs for transactions, development environnment, map reduce, etc.).
That being said:
Memcachedor Redis are memory database which means that you cannot store more than what your computer memory can hold. This is less true now that distributed capabilities have been added to redis.
Document database (such as MongoDB or Microsoft Document Db) support everything. And you can add memcached or redis in front. That's how most people use them.
I would like to add that any SQL can now deal with JSON. So that works too. With a cache up front if needed.
Some link of interest for JSON oriented database. But once again. That's too short a spec to choose a database.

Related

When should one use Redis as a Primary Database and Elastic Search [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I have following scenario, others may have different. How should we decide between Redis as persistent primary database and Elastic Search.
In a micro-service, database has lots of read requests, in comparison to write request. Also my data will have only 8-10 columns or keys in terms of JSON (Simple data structure).
If my database hardly gets write request in respect to read request, why should we not use Redis as persistent Database. I went through Redis Office document and found why should we use it as persistent database [Goodbye Cache: Redis as a Primary Database]
But still not convinced fully to use it as a Primary Database
The answer would depend on your application and what it does internally. But assuming you don't need particularly complicated queries to get the data (no complex filtering, for example) and you can fit all your information in memory, I see Redis as a completely valid alternative to a traditional database.
If you want the strongest possible guarantees Redis can offer, you'd want to enable both RDB and AOF persistence options (read https://redis.io/topics/persistence).
The big advantage of a set-up like this is you can trust Redis to improve the throughput of the application, and maintain a very good level of performance over time, even with a growing dataset.

Which key-value database to use for BLOB storage? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
We have a service that currently runs on top of a MySQL database and uses JBoss to run the application. The rate of the database growth is accelerating and I am looking to change the setup to improve scaling. The issue is not in a large number of rows nor (yet) a particularly high volume of queries but rather in the large number of BLOBs stored in the db. Particularly the time it takes to create or restore a backup (we use mysqldump and Percona Xtrabackup ) is a concern as well as the fact that we will need to scale horizontally to keep expanding the disk space in the future. At the moment the db size is around 500GB.
The kind of arrangement that I figure would work well for our future needs is a hybrid database that uses both MySQL and some key-value database. The latter would only store the BLOBs. The meta data as well as data for user management and business logic of the application would remain in the MySQL db and benefit from structured tables and full consistency. The application itself would handle the issue of consistency between the databases.
The question is which database to use? There are lots of NoSQL databases to choose from. Here are some points on what qualities I am looking for:
Distributed over multiple nodes, which are flexible to add or remove.
Redundancy of storage, with the database automatically making sure each value object is stored on at least two different nodes.
Value objects' size could range from a few dozen bytes to around 100MB.
The database is accessed from a java EJB application on top of JBoss as well as a program written in C++ that processes the data in the db. Some sort of connector for each would be needed.
No need for structure for the data. A single string or even just a large integer would suffice for the key, pure byte array for the value.
No updates for the value objects are needed, only inserts and deletes. If a particular object is made obsolete by a new object that fulfills the same role, the old object is deleted and a new object with a new key is inserted.
Having looked around a bit, Riak sounds good except for its problems with storing large value objects. Which database would you recommend?

Embedded document database for Node.js [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
What is the best embedded NoSQL database for Node.js?
My node.js application is too tiny to use a big database like mongodb which need extra configurations.
I tried EJDB, but it need too much disk space (about 1.5MB for each record).
I also search the web, but google gave me so many choices, which made me so confused.
Here are some requirements:
fast and lightweight
no need too much config
be able to store large amount of data (about hundred thousand records, total size in 1GB)
better package for Node.js
The most popular embedded Node.js database on GitHub is NeDB
Check this one: https://www.npmjs.com/package/ultradb
* no need for configuration, just point database file
* written for node in C, uses native node API - NAPI
* ideal for storing logs, large number of documents
* not ideal if you need to query over documents properties - you'll have to handle all kind of indexing yourself
* works as part of node, no separate processes
* multicore processors friendly - scales with node processes, multiple node instances/forks can work simultaneously on same database
* about 300kB
Maybe redis with storing documents as hashes?
http://redis.io/commands#hash
It works perfectly on Rapsberry Pi with about 100 hits per minute.
Also there is a good ORM for it - http://jugglingdb.co/

Looking for a disk-based redis-like database [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am currently using Redis for my app, and its features are really excellent for my application (lists, sets, sorted sets etc.).
My application relies heavily on sorted sets, lists, sets. And their related functions (push to a list, get list, union of sets etc. The only problem I am facing right now is that my data is large, and most of my data does not need to be in memory, and I want to store them on disk.
**I need an on-disk database with redis data structures **
I read about Cassandra but I am not sure if it supports sorted sets, sets, lists. Or at least if it does, I could not find methods to manipulate them the way Redis does.
Thanks.
https://github.com/yinqiwen/ardb
another REDIS protocol replacement
with LMDB, RocksDB and LevelDB disk-based backend
nice benchmarks
There are numerous on-disk databases with Redis-like datastructures or even trying to be drop-in protocol-compatible replacements for Redis.
There are excellent recommendations in "Is there something like Redis DB, but not limited with RAM size?" - pity the community considers such questions to be off-topic.
In particular, SSDB is an actively-maintained Redis-like on-disk database (but not directly compatible), and Ardb is an actively-maintained drop-in replacement for Redis that stores the data on disk. Disclaimer: I have not used either of them (yet).
try Edis - Erlang implementation of Redis based on leveldb http://inaka.github.io/edis/
I am encouraging you to learn Cassandra, while it has some things similar to key/value and sets, it is very different from Redis.
We currently moving one project from Redis (we use sadd / spop) to TokyoCabinet / KyotoCabinet via Memcached protocol. For the moment things looks good and very soon I will publish the lib on github - will be available here:
https://github.com/nmmmnu
and project will be called Simple Message Queue. It will support sadd / spop / sismember only. Also in Python you will be able to use new object instead of Redis object, but only for these three commands.
Hope this helps.
Update 2014.07:
Here is the project I am speaking about.
https://github.com/nmmmnu/MessageQueue
It implements both Redis and Memcached protocols. For back-end it uses memory ndb/mdb or berkeley db.

Is there a database with git-like qualities? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I'm looking for a database where multiple users can contribute and commit new data; other users can then pull that data into their own database repository, all in a git-like manner. A transcriptional database, if you like; does such a thing exist?
My current thinking is to dump the database to a single file as SQL, but that could well get unwieldy once it is of any size. Another option is to dump the database and use the filesystem, but again it gets unwieldy once of any size.
There's Irmin: https://github.com/mirage/irmin
Currently it's only offered as an OCaml API, but there's future plans for a GraphQL API and a Cap'n'Proto one.
Despite the complex API and the still scarce documentation, it allows you to plug any backend (In-Memory, Unix Filesystem, Git In-Memory and Git On-Disk). Therefore, it runs even on Unikernels and Browsers.
It also offers a bidirectional model where changes on the Git local repository are reflected upon Application State and vice-versa. With the complex API, you can operate on any Git-level:
Append-only Blob storage.
Transactional/compound Tree layer.
Commit layer featuring chain of changes and metadata.
Branch/Ref/Tag layer (only-local, but offers also remotes) for mutability.
The immutable store is often associated/regarded for the blobs + trees + commits on documentation.
Due the Content-addressable inherited Git-feature, Irmin allows deduplication and thus, reduced memory-consumption. Some functionally persistent data structures fit perfectly on this database, and the 3-way merge is a novel approach to handle merge conflicts on a CRDT-style.
Answer from: How can I put a database under version control?
I have been looking for the same feature for Postgres (or SQL databases in general) for a while, but I found no tools to be suitable (simple and intuitive) enough. This is probably due to the binary nature of how data is stored. Klonio sounds ideal but looks dead. Noms DB looks interesting (and alive). Also take a look at Irmin (OCaml-based with Git-properties).
Though this doesn't answer the question in that it would work with Postgres, check out the Flur.ee database. It has a "time-travel" feature that allows you to query the data from an arbitrary point in time. I'm guessing it should be able to work with a "branching" model.
This database was recently being developed for blockchain-purposes. Due to the nature of blockchains, the data needs to be recorded in increments, which is exactly how git works. They are targeting an open-source release in Q2 2019.
Because each Fluree database is a blockchain, it stores the entire history of every transaction performed. This is part of how a blockchain ensures that information is immutable and secure.
It's not SQL, but CouchDB supports replicating the database and pushing/pulling changes between users in a way similar to what you describe.
Some more information in the chapter on replication in the O'Reilly CouchDB book.

Resources