Looking for a disk-based redis-like database [closed] - database

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am currently using Redis for my app, and its features are really excellent for my application (lists, sets, sorted sets etc.).
My application relies heavily on sorted sets, lists, sets. And their related functions (push to a list, get list, union of sets etc. The only problem I am facing right now is that my data is large, and most of my data does not need to be in memory, and I want to store them on disk.
**I need an on-disk database with redis data structures **
I read about Cassandra but I am not sure if it supports sorted sets, sets, lists. Or at least if it does, I could not find methods to manipulate them the way Redis does.
Thanks.

https://github.com/yinqiwen/ardb
another REDIS protocol replacement
with LMDB, RocksDB and LevelDB disk-based backend
nice benchmarks

There are numerous on-disk databases with Redis-like datastructures or even trying to be drop-in protocol-compatible replacements for Redis.
There are excellent recommendations in "Is there something like Redis DB, but not limited with RAM size?" - pity the community considers such questions to be off-topic.
In particular, SSDB is an actively-maintained Redis-like on-disk database (but not directly compatible), and Ardb is an actively-maintained drop-in replacement for Redis that stores the data on disk. Disclaimer: I have not used either of them (yet).

try Edis - Erlang implementation of Redis based on leveldb http://inaka.github.io/edis/

I am encouraging you to learn Cassandra, while it has some things similar to key/value and sets, it is very different from Redis.
We currently moving one project from Redis (we use sadd / spop) to TokyoCabinet / KyotoCabinet via Memcached protocol. For the moment things looks good and very soon I will publish the lib on github - will be available here:
https://github.com/nmmmnu
and project will be called Simple Message Queue. It will support sadd / spop / sismember only. Also in Python you will be able to use new object instead of Redis object, but only for these three commands.
Hope this helps.
Update 2014.07:
Here is the project I am speaking about.
https://github.com/nmmmnu/MessageQueue
It implements both Redis and Memcached protocols. For back-end it uses memory ndb/mdb or berkeley db.

Related

Is there any high performance POSIX-like filesystem without a single point of failure? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
We have a web service that needs a somewhat POSIX-compatible shared filesystem for the application servers (multiple redundant systems running in parallel behind redundant load balancers). We're currently running GlusterFS as the shared filesystem for the application servers but I'm not happy with the performance of the system. Compared to actual raw performance of the storage servers running GlusterFS, it starts to look more sensible to run DRBD and single NFS server with all the other GlusterFS servers (currently 3 servers) waiting in hot-standby role.
Our workload is highly read oriented and usually deals with small files and I'd be happy to use "eventually consistent" system as long as a client can request sync for a single file if needed (that is, client is prepared to wait until the file has been successfully stored in the backend storage). I'd even accept a system where such "sync" requires querying the state of the file via some other way than POSIX fdatasync(). File metadata such as modification times is not important, only filename and the contents.
I'm currently aware of possible candidates and the problems each one currently has:
GlusterFS: overall performance is pretty poor in practice, performance goes down while adding new servers/bricks.
Ceph: highly complex to configure/administrate, POSIX compatibility sacrifices performance a lot as far as I know.
MooseFS: partially obfuscated open source (huge dumps of internally written code published seldomly with intentionally lost patch history), documentation leaves lots to desire.
SeaweedFS: pretty simple design and supposedly high performance, future of this project is unclear because pretty much all code is written and maintained by Chris Lu - what happens if he no longer writes any code? Unclear if the "Filer" component supports no single point of failure.
I know that CAP theorem prevents ever having truly consistent and always available system. Is there any good system for distributed file system where writes must be durable, but read performance is really good and the system has no single point of failure?
I am Chris Lu working on SeaweedFS. There are plans to commercialize it. (By adding more advanced features.)
The filer does not have simple point of failure, you can have multiple filer instances. The filer store can be any key-value store. If you need no SPOF, you can use Cassandra, Redis cluster, CockroachDB, TiDB, or Etcd. Or you can add your own key-value store option, which is pretty easy.

Which database support scalability and availability? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
What database software can do these?
Scalability via data partitioning, such as consistent hash.
Redundancy for fail over. Both memory cache and disk storage together. Key-value data. Value is document type such as JSON.
Prefers A and P in CAP theory.
I heard that MemcacheD can do these all, but I am not sure.
Here's details:
Data storage volume is, <30KB JSON document for each key. Keys shall be >100,000,000.
Data is accessed >10K times for a second.
Persistence is needed for every key-value data.
No need for transaction.
Development environment is C#, but other languages are ok if the protocol spec is known.
Map reduce is not needed.
This is too short a spec description to choose a database. There are tons of other contraints to consider (data storage volume, data transfer volume, persistence requirement, needs for transactions, development environnment, map reduce, etc.).
That being said:
Memcachedor Redis are memory database which means that you cannot store more than what your computer memory can hold. This is less true now that distributed capabilities have been added to redis.
Document database (such as MongoDB or Microsoft Document Db) support everything. And you can add memcached or redis in front. That's how most people use them.
I would like to add that any SQL can now deal with JSON. So that works too. With a cache up front if needed.
Some link of interest for JSON oriented database. But once again. That's too short a spec to choose a database.

Embedded document database for Node.js [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
What is the best embedded NoSQL database for Node.js?
My node.js application is too tiny to use a big database like mongodb which need extra configurations.
I tried EJDB, but it need too much disk space (about 1.5MB for each record).
I also search the web, but google gave me so many choices, which made me so confused.
Here are some requirements:
fast and lightweight
no need too much config
be able to store large amount of data (about hundred thousand records, total size in 1GB)
better package for Node.js
The most popular embedded Node.js database on GitHub is NeDB
Check this one: https://www.npmjs.com/package/ultradb
* no need for configuration, just point database file
* written for node in C, uses native node API - NAPI
* ideal for storing logs, large number of documents
* not ideal if you need to query over documents properties - you'll have to handle all kind of indexing yourself
* works as part of node, no separate processes
* multicore processors friendly - scales with node processes, multiple node instances/forks can work simultaneously on same database
* about 300kB
Maybe redis with storing documents as hashes?
http://redis.io/commands#hash
It works perfectly on Rapsberry Pi with about 100 hits per minute.
Also there is a good ORM for it - http://jugglingdb.co/

Is there a database with git-like qualities? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I'm looking for a database where multiple users can contribute and commit new data; other users can then pull that data into their own database repository, all in a git-like manner. A transcriptional database, if you like; does such a thing exist?
My current thinking is to dump the database to a single file as SQL, but that could well get unwieldy once it is of any size. Another option is to dump the database and use the filesystem, but again it gets unwieldy once of any size.
There's Irmin: https://github.com/mirage/irmin
Currently it's only offered as an OCaml API, but there's future plans for a GraphQL API and a Cap'n'Proto one.
Despite the complex API and the still scarce documentation, it allows you to plug any backend (In-Memory, Unix Filesystem, Git In-Memory and Git On-Disk). Therefore, it runs even on Unikernels and Browsers.
It also offers a bidirectional model where changes on the Git local repository are reflected upon Application State and vice-versa. With the complex API, you can operate on any Git-level:
Append-only Blob storage.
Transactional/compound Tree layer.
Commit layer featuring chain of changes and metadata.
Branch/Ref/Tag layer (only-local, but offers also remotes) for mutability.
The immutable store is often associated/regarded for the blobs + trees + commits on documentation.
Due the Content-addressable inherited Git-feature, Irmin allows deduplication and thus, reduced memory-consumption. Some functionally persistent data structures fit perfectly on this database, and the 3-way merge is a novel approach to handle merge conflicts on a CRDT-style.
Answer from: How can I put a database under version control?
I have been looking for the same feature for Postgres (or SQL databases in general) for a while, but I found no tools to be suitable (simple and intuitive) enough. This is probably due to the binary nature of how data is stored. Klonio sounds ideal but looks dead. Noms DB looks interesting (and alive). Also take a look at Irmin (OCaml-based with Git-properties).
Though this doesn't answer the question in that it would work with Postgres, check out the Flur.ee database. It has a "time-travel" feature that allows you to query the data from an arbitrary point in time. I'm guessing it should be able to work with a "branching" model.
This database was recently being developed for blockchain-purposes. Due to the nature of blockchains, the data needs to be recorded in increments, which is exactly how git works. They are targeting an open-source release in Q2 2019.
Because each Fluree database is a blockchain, it stores the entire history of every transaction performed. This is part of how a blockchain ensures that information is immutable and secure.
It's not SQL, but CouchDB supports replicating the database and pushing/pulling changes between users in a way similar to what you describe.
Some more information in the chapter on replication in the O'Reilly CouchDB book.

Is there a lightweight, embeddable, key/value database? (something like diet couchdb) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I was wondering if there was a lightweight, embeddable, key/value database out there.
Something like a lightweight Couchdb (RESTful, key/value, etc) where you just send it the key and it responds with appropriate values.
Thanks!
On the Related Projects page of the CouchDB wiki, under "Alternatives" they mention some similar projects:
Feather DB* CouchDB clone in java.
StrokeDB* A CouchDB-like database written in Ruby to make embedding into Ruby apps easier.
mongoDB A high-performance, open source, schema-free document-oriented database.
And of course Tokyo Cabinet which has already been mentioned.
There's also neo4j which is a "graph database" for java.
Of course, part of the power of CouchDB and some of the others is not just being able to store key/value pairs, but the high capacity, replication, and in particular views, which are basically the way of running queries over your documents.
If you just needed a simple key/value datastore that you can embed into your program, that doesn't have to hold gigs of data, the venerable GDBM might suit your needs.
A little hard to answer without knowing a bit more about your needs (programming language, concurrency requirements, data volumes and such).
* Web site does not appear to be working at the time of this writing.
Would TinyCDB be suitable?
http://www.corpit.ru/mjt/tinycdb.html
Introduction
TinyCDB is a very fast and simple package for creating and reading constant data bases, a data structure introduced by Dan J. Bernstein in his cdb package. It may be used to speed up searches in a sequence of (key,value) pairs with very big number of records. Example usage is indexing a big list of users - where a search will require linear reading of a large /etc/passwd file, and for many other tasks. It's usage/API is similar to ones found in BerkeleyDB, gdbm and traditional *nix dbm/ndbm libraries, and is compatible in great extent to cdb-0.75 package by Dan Bernstein.
CDB is a constant database, that is, it cannot be updated at a runtime, only rebuilt. Rebuilding is atomic operation and is very fast - much faster than of many other similar packages. Once created, CDB may be queried, and a query takes very little time to complete.
Simple, embeddable key/value database ? That's pretty much BDB
The OS filesystem is a lightweight key/value database. Keys are filenames and values are data in the files.
The word "embeddable" has an odd meaning if it's to be RESTful, so I don't really understand your requirements; but if all you need is storage and retrieval, why not use the FS?
Check out Perst -- it's licensed GPLv2 and/or proprietary depending on your needs. I've never used it but I hear that it's good. It's an application-embedded key-value store database under active development with ports to a number of popular frameworks and languages.
For what platform? Tokyo Cabinet is a lightweight, embeddable, associative database engine for a variety of scripting environments (Java, Ruby, Perl, Lua, et al.)

Resources