Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
What is the best embedded NoSQL database for Node.js?
My node.js application is too tiny to use a big database like mongodb which need extra configurations.
I tried EJDB, but it need too much disk space (about 1.5MB for each record).
I also search the web, but google gave me so many choices, which made me so confused.
Here are some requirements:
fast and lightweight
no need too much config
be able to store large amount of data (about hundred thousand records, total size in 1GB)
better package for Node.js
The most popular embedded Node.js database on GitHub is NeDB
Check this one: https://www.npmjs.com/package/ultradb
* no need for configuration, just point database file
* written for node in C, uses native node API - NAPI
* ideal for storing logs, large number of documents
* not ideal if you need to query over documents properties - you'll have to handle all kind of indexing yourself
* works as part of node, no separate processes
* multicore processors friendly - scales with node processes, multiple node instances/forks can work simultaneously on same database
* about 300kB
Maybe redis with storing documents as hashes?
http://redis.io/commands#hash
It works perfectly on Rapsberry Pi with about 100 hits per minute.
Also there is a good ORM for it - http://jugglingdb.co/
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
We have a web service that needs a somewhat POSIX-compatible shared filesystem for the application servers (multiple redundant systems running in parallel behind redundant load balancers). We're currently running GlusterFS as the shared filesystem for the application servers but I'm not happy with the performance of the system. Compared to actual raw performance of the storage servers running GlusterFS, it starts to look more sensible to run DRBD and single NFS server with all the other GlusterFS servers (currently 3 servers) waiting in hot-standby role.
Our workload is highly read oriented and usually deals with small files and I'd be happy to use "eventually consistent" system as long as a client can request sync for a single file if needed (that is, client is prepared to wait until the file has been successfully stored in the backend storage). I'd even accept a system where such "sync" requires querying the state of the file via some other way than POSIX fdatasync(). File metadata such as modification times is not important, only filename and the contents.
I'm currently aware of possible candidates and the problems each one currently has:
GlusterFS: overall performance is pretty poor in practice, performance goes down while adding new servers/bricks.
Ceph: highly complex to configure/administrate, POSIX compatibility sacrifices performance a lot as far as I know.
MooseFS: partially obfuscated open source (huge dumps of internally written code published seldomly with intentionally lost patch history), documentation leaves lots to desire.
SeaweedFS: pretty simple design and supposedly high performance, future of this project is unclear because pretty much all code is written and maintained by Chris Lu - what happens if he no longer writes any code? Unclear if the "Filer" component supports no single point of failure.
I know that CAP theorem prevents ever having truly consistent and always available system. Is there any good system for distributed file system where writes must be durable, but read performance is really good and the system has no single point of failure?
I am Chris Lu working on SeaweedFS. There are plans to commercialize it. (By adding more advanced features.)
The filer does not have simple point of failure, you can have multiple filer instances. The filer store can be any key-value store. If you need no SPOF, you can use Cassandra, Redis cluster, CockroachDB, TiDB, or Etcd. Or you can add your own key-value store option, which is pretty easy.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
What database software can do these?
Scalability via data partitioning, such as consistent hash.
Redundancy for fail over. Both memory cache and disk storage together. Key-value data. Value is document type such as JSON.
Prefers A and P in CAP theory.
I heard that MemcacheD can do these all, but I am not sure.
Here's details:
Data storage volume is, <30KB JSON document for each key. Keys shall be >100,000,000.
Data is accessed >10K times for a second.
Persistence is needed for every key-value data.
No need for transaction.
Development environment is C#, but other languages are ok if the protocol spec is known.
Map reduce is not needed.
This is too short a spec description to choose a database. There are tons of other contraints to consider (data storage volume, data transfer volume, persistence requirement, needs for transactions, development environnment, map reduce, etc.).
That being said:
Memcachedor Redis are memory database which means that you cannot store more than what your computer memory can hold. This is less true now that distributed capabilities have been added to redis.
Document database (such as MongoDB or Microsoft Document Db) support everything. And you can add memcached or redis in front. That's how most people use them.
I would like to add that any SQL can now deal with JSON. So that works too. With a cache up front if needed.
Some link of interest for JSON oriented database. But once again. That's too short a spec to choose a database.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
We have a service that currently runs on top of a MySQL database and uses JBoss to run the application. The rate of the database growth is accelerating and I am looking to change the setup to improve scaling. The issue is not in a large number of rows nor (yet) a particularly high volume of queries but rather in the large number of BLOBs stored in the db. Particularly the time it takes to create or restore a backup (we use mysqldump and Percona Xtrabackup ) is a concern as well as the fact that we will need to scale horizontally to keep expanding the disk space in the future. At the moment the db size is around 500GB.
The kind of arrangement that I figure would work well for our future needs is a hybrid database that uses both MySQL and some key-value database. The latter would only store the BLOBs. The meta data as well as data for user management and business logic of the application would remain in the MySQL db and benefit from structured tables and full consistency. The application itself would handle the issue of consistency between the databases.
The question is which database to use? There are lots of NoSQL databases to choose from. Here are some points on what qualities I am looking for:
Distributed over multiple nodes, which are flexible to add or remove.
Redundancy of storage, with the database automatically making sure each value object is stored on at least two different nodes.
Value objects' size could range from a few dozen bytes to around 100MB.
The database is accessed from a java EJB application on top of JBoss as well as a program written in C++ that processes the data in the db. Some sort of connector for each would be needed.
No need for structure for the data. A single string or even just a large integer would suffice for the key, pure byte array for the value.
No updates for the value objects are needed, only inserts and deletes. If a particular object is made obsolete by a new object that fulfills the same role, the old object is deleted and a new object with a new key is inserted.
Having looked around a bit, Riak sounds good except for its problems with storing large value objects. Which database would you recommend?
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I am currently using Redis for my app, and its features are really excellent for my application (lists, sets, sorted sets etc.).
My application relies heavily on sorted sets, lists, sets. And their related functions (push to a list, get list, union of sets etc. The only problem I am facing right now is that my data is large, and most of my data does not need to be in memory, and I want to store them on disk.
**I need an on-disk database with redis data structures **
I read about Cassandra but I am not sure if it supports sorted sets, sets, lists. Or at least if it does, I could not find methods to manipulate them the way Redis does.
Thanks.
https://github.com/yinqiwen/ardb
another REDIS protocol replacement
with LMDB, RocksDB and LevelDB disk-based backend
nice benchmarks
There are numerous on-disk databases with Redis-like datastructures or even trying to be drop-in protocol-compatible replacements for Redis.
There are excellent recommendations in "Is there something like Redis DB, but not limited with RAM size?" - pity the community considers such questions to be off-topic.
In particular, SSDB is an actively-maintained Redis-like on-disk database (but not directly compatible), and Ardb is an actively-maintained drop-in replacement for Redis that stores the data on disk. Disclaimer: I have not used either of them (yet).
try Edis - Erlang implementation of Redis based on leveldb http://inaka.github.io/edis/
I am encouraging you to learn Cassandra, while it has some things similar to key/value and sets, it is very different from Redis.
We currently moving one project from Redis (we use sadd / spop) to TokyoCabinet / KyotoCabinet via Memcached protocol. For the moment things looks good and very soon I will publish the lib on github - will be available here:
https://github.com/nmmmnu
and project will be called Simple Message Queue. It will support sadd / spop / sismember only. Also in Python you will be able to use new object instead of Redis object, but only for these three commands.
Hope this helps.
Update 2014.07:
Here is the project I am speaking about.
https://github.com/nmmmnu/MessageQueue
It implements both Redis and Memcached protocols. For back-end it uses memory ndb/mdb or berkeley db.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have over 10 million small static html files (10K each). And I am an user of bluehost.com. It have a limitation of 50000 files. It sent an email to warn me that if I didn't delete the files in 30 days, it would disable my account.
So I am looking for a free service to host my files. I considered google app engine, but it has a even more strict limitation: no more than 1000 files.(each should not larger than 1 Mb). And it seems that I could upload the files to code.google.com which provides free project hosting service.
Any good suggestions? I prefer a free one or a cheap one. And it should have a programming interface to upload and download files. Thank you in advance.
I would consider converting all the files into a database and coding a small server side script to retrieve the data, then use some rewriting rules to redirect the visitor to the script.
Most web hosts nowadays offer some sort of server side language a database of some sort. Many also allow you to use .htaccess files to put your rewrite rules in.
10 million files made by hand ? If they where made by a program try to move the program into a dynamic web language like php.
Use zip and a frontend that unpacks the files (if needed).
10 million files is generated code. Don't. Just create on demand.
[edit]
Then you don't need to store the pages at all. There are data structures that let you reconstruct the original page while being fast searchable combined with using little extra storage space.
You can certainly do this on a Linode (cheapest plan $20/month). But I think that might be overkill. Amazon S3 charges by the gigabyte, so you wouldn't pay very much for that. As far as straight web hosting goes, you've got me - I don't know of a provider that will let you do that to their poor directories.
Do you need access to all the files all the time?
You could archive them into zip files, as text tends to compress quite well, and maybe that would give you the required space saving.