Persisting User Preferences - Relational or Document-Oriented Database

Persisting User Preferences - Relational or Document-Oriented Database - database

I am looking into persisting user preferences past session expiration for an application and was curious if based on people's previous experiences a Relational Database (i.e. Oracle, MySql) or Document-Oriented Database (i.e. MongoDB, Redis) is better suited for this task. To help clarify the meaning of user preferences, my web-application would be storing pretty detailed information on a per-user basis including but not limited to: window size and position, grid column width and order, various widget states (collapsed/un-collapsed panels). All persistence in my application is currently handled by a Relational Database, but I have a feeling that something like user preferences may lend itself better to a Document-Oriented Database because it may be hard to represent this data in a strictly-structured way and a semi-structured approach may be better.

If you are already using a relational database for your application, it makes little sense to separate out just user privileges to a document-oriented db - it would just increase complexity. Starting a new app, it's worth considering.
For existing application you may consider using semi-structured data stores, like Postgresql's hstore.

The question being asked is Suitability not Practicality of installing new DB.
What is DB better suited for non-relational data like user preference ?
Clearly the answer should be non-relational DB. Document oriented NoSQL databases are suitable to storing these.
The OP mentioned Widgets etc preferences which are most likely JSON a document/objects. This is another reason mongoDB or JSON document oriented DB is more suitable.
There is also a fear of "installing new database" which is coming from the experience/pains of older relational databases which none of these NoSQL will have. But all this is besides the "suitability" question. Many factors will go into the "practicality" decision besides just the dependency.

Related

Relational or NoSQL database for application

I will be working on application which should manage instracompany documentation. It should work like this: User can upload a document with mandatory fields like: description, when document will be accessible for "readers", group of people who should approve the document, group of people who should read the document after successful approval... then history of documents, which decision each user made (aggred x disaggreed), some basic managment of users/groups/documents/roles etc... This application should be for a company (and it should run on local network).
Should I use relational database with ORM or NoSQL database? And why? What would be benefits of relation db or nosql related to description of application above.
Thanks

If you have a strictly defined schema (seems to be the case) and predictable traffic (which is also very likely in a corporate environment) and want ACID transactions and data recovery guarantees which have been tested and polished for many years (you surely do) then RDBMS is your choice. It doesn't matter what is used on the application side, ORM, plain JDBC or whatever.
One slippery point might be the document storage, however, provided that documents are not huge, relational databases (e.g. PostgreSQL) will do the job just fine.
This assumes that you do not expect hundreds of thousands requests per second and thus don't need any sharding. Even if you do expect such load, RDBMS may be okay.

How to choose between relational database and non relational database?

I have to build a website that store a lot of data. I search in the Internet to decide whether to use relational or non relational database. I can't find a good answer. Some website say that if you have a lot of data you can choose non relational database, but I think this is not a good strategy. Facebook (for example) use relational database (mysql) although there are a lot of data that is stored in Facebook database. Other website say that if your data can be organized in tables, you can choose relational database. However, as the website say, the performance of non relational database is better than relation database. My data can be organized in tables but I don't want to lose performance.
My need is to store huge amount of data and access them as fast as possible. So how can I decide between relational or non relational database.

It's about more than performance. Google can find lots of sources to help you choose, like this one:
http://www.informationweek.com/big-data/big-data-analytics/nosql-newsql-or-rdbms-how-to-choose/a/d-id/1297861
I would be surprised if you could measure a performance difference in your app and lay it at the feet of the database. Relational databases are quite performant. I doubt that your requirements are that special.Your code, latency, or other factors are likely to be bigger problems.
I would isolate the database in your code behind an interface and prototype both using real data and use cases. No one can answer here but you; better to do it with data than to guess.

Understand nosql database design

I am interested in how nosql databases are designed. To be specific things that I really like are support of range query, auto backup, partitioning(support of horizontal scaling) etc. Is there any book or online document to read if one is interested in building a nosql database. I know it is quite important to decide that what use-cases your database has to support. But at initial stage what are the basic structures one has to be aware of.

Schema-free/flexible ACID database for a SaaS application?

I am looking at rewriting a VB based on-premise (locally installed) application (invoicing+inventory) as a web based Clojure application for small enterprise customers. I am intending this to be offered as a SaaS application for customers in similar trade.
I was looking at database options: My choice was an RDBMS: Postgresql/ MySQL. I might scale up to 400 users in the first year, with typically a 20-40 page views/ per day per user - mostly for transactions not static views. Each view will involve fetch data and update data. ACID compliance is necessary(or so I think). So the transaction volume is not huge.
It would have been a no-brainer to pick either of these based on my preference, but for this one requirement, which I believe is typical of a SaaS app: The Schema will be changing as I add more customers/users and for each customer's changing business requirement (I will be offering some limited flexibility only to start with). As I am not a DB expert, based on what I can think of and has read, I can handle that in a number of ways:
Have a traditional RDBMS schema design in MySQl/Postgresql with a single DB hosting multiple tenants. And add enough "free-floating" columns in each table to allow for future changes as I add more customers or changes for an existing customer. This might have a downside of propagating the changes to the DB every time a small change is made to the Schema. I remember reading that in Postgresql schema updates can be done real time without locking. But not sure, how painful or how practical is it in this use case. And also, as the schema changes might also introduce new/ minor SQL changes as well.
Have an RDBMS, but design the database schema in a flexible manner: with a close to entity-attribute-value or just as a key-value store. (Workday, FriendFeed for example)
Have the entire thing in-memory as objects and store them in log files periodically.(e.g., edval, lmax)
Go for a NoSQL DB like MongoDB or Redis. But based on what I can gather, they are not suitable for this use-case and not fully ACID compliant.
Go for some NewSQL Dbs like VoltDb or JustoneDb(cloud based) which retain the SQL and ACID compliant behaviour and are "new-gen" RDBMS.
I looked at neo4j(graphdb), but not sure if that will fit this use-case
In my use case, more than scalability or distributed computing, I am looking at a better way to achieve "Flexibility in Schema + ACID + some reasonable Performance". Most of the articles I could find on the net speak of flexibility in schema as a cause leading to performance(in the case of NoSQL DBs) and scalability while leaving out the ACID/Transactions side.
Is this an "either or" case of 'Schema flexibility vs ACID' transactions or Is there a better way out?

I think tarantool can help you. That solution have transactions, lua, msgpack, and etc. And also see that video

What is couchdb, for what and how should I use it?

I hear a lot about couchdb, but after reading some documents about it, I still don't get why to use it and how.
Could you clarify this mystery for me?

It's a non-relational database, open-source, distributed (incremental, bidirectional replication), schema-free. A CouchDB database is a collection of documents; each document is a bunch of string "keys" and corresponding "values" (which can be numbers, strings, lists, dates, ...). You can have indices, queries, views.
If a relational DB feels confining to you (you find schemas too rigid, can't spread the DB engine work around a very large numbers of servers, etc), CouchDB is worth considering (it's one of the most interesting of the many non-relational DBs that are emerging these days).
But if all of your work happily fits in a relational database, that's what you probably want to continue using for production work (even though "playing around" with some non-relational DB is still well worth your time, just for personal growth and edification, that's quite different from transferring huge production systems over from a relational DB!-).

It sounds like you should be reading Why CouchDB

To quote from wikipedia
It is not a relational database management system. Instead of storing data in rows and columns, the database manages a collection of JSON documents. The documents in a collection need not share a schema, but retain query abilities via views.
CouchDB provides a different model for data storage than a traditional relational database in that it does not represent data as rows within tables, instead it stores data as "documents" in JSON format.
This difference in data storage model is what differenciates CouchDB from products like MySQL and SQL Server.
In terms of programatic access to CouchDB, it exposes a REST API which you can access by sending HTTP requests from your code
I hope this has been somewhat helpful, though I acknowlege it may not be given my minimal familiarity with the product

I'm far from an expert(all I've done is play around with it some...) but here's how I'm thinking of using it:
Usually when I'm designing an app I've got a bunch of app servers behind a load balancer. Often times, I've got sticky sessions so that each user will go back to the same app server during that session. What I'm thinking of doing is have a couchdb instance tied to each app server.
That way you can use that local couchdb to access user preferences, product data...whatever data you've got that doesn't have to be perfectly up to date.
So...now you've got data on these local CouchDBs. CouchDB allows replication. So, every fixed time period, merge the data back(every X seconds?) into it's peers to keep them up to date.
As a whole you shouldn't have to worry about conflicts b/c each appserver has it's own CouchDB and users are attached to the appserver, and you've got eventual consistency because you've got replication.
Does that answer your question?

A good example is when you say have to deal with people data in either a website or application. If you set off wishing to design the data and keep the individuals' information seperate, that makes a good case for CouchDB, which stores data in documents rather than relational tables. In a production deployment, my users may end up adding adhoc data about 10% of the people and some other funny details for another selected 5%. In a relational context, this could add up to loads of redundancy but not for CouchDB.
And it's not just about the fact that CouchDB is non-relational: if you're too focus on that, you're missing the point. CouchDB is plugged into the web, all you need to start with is HTTP for creating and making queries (GET/PUT/POST/DELETE...), and it's RESTful, plus the fact that it's portable and great for peer to peer sharing. It can also serve up web applications in what is termed as 'CouchApps', where CouchDB totally holds the images, CSS, markup as data stored under special documents called design documents.
Check out this collection of videos introducing non-relational databases, the one on CouchDB should give you a better idea.