what means the use of a Triplestore - database

the use of a triplestore means that we are going to use a database that have a table with 3 coluns and 7 indexes?
I mean using a triple store always is relationed with that relational model?

From http://en.wikipedia.org/wiki/Triplestore:
A triplestore is a purpose-built database for the storage and retrieval of Resource Description Framework (RDF) metadata.
It looks like that the high-performance triplestores use a non-relational model:
Some triplestores have been built as database engines from scratch, while others have been built on top of existing commercial relational database engines (i.e. SQL-based).4 Like the early development of OLAP databases, this intermediate approach allowed large and powerful database engines to be constructed for little programming effort in the initial phases of triplestore development. Long-term though it seems likely that native triplestores will have the advantage for performance. A difficulty with implementing triplestores over SQL is that although "triples" may thus be "stored", implementing efficient querying of a graph-based RDF model (i.e. mapping from SPARQL) onto SQL queries is difficult.5

Not necessarily. There are triplestores that rely on other RDBMS systems as backends. Examples of this case are: Jena/SDB, 3store or Virtuoso.
Others implement their own native persistent model customized to respond better to the RDF data model, like 4store, Jena/TDB or BigData. These tend to scale better.

Conceptually, yes - a triple is a binary relation with a subject, predicate, and object (e.g. <JohnSmith--marriedTo->JillSmith>.
Higher arity relations are not possible in a triple store as they are in a normal RDBMS (though you can fake them via the use of multiple triples).
The implementation varies though, as previous answers state.
Most triple stores actually store quads, so they can group triples into subsets ("Named Graphs" in RDF-speak).
The indexes are of course optional, but usually present in some form - again often modified to accommodate quads.

Related

Difference between RDBMS and ORDBMS

It happened to me when I was reading about PostgreSQL on its wiki page where it refers to itself as an ORDBMS. I always knew about Microsoft SQL Server which is an RDBM system. Can someone help me understand the main differences between Relational Database Management System(RDBMS) and Object Relational Database Management System (ORDBMS) and in what scenarios should I use one of them?
Also, my key question is related to the fact that in the world of Microsoft SQL Server we often use a layer of Entity Framework (EF) to do the object relational mapping on the application side. So, in ORDBMS world are all the responsibilities of an ORM already fulfilled by the database itself in entirety or there could be use cases or scenarios where I would end up using an ORM like Entity Framework on top of ORDBMS as well? Do people really even use ORMs on top of an ORDBMS system?
Taken from http://www.aspfree.com/c/a/database/introduction-to-rdbms-oodbms-and-ordbms/:
RDBMS
The main elements of RDBMS are based on Ted Codd’s 13 rules for a relational system, the concept of relational integrity, and normalization. The three fundamentals of a relational database are that all information must be held in the form of a table, where all data are described using data values. The second fundamental is that each value found in the table columns does not repeat. The final fundamental is the use of Standard Query Language (SQL).
Benefits of RDBMS are that the system is simple, flexible, and productive. Because the tables are simple, data is easier to understand and communicate with others. RDBMS are flexible because users do not have to use predefined keys to input information. Also, RDBMS are more productive because SQL is easier to learn. This allows users to spend more time inputting instead of learning. More importantly, RDBMS’s biggest advantage is the ease with which users can create and access data and extend it if needed. After the original database is created, new data categories can be added without the existing application being changed.
There are limitations to the relational database management system. First, relational databases do not have enough storage area to handle data such as images, digital and audio/video. The system was originally created to handle the integration of media, traditional fielded data, and templates. Another limitation of the relational database is its inadequacy to operate with languages outside of SQL. After its original development, languages such as C++ and JavaScript were formed. However, relational databases do not work efficiently with these languages. A third limitation is the requirement that information must be in tables where relationships between entities are defined by values.
ORDMS
Object-Relational database (ORDBMS) is the third type of database common today. ORDBMS are systems that “attempt to extend relational database systems with the functionality necessary to support a broader class of applications and, in many ways, provide a bridge between the relational and object-oriented paradigms.”
ORDBMS was created to handle new types of data such as audio, video, and image files that relational databases were not equipped to handle. In addition, its development was the result of increased usage of object-oriented programming languages, and a large mismatch between these and the DBMS software.
One advantage of ORDBMS is that it allows organizations to continue using their existing systems, without having to make major changes. A second advantage is that it allows users and programmers to start using object-oriented systems in parallel.
There are challenges in implementing an ORDBMS. The first is storage and access methods. The second is query processing, and the third is query optimization.
Most database players don't support it, or don't support it exclusively. It is complex, and not used broadly. Even if "data" is OO in nature, the databases existed decades ago, and they cannot take ORDBMS (or OODBMS). Learning curve also imposes problems.
ORDBMS/OODBMS are like virtual registry view you see in Registry Editor. Contents are tree-styled objects. But internally they might be stored as flat/hierarchical or in relational manner. You really don't care - the APIs provide you the view of registry information.
Similarly, even if major players don't support (and won't support) OO nature of database, they may provide some extensions. Or, you may have to craft your own framework for OO data. A movie database, having actors and directors can be represented using relations (tables). Actors, directors, shooting-locations would also be classes/objects, and can easily be represented using tables, and referential integrity imposed by database/DB designer.
You, as a developer would make this relational nature of data to an object-oriented form having Movie as a class, referencing actors/directors (1:1 or 1:N). I am not aware how/what EE facilitates this, but it would be doing mapping this way only.
Object-Relational Databases
Object oriented technology on top of relational technology and in the relational context.
Objects are stored in tables of objects rather than in tables of rows.
Support of major object oriented features: complex types, inheritance, aggregation, methods
Advantage: Extension of a well-known technology
Disadvantages: Mixture of both technologies may result in difficult to understand schemas
Has Performance problems
Object-relational systems include features such as complex object extensibility, encapsulation, inheritance, and better interfaces to OO languages.
ORDBMSs allow developers to embed new classes of data objects into the relational data model abstraction (and on top of SQL).
Following diagram shows how data can be accessed.

Database alternatives?

I was wondering the trade-offs for using databases and what the other options were? Also, what problems are not well suited for databases?
I'm concerned with Relational Databases.
The concept of database is very broad. I will make some simplifications in what I present here.
For some tasks, the most common database is the relational database. It's a database based on the relational model. The relational model assumes that you describe your data in rows, belonging to tables where each table has a given and fixed number of columns. You submit data on a "per row" basis, meaning that you have to provide a row in a single shot containing the data relative to all columns of your table. Every submitted row normally gets an identifier which is unique at the table level, sometimes at the database level. You can create relationships between entities in the relational database, for example by saying that a given cell in your table must refer to another table's row, so to preserve the so called "referential integrity".
This model works fine, but it's not the only one out there. In some cases, data are better organized as a tree. The filesystem is a hierarchical database. starts at a root, and everything goes under this root, in a tree like structure. Another model is the key/value pair. Sleepycat BDB is basically a store of key/value entities.
LDAP is another database which has two advantages: stores rather generic data, it's distributed by design, and it's optimized for reading.
Graph databases and triplestores allow you to store a graph and perform isomorphism search. This is typically needed if you have a very generic dataset that can encompass a broad level of description of your entities, so broad that is basically unknown. This is in clear opposition to the relational model, where you create your tables with a very precise set of columns, and you know what each column is going to contain.
Some relational column-based databases exist as well. Instead of submitting data by row, you submit them by whole column.
So, to answer your question: a database is a method to store data. Technically, even a text file is a database, although not a particularly nice one. The choice of the model behind your database is mostly relative to what is the typical needs of your application.
Setting the answer as CW as I am probably saying something strictly not correct. Feel free to edit.
This is a rather broad question, but databases are well suited for managing relational data. Alternatives would almost always imply to design your own data storage and retrieval engine, which for most standard/small applications is not worth the effort.
A typical scenario that is not well suited for a database is the storage of large amounts of data which are organized as a relatively small amount of logical files, in this case a simple filesystem-like system can be enough.
Don't forget to take a look at NOSQL databases. It's pretty new technology and well suited for stuff that doesn't fit/scale in a relational database.
Use a database if you have data to store and query.
Technically, most things are suited for databases. Computers are made to process data and databases are made to store them.
The only thing to consider is cost. Cost of deployment, cost of maintenance, time investment, but it will usually be worth it.
If you only need to store very simple data, flat files would be an alternative (text files).
Note: you used the generic term 'database', but there are many many different types and implementations of these.
For search applications, full-text search engines (some of which are integrated to traditional DBMSes, but some of which are not), can be a good alternative, allowing both more features (various linguistic awareness, ability to have semi-structured data, ranking...) as well as better performance.
Also, I've seen applications where configuration data is stored in the database, and while this makes sense in some cases, using plain text files (or YAML, XML and such) and loading the underlying objects during initialization, may be preferable, due to the self-contained nature of such alternative, and to the ease of modifying and replicating such files.
A flat log file, can be a good alternative to logging to DBMS, depending on usage of course.
This said, in the last 10 years or so, the DBMS Systems, in general, have added many features, to help them handle different forms of data and different search capabilities (ex: FullText search a fore mentioned, XML, Smart storage/handling of BLOBs, powerful user-defined functions, etc.) which render them more versatile, and hence a fairly ubiquitous service. Their strength remain mainly with relational data however.

What is NoSQL, how does it work, and what benefits does it provide? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I've been hearing things about NoSQL and that it may eventually become the replacement for SQL DB storage methods due to the fact that DB interaction is often a bottle neck for speed on the web.
So I just have a few questions:
What exactly is it?
How does it work?
Why would it be better than using a SQL Database? And how much better is it?
Is the technology too new to start implementing yet or is it worth taking a look into?
There is no such thing as NoSQL!
NoSQL is a buzzword.
For decades, when people were talking about databases, they meant relational databases. And when people were talking about relational databases, they meant those you control with Edgar F. Codd's Structured Query Language. Storing data in some other way? Madness! Anything else is just flatfiles.
But in the past few years, people started to question this dogma. People wondered if tables with rows and columns are really the only way to represent data. People started thinking and coding, and came up with many new concepts how data could be organized. And they started to create new database systems designed for these new ways of working with data.
The philosophies of all these databases were different. But one thing all these databases had in common, was that the Structured Query Language was no longer a good fit for using them. So each database replaced SQL with their own query languages. And so the term NoSQL was born, as a label for all database technologies which defy the classic relational database model.
So what do NoSQL databases have in common?
Actually, not much.
You often hear phrases like:
NoSQL is scalable!
NoSQL is for BigData!
NoSQL violates ACID!
NoSQL is a glorified key/value store!
Is that true? Well, some of these statements might be true for some databases commonly called NoSQL, but every single one is also false for at least one other. Actually, the only thing NoSQL databases have in common, is that they are databases which do not use SQL. That's it. The only thing that defines them is what sets them apart from each other.
So what sets NoSQL databases apart?
So we made clear that all those databases commonly referred to as NoSQL are too different to evaluate them together. Each of them needs to be evaluated separately to decide if they are a good fit to solve a specific problem. But where do we begin? Thankfully, NoSQL databases can be grouped into certain categories, which are suitable for different use-cases:
Document-oriented
Examples: MongoDB, CouchDB
Strengths: Heterogenous data, working object-oriented, agile development
Their advantage is that they do not require a consistent data structure. They are useful when your requirements and thus your database layout changes constantly, or when you are dealing with datasets which belong together but still look very differently. When you have a lot of tables with two columns called "key" and "value", then these might be worth looking into.
Graph databases
Examples: Neo4j, GiraffeDB.
Strengths: Data Mining
While most NoSQL databases abandon the concept of managing data relations, these databases embrace it even more than those so-called relational databases.
Their focus is at defining data by its relation to other data. When you have a lot of tables with primary keys which are the primary keys of two other tables (and maybe some data describing the relation between them), then these might be something for you.
Key-Value Stores
Examples: Redis, Cassandra, MemcacheDB
Strengths: Fast lookup of values by known keys
They are very simplistic, but that makes them fast and easy to use. When you have no need for stored procedures, constraints, triggers and all those advanced database features and you just want fast storage and retrieval of your data, then those are for you.
Unfortunately they assume that you know exactly what you are looking for. You need the profile of User157641? No problem, will only take microseconds. But what when you want the names of all users who are aged between 16 and 24, have "waffles" as their favorite food and logged in in the last 24 hours? Tough luck. When you don't have a definite and unique key for a specific result, you can't get it out of your K-V store that easily.
Is SQL obsolete?
Some NoSQL proponents claim that their favorite NoSQL database is the new way of doing things, and SQL is a thing of the past.
Are they right?
No, of course they aren't. While there are problems SQL isn't suitable for, it still got its strengths. Lots of data models are simply best represented as a collection of tables which reference each other. Especially because most database programmers were trained for decades to think of data in a relational way, and trying to press this mindset onto a new technology which wasn't made for it rarely ends well.
NoSQL databases aren't a replacement for SQL - they are an alternative.
Most software ecosystems around the different NoSQL databases aren't as mature yet. While there are advances, you still haven't got supplemental tools which are as mature and powerful as those available for popular SQL databases.
Also, there is much more know-how for SQL around. Generations of computer scientists have spent decades of their careers into research focusing on relational databases, and it shows: The literature written about SQL databases and relational data modelling, both practical and theoretical, could fill multiple libraries full of books. How to build a relational database for your data is a topic so well-researched it's hard to find a corner case where there isn't a generally accepted by-the-book best practice.
Most NoSQL databases, on the other hand, are still in their infancy. We are still figuring out the best way to use them.
What exactly is it?
On one hand, a specific system, but it has also become a generic word for a variety of new data storage backends that do not follow the relational DB model.
How does it work?
Each of the systems labelled with the generic name works differently, but the basic idea is to offer better scalability and performance by using DB models that don't support all the functionality of a generic RDBMS, but still enough functionality to be useful. In a way it's like MySQL, which at one time lacked support for transactions but, exactly because of that, managed to outperform other DB systems. If you could write your app in a way that didn't require transactions, it was great.
Why would it be better than using a SQL Database? And how much better is it?
It would be better when your site needs to scale so massively that the best RDBMS running on the best hardware you can afford and optimized as much as possible simply can't keep up with the load. How much better it is depends on the specific use case (lots of update activity combined with lots of joins is very hard on "traditional" RDBMSs) - could well be a factor of 1000 in extreme cases.
Is the technology too new to start implementing yet or is it worth taking a look into?
Depends mainly on what you're trying to achieve. It's certainly mature enough to use. But few applications really need to scale that massively. For most, a traditional RDBMS is sufficient. However, with internet usage becoming more ubiquitous all the time, it's quite likely that applications that do will become more common (though probably not dominant).
Since someone said that my previous post was off-topic, I'll try to compensate :-) NoSQL is not, and never was, intended to be a replacement for more mainstream SQL databases, but a couple of words are in order to get things in the right perspective.
At the very heart of the NoSQL philosophy lies the consideration that, possibly for commercial and portability reasons, SQL engines tend to disregard the tremendous power of the UNIX operating system and its derivatives.
With a filesystem-based database, you can take immediate advantage of the ever-increasing capabilities and power of the underlying operating system, which have been steadily increasing for many years now in accordance with Moore's law. With this approach, many operating-system commands become automatically also "database operators" (think of "ls" "sort", "find" and the other countless UNIX shell utilities).
With this in mind, and a bit of creativity, you can indeed devise a filesystem-based database that is able to overcome the limitations of many common SQL engines, at least for specific usage patterns, which is the whole point behind NoSQL's philosophy, the way I see it.
I run hundreds of web sites and they all use NoSQL to a greater or lesser extent. In fact, they do not host huge amounts of data, but even if some of them did I could probably think of a creative use of NoSQL and the filesystem to overcome any bottlenecks. Something that would likely be more difficult with traditional SQL "jails". I urge you to google for "unix", "manis" and "shaffer" to understand what I mean.
If I recall correctly, it refers to types of databases that don't necessarily follow the relational form. Document databases come to mind, databases without a specific structure, and which don't use SQL as a specific query language.
It's generally better suited to web applications that rely on performance of the database, and don't need more advanced features of Relation Database Engines. For example, a Key->Value store providing a simple query by id interface might be 10-100x faster than the corresponding SQL server implementation, with a lower developer maintenance cost.
One example is this paper for an OLTP Tuple Store, which sacrificed transactions for single threaded processing (no concurrency problem because no concurrency allowed), and kept all data in memory; achieving 10-100x better performance as compared to a similar RDBMS driven system. Basically, it's moving away from the 'One Size Fits All' view of SQL and database systems.
In practice, NoSQL is a database system which supports fast access to large binary objects (docs, jpgs etc) using a key based access strategy. This is a departure from the traditional SQL access which is only good enough for alphanumeric values. Not only the internal storage and access strategy but also the syntax and limitations on the display format restricts the traditional SQL. BLOB implementations of traditional relational databases too suffer from these restrictions.
Behind the scene it is an indirect admission of the failure of the SQL model to support any form of OLTP or support for new dataformats. "Support" means not just store but full access capabilities - programmatic and querywise using the standard model.
Relational enthusiasts were quick to modify the defnition of NoSQL from Not-SQL to Not-Only-SQL to keep SQL still in the picture! This is not good especially when we see that most Java programs today resort to ORM mapping of the underlying relational model. A new concept must have a clearcut definition. Else it will end up like SOA.
The basis of the NoSQL systems lies in the random key - value pair. But this is not new. Traditional database systems like IMS and IDMS did support hashed ramdom keys (without making use of any index) and they still do. In fact IDMS already has a keyword NONSQL where they support SQL access to their older network database which they termed as NONSQL.
It's like Jacuzzi: both a brand and a generic name. It's not just a specific technology, but rather a specific type of technology, in this case referring to large-scale (often sparse) "databases" like Google's BigTable or CouchDB.
NoSQL the actual program appears to be a relational database implemented in awk using flat files on the backend. Though they profess, "NoSQL essentially has no arbitrary limits, and can work where other products can't. For example there is no limit on data field size, the number of columns, or file size" , I don't think it is the large scale database of the future.
As Joel says, massively scalable databases like BigTable or HBase, are much more interesting. GQL is the query language associated with BigTable and App Engine. It's largely SQL tweaked to avoid features Google considers bottle-necks (like joins). However, I haven't heard this referred to as "NoSQL" before.
NoSQL is a database system which doesn't use string based SQL queries to fetch data.
Instead you build queries using an API they will provide, for example Amazon DynamoDB is a good example of a NoSQL database.
NoSQL databases are better for large applications where scalability is important.
Does NoSQL mean non-relational database?
Yes, NoSQL is different from RDBMS and OLAP. It uses looser consistency models than traditional relational databases.
Consistency models are used in distributed systems like distributed shared memory systems or distributed data store.
How it works internally?
NoSQL database systems are often highly optimized for retrieval and appending operations and often offer little functionality beyond record storage (e.g. key-value stores). The reduced run-time flexibility compared to full SQL systems is compensated by marked gains in scalability and performance for certain data models.
It can work on Structured and Unstructured Data. It uses Collections instead of Tables
How do you query such "database"?
Watch SQL vs NoSQL: Battle of the Backends; it explains it all.

What exactly is NoSQL?

What exactly is NoSQL? Is it database systems that only work with {key:value} pairs?
As far as I know MemCache is one of such database systems, am I right?
What other popular NoSQL databases are there and where exactly are they useful?
Thanks, Boda Cydo.
I'm not agree with the answers I'm seeing, although it's true that NoSQL solutions tends to break the ACID rules, not all are created from that approach.
I think first you should define what is a SQL Solution and then you can put the "Not Only" in front of it, that will be more accurate definition of what is a NoSQL solution.
With this approach in mind:
SQL databases are a way to group all the data stores that are accessible using Structured Query Language as the main (and most of the time only) way to communicate with them, this means it requires that the database support the structures that are common to those systems like "tables", "columns", "rows", "relationships", etc.
Now, put the "Not Only" in front of the last sentence and you will get a definition of what means "NoSQL". NoSQL groups all the stores created as an attempt to solve problems which cannot fit into the table/column/rows structures or even in SQL Statements, in most of the cases these databases will not support relationships, they're abandoning the well known structures just because the problems have changed since their conception.
If you have a text file, and you create an API to store/retrieve/organize this information, then you have a NoSQL database in your hands.
All of these means that there are several solutions to store the information in a way that traditional SQL systems will not allow to achieve better performance, flexibility, etc etc. Every NoSQL provider tries to solve a different problem and that's why you wont be able to compare two different solutions, for example:
djondb is a document store created to be used as
NoSQL enterprise solution supporting transactions, consistency, etc.
but sacrifice performance of its counterparts.
MongoDB is a document store (similar to
djondb) which accomplish great performance but trades some of the
ACID properties to achieve this.
CouchDB is another document store which
solves the queries slightly different providing views to retrieve the
information without doing a full query every time.
...
As you may have noticed I only talked about the document stores, that's because I wanted to show you that 3 different document stores implementations have different approach, therefore you should keep in mind the golden rule of NoSQL stores "Use the right tool for the right job".
I'm the creator of djondb and I've been doing a lot of research even before trying to start my own NoSQL implementation, but this is a field where the concepts will keep changing the way we see the information storage.
From wikipedia:
NoSQL is an umbrella term for a loosely defined class of non-relational data stores that break with a long history of relational databases and ACID guarantees. Data stores that fall under this term may not require fixed table schemas, and usually avoid join operations. The term was first popularised in early 2009.
The motivation for such an architecture was high scalability, to support sites such as Facebook, advertising.com, etc...
To quickly get a handle on NoSQL systems, see this blog post I wrote: Visual Guide to NoSQL Systems. Essentially, NoSQL systems sacrifice either consistency or availability in favor of tolerance to network partitions.
What is NoSQL ?
NoSQL is the acronym for Not Only SQL. The basic qualities of NoSQL databases are schemaless, distributed and horizontally scalable on commodity hardware. The NoSQL databases offers variety of functions to solve various problems with variety of data types, where “blob” used to be the only data type in RDBMS to store unstructured data.
1 Dynamic Schema
NoSQL databases allows schema to be flexible. New columns can be added anytime. Rows may or may not have values for those columns and no strict enforcement of data types for columns. This flexibility is handy for developers, especially when they expect frequent changes during the course of product life cycle.
2 Variety of Data
NoSQL databases support any type of data. It supports structured, semi-structured and unstructured data to be stored. Its supports logs, images files, videos, graphs, jpegs, JSON, XML to be stored and operated as it is without any pre-processing. So it reduces the need for ETL (Extract – Transform – Load).
3 High Availability Cluster
NoSQL databases support distributed storage using commodity hardware. It also supports high availability by horizontal scalability. This features enables NoSQL databases get the benefit of elastic nature of the Cloud infrastructure services.
4 Open Source
NoSQL databases are open source software. The usage of software is free and most of them are free to use in commercial products. The open sources codebase can be modified to solve the business needs. There are minor variations in the open source software licenses, users must be aware of license agreements.
5 NoSQL – Not Only SQL
NoSQL databases not only depend SQL to retrieve data. They provide rich API interfaces to perform DML and CRUD operations. These are APIs are move developer friendly and supported in variety of programming languages.
Take a look at these:
http://en.wikipedia.org/wiki/Nosql#List_of_NoSQL_open_source_projects
and this:
http://www.mongodb.org/display/DOCS/Comparing+Mongo+DB+and+Couch+DB
I used something called the Raima Data Manager more than a dozen years ago, that qualifies as NoSQL. It calls itself a "Set Oriented Database" Its not based on tables, and there is no query "language", just an C API for asking for subsets.
It's fast and easier to work with in C/C++ and SQL, there's no building up strings to pass to a query interpreter and the data comes back as an enumerable object rather than as an array. variable sized records are normal and don't waste space. I never saw the source code, but there were some hints at the interface that internally, the code used pointers a lot.
I'm not sure that the product I used is even sold anymore, but the company is still around.
MongoDB looks interesting, SourceForge is now using it.
I listened to a podcast with a team member. The idea with NoSQL isn't so much to replace SQL as it is to provide a solution for problems that aren't solved well with traditional RDBMS. As mentioned elsewhere, they are faster and scale better at the cost of reliability and atomicity (different solutions to different degrees). You wouldn't want to use one for a financial system, but a document based system would work great.
Here is a comprehensive list of NoSQL Databases: http://nosql-database.org/.
I'm glad that you have had success with RDM John! I work at Raima so it's great to hear feedback. For those looking for more information, here are a couple of resources:
Video Overview of RDM's General Architecture
Free Evaluation Download of RDM

Database system that is not relational

What are the other types of database systems out there. I've recently came across couchDB that handles data in a non relational way. It got me thinking about what other models are other people is using.
So, I want to know what other types of data model is out there. (I'm not looking for any specifics, just want to look at how other people are handling data storage, my interest are purely academic)
The ones I already know are:
RDBMS (mysql,postgres etc..)
Document based approach (couchDB, lotus notes)
Key/value pair (BerkeleyDB)
db4o
Quote from the "about" page:
db4o is the open source object database that enables Java and .NET developers to store and retrieve any application object with only one line of code, eliminating the need to predefine or maintain a separate, rigid data model.
Older non-relational databases:
Network Database
Hierarchical Database
Both mostly went out of style when relational became feasible.
Column-oriented databases are also a bit of a different animal. Many of them do support standard relational database SQL though. These are generally used for data warehouse type applications.
Semantic Web is also a non-relational data storage paradigm. There are no relations, all metadata is stored in the same way as data, and every entity has potentially its own unique set of attributes. Open-source projects that implement RDF, a Semantic Web standard, include Jena and Sesame.
Isn't Amazon's SimpleDB non-relational?
db4o, as mentioned by Eric, is an Object-Oriented database management system (OODBMS).
There's object-based databases(Gemstore, for example). Google's Big-Table and Amason's Simple Storage I am not sure how you would categorize, but both are map-reduce based.
A non-relational document oriented database we have been looking at is Apache CouchDB.
Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.
Our interest was in providing a distributed access user preferences store that would be immune to shape changes to which we could serialize preference objects from Java and access those just as easily with Javascript from a XULRunner based client application.
I'd like to detail more on Bill Karwin's answer about semantic web and triplestores, since it's what I am working on at the moment, and I have something to say on it.
The idea behind a triplestore is to store a graph-based database, whose datamodel roots in RDF. With RDF, you describe nodes and associations among nodes (in other words, edges). Data is organized in triples :
start node ----relation----> end node
(in RDF speech: subject --predicate--> object). With this very simple data model, any data network can be represented by adding more and more triples, provided you give a meaning to nodes and relations.
RDF is very general, and it's a graph-based data model well suited for search criteria looking for all triples with a particular combination of subject, predicate, or object, in any combination. Eventually, through a query language called SPARQL, you can also perform more complex queries, an operation that boils down to a graph isomorphism search onto the graph, both in terms of topology and in terms of node-edge meaning (we'll see this in a moment). SPARQL allows you only SELECT (and similar) queries. No DELETE, no INSERT, no UPDATE. The information you query (e.g. specific nodes you are interested in) are mapped into a table, which is what you get as a result of your query.
Now, topology in itself does not mean a lot. For this, a Schema language has been invented. Actually, more than one, and calling them schema languages is, in some cases, very limitative. The most famous and used today are RDF-Schema, OWL (Lite and Full), and they predate from the obsolete DAML+OIL. The point of these languages is, boiling down stuff, to give a meaning to nodes (by granting them a type, also described as a triple) and to relationships (edges). Also, you can define the "range" and "domain" of these relationships, or said differently what type is the start node and what type is the end node: you can say for example, that the property "numberOfWheels" can be applied only to connect a node of type Vehicle to a non-zero integer value.
ns:MyFiat --rdf:type--> ns:Vehicle
ns:MyFiat --ns:numberOfWheels-> 4
Now, you can use these ontologies in two directions: validation and inference. Validation is not that fancy today, but I've seen instances of use. Inference is what is cool today, because it allows reasoning. Inference basically takes a RDF graph containing a set of triples, takes an ontology, mixes them into a triplestore database which contains an "inference engine" and like magic the inference engine invents triples according to your ontological description. Example: suppose you just store this information in the database
ns:MyFiat --ns:numberOfWheels--> 4
and nothing else. No type is specified about this node, but the inference engine will add automatically a triple saying that
ns:MyFiat --rdf:type--> ns:Vehicle
because you said in your ontology that only objects of type Vehicle can be described by a property numberOfWheels.
Conversely, you can use the inference engine to validate your data against the ontology so to refuse not compliant data (sort of like XML-Schema for XML). In this case, you will need both triples to have your data successfully accepted by the triplestore.
Additional characteristics of triplestores are Formulas and Context-aware storage. Formulas are statements (as usual, triples subject predicate object) that describe something hypothetical. I never used Formulas, so I won't go into more details of something I don't know. Context awareness are basically subgraphs: the problem with storing triples is that you don't have anything to say where these triples come from. Suppose you have two dealers that describe the same price of a component. One says that the price is 5.99 and the other 4.99. If you just store both triples into a database, now you don't know anything about who stated each information. There are two ways to solve this problem.
One is reification. Reification means that you store additional triples to describe another triple. It's wasteful, and makes life hell because you have to reify every and each triple you store. The alternative is context-awareness. Having a context-aware storage It's like being able to box a bunch of triples into a container with a label on it (the context identifier). You now can use this identifier as subject for additional statements, hence describing a bunch of triples in a single action.
4. Navigational. Includes Tree/Hierarchy and Graph/Network.
File systems, the semantic web, XML, Object databases, CODASYL, and many others all fit into this category.
Those 4 are pretty much it.
There is also what is referred to as an "inverted index" or "inverted list" database. Software AG's Adabas product would be an example. As with hierachical, these databases continue to be used in large corporate or university environments because of legacy considerations or due to a performance advantage in certain situations (typically high-end transactional applications).
There are BASE systems (Basically Available, Soft State, Eventually consistent) and they work well with simple data models holding vast volumes of data. Google's BigTable, Dojo's Persevere, Amazon's Dynamo, Facebook's Cassandra are some examples.
See LINK
The illuminate Correlation Database is a new revolutionary non-relational database. The Correlation Database Management Dystem (CDBMS) is data model independent and designed to efficiently handle unplanned, ad hoc queries in an analytical system environment. Unlike relational database management systems or column-oriented databases, a correlation database uses a value-based storage (VBS) architecture in which each unique data value is stored only once and an auto-generated indexing system maintains the context for all values (data is 100% indexed). Queries are performed using natural language instead of SQL (NoSQL).
Learn more at: www.datainnovationsgroup.com

Resources