How is it possible to build database index on top of key/value store? - database

I was reading about LevelDB and found out that:
Upcoming versions of the Chrome browser include an implementation of the IndexedDB HTML5 API that is built on top of LevelDB
IndexedDB is also a simple key/value store that has the ability to index data.
My question is: how is it possible to build an index on top of a key/value store? I know that an index is at it's lowest level is n-ary tree and I understand the way that data is indexed in a database. But how can a key/value store like LevelDB be used for creating a database index?

The vital feature is not that it supports custom comparators but that it supports ordered iteration through keys and thus searches on partial keys. You can emulate fields in keys just using conventions for separating string values. The many scripting layers that sit on top of leveldb use that approach.
The dictionary view of a Key-Value store is that you can only tell if a key is present or not by exact match. It is not really feasible to use just such a KV store as a basis for a database index.
As soon as you can iterate through keys starting from a partial match, you have enough to provide the searching and sorting operations for an index.

Just a couple of things, LevelDB supports sorting of data using a custom comparer, from the page you linked to:
According to the project site the key features are:
Keys and values are arbitrary byte arrays.
Data is stored sorted by key.
Callers can provide a custom comparison function to override the sort order.
....
So LevelDB can contain data this can be sorted/indexed based on 1 sort order.
If you needed several indexable fields, you could just add your own B-Tree that works on-top of LevelDB. I would imagine that this is the type of approach that the Chrome browser takes, but I'm just guessing.
You can always look through the Chrome source.

Related

What is the most optimal way to store a trie for typeahead suggestions in distributed systems?

I have been reading a bit about tries, and how they are a good structure for typeahead designs. Aside from the trie, you usually also have a key/value pair for nodes and pre-computed top-n suggestions to improve response times.
Usually, from what I've gathered, it is ideal to keep them in memory for fast searches, such as what was suggested in this question: Scrabble word finder: building a trie, storing a trie, using a trie?. However, what if your Trie is too big and you have to shard it somehow? (e.g. perhaps a big e-commerce website).
The key/value pair for pre-computed suggestions can be obviously implemented in a key/value store (either kept in memory, like memcached/redis or in a database, and horizontally scalled as needed), but what is the best way to store a trie if it can't fit in memory? Should it be done at all, or should distributed systems each hold part of the trie in memory, while also replicating it so that it is not lost?
Alternatively, a search service (e.g. Solr or Elasticsearch) could be used to produce search suggestions/auto-complete, but I'm not sure whether the performance is up to par for this particular use-case. The advantage of the Trie is that you can pre-compute top-N suggestions based on its structure, leaving the search service to handle actual search on the website.
I know there are off-the-shelf solutions for this, but I'm mostly interesting in learning how to re-invent the wheel on this one, or at least catch a glimpse of the best practices if one wants to broach this topic.
What are your thoughts?
Edit: I also saw this post: https://medium.com/#prefixyteam/how-we-built-prefixy-a-scalable-prefix-search-service-for-powering-autocomplete-c20f98e2eff1, which basically covers the use of Redis as primary data store for a skip list with mongodb for LRU prefixes. Seems an OK approach, but I would still want to learn if there are other viable/better approaches.
I was reading 'System Design Interview: An Insider's Guide' by Alex Yu and he briefly covers this topic.
Trie DB. Trie DB is the persistent storage. Two options are available to store the data:
Document store: Since a new trie is built weekly, we can periodically take a snapshot of it, serialize it, and store the serialized data in the database. Document stores like MongoDB [4] are good fits for serialized data.
Key-value store: A trie can be represented in a hash table form [4] by applying the following logic:
• Every prefix in the trie is mapped to a key in a hash table.
• Data on each trie node is mapped to a value in a hash table.
Figure 13-10 shows the mapping between the trie and hash table.
The numbers on each prefix node represent the frequency of searches for that specific prefix/word.
He then suggests possibly scaling the storage by sharding the data by each alphabet (or groups of alphabets, like a-m, n-z). But this would unevenly distribute the data since there are more words that start with 'a' than with 'z'.
So he recommends using some type of shard manager where it keeps track of the query frequency and assigns a shard based on that. So if there are twice the amount of queries for 's' as opposed to 'z' and 'x' combined, two shards can be used, one for 's', and another for 'z' and 'x'.

Reverse Indexing and Data modeling in Key-Value store

I am new to key-value stores. My objective is to use an embedded key-value store to keep the persistent data model. The data model comprises of few related tables if designed with conventional RDBMS. I was checking a medium article on modeling a table for key value store. Although the article uses Level DB with Java I am planning to use RocksDB or FASTER with C++ for my work.
It uses a scheme where one key is used for every attribute of each row, like the following example.
$table_name:$primary_key_value:$attribute_name = $value
The above is fine for point lookups when usercode is aware about exactly which key to get. But there are scenarios like searching for users having same email address, or searching for users above a certain age or searching for users of one specific gender. In search scenarios the article performs a linear scan through all keys. In each iterations it checks the pattern of the key and applies the business logic (checking the value for match) once a key with a matching pattern is found.
It seems that, such type of searching is inefficient and in worst case it needs to traverse through the entire store. To solve that a reverse lookup table is required. My question is
How to model the reverse lookup table ? Is it some sort of reinvention of wheel ? Is there any alternative way ?
One solution that readily comes in mind is to have a separate ? store for each index-able property like the following.
$table_name:$attribute_name:$value_1 = $primary_key_value
With this approach the immediate question is
How to handle collisions in this reverse lookup table ? because multiple $primary_keys may be associated with the same vale.
As an immediate solution, instead of storing a single value an array of multiple primary keys can be stored as shown below.
$table_name:$attribute_name:$value_1 = [$primary_key_value_1, ... , $primary_key_value_N]
But such type of modeling requires usercode to parse the array from string and again serialize that to string after manipulation several times (assuming the underlying key-value store is not aware about array values).
Is it efficient to store multiple keys as array value ? or there exists some vendor provided efficient way ?
Assuming that the stringifi'ed array like design works, there has to be such indexes for each indexable properties. So this gives a fine grained control on what to index and what not to index. Next design decision that comes in mind is where these indexes will be store ?
should the indexes be stored in a separate store/file ? or in the same store/file the actual data belongs to ? Should there be a different store for each property ?
For this question, I don't have a clue because both of these approaches require more or less same amount of I/O. However having large data file will have more things on disk and fewer things on memory (so more I/O), whereas for multiple files there will be more things on memory so less page faults. This assumption could be totally wrong depending on the architecture of the specific key-value store. At the same time having too many files turns into a problem of managing a complicated file structure. Also, maintaining indexes require transactions for insert, update and delete operations. Having multiple files results into single updation in multiple trees, whereas having single file results into multiple updation in single tree.
Is transaction more specifically transaction involving multiple store/files supported ?
Not only the indices there are some meta information of the table that are also required to be kept along with the table data. To generate a new primary key (auto incremented) it is required to have prior knowledge about the last row number or last primary key generated because something like a COUNT(*) won't work. Additionally as all keys are not indexed, the meta information may include what properties are indexed and what properties are not indexed.
How to store the meta information of each table ?
Again the same set of questions appear for the meta table also. e.g. should the meta be a separate store/file ? Additionally as we have noticed that not all properties are indexed we may even decide to store each row as a JSON encoded value in the data store and keep that along with the index stores. The underlying key-value store vendor will treat that JSON as a string value like the following.
$table_name:data:$primary_key_value = {$attr_1_name: $attr_1_value, ..., $attr_N_name: $attr_N_value}
...
$table_name:index:$attribute_name = [$primary1, ..., $primaryN]
However reverse lookups are still possible through the indexes pointing towards the primary key.
Is there any drawbacks of using JSON encoded values instead of storing all properties as separate keys ?
So far I could not find any draw backs using this method, other than forcing the user to use JSON encoding, and some heap allocation in for JSON encoding/decoding.
The problems mentioned above is not specific to any particular application. These problems are generic enough to be associated to all developments using key-value store. So it is essential to know whether there is any reinvention of wheel.
Is there any defacto standard solution of all the problems mentioned in the question ? Does the solutions differ from the one stated in the question ?
How to model the reverse lookup table ? Is it some sort of reinvention of wheel ? Is there any alternative way ?
All the ways you describe are valid ways to create an index.
It does not re-invent the wheel in RocksDB because RocksDB does not support indices.
It really depends on the data, in general you will need to copy the index value and the primary key into another space to create the index.
How to handle collisions in this reverse lookup table ? because multiple $primary_keys may be associated with the same vale.
You can serialize pks using JSON (or something else). The problem with that approach is when the pks grow very large (which might or might not be a thing).
Is it efficient to store multiple keys as array value ? or there exists some vendor provided efficient way ?
With RocksDB, you have nothing that will make it "easier".
You did not mention the following approach:
$table_name:$attribute_name:$value_1:$primary_key_value_1 = ""
$table_name:$attribute_name:$value_1:$primary_key_value_2 = ""
...
$table_name:$attribute_name:$value_1:$primary_key_value_n = ""
Where the value is empty. And the indexed pk is part of the key.
should the indexes be stored in a separate store/file ? or in the same store/file the actual data belongs to ? Should there be a different store for each property ?
It depends on the key-value store. With rocksdb, if you need transactions, you must stick to one db file.
Is transaction more specifically transaction involving multiple store/files supported ?
Only Oracle Berkeley DB and WiredTiger support that feature.
How to store the meta information of each table ?
metadata can be in the database or the code.
Is there any drawbacks of using JSON encoded values instead of storing all properties as separate keys ?
Yeah, like I said above, if you encoded all pks into a single value, it might lead to problem downstream when the number of pk is large. For instance, you need to read the whole list to do pagination.
Is there any defacto standard solution of all the problems mentioned in the question ? Does the solutions differ from the one stated in the question ?
To summarize:
With RocksDB, Use a single database file
In the index, encode the primary key inside the key, and leave value empty, to be able to paginate.

Is Couchbase an ordered key-value store?

Are documents in Couchbase stored in key order? In other words, would they allow efficient queries for retrieving all documents with keys falling in a certain range? In particular I need to know if this is true for Couchbase lite.
Query efficiency is correlated with the construction of the views that are added to the server.
Couchbase/Couchbase Lite only stores the indexes specified and generated by the programmer in these views. As Couchbase rebalances it moves documents between nodes, so it seems impractical that key order could be guaranteed or consistent.
(Few databases/datastores guarantee document or row ordering on disk, as indexes provide this functionality more cheaply.)
Couchbase document retrieval is performed via map/reduce queries in views:
A view creates an index on the data according to the defined format and structure. The view consists of specific fields and information extracted from the objects in Couchbase. Views create indexes on your information that enables search and select operations on the data.
source: views intro
A view is created by iterating over every single document within the Couchbase bucket and outputting the specified information. The resulting index is stored for future use and updated with new data stored when the view is accessed. The process is incremental and therefore has a low ongoing impact on performance. Creating a new view on an existing large dataset may take a long time to build but updates to the data are quick.
source: Views Basics
source
and finally, the section on Translating SQL to map/reduce may be helpful:
In general, for each WHERE clause you need to include the corresponding field in the key of the generated view, and then use the key, keys or startkey / endkey combinations to indicate the data you want to select.
In conclusion, Couchbase views constantly update their indexes to ensure optimal query performance. Couchbase Lite is similar to query, however the server's mechanics differ slightly:
View indexes are updated on demand when queried. So after a document changes, the next query made to a view will cause that view's map function to be called on the doc's new contents, updating the view index. (But remember that you shouldn't write any code that makes assumptions about when map functions are called.)
How to improve your view indexing: The main thing you have control over is the performance of your map function, both how long it takes to run and how many objects it allocates. Try profiling your app while the view is indexing and see if a lot of time is spent in the map function; if so, optimize it. See if you can short-circuit the map function and give up early if the document isn't a type that will produce any rows. Also see if you could emit less data. (If you're emitting the entire document as a value, don't.)
from Couchbase Lite - View

Storing Inverted Index

I know that inverted indexing is a good way to index words, but what I'm confused about is how the search engines actually store them? For example, if a word "google" appears in document - 2, 4, 6, 8 with different frequencies, where should store them? Can a database table with one-to-many relation would do any good for storing them?
It is highly unlikely that fullfledged SQL-like databases are used for this purpose. First, it is called an inverted index because it is just an index. Each entry is just a reference. As non-relational databases and key-value stores came up as a favourite topic in relation to web technology.
You only ever have one way of accessing the data (by query word). That is why it's called an index.
Each entry is a list/array/vector of references to documents, so each element of that list is very small. The only other information besides of storing a documentID would be to store a tf-idf score for each element.
How to use it:
If you have a single query word ("google") then you look up in the inverted index in which documents this word turns up (2,4,6,8 in your example). If you have tf-idf scores, you can sort the results to report the best matching document first. You then go and look up which documents the document IDs 2,4,6,8 refer to, and report their URL as well as a snippet etc. URL, snippets etc are probably best stored in another table or key-value store.
If you have multiple query words ("google" and "altavista"), you look into the II for both query words and you get two lists of document IDs (2,4,6,8 and 3,7,8,11,19). You take the intersection of both lists, which in this case is (8), which is the list of documents in which both query words occur.
It's a fair bet that each of the major search engines has its own technology for handling inverted indexes. It's also a moderately good bet that they're not based on standard relational database technology.
In the specific case of Google, it is a reasonable guess that the current technology used is derived from the BigTable technology described in 2006 by Fay Chang et al in Bigtable: A Distributed Storage System for Structured Data. There's little doubt that the system has evolved since then, though.
Traditionally, an inverted index is written directly to file and stored on disk somewhere. If you want to do boolean retrieval querying (Either a file contains all the words in the query or not) postings might look like so stored contiguously on file.
Term_ID_1:Frequency_N:Doc_ID_1,Doc_ID_2,Doc_ID_N.Term_ID_2:Frequency_N:Doc_ID_1,Doc_ID_2,Doc_ID_N.Term_ID_N:Frequency_N:Doc_ID_1,Doc_ID_2,Doc_ID_N
The term id is the id of a term, the frequency is the number of docs the term appears in (in other words how long is the postings list) and the doc id is the document that contained the term.
Along with the index, you need to know where everything is on file so mappings also have to be stored somewhere on another file. For instance, given a term_id, the map needs to return the file position that contains that index and then it is possible to seek to that position. Since the frequency_id is recorded in the postings, you know how many doc_ids to read from the file. In addition, there will need to be mappings from the IDs to the actual term/doc name.
If you have a small use case, you may be able to pull this off with SQL by using blobs for the postings list and handling the intersection yourself when querying.
Another strategy for a very small use case is to use a term document matrix.
Possible Solution
One possible solution would be to use a positional index. It's basically an inverted index, but we augment it by adding more information. You can read more about it at Stanford NLP.
Example
Say a word "hello" appeared in docs 1 and 3, in positions (3,5,6,200) and (9,10) respectively.
Basic Inverted Index (note there's no way to find word freqs nor there positions)
"hello" => [1,3]
Positional Index (note we don't only have freqs for each docs, but we also know exactly where the term appeared in the doc)
"hello" => [1:<3,5,6,200> , 3:<9,10>]
Heads Up
Will your index take a lot more size now? You bet!
That's why it's a good idea to compress the index. There are multiple options to compress the postings list using gap encoding, and even more options to compress the dictionary, using general string compression algorithms.
Related Readings
Index compression
Postings file compression
Dictionary compression

Store any hash in GDBM and can I search in it?

Reading about GDBM in this book they only give simple examples of the data structure that can stored. E.g.
$dbm{'key'} = "value";
Background
I would like to save many small text files in the database for local use only, and use nested hashes and arrays to represent the file paths. It doesn't have to be GDBM, but it seams to be the only key/value database library for Perl.
Question
Can I store any hash in GDBM no matter have many nested hashes and arrays it contains?
Does GDBM offer any search features, or I am left to implement my own in Perl?
DBM databases don't support arrays at all. They are esssentially the same as a Perl hash, except that the value of an item can only be a simple string and may not be a number or a reference. The keys and values for each data item in a DBM database are simple byte sequences. That is, the API represents them by a char pointer and an int size.
Within that constraint you can use the database however you like, but remember that, unlike SQL databases, every key must be unique.
You could emulate nested hashes by using the data fetched by one access as a key for the next access but, bearing in mind the requirement for unique keys, that's far from ideal.
Alternatively, the value fetched could be the name of another DBM database which you could go on to query further.
A final option is to concatenate all the keys into a single value, so that
$dbm{aa}{bb}{cc}
would actually be implemented as something like
$dbm{aa_bb_cc}
Actually, you can store hashes of hashes, or lists of lists in perl. You use the MLDBM module from CPAN, along with the dbm of your choice..
check out this online pdf book and go to chapter 13.
[https://www.perl.org/books/beginning-perl/][1]
The complex part is figuring out how to access the various levels of references. To search you would have to run through the keys and parse the values.

Resources