How data is stored physically in Bigtable

How data is stored physically in Bigtable - database

Lets assume a table test
cf:a cf:b yy:a kk:cat
"com.cnn.news" zubrava10 sobaka foobar
"ch.main.users" - - - purrpurr
And the first cell ("zubrava") has 10 versions (10 timestamps) ("zubrava1", "zubrava2"...)
How data of this table will be stored on disk?
I mean is the primary index always
("row","column_family:column",timestamp) ?
So 10 versions of the same row for 10 timestamps will be stored together? How the entire table is stored?
Is scan for all values of given column is as fast as in column-oriented models?
SELECT cf:a from test

So 10 versions of the same row for 10 timestamps will be stored together? How the entire table is stored?
Bigtable is a row-oriented database, so all data for a single row are stored together, organized by column family, and then by column. Data is stored in reversed-timestamp order, which means it's easy and fast to ask for the latest value, but hard to ask for the oldest value.
Is scan for all values of given column is as fast as in column-oriented models?
SELECT cf:a from test
No, a column-oriented storage model stores all the data for a single column together, across all rows. Thus, a full-table scan in a column-oriented system (such as Google BigQuery) is faster than in a row-oriented storage system, but a row-oriented system provides for row-based mutations and row-based atomic mutations that a column-oriented storage system typically cannot.
On top of this, Bigtable provides a sorted order of all row keys in lexicographic order; column-oriented storage systems typically make no such guarantees.

Related

How does columnfamily from BigTable in GCP relate to columns in a relational database

I am trying to migrate a table that is currently in a relational database to BigTable.
Let's assume that the table currently has the following structure:
Table: Messages
Columns:
Message_id
Message_text
Message_timestamp
How can I create a similar table in BigTable?
From what I can see in the documentation, BigTable uses ColumnFamily. Is ColumnFamily the equivalent of a column in a relational database?

BigTable is different from a relational database system in many ways.
Regarding database structures, BigTable should be considered a wide-column, NoSQL database.
Basically, every record is represented by a row and for this row you have the ability to provide an arbitrary number of name-value pairs.
This row has the following characteristics.
Row keys
Every row is identified univocally by a row key. It is similar to a primary key in a relational database. This field is stored in lexicographic order by the system, and is the only information that will be indexed in a table.
In the construction of this key you can choose a single field or combine several ones, separated by # or any other delimiter.
The construction of this key is the most important aspect to take into account when constructing your tables. You must thing about how will you query the information. Among others, keep in mind several things (always remember the lexicographic order):
Define prefixes by concatenating fields that allows you to fetch information efficiently. BigTable allows and you to scan information that starts with a certain prefix.
Related, model your key in a way that allows you to store common information (think, for example, in all the messages that come from a certain origin) together, so it can be fetched in a more efficient way.
At the same time, define keys in a way that maximize dispersion and load balance between the different nodes in your BigTable cluster.
Column families
The information associated with a row is organized in column families. It has no correspondence with any concept in a relational database.
A column family allows you to agglutinate several related fields, columns.
You need to define the column families before-hand.
Columns
A column will store the actual values. It is similar in a certain sense to a column in a relational database.
You can have different columns for different rows. BigTable will sparsely store the information, if you do not provide a value for a row, it will consume no space.
BigTable is a third dimensional database: for every record, in addition to the actual value, a timestamp is stored as well.
In your use case, you can model your table like this (consider, for example, that you are able to identify the origin of the message as well, and that it is a value information):
Row key = message_origin#message_timestamp (truncated to half hour, hour...)1#message_id
Column family = message_details
Columns = message_text, message_timestamp
This will generate row keys like, consider for example that the message was sent from a device with id MT43:
MT43#1330516800#1242635
Please, as #norbjd suggested, see the relevant documentation for an in-deep explanation of these concepts.
One important difference with a relational database to note: BigTable only offers atomic single-row transactions and if using single cluster routing.
1 See, for instance: How to round unix timestamp up and down to nearest half hour?

LIFO / sorted database design pattern

I want to store data (as an archive) in two seperate lists one is to be a sort of LIFO stack where new data just gets pushed on top and the other is sorted by a temporally independent value. Data may be retreived at a later point in time, but I'm generally only interested in the topmost N values. Both lists can get very long but contain very simple values (document ids with priority). Is there a database to implement this pattern efficiently? I hear HBase does sorted storage, would it be useful for this kind of application?
At least the LIFO storage could be implemented as a plain file. Is this wise?
Or is this concern about retreival speed premature optimization, i.e. are there commands in SQL with which i can retreive first N by time of insertion / sorted by a value . Or should I shard / paginate?

Rows or "tuples" if you like, are specifically not ordered in a relational database. It is considered an implementation detail. Of course, we often need to impose an order of the rows anyway, but we have to do it when we query the data, not when we store it.
I have no knowledge of hbase, but I noticed it was free, so if you can consider MySQL an alternative, here is one way to do what you want.
Create an InnoDB table with an auto-incrementing primary key. InnoDB tables are clustered on the primary key, meaning that the rows are stored sorted by the key. Since you use an auto-incrementing key, newer rows will always have higher values, and rows added in sequence will be stored "near" each other. Those properties make for fast retreival of the X newest or oldest rows, since they will likely be co-located on the same data pages (reduces I/O).
It would be something like this:
create table mytab(
id int not null auto_increment
,the int
,rest varchar
,of char
,your tinyint
,columns varchar
,primary key(id)
)Engine=InnoDB;
To get the 10 latest rows added, you would query it like:
select *
from mytab
order
by id desc
limit 10;
Note that even if you are deleting the rows, the ID will keep on increasing. So if the MAX(id) is 5000, it doesn't mean you have 5000 rows.

When should I use Oracle's Index Organized Table? Or, when shouldn't I?

Index Organized Tables (IOTs) are tables stored in an index structure. Whereas a table stored
in a heap is unorganized, data in an IOT is stored and sorted by primary key (the data is the index). IOTs behave just like “regular” tables, and you use the same SQL to access them.
Every table in a proper relational database is supposed to have a primary key... If every table in my database has a primary key, should I always use an index organized table?
I'm guessing the answer is no, so when is an index organized table not the best choice?

Basically an index-organized table is an index without a table. There is a table object which we can find in USER_TABLES but it is just a reference to the underlying index. The index structure matches the table's projection. So if you have a table whose columns consist of the primary key and at most one other column then you have a possible candidate for INDEX ORGANIZED.
The main use case for index organized table is a table which is almost always accessed by its primary key and we always want to retrieve all its columns. In practice, index organized tables are most likely to be reference data, code look-up affairs. Application tables are almost always heap organized.
The syntax allows an IOT to have more than one non-key column. Sometimes this is correct. But it is also an indication that maybe we need to reconsider our design decisions. Certainly if we find ourselves contemplating the need for additional indexes on the non-primary key columns then we're probably better off with a regular heap table. So, as most tables probably need additional indexes most tables are not suitable for IOTs.
Coming back to this answer I see a couple of other responses in this thread propose intersection tables as suitable candidates for IOTs. This seems reasonable, because it is common for intersection tables to have a projection which matches the candidate key: STUDENTS_CLASSES could have a projection of just (STUDENT_ID, CLASS_ID).
I don't think this is cast-iron. Intersection tables often have a technical key (i.e. STUDENT_CLASS_ID). They may also have non-key columns (metadata columns like START_DATE, END_DATE are common). Also there is no prevailing access path - we want to find all the students who take a class as often as we want to find all the classes a student is taking - so we need an indexing strategy which supports both equally well. Not saying intersection tables are not a use case for IOTs. just that they are not automatically so.

I'd consider them for very narrow tables (such as the join tables used to resolve many-to-many tables). If (virtually) all the columns in the table are going to be in an index anyway, then why shouldn't you used an IOT.
Small tables can be good candidates for IOTs as discussed by Richard Foote here

I consider the following kinds of tables excellent candidates for IOTs:
"small" "lookup" type tables (e.g. queried frequently, updated infrequently, fits in a relatively small number of blocks)
any table that you already are going to have an index that covers all the columns anyway (i.e. may as well save the space used by the table if the index duplicates 100% of the data)

From the Oracle Concepts guide:
Index-organized tables are useful when
related pieces of data must be stored
together or data must be physically
stored in a specific order. This type
of table is often used for information
retrieval, spatial (see "Overview of
Oracle Spatial"), and OLAP
applications (see "OLAP").
This question from AskTom may also be of some interest especially where someone gives a scenario and then asks would an IOT perform better than an heap organised table, Tom's response is:
we can hypothesize all day long, but
until you measure it, you'll never
know for sure.

An index-organized table is generally a good choice if you only access data from that table by the key, the whole key, and nothing but the key.
Further, there are many limitations about what other database features can and cannot be used with index-organized tables -- I recall that in at least one version one could not use logical standby databases with index-organized tables. An index-organized table is not a good choice if it prevents you from using other functionality.

All an IOT really saves is the logical read(s) on the table segment, and as you might have spent two or three or more on the IOT/index this is not always a great saving except for small data sets.
Another feature to consider for speeding up lookups, particularly on larger tables, is a single table hash cluster. When correctly created they are more efficient for large data sets than an IOT because they require only one logical read to find the data, whereas an IOT is still an index that needs multiple logical i/o's to locate the leaf node.

I can't per se comment on IOTs, however if I'm reading this right then they're the same as a 'clustered index' in SQL Server. Typically you should think about not using such an index if your primary key (or the value(s) you're indexing if it's not a primary key) are likely to be distributed fairly randomly - as these inserts can result in many page splits (expensive).
Indexes such as identity columns (sequences in Oracle?) and dates 'around the current date' tend to make for good candidates for such indexes.

An Index-Organized Table--in contrast to an ordinary table--has its own way of structuring, storing, and indexing data.
Index organized tables (IOT) are indexes which actually hold the data which is being indexed, unlike the indexes which are stored somewhere else and have links to actual data.

Table clusters in SQLServer

In Oracle, a table cluster is a group of tables that share common columns and store related data in the same blocks. When tables are clustered, a single data block can contain rows from multiple tables. For example, a block can store rows from both the employees and departments tables rather than from only a single table:
http://download.oracle.com/docs/cd/E11882_01/server.112/e10713/tablecls.htm#i25478
Can this be done in SQLServer?

On the one hand, this sounds very much like views. Data is stored in the table, and the views provide access to only those columns within the table specified by the view's definition. (Thus, your "common columns".)
On the other hand, this sounds like how the database engine stores data the hard drive. In SQL, this is done via 8kb pages. Assuming two completely separate table definitions, there is no way to store data from two such distinct tables in the same page. (If an Oracle block is more along the lines of OS files, then that turns into SQL Files and File Groups, at which point the answer is "yes"... but I suspect this is not what blocks are about.)

Not based on what I am reading here. In SQL Server, each table's pages are independent of other tables' pages.
On the other hand, each table can have a choice of clustered index which can influence the performance greatly. In addition, I believe partitions will influence the execution plan and if both table have similar partition functions, this might boost performance, but the normal objective of partitioning is not for performance reasons.
Typically, optimization of JOINS involves index strategies (in my experience, preferably with covering non-clustered indexes)

Database optimization: Hashing all the values

Typically, the databases are designed as below to allow multiple types for an entity.
Entity Name
Type
Additional info
Entity name can be something like account number and type could be like savings,current etc in a bank database for example.
Mostly, type will be some kind of string. There could be additional information associated with an entity type.
Normally queries will be posed like this.
Find account numbers of this particular type?
Find account numbers of type X, having balance greater than 1 million?
To answer these queries, query analyzer will scan the index if the index is associated with a particular column. Otherwise, it will do a full scan of all the rows.
I am thinking about the below optimization.
Why not we store the hash or integral value of each column data in the actual table such that the ordering property is maintained, so that it will be easy for comparison.
It has below advantages.
1. Table size will be lot less because we will be storing small size values for each column data.
2. We can construct a clustered B+ tree index on the hash values for each column to retrieve the corresponding rows matching or greater or smaller than some value.
3. The corresponding values can be easily retrieved by having B+ tree index in the main memory and retrieving the corresponding values.
4. Infrequent values will never need to retrieved.
I am still having more optimizations in my mind. I will post those based on the feedback to this question.
I am not sure if this is already implemented in database, this is just a thought.
Thank you for reading this.
-- Bala
Update:
I am not trying to emulate what the database does. Normally indexes are created by the database administrator. I am trying to propose a physical schema by having indexes on all the fields in the database, so that database table size is reduced and its easy to answer few queries.
Updates:(Joe's answer)
How does adding indexes to every field reduce the size of the database? You still have to store all of the true values in addition to the hash; we don't just want to query for existence but want to return the actual data.
In a typical table, all the physical data will be stored. But now by generating a hash value on each column data, I am only storing the hash value in the actual table. I agree that its not reducing the size of the database, but its reducing the size of the table. It will be useful when you don't need to return all the column values.
Most RDBMSes answer most queries efficiently now (especially with key indices in place). I'm having a hard time formulating scenarios where your database would be more efficient and save space.
There can be only one clustered index on a table and all other indexes have to unclustered indexes. With my approach I will be having clustered index on all the values of the database. It will improve query performance.
Putting indexes within the physical data -- that doesn't really make sense. The key to indexes' performance is that each index is stored in sorted order. How do you propose doing that across any possible field if they are only stored once in their physical layout? Ultimately, the actual rows have to be sorted by something (in SQL Server, for example, this is the clustered index)?
The basic idea is that instead of creating a separate table for each column for efficient access, we are doing it at the physical level.
Now the table will look like this.
Row1 - OrderedHash(Column1),OrderedHash(Column2),OrderedHash(Column3)

Google for "hash index". For example, in SQL Server such an index is created and queried using the CHECKSUM function.
This is mainly useful when you need to index a column which contains long values, e.g. varchars which are on average more than 100 characters or something like that.

How does adding indexes to every field reduce the size of the database? You still have to store all of the true values in addition to the hash; we don't just want to query for existence but want to return the actual data.
Most RDBMSes answer most queries efficiently now (especially with key indices in place). I'm having a hard time formulating scenarios where your database would be more efficient and save space.
Putting indexes within the physical data -- that doesn't really make sense. The key to indexes' performance is that each index is stored in sorted order. How do you propose doing that across any possible field if they are only stored once in their physical layout? Ultimately, the actual rows have to be sorted by something (in SQL Server, for example, this is the clustered index)?

I don't think your approach is very helpful.
Hash values only help for equality/inequality comparisons, but not less than/greater than comparisons, compared to pretty much every database index.
Even with (in)equality hash functions do not offer 100% guarantee of having given you the right answer, as hash collisions can happen, so you will still have to fetch and compare the original value - boom, you just lost what you wanted to save.
You can have the rows in a table ordered only one way at a time. So if you have an application where you have to order rows differently in different queries (e.g. query A needs a list of customers ordered by their name, query B needs a list of customers ordered by their sales volume), one of those queries will have to access the table out-of-order.
If you don't want the database to have to work around colums you do not use in a query, then use indexes with extra data columns - if your query is ordered according to that index, and your query only uses columns that are in the index (coulmns the index is based on plus columns you have explicitly added into the index), the DBMS will not read the original table.
Etc.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight