• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Version in HBase Coloumn family

 
Rajesh So
Ranch Hand
Posts: 149
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

The cells in HBase have a concept of Version. By default, we can have upto 3 versions for a cell. There are methods to get, put and scan the coloumn.

My questions are

HOW does my application benefit by the versions?

In RDBMS, there was only one version/cell. Now HBase has offered multiple versions. WHY does HBase have multiple versions per cell?

Thanks,
Rajesh
 
Karthik Shiraly
Bartender
Posts: 1210
25
Android C++ Java Linux PHP Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The version concept comes from Google's BigTable paper, which was the basis for implementing HBase.

Google's search spider keeps visiting websites multiple times. Since websites may change between each visit, BigTable stores multiple
versions of the contents and perhaps relationships between sites. So it's easy to make a query like "get latest contents of <url>" or "get latest 2 versions of <url> and diff them".

It's like version control for data.
If a cell value can change but you need the history of changes later on - perhaps for auditing or diff'ing - use versions.

Whether it's useful to your application depends on what your application does.

For example, an editable wiki can store multiple versions of a wiki article in the same row and column. If you were using an RDBMS, it would require
multiple rows with different entries in the timestamp column.
 
abhi k tripathi
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Rajesh,

You are right Hbase does have version support in column family. Although I am not sure about the number of version it support.

According to me, having version support is one of the key benefits of Hbase.
In RDBMS, you can maintain a backup of the database for case like failure or roll back. It will consume lot of space and you have to load the whole backup inorder to check the single change in column value.
With HBase, you can simply do it by writing a single code:
For example: -
- to return more than one version, see Get.setMaxVersions()

You can also check the values at given time:
- to return versions other than the latest, see Get.setTimeRange()

You can check the hbase version example here:
http://hbase.apache.org/0.94/book/versions.html

Hbase is typically used in Analytics now days. If you are able to check the value change in the same field which is very important aspect of analytic you can easily do it with Hbase.
If you look for google, you will find the multiple scenarios of the version support.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic