This week's book giveaway is in the Artificial Intelligence and Machine Learning forum.
We're giving away four copies of Zero to AI - A non-technical, hype-free guide to prospering in the AI era and have Nicolò Valigi and Gianluca Mauro on-line!
See this thread for details.
Win a copy of Zero to AI - A non-technical, hype-free guide to prospering in the AI era this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Which Big Data technologies does Hadoop comprise of?

 
Ranch Foreman
Posts: 1791
9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There are several big data technologies like Map Reduce, Apache Spark, Hive,Cassandra,HBase etc. Which all technologies come under hadoop. Why does hbase come under hadoop but cassandra does not?

thanks
 
Ranch Hand
Posts: 42
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HBase is built on HDFS so technically over existing Hadoop technologies, Cassandra is not !!
 
Ranch Hand
Posts: 57
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello
Cassandra is a write-intensive database. Its write performance is higher than most other Nosql dbs. Cassandra follows a peer to peer architecture, as opposed to master-slave architecture of MongoDB and most RDBMS. That means you can write to any peer and Cassandra will take care of data synchronization. That's why its faster. Having said that Cassandra has some shortcomings when it comes to querying data, hence data modeling is the most important part of using Cassandra well. To enable the fast read/write, Cassandra allows you to query only by its primary keys. The partition key enables segregating data into partitions. So Cassandra can determine which partition to look for your data by the partition key. The clustering key keeps the data stored in the tables in sorted order. I am not aware if you can do custom sorting on any field in Cassandra. Of course, you can create secondary indexes on fields other than Primary keys, to query by them, but the moment you do that you degrade performance drastically. All this makes data modeling quite a challenge in Cassandra. Often if you modeled according to a certain requirement, and later when a new requirement comes along that means you need to change the data model again. Cassandra also has a steeper learning curve compared to MongoDB.
The best tool
Apache Spark
Often used as a framework for building analytic tools on top of, Spark is an open-source processing engine that is built for speed, ease of use and sophisticated analytics.

A huge amount of backing is being given to Spark, with over 750 contributors from over 200 organizations aiming to develop on it and advance it.

A number of companies such as Hortonworks and IBM have all been busy integrating Spark capabilities into their big data platforms, and it could be set to become the default analytics power for Hadoop.


I hope this will help to you
 
Monica Shiralkar
Ranch Foreman
Posts: 1791
9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Ishan Shah wrote:and it could be set to become the default analytics power for Hadoop



Thanks
However ,Cassandra and Spark are not part of hadoop ecosystem.

What do you mean by "(spark) can become default analytics power for Hadoop"?
 
I would challenge you to a battle of wits, but I see you are unarmed - shakespear. Unarmed tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
https://coderanch.com/wiki/718759/books/Building-World-Backyard-Paul-Wheaton
reply
    Bookmark Topic Watch Topic
  • New Topic