Win a copy of Practice Tests for OCP Java 17 Certification Exam (1Z0-829) this week in the OCPJP forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Liutauras Vilda
Sheriffs:
  • Rob Spoor
  • Junilu Lacar
  • paul wheaton
Saloon Keepers:
  • Stephan van Hulst
  • Tim Moores
  • Tim Holloway
  • Carey Brown
  • Scott Selikoff
Bartenders:
  • Piet Souris
  • Jj Roberts
  • fred rosenberger

Difficulty ?

 
Ranch Hand
Posts: 531
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I don't actually work with Big Data but have a feeling that it is something that will be coming my way at some point.
How difficult is it to learn, and become competent, in Hadoop as compared with other NoSql technologies ?

Thanks,
Paul
 
Author
Posts: 21
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

There is definitely a learning curve with Hadoop, which is probably higher than most NoSQL systems, which are more aligned with other real-time systems that we are all accustomed to working with (such as relational databases). The additional learning time is really related to installation, management and understanding MapReduce as a framework and programming model.

Having said that, I would argue that it's worthwhile understanding the Hadoop fundamentals; even if you don't end up using the technology, it will help you understand the MapReduce concepts, which are also being leveraged in-system by NoSQL solutions. Hadoop's emphasis on data locality is also a valuable lesson that we all should be aware about as general good-practice distributed system design, which helps reenforce our own architectural and design decisions.

Thanks,
Alex
 
paul nisset
Ranch Hand
Posts: 531
2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Alex.
 
Ranch Hand
Posts: 119
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Alex Holmes wrote:Hi,

There is definitely a learning curve with Hadoop, which is probably higher than most NoSQL systems, which are more aligned with other real-time systems that we are all accustomed to working with (such as relational databases). The additional learning time is really related to installation, management and understanding MapReduce as a framework and programming model.

Having said that, I would argue that it's worthwhile understanding the Hadoop fundamentals; even if you don't end up using the technology, it will help you understand the MapReduce concepts, which are also being leveraged in-system by NoSQL solutions. Hadoop's emphasis on data locality is also a valuable lesson that we all should be aware about as general good-practice distributed system design, which helps reenforce our own architectural and design decisions.

Thanks,
Alex



Alex, can you please elaborate more about what you mean by "Hadoop's emphasis on data locality is also a valuable lesson"

Regards,
Mohamed
 
Alex Holmes
Author
Posts: 21
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Mohamed,

In distributed computing it is much preferred to read data from local disk, rather than over the network. This is also known as data locality, and is one of the key aspects of Hadoop. When MapReduce pushes work to the slave nodes, it does do in a way to favor reads from disk rather than reads from the network.

Thanks,
Alex
 
Mohamed El-Refaey
Ranch Hand
Posts: 119
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Alex Holmes wrote:Mohamed,

In distributed computing it is much preferred to read data from local disk, rather than over the network. This is also known as data locality, and is one of the key aspects of Hadoop. When MapReduce pushes work to the slave nodes, it does do in a way to favor reads from disk rather than reads from the network.

Thanks,
Alex



Thanks Alex for your clarifications! much appreciated.
 
reply
    Bookmark Topic Watch Topic
  • New Topic