This week's book giveaway is in the Python forum.
We're giving away four copies of High Performance Python for Data Analytics and have Tiago Rodrigues Antao on-line!
See this thread for details.
Win a copy of High Performance Python for Data Analytics this week in the Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Bear Bibeault
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Junilu Lacar
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Jj Roberts
  • Carey Brown
Bartenders:
  • salvin francis
  • Frits Walraven
  • Piet Souris

While working on AWS, is HDFS useless as you would be already having S3

 
Ranch Foreman
Posts: 2348
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
When we work with AWS, then we can use S3 (simple storage service). Is HDFS useless in this case because you can instead use S3 or is there anything which HDFS can provide but we cannot do that using S3? Thanks.
 
Saloon Keeper
Posts: 6803
162
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In engineering, trade-offs between A and B are rarely black-and-white. A quick search for "hadoop hdfs vs s3" finds https://www.xplenty.com/blog/storing-apache-hadoop-data-cloud-hdfs-vs-s3/, which should make for interesting reading.
 
Monica Shiralkar
Ranch Foreman
Posts: 2348
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, that's right. Thanks. What I meant is that in case when one is already using AWS one would automatically have access to S3. So one would install Hadoop and get access to HDFS only if one seeks something that S3 cannot give it.
 
Monica Shiralkar
Ranch Foreman
Posts: 2348
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
One reason I can see to still use HDFS despite working on AWS, is that one may be using some component of Hadoop ecosystem like Hive which would internally require HDFS.
 
Monica Shiralkar
Ranch Foreman
Posts: 2348
12
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Tim Moores wrote: https://www.xplenty.com/blog/storing-apache-hadoop-data-cloud-hdfs-vs-s3/, which should make for interesting reading.



Thanks. I read that although S3 has advantages over HDFS at most places like elasticity , the  area where HDFS seem to have advantage is better performance.
reply
    Bookmark Topic Watch Topic
  • New Topic