Win a copy of The Business Blockchain this week in the Cloud forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to test hadoop job performace

 
Alberto Fanini
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,
I've implemented a frequent itemset map-reduce algorithm based on SON for Apache Hadoop. Now I need to test its performance, i.e. study how its execution time varies using different datasets and compare it with different versions of the algorithm in order to choose the best one.

So, I run several jobs on a 6-machines cluster and I have noticed that the execution time varies significantly even keeping the same dataset and the same algorithm version. I have come to the conclusion that in this type of environment the execution time is unpredictable because of the (un)availability of requested data in the machine where the computation runs.

How can I run this type of test in a reliable way?

Thank you
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic