Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Hadoop Single Node Setup

 
Lopez Mirinda
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My boss has asked me to setup single node hadoop on windows and said that we would be using it to create some POC on how to use Big Data in this setup.
So, when I say single node haoop, is that the same as single node hadoop cluster?

Also, I have been browsing and found the following links.
http://www.cs.brandeis.edu/~rshaull/cs147a-fall-2008/hadoop-windows/
This link talks about using cygwin, java and any stable version of hadoop. This seems rather simple and easy.

http://developer.yahoo.com/hadoop/tutorial/module3.html#vm
This is more confusing, it says we need vmware player, copying the hadoop virtual machine etc etc.

Which one do I follow for my POC?
 
chris webster
Bartender
Posts: 2407
33
Linux Oracle Postgres Database Python Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I haven't used Hadoop, but I understand that it relies on Unix shell commands, so if you are on a Windows machine, then you either need a Unix-shell for your Windows machine (which is what Cygwin does), or you need a Linux virtual machine running inside your Windows machine (which is what VMWare does). Alternatively, you can either install Linux to dual-boot on your current machine before installing Hadoop, or find a machine that already has Linux on it and install Hadoop there instead.

I don't use Cygwin, so I don't know how you would install/run Hadoop with Cygwin, but I'm guessing you would open a Cygwin shell and then use Unix-style commands as indicated in the Hadoop instructions.

VirtualBox is an easy free alternative to VMWare - you can install VirtualBox on a Windows PC, then install a Linux VM to run inside VirtualBox e.g. here is a quick guide to installing Ubuntu Linux on VirtualBox. Once you've got Linux running inside VirtualBox, I guess you would log into your Linux VM and then follow the instructions for installing/running Hadoop on Linux.

If you're using Ubuntu, remember you may need to use "sudo" for running some commands that require extra permissions e.g. "sudo apt-get install ssh". Also, running a VM inside Windows takes extra memory, so make sure you have enough RAM to run both OS at the same time. If you dual-boot Linux instead, obviously this is not such a problem as you are either running Windows or Linux but not both at the same time.

FWIW, it may be slightly more work to install Linux (as a VM or dual boot) on your Windows PC, but once you've done this it's often much easier to get/install/use the Linux versions of many open source tools than the equivalent Windows versions.
 
Lopez Mirinda
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for the detailed info, chris.

I tried using cygwin for hadoop but as i see there are quite a few issues setting up hadoop.
So, thought of using this vmware and successful partly. though this is a good way to learn MapReduce, I would definitely try to install hadoop using cygwin as well.
 
Pablo Abbate
Ranch Hand
Posts: 30
Java Spring Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Although this for linux, could be useful.

Cloudera Hadoop
 
MadhaviLatha Polanki
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lopez, could you please let me know if you have installed hadoop using cygwin succesfully. I have got few errors while formatting the name node in Hadoop. Could you please help me in resloving the issue. Thanks in advance.
 
Lopez Mirinda
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
madhavi,
please post the error, the hadoop version used. would be more helpful in debugging
 
MadhaviLatha Polanki
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lopez, I will provide you with the screen shot for the error soon. Thank you.
 
MadhaviLatha Polanki
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lopez. Sometimes I get an error regarding SSHD service start. Sometimes I get the error in running the local host and the error message for this is " Connection Refused". Now while extracting the hadoop jar file to the folder "C:\cygwin\user\local" folder I got the eror saying I do not have permissions to do this task. Could you please help me in knowing what might be the reason for this? I have attached the document for this error.
hadooperror.png
[Thumbnail for hadooperror.png]
 
Lopez Mirinda
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
try running cygwin terminal as "administrator" (on the terminal shortcut icon, right click, run as admin). and try to untar the hadoop distribution from the terminal.
 
MadhaviLatha Polanki
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The problem is that I am running the cygwin as "Administrator user", but still I am encountering such problems.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic