2010年12月15日 星期三

Setup Hadoop with Clonezilla

This is a log that I used to install a Hadoop in Ubuntu, using with Clonezilla to quick install many computer.





1. Install a system that has Ubuntu system.

If you want to install the system in different architecture of computer.  You need to find a computer that has the least disk space.  For example: mine is 40GB.

I partition the disk as following:

128MB as FAT32, for booting usage.
2GB   as SWAP
15GB  as root
Rest  as /home as logical

The reason of make the home as logical partition is that I can resize the partition as new computer may have more space.

2. Install ssh, Java, rsync with following command:

sudo su;
echo "deb http://archive.canonical.com/ubuntu lucid partner" >> /etc/apt/sources.list;
sudo apt-get update; sudo apt-get install sun-java6-jdk sun-java6-plugin ssh openssh-server rsync;
exit;

3. Disable IPv6 with following command:

sudo su;
echo "# disable IPv6" >> /etc/modprobe.d/blacklist;
echo "blacklist ipv6" >> /etc/modprobe.d/blacklist;


Uses cat /etc/modprobe.d/blacklist; to check if the process is correct.


4. Download Hadoop


Goto the download page and download the latest stable version.  Mine is following:

wget http://ftp.twaren.net/Unix/Web/apache//hadoop/core/stable/hadoop-0.20.2.tar.gz;

Extract the file.

tar xzf hadoop-0.20.2.tar.gz;
mv hadoop-0.20.2 hadoop;


Than, edit some config.

5. Edit config files

5.1 Add JAVA_HOME

vi ~/hadoop/conf/hadoop-env.sh;

Change

# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.6-sun

To

# The java implementation to use.  Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun


6. Use Clonezilla

Yes, Use Clonezilla to save a image and copy to other computer.  If you are installing Ubuntu in clean install.  The saving image time should less then ten minutes.  Copying to other computer is less than 5 minutes.