This sections including the steps for setting up Prerequisites for all bigdata-integration examples.
This section including step by step procedures for installing 'Hadoop 1.2.1' to RHEL 6, and configuring a Single Node Setup.
$ uname -a
Linux 2.6.32-431.20.3.el6.x86_64 #1 SMP Fri Jun 6 18:30:54 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux
$ java -version
java version "1.7.0_60"
Java(TM) SE Runtime Environment (build 1.7.0_60-b19)
Java HotSpot(TM) 64-Bit Server VM (build 24.60-b09, mixed mode)
$ wget
$ tar -xvf hadoop-1.2.1.tar.gz
$ cd hadoop-1.2.1
Edit 'conf/', comment out JAVA_HOME, make sure it point to a valid Java Home:
export JAVA_HOME=/usr/java/jdk1.7.0_60
NOTE: Hadoop 1.2.1 need Java 1.6 or higher
Edit 'conf/core-site.xml', add the following properties in :
NOTE: the property's value should match to your's setting.
Edit 'conf/hdfs-site.xml', add the following 2 property in :
Format a new distributed-filesystem via execute
hadoop-1.2.1/bin/hadoop namenode -format
Start all hadoop services via execute
$ ./bin/
NOTE: there are 5 java processes which represent 5 services be started:
. Execute 'jps -l' to check the java processes:
$ jps -l
4056 org.apache.hadoop.hdfs.server.namenode.NameNode
4271 org.apache.hadoop.hdfs.server.datanode.DataNode
4483 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
4568 org.apache.hadoop.mapred.JobTracker
4796 org.apache.hadoop.mapred.TaskTracker
has relevant Web Consoles for View and Monitor the serivces. Web Access URLs for Services:
http://localhost:50030/ for the Jobtracker
http://localhost:50070/ for the Namenode
http://localhost:50060/ for the Tasktracker
Stop all hadoop services via execute
# bin/
This section including step by step procedures for installing Apache Hive and set up HiveServer2.
Hadoop is the prerequisite, refer to above steps to install and start Hadoop.
$ tar -xvf apache-hive-1.2.1-bin.tar.gz
$ cd apache-hive-1.2.1-bin
Create a '' under 'conf'
$ cd conf/
$ cp
$ vim
comment out HADOOP_HOME and make sure point to a valid Hadoop home, for example:
Navigate to Hadoop Home, create '/tmp' and '/user/hive/warehouse' and chmod g+w in HDFS before running Hive:
$ ./bin/hadoop fs -mkdir /tmp
$ ./bin/hadoop fs -mkdir /user/hive/warehouse
$ ./bin/hadoop fs -chmod g+w /tmp
$ ./bin/hadoop fs -chmod g+w /user/hive/warehouse
$ ./bin/hadoop fs -chmod 777 /tmp/hive
NOTE: Restart Hadoop services is needed, this for avoid ' Filesystem closed' in DFSClient check Open.
Create a 'hive-site.xml' file under conf folder
$ cd apache-hive-1.2.1-bin/conf/
$ touch hive-site.xml
Edit the 'hive-site.xml', add the following content:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
NOTE: there are other Optional properties, more refer to Setting+Up+HiveServer2
$ ./bin/hiveserver2
The following steps show how to install Apache Spark and start the Thrift JDBC/ODBC server.
$ tar -xvf $ tar -xvf spark-1.4.0-bin-hadoop2.4.tgz
$ cd spark-1.4.0-bin-hadoop2.4
$ ./sbin/