-
Setup - http://raseshmori.wordpress.com/2012/09/23/install-hadoop-2-0-1-yarn-nextgen/
-
Setup - https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation
-
Setup - http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html
-
Subqueries - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries
-
Hive CLI - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli
-
Hive Serde - https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ApacheWeblogData
-
Hive Compressed Storage - https://cwiki.apache.org/confluence/display/Hive/CompressedStorage
-
Hive ORC & Vectorized Query Execution - https://cwiki.apache.org/confluence/display/Hive/Vectorized+Query+Execution
-
Hadoop Shell commands - http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html#rm
-
Hadoop native library - http://stackoverflow.com/a/20242755/294552
-
Array Operations - http://stackoverflow.com/questions/8039751/hadoop-hive-query-to-split-one-column-into-several-ones
-
Array Operations - http://stackoverflow.com/questions/17212623/project-array-to-columns-in-hive
-
UDTF - http://stackoverflow.com/questions/12160304/hadoop-hive-split-a-single-row-into-multiple-rows
-
UDTF - http://stackoverflow.com/questions/11373543/explode-the-array-of-struct-in-hive
-
Lateral Views - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView
-
Word Count in hive - http://stackoverflow.com/questions/10039949/word-count-program-in-hive
-
Custom UDFs - https://github.com/rathboma/hive-extension-examples
-
Custom UDFs - https://cwiki.apache.org/confluence/display/Hive/HivePlugins
-
Analytics - http://www.postgresql.org/docs/9.1/static/tutorial-window.html
-
Analytics - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
-
Analytics - http://www.slideshare.net/Hadoop_Summit/analytical-queries-with-hive
-
Parameterizing scripts - http://stackoverflow.com/questions/12464636/how-to-set-variables-in-hive-scripts
-
Permanent Functions - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PermanentFunctions
-
Error logs - https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs
-
UDAF - http://ragrawal.wordpress.com/2013/10/26/writing-hive-custom-aggregate-functions-udaf-part-ii/
-
Tez - http://hortonworks.com/hadoop-tutorial/supercharging-interactive-queries-hive-tez/
-
Tez - https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez
-
Source shell scripts - http://bash.cyberciti.biz/guide/Source_command
-
Source shell scripts - http://stackoverflow.com/questions/670191/getting-a-source-not-found-error-when-using-source-in-a-bash-script
-
Shell scripts - AWK - sum - http://stackoverflow.com/questions/450799/shell-command-to-sum-integers-one-per-line
-
Redirecting standard output, error to log - http://stackoverflow.com/questions/4721635/redirect-standard-output-error-to-log-file
-
source vs sh - http://stackoverflow.com/questions/13786499/source-vs-sh-in-linux-what-is-the-difference
-
Git - http://stackoverflow.com/questions/173919/is-there-a-theirs-version-of-git-merge-s-ours
-
protobuf - http://www.confusedcoders.com/random/how-to-install-protocol-buffer-2-5-0-on-ubuntu-13-04
-
Hive Tez - java.io.FileNotFoundException - hdfs:/user/root - http://osdir.com/ml/general/2014-07/msg31819.html
-
Hive Tez - java.lang.NoSuchMethodError - https://mail-archives.apache.org/mod_mbox/tez-user/201408.mbox/%3C84672461-ED68-44DC-80BF-2CE5B4EF46E0@apache.org%3E
-
Giraph - Build - http://giraph.apache.org/build.html
-
Giraph - Run - http://giraph.apache.org/quick_start.html#qs_section_5
-
Algorithms - Union Find - http://algs4.cs.princeton.edu/15uf/
-
Hive UDF with Parameters - https://blogs.oracle.com/datawarehousing/entry/three_little_hive_udfs_part2
-
Hive 'user' keyword - https://issues.apache.org/jira/browse/HIVE-10294
-
Hive variables - https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution
-
HBASE quick start - http://hbase.apache.org/0.94/book/quickstart.html
-
HBASE quick start Psuedo - https://hbase.apache.org/book.html#quickstart_pseudo
-
HBASE Tutorial - http://www.tutorialspoint.com/hbase/
-
Hive HBASE integration - https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
Hive 1.x will remain compatible with HBase 0.98.x and lower versions. Hive 2.x will be compatible with HBase 1.x and higher. (See HIVE-10990 for details.) Consumers wanting to work with HBase 1.x using Hive 1.x will need to compile Hive 1.x stream code themselves. The hbase.mapred.output.outputtable property is optional; it's needed if you plan to insert data to the table (the property is used by hbase.mapreduce.TableOutputFormat)
-
jline
-
YARN Tuning - http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_installing_manually_book/content/determine-hdp-memory-config.html
$ python yarn-utils.py -c 4 -m 24 -d 1 -k True Using cores=4 memory=24GB disks=1 hbase=True Profile: cores=4 memory=16384MB reserved=8GB usableMem=16GB disks=1 Num Container=3 Container Ram=5120MB Used Ram=15GB Unused Ram=8GB yarn.scheduler.minimum-allocation-mb=5120 yarn.scheduler.maximum-allocation-mb=15360 yarn.nodemanager.resource.memory-mb=15360 mapreduce.map.memory.mb=5120 mapreduce.map.java.opts=-Xmx4096m mapreduce.reduce.memory.mb=5120 mapreduce.reduce.java.opts=-Xmx4096m yarn.app.mapreduce.am.resource.mb=5120 yarn.app.mapreduce.am.command-opts=-Xmx4096m mapreduce.task.io.sort.mb=2048
-
PEGASUS - http://www.cs.cmu.edu/~pegasus/
* Paper - http://www.cs.cmu.edu/~ukang/papers/PegasusICDM2009.pdf * Getting Started - http://www.cs.cmu.edu/~pegasus/getting%20started.htm