Found Video Lecture Series: Data Science Boot Camp. It's very great for learning. I hope readers will get benefit from it. I learned a bit about it Today and downloaded Example Code to play. Lesson 1: I learned to use Flume Agent and read data by Hive.
Terminal 1: run flume
[surachart@centos01 ~]$ git clone https://github.com/oraclebigdata/oa_lesson_1_source_and_acquire
Initialized empty Git repository in /home/surachart/oa_lesson_1_source_and_acquire/.git/
remote: Counting objects: 16, done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 16 (delta 5), reused 14 (delta 3)
Unpacking objects: 100% (16/16), done.
[surachart@centos01 ~]$ cd oa_lesson_1_source_and_acquire
[surachart@centos01 oa_lesson_1_source_and_acquire]$ ls
commands.sh example.xml flume_example.conf hive_examples.hql LICENSE README.md
[surachart@centos01 oa_lesson_1_source_and_acquire]$ less README.md
[surachart@centos01 oa_lesson_1_source_and_acquire]$ less commands.sh
[surachart@centos01 oa_lesson_1_source_and_acquire]$ cp flume_example.conf flume_example.conf.orig
[surachart@centos01 oa_lesson_1_source_and_acquire]$ vi flume_example.conf
[surachart@centos01 oa_lesson_1_source_and_acquire]$ diff flume_example.conf.orig flume_example.conf
16c16
< hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://localhost:8020/user/oracle/flume_example
---
> hdfs-agent.sinks.hdfs-write.hdfs.path = hdfs://localhost:8020/user/surachart/flume_example
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$ hadoop fs -ls hdfs://localhost:8020/user/surachart/flume_example
13/11/16 20:47:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: `hdfs://localhost:8020/user/surachart/flume_example': No such file or directory
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$ flume-ng agent -n hdfs-agent -f ./flume_example.conf
Warning: No configuration directory set! Use --conf <dir> to override.
Info: Including Hadoop libraries found via (/usr/bin/hadoop) for HDFS access
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
Info: Including HBASE libraries found via (/usr/bin/hbase) for HBASE access
Info: Excluding /usr/lib/hbase/bin/../lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-api-1.6.1.jar from classpath
Info: Excluding /usr/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar from classpath
+ exec /usr/java/latest/bin/java -Xmx20m -cp '/usr/lib/flume/lib/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/avro-1.5.3.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/zookeeper-3.4.2.jar:/usr/lib/hadoop/.//bin:/usr/lib/hadoop/.//client:/usr/lib/hadoop/.//etc:/usr/lib/hadoop/.//hadoop-annotations-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-common-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-common-2.0.5-alpha-tests.jar:/usr/lib/hadoop/.//lib:/usr/lib/hadoop/.//libexec:/usr/lib/hadoop/.//sbin:/usr/contrib/capacity-scheduler/*.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.13.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/.//bin:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.5-alpha.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.5-alpha-tests.jar:/usr/lib/hadoop-hdfs/.//lib:/usr/lib/hadoop-hdfs/.//sbin:/usr/lib/hadoop-hdfs/.//webapps:/usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-yarn/lib/asm-3.2.jar:/usr/lib/hadoop-yarn/lib/avro-1.5.3.jar:/usr/lib/hadoop-yarn/lib/commons-io-2.1.jar:/usr/lib/hadoop-yarn/lib/guice-3.0.jar:/usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-yarn/lib/jersey-core-1.8.jar:/usr/lib/hadoop-yarn/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-yarn/lib/jersey-server-1.8.jar:/usr/lib/hadoop-yarn/lib/junit-4.8.2.jar:/usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-yarn/lib/netty-3.5.11.Final.jar:/usr/lib/hadoop-yarn/lib/paranamer-2.3.jar:/usr/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-yarn/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop-yarn/.//bin:/usr/lib/hadoop-yarn/.//etc:/usr/lib/hadoop-yarn/.//hadoop-yarn-api-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-unmanaged-am-launcher-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-client-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//lib:/usr/lib/hadoop-yarn/.//sbin:/usr/lib/hadoop-mapreduce/lib/aopalliance-1.0.jar:/usr/lib/hadoop-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-mapreduce/lib/avro-1.5.3.jar:/usr/lib/hadoop-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-mapreduce/lib/guice-3.0.jar:/usr/lib/hadoop-mapreduce/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-mapreduce/lib/javax.inject-1.jar:/usr/lib/hadoop-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-mapreduce/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-mapreduce/lib/netty-3.5.11.Final.jar:/usr/lib/hadoop-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-mapreduce/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop-mapreduce/.//bin:/usr/lib/hadoop-mapreduce/.//hadoop-archives-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-datajoin-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-distcp-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-extras-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-gridmix-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-app-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-common-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-core-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-hs-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-hs-plugins-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.0.5-alpha-tests.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-shuffle-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-examples-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-rumen-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-streaming-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//lib:/usr/lib/hadoop-mapreduce/.//sbin:/usr/lib/hbase/bin/../conf:/usr/java/latest/lib/tools.jar:/usr/lib/hbase/bin/..:/usr/lib/hbase/bin/../hbase-0.94.5.jar:/usr/lib/hbase/bin/../hbase-0.94.5-tests.jar:/usr/lib/hbase/bin/../hbase.jar:/usr/lib/hbase/bin/../lib/activation-1.1.jar:/usr/lib/hbase/bin/../lib/aopalliance-1.0.jar:/usr/lib/hbase/bin/../lib/asm-3.1.jar:/usr/lib/hbase/bin/../lib/avro-1.5.3.jar:/usr/lib/hbase/bin/../lib/avro-ipc-1.5.3.jar:/usr/lib/hbase/bin/../lib/commons-beanutils-1.7.0.jar:/usr/lib/hbase/bin/../lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hbase/bin/../lib/commons-cli-1.2.jar:/usr/lib/hbase/bin/../lib/commons-codec-1.4.jar:/usr/lib/hbase/bin/../lib/commons-collections-3.2.1.jar:/usr/lib/hbase/bin/../lib/commons-configuration-1.6.jar:/usr/lib/hbase/bin/../lib/commons-daemon-1.0.13.jar:/usr/lib/hbase/bin/../lib/commons-digester-1.8.jar:/usr/lib/hbase/bin/../lib/commons-el-1.0.jar:/usr/lib/hbase/bin/../lib/commons-httpclient-3.1.jar:/usr/lib/hbase/bin/../lib/commons-io-2.1.jar:/usr/lib/hbase/bin/../lib/commons-lang-2.5.jar:/usr/lib/hbase/bin/../lib/commons-logging-1.1.1.jar:/usr/lib/hbase/bin/../lib/commons-math-2.1.jar:/usr/lib/hbase/bin/../lib/commons-net-3.1.jar:/usr/lib/hbase/bin/../lib/core-3.1.1.jar:/usr/lib/hbase/bin/../lib/gmbal-api-only-3.0.0-b023.jar:/usr/lib/hbase/bin/../lib/grizzly-framework-2.1.1.jar:/usr/lib/hbase/bin/../lib/grizzly-framework-2.1.1-tests.jar:/usr/lib/hbase/bin/../lib/grizzly-http-2.1.1.jar:/usr/lib/hbase/bin/../lib/grizzly-http-server-2.1.1.jar:/usr/lib/hbase/bin/../lib/grizzly-http-servlet-2.1.1.jar:/usr/lib/hbase/bin/../lib/grizzly-rcm-2.1.1.jar:/usr/lib/hbase/bin/../lib/guava-11.0.2.jar:/usr/lib/hbase/bin/../lib/guice-3.0.jar:/usr/lib/hbase/bin/../lib/guice-servlet-3.0.jar:/usr/lib/hbase/bin/../lib/high-scale-lib-1.1.1.jar:/usr/lib/hbase/bin/../lib/httpclient-4.1.2.jar:/usr/lib/hbase/bin/../lib/httpcore-4.1.3.jar:/usr/lib/hbase/bin/../lib/jackson-core-asl-1.8.8.jar:/usr/lib/hbase/bin/../lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hbase/bin/../lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hbase/bin/../lib/jackson-xc-1.8.8.jar:/usr/lib/hbase/bin/../lib/jamon-runtime-2.3.1.jar:/usr/lib/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/usr/lib/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/usr/lib/hbase/bin/../lib/javax.inject-1.jar:/usr/lib/hbase/bin/../lib/javax.servlet-3.0.jar:/usr/lib/hbase/bin/../lib/jaxb-api-2.1.jar:/usr/lib/hbase/bin/../lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hbase/bin/../lib/jersey-client-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-core-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-grizzly2-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-guice-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-json-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-server-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-test-framework-core-1.8.jar:/usr/lib/hbase/bin/../lib/jersey-test-framework-grizzly2-1.8.jar:/usr/lib/hbase/bin/../lib/jets3t-0.6.1.jar:/usr/lib/hbase/bin/../lib/jettison-1.1.jar:/usr/lib/hbase/bin/../lib/jetty-6.1.26.jar:/usr/lib/hbase/bin/../lib/jetty-util-6.1.26.jar:/usr/lib/hbase/bin/../lib/jruby-complete-1.6.5.jar:/usr/lib/hbase/bin/../lib/jsch-0.1.42.jar:/usr/lib/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/lib/hbase/bin/../lib/jsp-api-2.1.jar:/usr/lib/hbase/bin/../lib/jsr305-1.3.9.jar:/usr/lib/hbase/bin/../lib/junit-4.10-HBASE-1.jar:/usr/lib/hbase/bin/../lib/kfs-0.3.jar:/usr/lib/hbase/bin/../lib/libthrift-0.8.0.jar:/usr/lib/hbase/bin/../lib/log4j-1.2.16.jar:/usr/lib/hbase/bin/../lib/management-api-3.0.0-b012.jar:/usr/lib/hbase/bin/../lib/metrics-core-2.1.2.jar:/usr/lib/hbase/bin/../lib/netty-3.2.4.Final.jar:/usr/lib/hbase/bin/../lib/netty-3.5.11.Final.jar:/usr/lib/hbase/bin/../lib/protobuf-java-2.4.0a.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hbase/bin/../lib/servlet-api-2.5.jar:/usr/lib/hbase/bin/../lib/snappy-java-1.0.3.2.jar:/usr/lib/hbase/bin/../lib/stax-api-1.0.1.jar:/usr/lib/hbase/bin/../lib/velocity-1.7.jar:/usr/lib/hbase/bin/../lib/xmlenc-0.52.jar:/usr/lib/hbase/bin/../lib/zookeeper.jar:/etc/hadoop/conf:/usr/bin:/usr/etc:/usr/games:/usr/include:/usr/java:/usr/lib:/usr/lib64:/usr/libexec:/usr/local:/usr/sbin:/usr/share:/usr/src:/usr/tmp:/usr/lib/anaconda-runtime:/usr/lib/bigtop-tomcat:/usr/lib/bigtop-utils:/usr/lib/ConsoleKit:/usr/lib/cups:/usr/lib/debug:/usr/lib/flume:/usr/lib/games:/usr/lib/gcc:/usr/lib/hadoop:/usr/lib/hadoop-hdfs:/usr/lib/hadoop-httpfs:/usr/lib/hadoop-mapreduce:/usr/lib/hadoop-yarn:/usr/lib/hbase:/usr/lib/hive:/usr/lib/hue:/usr/lib/java:/usr/lib/java-1.3.1:/usr/lib/java-1.4.0:/usr/lib/java-1.4.1:/usr/lib/java-1.4.2:/usr/lib/java-1.5.0:/usr/lib/java-1.6.0:/usr/lib/java-1.7.0:/usr/lib/java-ext:/usr/lib/jvm:/usr/lib/jvm-commmon:/usr/lib/jvm-exports:/usr/lib/jvm-private:/usr/lib/locale:/usr/lib/lsb:/usr/lib/mahout:/usr/lib/oozie:/usr/lib/pig:/usr/lib/python2.6:/usr/lib/rpm:/usr/lib/sendmail:/usr/lib/sendmail.postfix:/usr/lib/whirr:/usr/lib/yum-plugins:/usr/lib/zookeeper:/usr/lib/zookeeper/bin:/usr/lib/zookeeper/conf:/usr/lib/zookeeper/lib:/usr/lib/zookeeper/zookeeper-3.4.5.jar:/usr/lib/zookeeper/zookeeper.jar:/usr/lib/zookeeper/lib/jline-0.9.94.jar:/usr/lib/zookeeper/lib/log4j-1.2.15.jar:/usr/lib/zookeeper/lib/netty-3.2.2.Final.jar:/etc/hadoop/conf:/usr/lib/hadoop/lib/activation-1.1.jar:/usr/lib/hadoop/lib/asm-3.2.jar:/usr/lib/hadoop/lib/avro-1.5.3.jar:/usr/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/usr/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-collections-3.2.1.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-digester-1.8.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop/lib/commons-io-2.1.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop/lib/commons-math-2.1.jar:/usr/lib/hadoop/lib/commons-net-3.1.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop/lib/jackson-xc-1.8.8.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar:/usr/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/usr/lib/hadoop/lib/jersey-core-1.8.jar:/usr/lib/hadoop/lib/jersey-json-1.8.jar:/usr/lib/hadoop/lib/jersey-server-1.8.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jettison-1.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-api-2.1.jar:/usr/lib/hadoop/lib/jsr305-1.3.9.jar:/usr/lib/hadoop/lib/kfs-0.3.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/paranamer-2.3.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop/lib/servlet-api-2.5.jar:/usr/lib/hadoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop/lib/stax-api-1.0.1.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/zookeeper-3.4.2.jar:/usr/lib/hadoop/.//bin:/usr/lib/hadoop/.//client:/usr/lib/hadoop/.//etc:/usr/lib/hadoop/.//hadoop-annotations-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-auth-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-common-2.0.5-alpha.jar:/usr/lib/hadoop/.//hadoop-common-2.0.5-alpha-tests.jar:/usr/lib/hadoop/.//lib:/usr/lib/hadoop/.//libexec:/usr/lib/hadoop/.//sbin:/usr/contrib/capacity-scheduler/*.jar:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/asm-3.2.jar:/usr/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/usr/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/usr/lib/hadoop-hdfs/lib/commons-daemon-1.0.13.jar:/usr/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/usr/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/usr/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/usr/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/usr/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/usr/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/usr/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/usr/lib/hadoop-hdfs/lib/jetty-6.1.26.jar:/usr/lib/hadoop-hdfs/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/usr/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/usr/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/usr/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/usr/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/usr/lib/hadoop-hdfs/.//bin:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.5-alpha.jar:/usr/lib/hadoop-hdfs/.//hadoop-hdfs-2.0.5-alpha-tests.jar:/usr/lib/hadoop-hdfs/.//lib:/usr/lib/hadoop-hdfs/.//sbin:/usr/lib/hadoop-hdfs/.//webapps:/usr/lib/hadoop-yarn/lib/aopalliance-1.0.jar:/usr/lib/hadoop-yarn/lib/asm-3.2.jar:/usr/lib/hadoop-yarn/lib/avro-1.5.3.jar:/usr/lib/hadoop-yarn/lib/commons-io-2.1.jar:/usr/lib/hadoop-yarn/lib/guice-3.0.jar:/usr/lib/hadoop-yarn/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-yarn/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-yarn/lib/javax.inject-1.jar:/usr/lib/hadoop-yarn/lib/jersey-core-1.8.jar:/usr/lib/hadoop-yarn/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-yarn/lib/jersey-server-1.8.jar:/usr/lib/hadoop-yarn/lib/junit-4.8.2.jar:/usr/lib/hadoop-yarn/lib/log4j-1.2.17.jar:/usr/lib/hadoop-yarn/lib/netty-3.5.11.Final.jar:/usr/lib/hadoop-yarn/lib/paranamer-2.3.jar:/usr/lib/hadoop-yarn/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-yarn/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop-yarn/.//bin:/usr/lib/hadoop-yarn/.//etc:/usr/lib/hadoop-yarn/.//hadoop-yarn-api-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-distributedshell-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-applications-unmanaged-am-launcher-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-client-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-common-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-common-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-nodemanager-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-resourcemanager-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-tests-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-server-web-proxy-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//hadoop-yarn-site-2.0.5-alpha.jar:/usr/lib/hadoop-yarn/.//lib:/usr/lib/hadoop-yarn/.//sbin:/usr/lib/hadoop-mapreduce/lib/aopalliance-1.0.jar:/usr/lib/hadoop-mapreduce/lib/asm-3.2.jar:/usr/lib/hadoop-mapreduce/lib/avro-1.5.3.jar:/usr/lib/hadoop-mapreduce/lib/commons-io-2.1.jar:/usr/lib/hadoop-mapreduce/lib/guice-3.0.jar:/usr/lib/hadoop-mapreduce/lib/guice-servlet-3.0.jar:/usr/lib/hadoop-mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/lib/hadoop-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/lib/hadoop-mapreduce/lib/javax.inject-1.jar:/usr/lib/hadoop-mapreduce/lib/jersey-core-1.8.jar:/usr/lib/hadoop-mapreduce/lib/jersey-guice-1.8.jar:/usr/lib/hadoop-mapreduce/lib/jersey-server-1.8.jar:/usr/lib/hadoop-mapreduce/lib/junit-4.8.2.jar:/usr/lib/hadoop-mapreduce/lib/log4j-1.2.17.jar:/usr/lib/hadoop-mapreduce/lib/netty-3.5.11.Final.jar:/usr/lib/hadoop-mapreduce/lib/paranamer-2.3.jar:/usr/lib/hadoop-mapreduce/lib/protobuf-java-2.4.0a.jar:/usr/lib/hadoop-mapreduce/lib/snappy-java-1.0.3.2.jar:/usr/lib/hadoop-mapreduce/.//bin:/usr/lib/hadoop-mapreduce/.//hadoop-archives-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-datajoin-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-distcp-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-extras-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-gridmix-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-app-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-common-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-core-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-hs-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-hs-plugins-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-jobclient-2.0.5-alpha-tests.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-client-shuffle-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-mapreduce-examples-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-rumen-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//hadoop-streaming-2.0.5-alpha.jar:/usr/lib/hadoop-mapreduce/.//lib:/usr/lib/hadoop-mapreduce/.//sbin:/conf' -Djava.library.path=:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib:/usr/lib/hbase/bin/../lib/native/Linux-amd64-64 org.apache.flume.node.Application -n hdfs-agent -f ./flume_example.conf
13/11/16 20:47:52 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 1
13/11/16 20:47:52 INFO node.FlumeNode: Flume node starting - hdfs-agent
13/11/16 20:47:52 INFO nodemanager.DefaultLogicalNodeManager: Node manager starting
13/11/16 20:47:52 INFO lifecycle.LifecycleSupervisor: Starting lifecycle supervisor 11
13/11/16 20:47:52 INFO properties.PropertiesFileConfigurationProvider: Configuration provider starting
13/11/16 20:47:52 INFO properties.PropertiesFileConfigurationProvider: Reloading configuration file:./flume_example.conf
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Added sinks: hdfs-write Agent: hdfs-agent
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Processing:hdfs-write
13/11/16 20:47:52 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [hdfs-agent]
13/11/16 20:47:52 INFO properties.PropertiesFileConfigurationProvider: Creating channels
13/11/16 20:47:52 INFO properties.PropertiesFileConfigurationProvider: created channel memoryChannel
13/11/16 20:47:52 INFO interceptor.StaticInterceptor: Creating RegexFilteringInterceptor: regex=^echo.*,excludeEvents=true
13/11/16 20:47:52 INFO sink.DefaultSinkFactory: Creating instance of sink: hdfs-write, type: hdfs
13/11/16 20:47:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/11/16 20:47:54 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
13/11/16 20:47:54 INFO nodemanager.DefaultLogicalNodeManager: Starting new configuration:{ sourceRunners:{netcat-collect=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:netcat-collect,state:IDLE} }} sinkRunners:{hdfs-write=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@1bf3f158 counterGroup:{ name:null counters:{} } }} channels:{memoryChannel=org.apache.flume.channel.MemoryChannel{name: memoryChannel}} }
13/11/16 20:47:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel memoryChannel
13/11/16 20:47:54 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: CHANNEL, name: memoryChannel, registered successfully.
13/11/16 20:47:54 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: memoryChannel started
13/11/16 20:47:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink hdfs-write
13/11/16 20:47:54 INFO instrumentation.MonitoredCounterGroup: Monitoried counter group for type: SINK, name: hdfs-write, registered successfully.
13/11/16 20:47:54 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: hdfs-write started
13/11/16 20:47:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Source netcat-collect
13/11/16 20:47:54 INFO source.NetcatSource: Source starting
13/11/16 20:47:54 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:11111]
Terminal 2: send record batches & read data by Hive
[surachart@centos01 ~]$ hadoop fs -ls hdfs://localhost:8020/user/surachart/flume_example
13/11/16 20:48:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ls: `hdfs://localhost:8020/user/surachart/flume_example': No such file or directory
[surachart@centos01 ~]$
[surachart@centos01 ~]$ cd oa_lesson_1_source_and_acquire
[surachart@centos01 oa_lesson_1_source_and_acquire]$ head -n 20 example.xml | nc localhost 11111
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
OK
[surachart@centos01 oa_lesson_1_source_and_acquire]$ hadoop fs -ls hdfs://localhost:8020/user/surachart/flume_example
13/11/16 20:49:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 4 items
-rw-r--r-- 1 surachart supergroup 1203 2013-11-16 20:49 hdfs://localhost:8020/user/surachart/flume_example/FlumeData.1384609744355
-rw-r--r-- 1 surachart supergroup 1147 2013-11-16 20:49 hdfs://localhost:8020/user/surachart/flume_example/FlumeData.1384609744356
-rw-r--r-- 1 surachart supergroup 1177 2013-11-16 20:49 hdfs://localhost:8020/user/surachart/flume_example/FlumeData.1384609744357
-rw-r--r-- 1 surachart supergroup 0 2013-11-16 20:49 hdfs://localhost:8020/user/surachart/flume_example/FlumeData.1384609744358.tmp
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$
[surachart@centos01 oa_lesson_1_source_and_acquire]$ hive
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j.properties
Hive history file=/tmp/surachart/hive_job_log_surachart_201311162053_1256958590.txt
hive> show tables;
OK
Time taken: 13.118 seconds
hive> CREATE EXTERNAL TABLE flume_example (record string) ROW FORMAT DELIMITED LINES TERMINATED BY '\n' LOCATION '/user/surachart/flume_example';
OK
Time taken: 1.616 seconds
hive> show tables; OK
flume_example
Time taken: 0.413 seconds
hive>
> SELECT xpath_int(record, "/record/PurchaseDate"), xpath_string(record, "/record/FlightsAvailable") FROM flume_example;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1384583763440_0003, Tracking URL = http://centos01:8088/proxy/application_1384583763440_0003/
Kill Command = /usr/bin/hadoop job -kill job_1384583763440_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-11-16 20:55:19,963 Stage-1 map = 0%, reduce = 0%
2013-11-16 20:55:50,703 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.68 sec
2013-11-16 20:55:52,539 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 3.68 sec
MapReduce Total cumulative CPU time: 3 seconds 680 msec
Ended Job = job_1384583763440_0003
MapReduce Jobs Launched:
Job 0: Map: 1 Cumulative CPU: 3.68 sec HDFS Read: 4433 HDFS Write: 357 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 680 msec
OK
1368111464 97, 13
1335377562 7, 67, 41, 43, 3
1383190668 97, 67
1366662121
1379048005 3, 43, 61
1376098854
1343518807 43, 29
1341610164
1353358109 41, 43, 97
1377499154
1339295392 23
1358062165 13, 23, 61
1338964230 3, 97
1348172911
1361978449 83, 31
1379977738
1347007379 83, 79, 7, 59
1341128392 3
1351321689 59, 13, 41, 67, 71
1384062904 2, 17, 41
Time taken: 58.03 seconds
Wow! Good for learning.
Written By: Surachart Opun
http://surachartopun.com