Below is output from my attempt to get a multi-node Docker/CDH 5.8.1 setup up and running on an AWS EC2 "m4.4xlarge" instance (16 x vCPU’s and 64GB of memory), running CentOS 7.2....
I used the following Cloudera URL as "instructions":
http://blog.cloudera.com/blog/2016/08/multi-node-clusters-with-cloudera-quickstart-for-docker/
You can see all goes well until the "Exception: Timed out after waiting 10 minutes for services to start" (I'm guessing the python errors above this line are the cause).
It would be great if you could cast your eyes over the output
Install Docker Engine:
[root@client2-dev-cdh-docker-launcher-instance ~]# yum install -y docker-engine
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.ventraip.net.au
* epel: epel.mirror.digitalpacific.com.au
* extras: mirror.ventraip.net.au
* updates: mirror.ventraip.net.au
Resolving Dependencies
--> Running transaction check
---> Package docker-engine.x86_64 0:1.12.0-1.el7.centos will be installed
--> Processing Dependency: docker-engine-selinux >= 1.12.0-1.el7.centos for package: docker-engine-1.12.0-1.el7.centos.x86_64
--> Processing Dependency: libltdl.so.7()(64bit) for package: docker-engine-1.12.0-1.el7.centos.x86_64
--> Running transaction check
---> Package docker-engine-selinux.noarch 0:1.12.0-1.el7.centos will be installed
---> Package libtool-ltdl.x86_64 0:2.4.2-21.el7_2 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
.....
.....
Installed:
docker-engine.x86_64 0:1.12.0-1.el7.centos
Dependency Installed:
docker-engine-selinux.noarch 0:1.12.0-1.el7.centos libtool-ltdl.x86_64 0:2.4.2-21.el7_2
Complete!
Start Docker Service:
[root@client2-dev-cdh-docker-launcher-instance ~]# service docker start
Redirecting to /bin/systemctl start docker.service
[root@client2-dev-cdh-docker-launcher-instance ~]# systemctl | grep -i dock
sys-devices-virtual-net-docker0.device loaded active plugged /sys/devices/virtual/net/docker0
sys-subsystem-net-devices-docker0.device loaded active plugged /sys/subsystem/net/devices/docker0
var-lib-docker-devicemapper.mount loaded active mounted /var/lib/docker/devicemapper
docker.service loaded active running Docker Application Container Engine
Test/Launch the "hello world" container:
[root@client2-dev-cdh-docker-launcher-instance ~]# docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
c04b14da8d14: Pull complete
Digest: sha256:0256e8a36e2070f7bf2d0b0763dbabdd67798512411de4cdcf9431a1feb60fd9
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker Hub account:
https://hub.docker.com
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
Check Docker Containers:
[root@client2-dev-cdh-docker-launcher-instance ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d73ebc648479 hello-world "/hello" 18 seconds ago Exited (0) 17 seconds ago compassionate_rosalind
[root@client2-dev-cdh-docker-launcher-instance ~]# docker rm d73ebc648479
d73ebc648479
[root@client2-dev-cdh-docker-launcher-instance ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
CURL the "clusterdock" script:
[root@client2-dev-cdh-docker-launcher-instance ~]# curl -sL http://tiny.cloudera.com/clusterdock.sh > clusterdock.sh
[root@client2-dev-cdh-docker-launcher-instance ~]# ls -lrt
-rwxrwxrwx 1 root root 5641 Aug 18 16:12 clusterdock.sh
Source the "clusterdock" script to setup environment:
Note: I edited the "clusterdock.sh" script and added a "set -x" to it in order to see output....
[root@client2-dev-cdh-docker-launcher-instance ~]# source clusterdock.sh
++ printf '\033]0;%s@%s:%s\007' root client2-dev-cdh-docker-launcher-instance '~'
[root@client2-dev-cdh-docker-launcher-instance ~]#
++ printf '\033]0;%s@%s:%s\007' root client2-dev-cdh-docker-launcher-instance '~'
[root@client2-dev-cdh-docker-launcher-instance ~]#
++ printf '\033]0;%s@%s:%s\007' root client2-dev-cdh-docker-launcher-instance '~'
[root@client2-dev-cdh-docker-launcher-instance ~]#
++ printf '\033]0;%s@%s:%s\007' root client2-dev-cdh-docker-launcher-instance '~'
Now create a 4 node CDH cluster…..
[root@client2-dev-cdh-docker-launcher-instance ~]# clusterdock_run ./bin/start_cluster -n client2-cdh-dev-cluster cdh --primary-node=machine-1 --secondary-nodes='machine-{2..4}' --exclude-service-types=IMPALA
+ clusterdock_run ./bin/start_cluster -n client2-cdh-dev-cluster cdh --primary-node=machine-1 '--secondary-nodes=machine-{2..4}' --exclude-service-types=IMPALA
+ '[' -z '' ']'
+ local CONSTANTS_CONFIG_URL=https://raw.githubusercontent.com/cloudera/clusterdock/master/clusterdock/constants.cfg
++ curl -s https://raw.githubusercontent.com/cloudera/clusterdock/master/clusterdock/constants.cfg
++ awk -F ' *= *' '/^docker_registry_url/ {print $2}'
+ local DOCKER_REGISTRY_URL=docker.io
++ curl -s https://raw.githubusercontent.com/cloudera/clusterdock/master/clusterdock/constants.cfg
++ awk -F ' *= *' '/^cloudera_namespace/ {print $2}'
+ local CLOUDERA_NAMESPACE=cloudera
+ CLUSTERDOCK_IMAGE=docker.io/cloudera/clusterdock:latest
+ '[' '' '!=' false ']'
+ sudo docker pull docker.io/cloudera/clusterdock:latest
+ '[' -n '' ']'
+ '[' -n '' ']'
+ '[' -n '' ']'
+ '[' -n '' ']'
+ '[' -n '' ']'
+ sudo docker run --net=host -t --privileged -v /tmp/clusterdock -v /etc/hosts:/etc/hosts -v /etc/localtime:/etc/localtime -v /var/run/docker.sock:/var/run/docker.sock docker.io/cloudera/clusterdock:latest ./bin/start_cluster -n client2-cdh-dev-cluster cdh --primary-node=machine-1 '--secondary-nodes=machine-{2..4}' --exclude-service-types=IMPALA
INFO:clusterdock.topologies.cdh.actions:Pulling image docker.io/cloudera/clusterdock:cdh580_cm581_primary-node. This might take a little while...
cdh580_cm581_primary-node: Pulling from cloudera/clusterdock
3eaa9b70c44a: Pull complete
99ba8e23f310: Pull complete
c9c08e9a0d03: Pull complete
7434a9a99daa: Pull complete
d52d9baa0ee6: Pull complete
00ca224ba661: Pull complete
Digest: sha256:9feffbfc5573262a6efbbb0a969efde890e63ced8a4ab3c9982f4f0dc607e429
Status: Downloaded newer image for cloudera/clusterdock:cdh580_cm581_primary-node
INFO:clusterdock.topologies.cdh.actions:Pulling image docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node. This might take a little while...
cdh580_cm581_secondary-node: Pulling from cloudera/clusterdock
3eaa9b70c44a: Already exists
99ba8e23f310: Already exists
c9c08e9a0d03: Already exists
7434a9a99daa: Already exists
d52d9baa0ee6: Already exists
f70deff0592f: Pull complete
Digest: sha256:251778378b362adff4e93b99d423848216e4823965dabd1bd4c41dbb4c79afcf
Status: Downloaded newer image for cloudera/clusterdock:cdh580_cm581_secondary-node
INFO:clusterdock.cluster:Network (client2-cdh-dev-cluster) not present, creating it...
INFO:clusterdock.cluster:Successfully setup network (name: client2-cdh-dev-cluster).
INFO:clusterdock.cluster:Successfully started machine-2.client2-cdh-dev-cluster (IP address: 192.168.123.3).
INFO:clusterdock.cluster:Successfully started machine-3.client2-cdh-dev-cluster (IP address: 192.168.123.4).
INFO:clusterdock.cluster:Successfully started machine-4.client2-cdh-dev-cluster (IP address: 192.168.123.5).
INFO:clusterdock.cluster:Successfully started machine-1.client2-cdh-dev-cluster (IP address: 192.168.123.2).
INFO:clusterdock.cluster:Started cluster in 6.81 seconds.
INFO:clusterdock.topologies.cdh.actions:Changing server_host to machine-1.client2-cdh-dev-cluster in /etc/cloudera-scm-agent/config.ini...
INFO:clusterdock.topologies.cdh.actions:Removing files (/var/lib/cloudera-scm-agent/uuid, /dfs*/dn/current/*) from hosts (machine-3.client2-cdh-dev-cluster, machine-4.client2-cdh-dev-cluster)...
INFO:clusterdock.topologies.cdh.actions:Restarting CM agents...
cloudera-scm-agent is already stopped
Starting cloudera-scm-agent: [ OK ]
Stopping cloudera-scm-agent: [ OK ]
Stopping cloudera-scm-agent: [ OK ]
Stopping cloudera-scm-agent: [ OK ]
Starting cloudera-scm-agent: [ OK ]
Starting cloudera-scm-agent: [ OK ]
Starting cloudera-scm-agent: [ OK ]
INFO:clusterdock.topologies.cdh.actions:Waiting for Cloudera Manager server to come online...
INFO:clusterdock.topologies.cdh.actions:Detected Cloudera Manager server after 35.04 seconds.
INFO:clusterdock.topologies.cdh.actions:CM server is now accessible at http://client2-dev-cdh-docker-launcher-instance:32768
INFO:clusterdock.topologies.cdh.cm:Detected CM API v13.
INFO:clusterdock.topologies.cdh.cm_utils:Adding hosts (Ids: 484aa22a-44af-4593-b6d0-ad91806fa944, 3df146ff-5139-4196-962e-873a285fecb7) to Cluster 1 (clusterdock)...
INFO:clusterdock.topologies.cdh.cm_utils:Creating secondary node host template...
INFO:clusterdock.topologies.cdh.cm_utils:Sleeping for 30 seconds to ensure that parcels are activated...
INFO:clusterdock.topologies.cdh.cm_utils:Applying secondary host template...
INFO:clusterdock.topologies.cdh.cm_utils:Updating database configurations...
INFO:clusterdock.topologies.cdh.cm:Updating NameNode references in Hive metastore...
INFO:clusterdock.topologies.cdh.actions:Removing service impala from Cluster 1 (clusterdock)...
INFO:clusterdock.topologies.cdh.actions:Deploying client configuration...
INFO:clusterdock.topologies.cdh.actions:Starting cluster...
INFO:clusterdock.topologies.cdh.actions:Starting Cloudera Management service...
INFO:clusterdock.topologies.cdh.cm:Beginning service health validation...
Traceback (most recent call last):
File "./bin/start_cluster", line 70, in <module>
main()
File "./bin/start_cluster", line 63, in main
actions.start(args)
File "/root/clusterdock/clusterdock/topologies/cdh/actions.py", line 151, in start
deployment.validate_services_started()
File "/root/clusterdock/clusterdock/topologies/cdh/cm.py", line 91, in validate_services_started
"(at fault: {1}).").format(timeout_min, at_fault_services))
Exception: Timed out after waiting 10 minutes for services to start (at fault: [[u'zookeeper', "Failed health checks: [u'ZOOKEEPER_SERVERS_HEALTHY']"], [u'hdfs', "Failed health checks: [u'HDFS_CANARY_HEALTH', u'HDFS_DATA_NODES_HEALTHY', u'HDFS_FREE_SPACE_REMAINING', u'HDFS_HA_NAMENODE_HEALTH']"], [u'hbase', "Failed health checks: [u'HBASE_MASTER_HEALTH', u'HBASE_REGION_SERVERS_HEALTHY']"], [u'solr', "Failed health checks: [u'SOLR_SOLR_SERVERS_HEALTHY']"], [u'yarn', "Failed health checks: [u'YARN_JOBHISTORY_HEALTH', u'YARN_NODE_MANAGERS_HEALTHY', u'YARN_RESOURCEMANAGERS_HEALTH']"], [u'ks_indexer', "Failed health checks: [u'KS_INDEXER_HBASE_INDEXERS_HEALTHY']"], [u'hive', "Failed health checks: [u'HIVE_HIVEMETASTORES_HEALTHY', u'HIVE_HIVESERVER2S_HEALTHY']"], [u'oozie', "Failed health checks: [u'OOZIE_OOZIE_SERVERS_HEALTHY']"], [u'hue', "Failed health checks: [u'HUE_HUE_SERVERS_HEALTHY']"], [u'mgmt', "Failed health checks: [u'MGMT_ALERT_PUBLISHER_HEALTH', u'MGMT_EVENT_SERVER_HEALTH', u'MGMT_HOST_MONITOR_HEALTH', u'MGMT_SERVICE_MONITOR_HEALTH']"]]).
+ '[' -n '' ']'
++ printf '\033]0;%s@%s:%s\007' root client2-dev-cdh-docker-launcher-instance '~'
OK, failed....kill off Docker containers:
[root@client2-dev-cdh-docker-launcher-instance ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
16319733938c docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" 22 minutes ago Up 22 minutes drunk_mestorf
0fcaf9c6ddc3 docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" 22 minutes ago Up 22 minutes pensive_kirch
efb3495ae56b docker.io/cloudera/clusterdock:cdh580_cm581_secondary-node "/sbin/init" 22 minutes ago Up 22 minutes angry_turing
904e8f4a6290 docker.io/cloudera/clusterdock:cdh580_cm581_primary-node "/sbin/init" 22 minutes ago Up 22 minutes 0.0.0.0:32768->7180/tcp infallible_ritchie
d152d82e497f docker.io/cloudera/clusterdock:latest "python ./bin/start_c" 33 minutes ago Exited (1) 6 minutes ago loving_fermat
[root@client2-dev-cdh-docker-launcher-instance ~]# docker stop d152d82e497f 904e8f4a6290 efb3495ae56b 0fcaf9c6ddc3 16319733938c
Error response from daemon: No such container: d152d82e497f
904e8f4a6290
efb3495ae56b
0fcaf9c6ddc3
16319733938c
[root@client2-dev-cdh-docker-launcher-instance ~]# docker rm d152d82e497f 904e8f4a6290 efb3495ae56b 0fcaf9c6ddc3 16319733938c
904e8f4a6290
efb3495ae56b
0fcaf9c6ddc3
16319733938c
Error response from daemon: No such container: d152d82e497f
Cheers,
Damion.