Introduction.
This article is exploring the process of installing HA OpenNebula and Ceph as datastore on three nodes (disks – 6xSSD 240GB, backend network IPoIB, OS CentOS 7) and using one additional node for backup.
Scheme of equipment below:
.
We are using this solution for virtualization of our imagery processing servers.
Preparing.
All actions should be performed on all nodes. For kosmo-arch all except bridge-utils and FrontEnd network.
FrontEnd network.
Configure bond0 (mode0) and start script below to create frontend interface for VMs (OpenNebula)
BackEnd network. Configuration of IPoIB:
Enable IPoIB and switch infiniband to connected mode. This Link about differences of connected or datagram modes.
Start Infiniband services.
Check of working
and
Setup bond1 (mode1) of two IB interfaces. Set up IP 172.19.254.X where X is node number. Example below:
Disable firewall
Tuning sysctl.
Installing Ceph.
Preparation
Configure passwordless access between nodes for user root. The key shoud be created on one node and then copy to other to /root/.ssh/.
Disable Selinux on all nodes
setenforce 0
Add max open files to /etc/security/limits.conf (depends on your requirements) on all nodes
hard nofile 1000000
soft nofile 1000000
Setup /etc/hosts on all nodes:
Installing
Install kernel >3.15 on all nodes (That is needed for using cephFS client)
Set up new kernel for booting.
Reboot.
Set up repository: (on all nodes)
Import gpgkey: (on all nodes)
Setup ntpd. (on all nodes)
Editing /etc/ntp.conf and start ntpd. (on all nodes)
Install: (on all nodes)
Deploying.
MON deploying: (on kosmo-virt1)
OSD deploying:
(on kosmo-virt1)
(on kosmo-virt2)
(on kosmo-virt3)
where sd[b-g] – SSD disks.
MDS deploying:
New giant version of ceph doesn’t have osd pool data and metadata
Use ceph osd lspools to check.
Check pool id of data and metadata with
Configure FS
where 4 – id metadata pool, 3 – id metadata pool
Configure MDS
(on kosmo-virt1)
(on kosmo-virt2)
(on all nodes)
Configure kosmo-arch.
Copy /etc/ceph.conf and /etc/ceph.client.admin.keyring from any of kosmo-virt to kosmo-arch
Preparing Ceph for OpenNebula.
Create pool:
Setup authorization to pool one:
Get key from keyring:
Checking:
Copy /etc/ceph/ceph.client.oneadmin.keyring and /etc/ceph/oneadmin.key to the second node.
Preparing for Opennebula HA
Configuring MariaDB cluster
Configure MariaDB cluster on all nodes except kosmo-arch
Setup repo:
Install:
start service:
prepare for cluster:
configuring cluster: (for kosmo-virt1)
(for kosmo-virt2)
(for kosmo-virt3)
(on kosmo-virt1)
(on kosmo-virt2)
(on kosmo-virt3)
check on all nodes:
| Variable_name | Value | +——————————+————————————–+
wsrep_local_state_uuid
739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_protocol_version
7
wsrep_last_committed
0
wsrep_replicated
0
wsrep_replicated_bytes
0
wsrep_repl_keys
0
wsrep_repl_keys_bytes
0
wsrep_repl_data_bytes
0
wsrep_repl_other_bytes
0
wsrep_received
6
wsrep_received_bytes
425
wsrep_local_commits
0
wsrep_local_cert_failures
0
wsrep_local_replays
0
wsrep_local_send_queue
0
wsrep_local_send_queue_max
1
wsrep_local_send_queue_min
0
wsrep_local_send_queue_avg
0.000000
wsrep_local_recv_queue
0
wsrep_local_recv_queue_max
1
wsrep_local_recv_queue_min
0
wsrep_local_recv_queue_avg
0.000000
wsrep_local_cached_downto
18446744073709551615
wsrep_flow_control_paused_ns
0
wsrep_flow_control_paused
0.000000
wsrep_flow_control_sent
0
wsrep_flow_control_recv
0
wsrep_cert_deps_distance
0.000000
wsrep_apply_oooe
0.000000
wsrep_apply_oool
0.000000
wsrep_apply_window
0.000000
wsrep_commit_oooe
0.000000
wsrep_commit_oool
0.000000
wsrep_commit_window
0.000000
wsrep_local_state
4
wsrep_local_state_comment
Synced
wsrep_cert_index_size
0
wsrep_causal_reads
0
wsrep_cert_interval
0.000000
wsrep_incoming_addresses
172.19.254.1:3306,172.19.254.3:3306,172.19.254.2:3306
wsrep_evs_delayed
wsrep_evs_evict_list
wsrep_evs_repl_latency
0/0/0/0/0
wsrep_evs_state
OPERATIONAL
wsrep_gcomm_uuid
7397d6d6-d6de-11e4-a515-d3302a8c2342
wsrep_cluster_conf_id
2
wsrep_cluster_size
2
wsrep_cluster_state_uuid
739895d5-d6de-11e4-87f6-3a3244f26574
wsrep_cluster_status
Primary
wsrep_connected
ON
wsrep_local_bf_aborts
0
wsrep_local_index
0
wsrep_provider_name
Galera
wsrep_provider_vendor
Codership Oy info@codership.com
wsrep_provider_version
25.3.9(r3387)
wsrep_ready
ON
wsrep_thread_count
2
+——————————+————————————–+
Creating user and database:
Remember, if all nodes will be down, actual node must be started with /etc/init.d/mysql start –wsrep-new-cluster. You should find an actual node. If you start node with not actual view, other nodes will issue error (see logs) – [ERROR] WSREP: gcs/src/gcs_group.cpp:void group_post_state_exchange(gcs_group_t*)():319: Reversing history: 0 → 0, this member has applied 140536161751824 more events than the primary component.Data loss is possible. Aborting.
Configuring HA cluster
Unfortunately pcs cluster conflicts with Opennebula server. That’s why will go with pacemaker,corosync and crmsh.
Installing HA
Set up repo on all nodes except kosmo-arch:
Install on all nodes except kosmo-arch:
On kosmo-virt1 create configuration
and create authkey on kosmo-virt1
Copy corosync and authkey to kosmo-virt2 and kosmo-virt3
Enabling (on all nodes except kosmo-arch):
Starting (on all nodes except kosmo-arch):
Checking:
add properies
Installing Opennebula
Installing
Setup repo on all nodes except kosmo-arch:
Installing (on all nodes except kosmo-arch):
Ruby Runtime Installation:
Change password oneadmin:
Create passworless access for oneadmin (on kosmo-virt1):
Copy to other nodes (remember that oneadmin home directory is /var/lib/one).
Change listen for sunstone-server (on all nodes):
on kosmo-virt1:
copy all /var/lib/one/.one/*.auth and one.key files to OTHER_NODES:/var/lib/one/.one/
Start stop services on kosmo-virt1:
Try to connect to http://node:9869.
Check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log).
If no errors:
Add ceph support for qemu-kvm for all nodes except kosmo-arch
if there is no rbd support than you have to compile and install:
Download:
Compiling.
Change %define rhev 0 to %define rhev 1.
Installing (for all nodes except kosmo-arch).
Check for ceph support.
Try to write image (for all nodes except kosmo-arch):
where N node number.
Add ceph support for libvirt
On all nodes:
On kosmo-virt1 create uuid:
Create secret.xml
Where AQDp1aqz+JPAJhAAIcKf/Of0JfpJRQvfPLqn9Q== is cat /etc/ceph/oneadmin.key.
Copy secret.xml to other nodes.
Add key to libvirt (for all nodes except kosmo-arch)
check
Restart libvirtd:
Convering database to mysql:
Downloading script:
Converting:
Change /etc/one/oned.conf from
to
Copy oned.conf to other nodes as root except kosmo-arch.
Check kosmo-virt2 and kosmo-virt3 nodes in turn:
check logs for errors (/var/log/one/oned.log /var/log/one/sched.log /var/log/one/sunstone.log)
Creating HA resources
On all nodes except kosmo-arch:
From any of the nodes except kosmo-arch:
Check
Configuring OpenNebula
http://active_node:9869 – web management.
With web management. 1. Create Cluster. 2. Add hosts (using 192.168.14.0 networks).
Console management.
3. Add net. (su oneadmin)
4. Create image rbd datastore. (su oneadmin)
5. Create system ceph datastore.
check last id number – N.
on all nodes create directory and mount ceph
where K= IP of curent node.
From one node change permitions:
Create system ceph datastore (su oneadmin):
6. Add nodes, vnets, datastories to created cluster with web management.
HA VM
Here is official doc.
But one comment. I’m using migrate instead of recreate command.
BACKUP
Some words about backup.
Use persistent image type for this work scheme.
For BACKUP was used a single Linux server kosmo-arch (ceph client) with installed zfs on linux. For zpool set ZFS and deduplication on. (Remember that deduplication required about 2GB mem for 1TB storage space.)
Example of simple script that is starting by cron:
Use onevm utility or web-interface (see template) to know which image assigned to VM.
PS
Don’t forget to change storage driver for VM to vda.(Drivers for windows). Without that you will face with low IO performance. (no more than 100 MB/s).
I saw 415MB/s with virtio drivers.
Links.
1. Official documentation Opennebula
2. Official documentation Ceph
3. Convertation sqlite to mysql
4. Convertation Opennebula DB to mysql
5. HA Rhel 7
6. Cluster wirh crmsh