In this post, I'll walk you through:
- Installing the RPMs that comprise the DevKit
- Using devnodectl to start up a cluster
- Accessing the database instance with the mysql client
- Dumping and importing data from a MySQL instance
- Starting MySQL replication as a slave
Along the way I want to highlight the following features of Clustrix:
- Drop-in compatibility with MySQL, including replication
- Single-instance database (no sharding!)
Let's get started!
Installation
First download from the main page at www.clustrix.com, the "Try it now" button. You'll need to download and install the common and devnode packages as follows:
[root@beta001 v3.2]# rpm -ivh clustrix-common-v3.2-493.x86_64.rpm clustrix-devnode-v3.2-7371.x86_64.rpm
Preparing... ########################################### [100%]
1:clustrix-common ########################################### [ 50%]
2:clustrix-devnode ########################################### [100%]
If you run into any errors installing the RPMs, it's likely to be due to missing dependencies, which you may be able to fill in using yum. Otherwise, you'll probably need to find a client with a more recent Redhat/CentOS install (for this exercise, I'm using CentOS 6).
What does the RPM actually install? Everything installs under /opt/clustrix; support libraries in
lib/, and the two executables we care about in bin/: devnode and devnodectl.
You'll also need a data directory for the cluster to store files which will serve as virtual disks. It is highly recommended that you store these on local disk rather than over NFS. Bear in mind that with proper Clustrix nodes, this storage is provided with SSDs, thus the throughput capacity of your local storage will be a constraining factor in performance of a devnode cluster (a subject I plan to revisit in a future blog post). Here I'm going to use /data/clustrix, which is the default for the devnodectl utility (if you use a different dir, you'll need to use the -d/--data-dir argument each time you run devnodectl):
[root@beta001 v3.2]# mkdir /data/clustrix
[root@beta001 v3.2]# chown nparrish /data/clustrix
If you are running as a non-root user (which is advised), you'll also want to chown the directory to your username, as shown.
Initializing the cluster
devnodectl takes care of starting up devnode processes, specifying flags so that they automatically join together into a cluster. In later blog posts we'll take a more manual approach, in order to demonstrate things like adding nodes to an existing cluster, running devnode instances on multiple clients, and fault tolerance features. But for now, firing up the cluster is as simple as:
[nparrish@beta001 ~]$ /opt/clustrix/bin/devnodectl --init --nodes 3 start
Let's look at the output this returns:
[exec] /opt/clustrix/bin/devnode -clusterpath /data/clustrix -cluster nparrish -setpnid 1 -anyport -unix -glue -noautostart -nclean 4 -vdev-size 256
[exec] /opt/clustrix/bin/devnode -clusterpath /data/clustrix -cluster nparrish -setpnid 2 -anyport -unix -noautostart -nclean 4 -vdev-size 256
[exec] /opt/clustrix/bin/devnode -clusterpath /data/clustrix -cluster nparrish -setpnid 3 -anyport -unix -noautostart -nclean 4 -vdev-size 256
This shows us the actual devnode commands being run. I'll draw your attention to a few of the options specified:
-clusterpath /data/clustrix: All cluster state, including the files used for virtual disks, will be in this dir.
-nclean 4 -vdev-size 256: This gives us 4 disks of 256MB each, for each of the three nodes.
Waiting for node 1 to enter quorum.
done.
Node 1 [RUNNING]: /data/clustrix/p1
mysql: 3306 control: 52924 healthmon:3581
mysql socket: /data/clustrix/p1/mysql.sock
Node 2 [RUNNING]: /data/clustrix/p2
mysql: 57059 control: 57079 healthmon:52498
mysql socket: /data/clustrix/p2/mysql.sock
Node 3 [RUNNING]: /data/clustrix/p3
mysql: 58157 control: 2048 healthmon:52002
mysql socket: /data/clustrix/p3/mysql.sock
Example command to access your cluster:
mysql -u root -S /data/clustrix/p3/mysql.sock
This tells us how to access each of the three nodes that have been started. The -anyport command means that each node gets a random MySQL port (after trying for the default 3306, which you can see node 1 gets -- if your client is already running a mysqld instance, all nodes would get some different port number), and this output tells you what port to specify for your clients to connect. You can also connect via socket, as shown in the example. The control and healthmon ports will be covered in a future blog post.
Connecting to the cluster
[nparrish@beta001 ~]$ mysql -h 127.0.0.1 -P 58157 -u root
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3074
Server version: 5.0.45
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select @@version;
+----------------------------------------------------+
| @@version |
+----------------------------------------------------+
| 5.0.45-clustrix-v3.2-7371-0f71a74d89c512ac-release |
+----------------------------------------------------+
1 row in set (0.00 sec)
A few things to note here:
- This is standard, off-the-shelf mysql command line client (hence the Oracle copyright)
- I connected using the IP for localhost; mysql client tries to be smart and use the default mysql socket (/var/lib/mysql/mysql.sock) if you specify -h localhost
- Clustrix reports server version 5.0.45 for compatibility only; from the full version string in @@version, the relevant Clustrix part starts at v3.2)
- We are connecting with built-in root user, which initially has no password
Let's create a table and insert a little bit of data:
mysql> use test;
Database changed
mysql> create table foo (id int key, v varchar(100));
Query OK, 0 rows affected (0.09 sec)
mysql> insert into foo values (1, 'bar'), (2, 'baz');
Query OK, 2 rows affected (0.00 sec)
mysql> select * from foo;
+----+------+
| id | v |
+----+------+
| 1 | bar |
| 2 | baz |
+----+------+
2 rows in set (0.03 sec)
Nothing fancy, but let's now connect to a different node and see how things look:
mysql> exit
Bye
[nparrish@beta001 ~]$ mysql -h 127.0.0.1 -P 57059 -u root test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
[...]
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| foo |
+----------------+
1 row in set (0.00 sec)
mysql> select * from foo;
+----+------+
| id | v |
+----+------+
| 2 | baz |
| 1 | bar |
+----+------+
2 rows in set (0.01 sec)
Note that here I've connected to a different node, but I have the exact same view of my data. You might also note that the rows were returned in a different order. On Clustrix, rows are distributed across multiple nodes, and so if no ORDER BY clause is specified, may be returned in different order. This differs from MySQL, which implicitly orders by primary key (since it must only read from one place).
mysqldump and clustrix_import
Let's import data from an existing MySQL instance (here running on a different client, called hefty). We start by dumping with mysqldump:
[nparrish@hefty tmp]$ mysqldump -h 127.0.0.1 --single-transaction --master-data=2 sbtest > /tmp/sbtest.sql
- --single-transaction ensures that all tables in the database are consistent with respect to eachother
- --master-data=2 gives us the position in the replication binlog that corresponds to that single transaction, so we can start up a replication slave
- Here we are just dumping a specific database, sbtest. We'll also need to create this database on our cluster, as the resulting dump file will not have DROP/CREATE DATABASE statements for it.
We can then import this using mysql client:
[nparrish@hefty tmp]$ mysql -h beta001 -P 3306 sbtest < /tmp/sbtest.sql
A few things to note here:
- As noted, we need to indicate which database to import into, so we must first CREATE DATABASE sbtest; then specify sbtest argument to indicate the target database.
- If we were importing into actual nodes, we'd use a tool called clustrix_import, which does the inserts in parallel; the tool has not been (de)tuned for use with devnode, so using these together will result in an out-of-memory condition.
Recall that these nodes were created with 4 256MB disks each, so by default the cluster will accomodate only a small (~1GB) dump file. If you want larger disks, use the --drive-size option when running devnodectl --init (see devnodectl -h output for details).
MySQL Replication
We'll dedicate a whole future blog post to replication, but for just a taste let's start up a slave to catch up from the point at which we took the above dump. First we need to find the log file and position, which is stored as a comment in our dump file:
[nparrish@hefty tmp]$ head -24 /tmp/sbtest.sql
-- MySQL dump 10.11
--
[....]
-- Position to start replication or point-in-time recovery from
--
-- CHANGE MASTER TO MASTER_LOG_FILE='hefty-bin.000001', MASTER_LOG_POS=610868;
So we just need to add the hostname and user account to use in order to point our cluster at this MySQL instance (note that we can run this on any of our nodes):
mysql> CHANGE MASTER TO MASTER_LOG_FILE='hefty-bin.000001', MASTER_LOG_POS=610868, MASTER_HOST='hefty', MASTER_USER='root';
Query OK, 0 rows affected (0.04 sec)
mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_Name: default
Slave_Status: Running
Master_Host: hefty
Master_Port: 3306
Master_User: root
Master_Log_File: hefty-bin
Slave_Enabled: Enabled
Log_File_Seq: 2
Log_File_Pos: 267
Last_Error: no error
Connection_Status: Connected
Relay_Log_Bytes_Read: 0
Relay_Log_Current_Size: 0
Seconds_Behind_Master: 0
1 row in set (0.01 sec)
And we're off and replicating any write events on the MySQL instance.
Stopping devnodes
To shut down your cluster, run devnodectl stop:
[nparrish@beta001 clustrix]$ /opt/clustrix/bin/devnodectl stop
[node /data/clustrix/p1] kill -9 21089
[node /data/clustrix/p2] kill -9 20078
[node /data/clustrix/p3] kill -9 20079
As noted, the processes are simply stopped with kill -9. You can restart them with devnodectl start, this time leaving out the --init and --nodes options; note that these processes will probably get different mysql ports than for the prior run. We'll spend more time with restarting devnode processes when we cover fault tolerance in a future post.
Recap
So we've covered some of the initial steps you'll be taking with the Clustrix developers kit:
- Installing the RPMs
- Starting up a simple three node cluster with devnodectl
- Connecting to the cluster with mysql
- Using mysqldump to get data from an existing MySQL instance, then import with mysql onto your cluster
- Using MySQL replication to set up your cluster as a slave to that MySQL instance
If you're familiar with MySQL, most of this should look pretty familiar, and that's quite the point -- the drop-in compatibility you get with Clustrix means there's no application rewrite involved. Out of the box you get a cluster which hosts a single-instance database, no federation or sharding mess.
What's next
Stay tuned for more. Future topics will include a closer look at how data is stored on Clustrix, deeper coverage of replication capabilities, running devnode on multiple hosts, fault tolerance, examining query performance, and more.
Please come talk to us on the support forums at https://groups.google.com/a/clustrix.com/group/support-public/topics if you run into any problems or have more technical questions.
0 comments:
Post a Comment