Mysql Table Horizontal Split By Mycat

Server environment

1
2
3
4
5
6
7
8
$ cat /etc/issue
Ubuntu 14.04.5 LTS

$ mysqld -V
mysqld Ver 5.5.58-0ubuntu0.14.04.1 for debian-linux-gnu on x86_64 ((Ubuntu))

$ java -version
java version "1.7.0_151"

Install Java7+

1
$ sudo apt-get install -y openjdk-7-jre

Install Mycat

Download the zip file from github

1
2
3
4
5
6
7
8
9
10
11
$ wget http://dl.mycat.io/1.6-RELEASE/Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz
$ tar zxvf Mycat-server-1.6-RELEASE-20161028204710-linux.tar.gz
$ cd mycat/
$ tree -L 1 .
.
├── bin
├── catlet
├── conf
├── lib
├── logs
└── version.txt

Set up Mysql account

1
$ vim conf/server.xml
1
2
3
4
5
6
7
8
9
10
11
12
...
<user name="user1">
<property name="password">password1</property>
<property name="schemas">virtual_database</property>
</user>

<user name="user2">
<property name="password">password2</property>
<property name="schemas">virtual_database</property>
<property name="readOnly">true</property>
</user>
...

There are 2 users, user1 and user2, passwords are password1 and password2, user2 users are read-only

Set up Mysql nodes

1
$ vim conf/schema.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">

<schema name="virtual_database" checkSQLschema="false" sqlMaxLimit="100">

<!-- auto sharding by id (long) -->
<table name="travelrecord" dataNode="dn1,dn2,dn3" rule="auto-sharding-long" />

<!-- global table is auto cloned to all defined data nodes ,so can join with any table whose sharding node is in the same data node -->
<table name="company" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />
<table name="goods" primaryKey="ID" type="global" dataNode="dn1,dn2" />

<!-- random sharding using mod sharind rule -->
<table name="hotnews" primaryKey="ID" autoIncrement="true" dataNode="dn1,dn2,dn3" rule="mod-long" />

<table name="employee" primaryKey="ID" dataNode="dn1,dn2" rule="sharding-by-intfile" />

<table name="customer" primaryKey="ID" dataNode="dn1,dn2" rule="sharding-by-intfile">
<childTable name="orders" primaryKey="ID" joinKey="customer_id" parentKey="id">
<childTable name="order_items" joinKey="order_id" parentKey="id" />
</childTable>
<childTable name="customer_addr" primaryKey="ID" joinKey="customer_id" parentKey="id" />
</table>

</schema>

<dataNode name="dn1" dataHost="devdb" database="app1" />
<dataNode name="dn2" dataHost="devdb" database="app2" />
<dataNode name="dn3" dataHost="devdb" database="app3" />

<dataHost name="devdb" maxCon="1000" minCon="10" balance="0" writeType="0" dbType="mysql" dbDriver="native" switchType="1" slaveThreshold="100">

<heartbeat>select user()</heartbeat>

<!-- can have multi write hosts -->
<writeHost host="hostM1" url="localhost:3306" user="user1" password="password1">
<!-- can have multi read hosts -->
<!--<readHost host="hostS2" url="127.0.0.1:3306" user="user1" password="password1" />-->
</writeHost>

<writeHost host="hostS1" url="localhost:3306" user="user2" password="password2" />
</dataHost>
</mycat:schema>

There are three main parts:
The first is schema block, which describe the tables in schema of virtual database. Which nodes or algorithms are using on these table.

  • For example It means that travelrecord table is distributed in dn1,dn2,dn3 these three nodes and used auto-sharding-long algorithm.

The second is the data nodes. It tells Mycat Which nodes are included in database and mapping the actual data server and database.
I am going to configure three nodes dn1, dn2, dn3 which are in local server and databases ‘s names are app1, app2, app3.

The third is data host. This part is the actual server configuration.
I will config two Mysql database, hostM1 and hostS1, address is “localhost:3306” and username are “user1”,”user1” and password are “password1”,”password2”.

Then specify the heart beat method “select user()”.

Change the log level from info to debug

1
$ vim conf/log4j2.xml
1
2
3
4
5
6
<asyncRoot level="debug" includeLocation="true">

<AppenderRef ref="Console" />
<AppenderRef ref="RollingFile"/>

</asyncRoot>

Create three database

In this tutorial, just one database, so hostM1 and hostS1 are the same machine

hostM1:

1
2
3
mysql> create database app1;
mysql> create database app2;
mysql> create database app3;

hostS1:

1
2
3
mysql> create database app1;
mysql> create database app2;
mysql> create database app3;

Create two database user

hostM1:

1
2
3
mysql> CREATE USER user1 IDENTIFIED BY 'password1';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'user1'@'%' ;
mysql> FLUSH PRIVILEGES;

hostS1:

1
2
3
mysql> CREATE USER user2 IDENTIFIED BY 'password2';
mysql> GRANT ALL PRIVILEGES ON *.* TO 'user2'@'%' ;
mysql> FLUSH PRIVILEGES;

Start Mycat

1
2
$ ./bin/mycat start
Starting Mycat-server...

Track the log for connecting info

1
$ tail -f logs/mycat.log

Login Mycat

  • Mycat default port is 8066
1
$ mysql -uuser1 -ppassword1 -h127.0.0.1 -P8066 -Dvirtual_database

Check databases

1
2
3
4
5
6
7
mysql> show databases;
+------------------+
| DATABASE |
+------------------+
| virtual_database |
+------------------+
1 row in set (0.00 sec)

Create a table in schema rule

1
2
3
4
5
6
7
8
9
10
11
12
mysql> explain create table travelrecord (id bigint not null primary key,user_id varchar(100),traveldate DATE, fee decimal,days int);
+-----------+-----------------------------------------------------------------------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+-----------------------------------------------------------------------------------------------------------------------+
| dn1 | create table travelrecord (id bigint not null primary key,user_id varchar(100),traveldate DATE, fee decimal,days int) |
| dn2 | create table travelrecord (id bigint not null primary key,user_id varchar(100),traveldate DATE, fee decimal,days int) |
| dn3 | create table travelrecord (id bigint not null primary key,user_id varchar(100),traveldate DATE, fee decimal,days int) |
+-----------+-----------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> create table travelrecord (id bigint not null primary key,user_id varchar(100),traveldate DATE, fee decimal,days int);
Query OK, 0 rows affected (0.02 sec)

Ok, Mycat is working. Table will be create in three different node.

Create a table that not include in schema rule

1
2
mysql> create table my_table_not_in_rule (id bigint not null primary key, name varchar(100));
ERROR 1064 (HY000): op table not in schema----MY_TABLE_NOT_IN_RULE

Add three records which id are 1000000, 5000001, 10000001

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
mysql> explain insert into travelrecord (id,user_id,traveldate,fee,days) values(1000000,'liang','2017-08-01',96.05,3);
+-----------+-------------------------------------------------------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+-------------------------------------------------------------------------------------------------------+
| dn1 | insert into travelrecord (id,user_id,traveldate,fee,days) values(1000000,'liang','2017-08-01',96.05,3) |
+-----------+-------------------------------------------------------------------------------------------------------+
1 row in set (0.08 sec)

mysql> explain insert into travelrecord (id,user_id,traveldate,fee,days) values(5000001,'liang','2017-08-01',96.05,3);
+-----------+-------------------------------------------------------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+-------------------------------------------------------------------------------------------------------+
| dn2 | insert into travelrecord (id,user_id,traveldate,fee,days) values(5000001,'liang','2017-08-01',96.05,3) |
+-----------+-------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

mysql> explain insert into travelrecord (id,user_id,traveldate,fee,days) values(10000001,'liang','2017-08-01',96.05,3);
+-----------+---------------------------------------------------------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+---------------------------------------------------------------------------------------------------------+
| dn3 | insert into travelrecord (id,user_id,traveldate,fee,days) values(10000001,'liang','2017-08-01',96.05,3) |
+-----------+---------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Auto sharding into different node by id

1
2
3
4
5
6
7
8
mysql> insert into travelrecord (id,user_id,traveldate,fee,days) values(1000000,'liang','2017-08-01',96.05,3);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into travelrecord (id,user_id,traveldate,fee,days) values(5000001,'liang','2017-08-01',96.05,3);
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql> insert into travelrecord (id,user_id,traveldate,fee,days) values(10000001,'liang','2017-08-01',96.05,3);
Query OK, 1 row affected, 1 warning (0.00 sec)

Find the data

1
2
3
4
5
6
7
8
9
mysql> explain select * from travelrecord where id >= 2000000;
+-----------+----------------------------------------------------------+
| DATA_NODE | SQL |
+-----------+----------------------------------------------------------+
| dn1 | SELECT * FROM travelrecord WHERE id >= 2000000 LIMIT 100 |
| dn2 | SELECT * FROM travelrecord WHERE id >= 2000000 LIMIT 100 |
| dn3 | SELECT * FROM travelrecord WHERE id >= 2000000 LIMIT 100 |
+-----------+----------------------------------------------------------+
3 rows in set (0.00 sec)

It’s easy to know that it is querying on each database and limit 100 default

Add New Table Rule

MyCAT ‘s schema.xml can be loaded dynamically

1
$ vim conf/schema.xml
1
2
3
4
<schema name="virtual_database" checkSQLschema="false" sqlMaxLimit="100">
...
<table name="t_order" primaryKey="id" autoIncrement="true" dataNode="dn1,dn2,dn3" rule="sharding-by-month" />
</schema>

Add new table t_order and use “sharding-by-month” rule. It is Mycat configured rule for the field “create_time” to splitting monthly which in file “conf/rule.xml”.

Use Mycat manage port for real-time refresh rules

  • Mycat admin port default is 9066
1
$ mysql -uuser1 -ppassword1 -h127.0.0.1 -P9066 -Dvirtual_database

Refresh Mycat config in admin

1
2
3
mysql> reload @@config;
Query OK, 1 row affected (0.04 sec)
Reload config success

Access Mycat, you can see a new t_order table which is just a virtual table for mapping management

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ mysql -uuser1 -ppassword1 -h127.0.0.1 -P8066 -Dvirtual_database
...
mysql> show tables;
+----------------------------+
| Tables in virtual_database |
+----------------------------+
| company |
| customer |
| customer_addr |
| employee |
| goods |
| hotnews |
| orders |
| order_items |
| travelrecord |
| t_order |
+----------------------------+
10 rows in set (0.00 sec)

Create table t_order

1
2
3
4
5
CREATE TABLE `t_order` (
`id` varchar(32) NOT NULL,
`order_id` varchar(64) DEFAULT NULL,
`create_time` datetime
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
1
2
3
4
5
6
7
8
mysql> insert into t_order(id,order_id,create_time) values('11','order_1001','2017-01-01 10:00:00');
ERROR 1064 (HY000): Can't find a valid data node for specified node index :T_ORDER -> CREATE_TIME -> 2017-01-01 10:00:00 -> Index : 24

mysql> insert into t_order(id,order_id,create_time) values('12','order_1002','2017-02-01 10:00:00');
ERROR 1064 (HY000): Can't find a valid data node for specified node index :T_ORDER -> CREATE_TIME -> 2017-02-01 10:00:00 -> Index : 25

mysql> insert into t_order(id,order_id,create_time) values('13','order_1003','2017-03-01 00:00:00');
ERROR 1064 (HY000): Can't find a valid data node for specified node index :T_ORDER -> CREATE_TIME -> 2017-03-01 00:00:00 -> Index : 26

Because the rule “sharding-by-month” in file conf/rule.xml, the start time is defined as 2015-01-1, need to be modified to 2017-01-01

1
2
3
4
5
6
7
8
9
10
11
12
13
<tableRule name="sharding-by-month">
<rule>
<columns>create_time</columns>
<algorithm>partbymonth</algorithm>
</rule>
</tableRule>
...
<function name="partbymonth"
class="io.mycat.route.function.PartitionByMonth">
<property name="dateFormat">yyyy-MM-dd</property>
<!-- <property name="sBeginDate">2015-01-01</property> -->
<property name="sBeginDate">2017-01-01</property>
</function>

Increase the number of nodes to 12

1
$ vim conf/schema.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<schema name="virtual_database" checkSQLschema="false" sqlMaxLimit="100">
...
<table name="t_order" primaryKey="id" autoIncrement="true" dataNode="dn$1-12" rule="sharding-by-month" />
</schema>
...
<dataNode name="dn1" dataHost="devdb" database="app1" />
<dataNode name="dn2" dataHost="devdb" database="app2" />
<dataNode name="dn3" dataHost="devdb" database="app3" />

<dataNode name="dn4" dataHost="devdb" database="app1" /> <!-- new -->
<dataNode name="dn5" dataHost="devdb" database="app2" /> <!-- new -->
<dataNode name="dn6" dataHost="devdb" database="app3" /> <!-- new -->

<dataNode name="dn7" dataHost="devdb" database="app1" /> <!-- new -->
<dataNode name="dn8" dataHost="devdb" database="app2" /> <!-- new -->
<dataNode name="dn9" dataHost="devdb" database="app3" /> <!-- new -->

<dataNode name="dn10" dataHost="devdb" database="app1" /> <!-- new -->
<dataNode name="dn11" dataHost="devdb" database="app2" /> <!-- new -->
<dataNode name="dn12" dataHost="devdb" database="app3" /> <!-- new -->
...

Refresh Mycat config

1
2
3
mysql> reload @@config_all;
Query OK, 1 row affected (0.03 sec)
Reload config success

Add New records

1
2
3
4
5
6
mysql> insert into t_order(id,order_id,create_time) values('11','order_1001','2017-01-01 10:00:00');
mysql> insert into t_order(id,order_id,create_time) values('12','order_1002','2017-02-01 10:00:00');
mysql> insert into t_order(id,order_id,create_time) values('13','order_1003','2017-03-01 00:00:00');
mysql> insert into t_order(id,order_id,create_time) values('14','order_1003','2017-04-01 00:00:00');
mysql> insert into t_order(id,order_id,create_time) values('15','order_1003','2017-05-01 00:00:00');
mysql> insert into t_order(id,order_id,create_time) values('16','order_1003','2017-06-01 00:00:00');

Mycat monthly division can only be scheduled within a year!!!!!
I do not know if there is any other way

Meet some errors

  • Unknown database
    1
    2
    $ mysql -uuser1 -ppassword1 -h127.0.0.1 -P8066 -Dvirtual_database
    ERROR 1049 (42000): Unknown database 'virtual_database'

    The solution is to specify the ip address…

    1
    2
    3
    4
    5
    6
    7
    $ mysql -uuser1 -ppassword1 -h127.0.0.1 -P8066 -Dvirtual_database

    Welcome to the MySQL monitor. Commands end with ; or \g.
    Server version: 5.6.29-mycat-1.6-RELEASE-20161028204710 MyCat Server (OpenCloundDB)

    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    mysql>
  • After dynamically loading scheme, NullPointerException occurs
    1
    ERROR 3009 (HY000): java.lang.NullPointerException

    adjusting nodes, you should use config_all

    1
    2
    3
    mysql> reload @@config_all;
    Query OK, 1 row affected (0.03 sec)
    Reload config success
  • Can’t find table define in schema
    1
    java.sql.SQLNonTransientException: can't find table define in schema "USERS" schema:virtual_database

    Many sql syntax errors will be reported this error…
    I’m using this library catfan/Medoo then show this error

  • please make sure the primaryKey’s config is not null in schemal.xml

    Make sure the schema.xml ‘s configuration is right: primaryKey=”ID” autoIncrement=”true”

    1
    <table name="users" primaryKey="ID" autoIncrement="true" dataNode="dn1,dn2,dn3" rule="auto-sharding-long" />
  • can’t find any valid datanode :USERS -> ID -> 947870356817973248

    If sharding rule is auto-sharding-long, it will limit the id range.
    Datanode1: 1~5000001
    Datanode2: 5000001~10000001
    Datanode3: 10000001~15000001

    When id is more than 15000001, it will show this error:
    [Err] 1064 - can’t find any valid datanode :TB_ITEM -> ID -> 15000002
    At this point you need to add a new node.

Share