mysql的海量数据优化

阅读次数: 51

  • A+
所属分类:MYSQL 建站 数据库

针对海量数据的优化主要有2种

大表拆小表的方式、SQL语句的优化。

SQL语句的优化:可以通过增加索引等来调整,但是数据量的增大将会导致索引的维护代价增大。

大表拆小表:

竖切

横切

这两种都会有缺点

分区技术:

RANGE分区:基于属于一个给定的连续区间的列值,把多行分配给分区

LIST分区:类似于RANGE分区,区别在于LIST分区是基于列值匹配一个离散值集合中的某个值来进行选择(枚举类型的值,可以用这个)

HASH分区:基于用户定义的表达式的返回值来进行选择的分区,该表达式使用将要插入到表中的这些行的列值进行计算,这个函数可以包含MYSQL中有效的,产生非负整数值的任何表达式

KEY分区:类似于按HASH分区,区别在于KEY分区只支持计算一列或多列,且MYSQL服务器提供其自身的哈希函数

遗留问题:

一张表中

a.主键 ID

b.user_id 用户ID

c.pro_name 产品名称

其中 同一个用户不能拥有相同的产品名称(即为 user_id 和 pro_name 为unique索引)

那么此时用 HASH来分区我真的不知道要怎么分区了,因为一直报错误。

附例子


Partitioning Keys, Primary Keys, and Unique Keys

分区键、               主键、                     唯一性索引

This section discusses the relationship of partitioning keys with primary keys and unique keys. The rule governing this relationship can be expressed as follows: All columns used in the partitioning expression for a partitioned table must be part of every unique key that the table may have.

这一节将讨论分区键和主键索引及唯一性索引之间的关系:

可以这样说:

在分区表上,用于分区表达式里的每一个字段都必须是唯一性索引的一部分。

In other words, every unique key on the table must use every column in the table's partitioning expression. (This also includes the table's primary key, since it is by definition a unique key. This particular case is discussed later in this section.) For example, each of the following table creation statements is invalid:

换句话说,表上的每一个唯一性索引必须用于分区表的表达式上(其中包括主键索引)。

例如:以下案例,建立分区表是无效的。

mysql> CREATE TABLE t1 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     UNIQUE KEY (col1, col2)
    -> )
    -> PARTITION BY HASH(col3)
    -> PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

mysql> CREATE TABLE t2 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     UNIQUE KEY (col1),
    ->     UNIQUE KEY (col3)
    -> )
    -> PARTITION BY HASH(col1 + col3)
    -> PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

In each case, the proposed table would have at least one unique key that does not include all columns used in the partitioning expression.

  以上案例:对于唯一性索引键,至少有一个字段不包含在分区表达式里

Each of the following statements is valid, and represents one way in which the corresponding invalid table creation statement could be made to work:

mysql> CREATE TABLE t1 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     UNIQUE KEY (col1, col2, col3)
    -> )
    -> PARTITION BY HASH(col3)
    -> PARTITIONS 4;

Query OK, 0 rows affected (4.70 sec)

mysql> CREATE TABLE t2 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     UNIQUE KEY (col1, col3)
    -> )
    -> PARTITION BY HASH(col1 + col3)
    -> PARTITIONS 4;

Query OK, 0 rows affected (2.93 sec)

This example shows the error produced in such cases:

mysql> CREATE TABLE t3 (
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->          col4 INT NOT NULL,
    ->          UNIQUE KEY (col1, col2),
    ->         UNIQUE KEY (col3)
    ->      )
    ->      PARTITION BY HASH(col1 + col3)
    ->     PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

mysql> CREATE TABLE t3a (
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->          col4 INT NOT NULL,
    ->          UNIQUE KEY (col1, col2),
    ->         UNIQUE KEY (col3)
    ->      )
    ->      PARTITION BY HASH(col1 + col2)
    ->     PARTITIONS 4;

ERROR 1503 (HY000): A UNIQUE INDEX must include all columns in the table's partitioning function

CREATE TABLE t3 (
         col1 INT NOT NULL,
         col2 DATE NOT NULL,
         col3 INT NOT NULL,
         col4 INT NOT NULL,
         UNIQUE KEY (col1, col2),
        UNIQUE KEY (col3)
     )
     PARTITION BY HASH(col1 + col2+col3 )
    PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

The CREATE TABLE statement fails because both col1 and col3 are included in the proposed partitioning key, but neither of these columns is part of both of unique keys on the table. This shows one possible fix for the invalid table definition:

    以下cretate table语句失败了,是因为col1和col3不同时属于表里两个唯一性索引键。另外一个案例显示如何修复这个问题:

mysql> CREATE TABLE t3 (
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->          col4 INT NOT NULL,
    ->          UNIQUE KEY (col1, col2,col3),
    ->         UNIQUE KEY (col3)
    ->      )
    ->      PARTITION BY HASH(col1 + col2+col3 )
    ->     PARTITIONS 4;

ERROR 1503 (HY000): A UNIQUE INDEX must include all columns in the table's partitioning function

mysql> CREATE TABLE t3 (
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->          col4 INT NOT NULL,
    ->          UNIQUE KEY (col1, col2,col3),
    ->         UNIQUE KEY (col3)
    ->      )
    ->      PARTITION BY HASH(col3 )
    ->     PARTITIONS 4;

Query OK, 0 rows affected (3.11 sec)

In this case, the proposed partitioning key col3 is part of both unique keys, and the table creation statement succeeds.

以上可以看出,col3同时属于两个唯一性索引的键,所以create table执行成功。

The following table cannot be partitioned at all, because there is no way to include in a partitioning key any columns that belong to both unique keys:

   如下所示,此分区表是无法建立的,因为没有一个分区键,可以同时属于两个唯一性索引的键。

CREATE TABLE t4 (
    col1 INT NOT NULL,
    col2 INT NOT NULL,
    col3 INT NOT NULL,
    col4 INT NOT NULL,
    UNIQUE KEY (col1, col3),
    UNIQUE KEY (col2, col4)
);

Since every primary key is by definition a unique key, this restriction also includes the table's primary key, if it has one. For example, the next two statements are invalid:

主键也属于唯一性索引,所以以上规则适合于primary key。

mysql> CREATE TABLE t5 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     PRIMARY KEY(col1, col2)
    -> )
    -> PARTITION BY HASH(col3)
    -> PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

mysql> CREATE TABLE t6 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     PRIMARY KEY(col1, col3),
    ->     UNIQUE KEY(col2)
    -> )
    -> PARTITION BY HASH( YEAR(col2) )
    -> PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

mysql> CREATE TABLE t6 (
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->        col4 INT NOT NULL,
    ->          PRIMARY KEY(col1,col2, col3),
    ->          UNIQUE KEY(col2)
    ->      )
    ->      PARTITION BY HASH( YEAR(col2) )
    ->      PARTITIONS 4;

Query OK, 0 rows affected (2.88 sec)

mysql> CREATE TABLE t6a(
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->        col4 INT NOT NULL,
    ->          PRIMARY KEY(col1,col2, col3)
    ->      )
    ->      PARTITION BY HASH( YEAR(col2) )
    ->      PARTITIONS 4;

Query OK, 0 rows affected (3.51 sec)

mysql> CREATE TABLE t6b(
    ->          col1 INT NOT NULL,
    ->          col2 DATE NOT NULL,
    ->          col3 INT NOT NULL,
    ->        col4 INT NOT NULL,
    ->          PRIMARY KEY(col1,col2, col3)
    ->      )
    ->      PARTITION BY HASH( col3)
    ->      PARTITIONS 4;

Query OK, 0 rows affected (4.26 sec)

In both cases, the primary key does not include all columns referenced in the partitioning expression. However, both of the next two statements are valid:

以上两个案例,分区表达式里的字段不包含所有的主键字段。

mysql> CREATE TABLE t7 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     PRIMARY KEY(col1, col2)
    -> )
    -> PARTITION BY HASH(col1 + YEAR(col2))
    -> PARTITIONS 4;

Query OK, 0 rows affected (4.40 sec)

mysql> CREATE TABLE t8 (
    ->     col1 INT NOT NULL,
    ->     col2 DATE NOT NULL,
    ->     col3 INT NOT NULL,
    ->     col4 INT NOT NULL,
    ->     PRIMARY KEY(col1, col2, col4),
    ->     UNIQUE KEY(col2, col1)
    -> )
    -> PARTITION BY HASH(col1 + YEAR(col2))
    -> PARTITIONS 4;

Query OK, 0 rows affected (3.51 sec)

If a table has no unique keys—this includes having no primary key—then this restriction does not apply, and you may use any column or columns in the partitioning expression as long as the column type is compatible with the partitioning type.

对于没有建立主键或唯一性索引的表,可以将任何兼容分区表的字段放在分区表达式中。

For the same reason, you cannot later add a unique key to a partitioned table unless the key includes all columns used by the table's partitioning expression. Consider the partitioned table created as shown here:

基于同样的原因,你在分区表上添加唯一性索引时,必须符合以上规则。分区表表达式的键,必须包含在所有的唯一性索引中

mysql> CREATE TABLE t_no_pk (c1 INT, c2 INT)
    ->     PARTITION BY RANGE(c1) (
    ->         PARTITION p0 VALUES LESS THAN (10),
    ->         PARTITION p1 VALUES LESS THAN (20),
    ->         PARTITION p2 VALUES LESS THAN (30),
    ->         PARTITION p3 VALUES LESS THAN (40)
    ->     );

Query OK, 0 rows affected (0.12 sec)

It is possible to add a primary key to t_no_pk using either of these ALTER TABLE statements:

#  possible PK

mysql> ALTER TABLE t_no_pk ADD PRIMARY KEY(c1);

Query OK, 0 rows affected (0.13 sec)

Records: 0  Duplicates: 0  Warnings: 0

# drop this PK

mysql> ALTER TABLE t_no_pk DROP PRIMARY KEY;

Query OK, 0 rows affected (0.10 sec)

Records: 0  Duplicates: 0  Warnings: 0

#  use another possible PK

mysql> ALTER TABLE t_no_pk ADD PRIMARY KEY(c1, c2);

Query OK, 0 rows affected (0.12 sec)

Records: 0  Duplicates: 0  Warnings: 0

# drop this PK

mysql> ALTER TABLE t_no_pk DROP PRIMARY KEY;

Query OK, 0 rows affected (0.09 sec)

Records: 0  Duplicates: 0  Warnings: 0

However, the next statement fails, because c1 is part of the partitioning key, but is not part of the proposed primary key:

 然而,以下语句会失败,因为分区键 c1,不属于primary key

#  fails with error 1503

mysql> ALTER TABLE t_no_pk ADD PRIMARY KEY(c2);

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

Since t_no_pk has only c1 in its partitioning expression, attempting to adding a unique key on c2 alone fails. However, you can add a unique key that uses both c1 and c2.

These rules also apply to existing nonpartitioned tables that you wish to partition using ALTER TABLE ... PARTITION BY. Consider a table np_pk created as shown here:

mysql> CREATE TABLE np_pk (
    ->     id INT NOT NULL AUTO_INCREMENT,
    ->     name VARCHAR(50),
    ->     added DATE,
    ->     PRIMARY KEY (id)
    -> );

Query OK, 0 rows affected (0.08 sec)

The following ALTER TABLE statement fails with an error, because the added column is not part of any unique key in the table:

以下,alter table语句将会失败,因为分区键,不包含于唯一性索引。

mysql> ALTER TABLE np_pk
    ->     PARTITION BY HASH( TO_DAYS(added) )
    ->     PARTITIONS 4;

ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

However, this statement using the id column for the partitioning column is valid, as shown here:

以下语句将会成功:因为分区键id,包含于主键primary key

mysql> ALTER TABLE np_pk
    ->     PARTITION BY HASH(id)
    ->     PARTITIONS 4;

Query OK, 0 rows affected (0.11 sec)

Records: 0  Duplicates: 0  Warnings: 0

In the case of np_pk, the only column that may be used as part of a partitioning expression is id; if you wish to partition this table using any other column or columns in the partitioning expression, you must first modify the table, either by adding the desired column or columns to the primary key, or by dropping the primary key altogether.

  如以上案例,只有id字段可以作为分区表达式键,如果你还想使用其他的字段用于分区表达式中,你必须修改表结构,添加字段到你的主键里,或者删除主键。


维护命令

添加分区

alter table xxxxxxx add partition (partition p0 values less than(1991));  //只能添加大于分区键的分区 

删除分区

alter table xxxxxxx drop partition p0; //可以删除任意分区

删除分区数据

alter table xxxxxx  truncate partition p1,p2;  
alter table xxxxxx  truncate partition all;  
或  
delete from xxxxxx where separated < '2006-01-01' or (separated >= '2006-01-01' and separated<'2011-01-01'); 

重定义分区(包括重命名分区,伴随移动数据;合并分区)

alter table xxxxx reorganize partition p1,p3,p4 into (partition pm1 values less than(2006),  
partition pm2 values less than(2011));  

rebuild重建分区

alter  table xxxxxx rebuild partition pm1/all; //相当于drop所有记录,然后再reinsert;可以解决磁盘碎片

优化表

alter  table tt2 optimize partition pm1; //在大量delete表数据后,可以回收空间和碎片整理。但在5.5.30后支持。在5.5.30之前可以通过recreate+analyze来替代,如果用rebuild+analyze速度慢 

analzye表

alter  table xxxxxx analyze partition pm1/all; 

check表

alter  table xxxxxx check partition pm1/all;  

sql代码

show create table employees2;  //查看分区表的定义  
show table status like 'employees2'\G;    //查看表时候是分区表 如“Create_options: partitioned”  
select * from information_schema.KEY_COLUMN_USAGE where table_name='employees2';   //查看索引  
SELECT * FROM information_schema.partitions WHERE table_name='employees2'   //查看分区表  
explain partitions select * from employees2 where separated < '1990-01-01' or separated > '2016-01-01';   //查看分区是否被select使用  

 

 

  • 我的微信
  • 这是我的微信扫一扫
  • weinxin
  • 我的微信公众号
  • 我的微信公众号扫一扫
  • weinxin

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: