mysql快速导出导入大量数据

mysql快速导出导入大量数据

我的电脑操作系统时ubuntu 16.04,配置是4核16G.
上面装的软件应用除了常用的一些还有:intellij, webstorm, hadoop, zookeeper, cassandra, mysql, canal, nginx, tomcat等应用.
话说一个项目需要导入700w左右的数据,在windows上用navicat导入岁数慢但是还可以看到希望,在ubuntu上到了一天一夜还有三分一没到完因为还有其他表要导入所以决定换方案.

查看数据的存储引擎

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
show engines;
mysql> show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO |
| InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
| MyISAM | YES | MyISAM storage engine | NO | NO | NO |
| FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+

修改数据库的存储引擎

1
2
3
4
5
6
7
root@bls:/usr/local/mysql/mysql-5.6.33# cd /etc/mysql
root@bls:/etc/mysql# ls
conf.d debian.cnf debian-start fabric.cfg my.cnf my.cnf.fallback mysql.cnf mysql.conf.d mysql-fabric-doctrine-1.4.0.zip
root@bls:/etc/mysql# cd conf.d/
root@bls:/etc/mysql/conf.d# ls
mysql.cnf mysqldump.cnf
root@bls:/etc/mysql/conf.d# vim mysql.cnf

编辑mysql.cnf文件,修改默认InnoDB配置

1
2
3
4
5
6
7
8
9
10
11
[mysqld]
#skip_grant_tables 这个先不要,这是忽略权限。
#设置3306端口
port = 3306
# 允许最大连接数
max_connections=200
# 服务端使用的字符集默认为8比特编码的latin1字符集
character-set-server=utf8
# 创建新表时将使用的默认存储引擎, 在这里调整数据库的默认存储引擎
default-storage-engine=MyISAM
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

重启mysql服务

1
service mysql restart

查看当前存储引擎

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mysql> show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO |
| InnoDB | YES | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
| MyISAM | DEFAULT | MyISAM storage engine | NO | NO | NO |
| FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.00 sec)

文件数据量大的时候(上万条记录虽然不算大,但是也不建议使用)不建议使用source或者mysql -u username -p 库名<文件名.sql导入这两种方式,第一种比第二种更慢.
上面写的是更改存储引擎,innodb是支持事务的所以调整为myisam不支持事务的引擎来提高导入效率.
下面是推荐的导出导入方式:

数据导出

进入到需要导出表的数据库

1
SELECT * INTO OUTFILE 'data.txt' FIELDS TERMINATED BY ',' FROM 表名;

执行上面命令的时候会提示以下error信息:

1
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

这个问题出现的原因是启动MySql的时候使用了–secure-file-priv这个参数,这个参数主要的目的就是限制LOAD DATA INFILE或者SELECT INTO OUTFILE之类文件的目录位置,
使用

1
2
3
4
5
6
7
mysql> select @@global.secure_file_priv;
+---------------------------+
| @@global.secure_file_priv |
+---------------------------+
| /var/lib/mysql-files/ |
+---------------------------+
1 row in set (0.00 sec)

命令来查看mysql设置的安全路径,
将导入或导出文件在这个建议的文件目录下就行操作就可以解决这个问题().
正确的导出sql应该为

1
SELECT * INTO OUTFILE '/var/lib/mysql-files/data.txt' FIELDS TERMINATED BY ',' FROM 表名;

数据导入

1
2
3
mysql> LOAD DATA INFILE '/var/lib/mysql-files/Uploads/area.txt' INTO TABLE area FIELDS TERMINATED BY ',';
Query OK, 766962 rows affected (1.95 sec)
Records: 766962 Deleted: 0 Skipped: 0 Warnings: 0

以上,总结:
我的这次数据量其实不算大不到100w,但是调整前和调整后差距太大了.
调整前: 两个表加一起不到200w条数据,使用常规(非source)命令行导入花费超过12小时(没导完换方案).
调整: 更换存储引擎Innodb到MyIsAM,没有对数据库其他参数进行优化
调整后: 使用SELECT * INTO OUTFILE,LOAD DATA INFILE命令,不到10秒钟导完.