MySQL分区表和HBase的关系是什么

发布时间:2021-08-04 16:39:46 作者:Leah
来源:亿速云 阅读:141

MySQL分区表和HBase的关系是什么,针对这个问题,这篇文章详细介绍了相对应的分析和解答,希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。

  创建 MySQL 分区数据

  DROP TABLE ord_order;

  -- 创建订单分区表

  CREATE TABLE ord_order(

  order_id BIGINT NOT NULL AUTO_INCREMENT COMMENT '订单ID',

  user_id INT NOT NULL COMMENT '用户ID',

  goods_id INT NOT NULL COMMENT '商品ID',

  order_price INT NOT NULL DEFAULT 0 COMMENT '订单实际价格(分)',

  create_time DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',

  PRIMARY KEY(order_id, create_time)

  )

  PARTITION BY LIST (YEAR(create_time)*100 + MONTH(create_time))

  (

  PARTITION p201601 VALUES IN (201601),

  PARTITION p201602 VALUES IN (201602),

  PARTITION p201603 VALUES IN (201603),

  PARTITION p201604 VALUES IN (201604),

  PARTITION p201605 VALUES IN (201605),

  PARTITION p201606 VALUES IN (201606),

  PARTITION p201607 VALUES IN (201607),

  PARTITION p201608 VALUES IN (201608),

  PARTITION p201609 VALUES IN (201609),

  PARTITION p201610 VALUES IN (201610),

  PARTITION p201611 VALUES IN (201611),

  PARTITION p201612 VALUES IN (201612)

  );

  -- 插入相关数据

  INSERT INTO ord_order VALUES

  (NULL, 10000001, 11111111, 1000, '2016-01-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-01-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-01-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-01-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-01-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-02-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-02-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-02-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-02-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-02-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-03-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-03-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-03-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-03-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-03-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-04-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-04-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-04-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-04-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-04-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-05-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-05-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-05-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-05-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-05-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-06-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-06-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-06-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-06-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-06-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-07-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-07-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-07-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-07-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-07-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-08-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-08-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-08-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-08-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-08-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-09-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-09-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-09-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-09-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-09-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-10-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-10-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-10-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-10-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-10-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-11-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-11-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-11-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-11-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-11-13 05:00:50'),

  (NULL, 10000001, 11111111, 1000, '2016-12-13 01:00:10'),

  (NULL, 10000001, 11111112, 2000, '2016-12-13 02:00:20'),

  (NULL, 10000001, 11111113, 3000, '2016-12-13 03:00:30'),

  (NULL, 10000001, 11111114, 4000, '2016-12-13 04:00:40'),

  (NULL, 10000001, 11111115, 5000, '2016-12-13 05:00:50');

  -- 查看分区p201601数据

  SELECT * FROM ord_order PARTITION(p201601);

  -- 组合成的 row key

  SELECT CONCAT(user_id, 10000000000-UNIX_TIMESTAMP(create_time), goods_id)

  FROM ord_order PARTITION(p201601);

  结合HBase咯

  创建HBase表 ord_order

  由于版本兼容的问题,这边我需要先创建好HBase对应的表。不然会报不能自动创建 column family 的错误。

  使用 hbase shell 创建 ord_order 表

 hbase(main):033:0> create 'ord_order', {NAME => 'cf1'}

  使用 Sqoop 将MySQL的ord_order 表的 p201601 分区的数据导入HBase表。

  /usr/local/sqoop/bin/sqoop import \

  --connect jdbc:mysql://192.168.137.11:3306/test \

  --username HH \

  --password oracle \

  --query 'SELECT CONCAT(user_id, 10000000000-UNIX_TIMESTAMP(create_time), goods_id) AS order_id, order_price, create_time FROM ord_order PARTITION(p201601) WHERE $CONDITIONS' \

  --hbase-table ord_order \

  --hbase-create-table \

  --hbase-row-key order_id \

  --split-by order_id \

  --column-family cf1 \

  -m 1

  导入成功后就可以在MySQL上面将相关分区删除,并且创建之后需要的分区:

  ALTER TABLE ord_order

  ADD PARTITION (PARTITION p201701 VALUES IN (201701));

  ALTER TABLE ord_order DROP PARTITION p201601;

  查看Hbase中导入的数据

  hbase(main):001:0> scan 'ord_order'

  ROW COLUMN+CELL

  10000001854736755011111115 column=cf1:create_time, timestamp=1479224942888, value=2016-01-13 05:00:50.0

  10000001854736755011111115 column=cf1:order_price, timestamp=1479224942888, value=5000

  10000001854737116011111114 column=cf1:create_time, timestamp=1479224942888, value=2016-01-13 04:00:40.0

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

  10000001854737477011111113 column=cf1:create_time, timestamp=1479224942888, value=2016-01-13 03:00:30.0

  10000001854737477011111113 column=cf1:order_price, timestamp=1479224942888, value=3000

  10000001854737838011111112 column=cf1:create_time, timestamp=1479224942888, value=2016-01-13 02:00:20.0

  10000001854737838011111112 column=cf1:order_price, timestamp=1479224942888, value=2000

  10000001854738199011111111 column=cf1:create_time, timestamp=1479224942888, value=2016-01-13 01:00:10.0

  10000001854738199011111111 column=cf1:order_price, timestamp=1479224942888, value=1000

  5 row(s) in 0.5390 seconds

  ROW KEY 设计详解

  HBase中的row key为 user_id, 10000000000-UNIX_TIMESTAMP(create_time), goods_id 3个字段组成。

  这边值得注意的是 10000000000-UNIX_TIMESTAMP(create_time), 这样设计的原因是为了让订单能按时间的倒序排列, 这样就符合 越新的数据越先显示

  如: 现在需要对用户 10000001 的订单进行分页, 每页两条数据, 并且按时间的倒序排序(最新订单最先显示)

  hbase(main):003:0> scan 'ord_order', {COLUMNS=>['cf1:order_price'], ROWPREFIXFILTER=>'10000001', LIMIT=>2}

  ROW COLUMN+CELL

  10000001854736755011111115 column=cf1:order_price, timestamp=1479224942888, value=5000

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

  点击下一页的数据:

  hbase(main):004:0> scan 'ord_order', {COLUMNS=>['cf1:order_price'], LIMIT=>3, STARTROW=>'10000001854737116011111114'}

  ROW COLUMN+CELL

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

  10000001854737477011111113 column=cf1:order_price, timestamp=1479224942888, value=3000

  10000001854737838011111112 column=cf1:order_price, timestamp=1479224942888, value=2000

  3 row(s) in 0.0260 seconds

  上面获得了三行数据,在实际展现的时候去除第一行就好了,实际展示如下:

  10000001854737477011111113 column=cf1:order_price, timestamp=1479224942888, value=3000

  10000001854737838011111112 column=cf1:order_price, timestamp=1479224942888, value=2000

  点击上一页

  hbase(main):008:0> scan 'ord_order', {COLUMNS=>['cf1:order_price'], LIMIT=>3, STARTROW=>'10000001854737477011111113', REVERSED=>true}

  ROW COLUMN+CELL

  10000001854737477011111113 column=cf1:order_price, timestamp=1479224942888, value=3000

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

  10000001854736755011111115 column=cf1:order_price, timestamp=1479224942888, value=5000

  3 row(s) in 0.0640 seconds

  上面同样获得了三条数据,我们需要去除第一行,让后按数据集合倒序显示

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

  10000001854736755011111115 column=cf1:order_price, timestamp=1479224942888, value=5000

  ↓↓↓↓↓ 上面两行是集合数据 下面两行数倒序遍历集合的数据(也是最终显示的数据)

  10000001854736755011111115 column=cf1:order_price, timestamp=1479224942888, value=5000

  10000001854737116011111114 column=cf1:order_price, timestamp=1479224942888, value=4000

关于MySQL分区表和HBase的关系是什么问题的解答就分享到这里了,希望以上内容可以对大家有一定的帮助,如果你还有很多疑惑没有解开,可以关注亿速云行业资讯频道了解更多相关知识。

推荐阅读:
  1. HBase的原理和架构是什么
  2. mysql和Navicat的关系是什么

免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。

mysql hbase

上一篇:MySQL中如何修改密码

下一篇:如何解决某些HTML字符打不出来的问题

相关阅读

您好,登录后才能下订单哦!

密码登录
登录注册
其他方式登录
点击 登录注册 即表示同意《亿速云用户服务条款》