GlusterFS 相关

发布时间：2020-07-02 23:13:58 作者：xingliguang
来源：网络阅读：3511

一、Gluster介绍

GlusterFS 是Scale-Out存储解决方案Gluster的核心，它是一个开源的分布式文件系统，具有强大的横向扩展能力，通过扩展能够支持数PB存储容量和处理数千客户端。GlusterFS借助TCP/IP或InfiniBand RDMA(一种支持多并发链接的“转换线缆”技术)网络将物理分布的存储资源聚集在一起，使用单一全局命名空间来管理数据。GlusterFS基于可堆叠的用户空间设计，可为各种不同的数据负载提供优异的性能，工作原理图：

GlusterFS 相关

GlusterFS支持运行在任何标准IP网络上标准应用程序的标准客户端，如图2所示，用户可以在全局统一的命名空间中使用NFS/CIFS等标准协议来访问应用数据。GlusterFS使得用户可摆脱原有的独立、高成本的封闭存储系统，能够利用普通廉价的存储设备来部署可集中管理、横向扩展、虚拟化的存储池，存储容量可扩展至TB/PB级。

GlusterFS主要特征如下：

1、扩展性和高性能

GlusterFS利用双重特性来提供几TB至数PB的高扩展存储解决方案。Scale-Out架构允许通过简单地增加资源来提高存储容量和性能，磁盘、计算和I/O资源都可以独立增加，支持10GbE和InfiniBand等高速网络互联。Gluster弹性哈希（Elastic Hash）解除了GlusterFS对元数据服务器的需求，消除了单点故障和性能瓶颈，真正实现了并行化数据访问。

2、高可用性

GlusterFS可以对文件进行自动复制，如镜像或多次复制，从而确保数据总是可以访问，甚至是在硬件故障的情况下也能正常访问。自我修复功能能够把数据恢复到正确的状态，而且修复是以增量的方式在后台执行，几乎不会产生性能负载。GlusterFS没有设计自己的私有数据文件格式，而是采用操作系统中主流标准的磁盘文件系统（如EXT3、ZFS）来存储文件，因此数据可以使用各种标准工具进行复制和访问。

3、全局统一命名空间

全局统一命名空间将磁盘和内存资源聚集成一个单一的虚拟存储池，对上层用户和应用屏蔽了底层的物理硬件。存储资源可以根据需要在虚拟存储池中进行弹性扩展，比如扩容或收缩。当存储虚拟机映像时，存储的虚拟映像文件没有数量限制，成千虚拟机均通过单一挂载点进行数据共享。虚拟机I/O可在命名空间内的所有服务器上自动进行负载均衡，消除了SAN环境中经常发生的访问热点和性能瓶颈问题。

4、弹性哈希算法

GlusterFS采用弹性哈希算法在存储池中定位数据，而不是采用集中式或分布式元数据服务器索引。在其他的Scale-Out存储系统中，元数据服务器通常会导致I/O性能瓶颈和单点故障问题。GlusterFS中，所有在Scale-Out存储配置中的存储系统都可以智能地定位任意数据分片，不需要查看索引或者向其他服务器查询。这种设计机制完全并行化了数据访问，实现了真正的线性性能扩展。

5、弹性卷管理

数据储存在逻辑卷中，逻辑卷可以从虚拟化的物理存储池进行独立逻辑划分而得到。存储服务器可以在线进行增加和移除，不会导致应用中断。逻辑卷可以在所有配置服务器中增长和缩减，可以在不同服务器迁移进行容量均衡，或者增加和移除系统，这些操作都可在线进行。文件系统配置更改也可以实时在线进行并应用，从而可以适应工作负载条件变化或在线性能调优。

6、基于标准协议

Gluster存储服务支持NFS, CIFS, HTTP, FTP以及Gluster原生协议，完全与POSIX标准兼容。现有应用程序不需要作任何修改或使用专用API，就可以对Gluster中的数据进行访问。这在公有云环境中部署Gluster时非常有用，Gluster对云服务提供商专用API进行抽象，然后提供标准POSIX接口。

参考链接

http://www.tuicool.com/articles/AbE7Vr

GlusterFS术语解释：

Brick：GFS中的存储单元，通过是一个受信存储池中的服务器的一个导出目录。可以通过主机名和目录名来标识，如’SERVER:EXPORT’

Client：挂载了GFS卷的设备

Extended Attributes:xattr是一个文件系统的特性，其支持用户或程序关联文件/目录和元数据。

FUSE:FilesystemUserspace是一个可加载的内核模块，其支持非特权用户创建自己的文件系统而不需要修改内核代码。通过在用户空间运行文件系统的代码通过FUSE代码与内核进行桥接。

Geo-Replication

GFID：GFS卷中的每个文件或目录都有一个唯一的128位的数据相关联，其用于模拟inode

Namespace：每个Gluster卷都导出单个ns作为POSIX的挂载点

Node：一个拥有若干brick的设备

RDMA：远程直接内存访问，支持不通过双方的OS进行直接内存访问。

RRDNS：round robin DNS是一种通过DNS轮转返回不同的设备以进行负载均衡的方法

Self-heal：用于后台运行检测复本卷中文件和目录的不一致性并解决这些不一致。

Split-brain：脑裂

Translator：

Volfile：glusterfs进程的配置文件，通常位于/var/lib/glusterd/vols/volname

Volume：一组bricks的逻辑集合

二、安装配置

1、集群架构

系统：Centos 6.7  x86_64   
服务端: 192.168.159.128
         192.168.159.129
客户端: 192.168.159.130

2、在服务端安装

首先安装Gluster的yum源
#yum install centos-release-gluster38 -y
安装Glusterfs
#yum install glusterfs-server -y
安装完成之后，查看版本
#glusterfs -V
glusterfs 3.8.12 built on May 11 2017 18:24:27
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc. <http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
启动gluster
#service glusterd start
Starting glusterd:[  OK  ]
设置开机自启动
#chkconfig --add glusterd
#chkconfig glusterd on
查看监听的端口
#netstat -tunlp
tcp        0      0 0.0.0.0:24007               0.0.0.0:*                   LISTEN      65940/glusterd

3、在客户端安装

首先安装Gluster的yum源
#yum install centos-release-gluster38 -y
安装客户端glusterfs
#yum install glusterfs glusterfs-fuse glusterfs-client glusterfs-libs -y

4、在服务端配置

说明：配置glusterfs集群一般都会在服务器挂载新的磁盘，分区格式化，将整块磁盘挂载到一个目录
专门用来存放数据，这样在一定程度上提高了数据的可用性，当系统出现问题的时候，数据不丢失。
参考官网：http://gluster.readthedocs.io/en/latest/Quick-Start-Guide/Quickstart/
配置glusterfs集群必须满足以下条件：
1、最少要2个节点
2、一个网络（节点之间互通）
2、节点需要有2个虚拟硬盘，一个是安装os，一个是配置glusterfs集群，否则会报错如下：
volume create: gfs01: failed: The brick 192.168.159.128:/data/gluster is being created in the root partition. It is recommended that you don't use the system's root partition for storage backend. Or use 'force' at the end of the command if you want to override this behavior.
但是官网说要把磁盘格式化为xfs格式，我看有些朋友是格式化为ext4格式，然后加入/etc/fstab
/dev/sdb1               /opt                    ext4    defaults        0 0
所以在2台服务端添加新的磁盘，然后分区格式化，并挂载到/data目录！
#mkdir /data
#fdisk /dev/sdb      #分一个区，执行n , p ,1 然后回车2次，执行w 
#mkfs.ext4 /dev/sdb1
#mount /dev/sdb1 /data
添加到/etc/fstab
/dev/sdb1               /data           ext4    defaults        0 0
#####################################
开始配置，在服务端哪一台机器操作都可以：
添加节点：
#gluster peer probe 192.168.159.128           #提示不需要添加本机
peer probe: success. Probe on localhost not needed
#gluster peer probe 192.168.159.129
peer probe: success. 
查看状态：
#gluster peer status
Number of Peers: 1
Hostname: 192.168.159.129
Uuid: 1aed6e01-c497-4890-9447-c3bd548dd37f
State: Peer in Cluster (Connected)
在另一台服务端查看状态
# gluster peer status
Number of Peers: 1
Hostname: 192.168.159.128
Uuid: 851be337-84b2-460d-9a73-4eee6ad95e95
State: Peer in Cluster (Connected)
##################################################
创建gluster共享目录（在128和129服务器上都需要创建共享目录）
#mkdir /data/gluster
创建名为gfs01的卷：
#gluster volume create gfs01 replica 2 192.168.159.128:/data/gluster 192.168.159.129:/data/gluster
volume create: gfs01: success: please start the volume to access data
启动卷
#gluster volume start gfs01
volume start: gfs01: success
查看卷信息:
#gluster volume info
 
Volume Name: gfs01
Type: Replicate
Volume ID: 1cdefebd-5831-4857-b1f9-f522fc868c60
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 192.168.159.128:/data/gluster
Brick2: 192.168.159.129:/data/gluster
Options Reconfigured:
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
查看卷状态:
#gluster volume status
Status of volume: gfs01
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.159.128:/data/gluster         49152     0          Y       2319 
Brick 192.168.159.129:/data/gluster         49152     0          Y       4659 
Self-heal Daemon on localhost               N/A       N/A        Y       2339 
Self-heal Daemon on 192.168.159.129         N/A       N/A        Y       4683 
 
Task Status of Volume gfs01
------------------------------------------------------------------------------
There are no active volume tasks
其他的用法
gluster volume stop gfs01       #停止卷
gluster volume delete gfs01     #删除卷

5、在客户端配置挂载

创建一个挂载点
#mkdir /opt/test_gluster
#mount -t glusterfs 192.168.159.128:/gfs01 /opt/test_gluster/
[root@client ~]# ll /opt/test_gluster/
total 0
[root@client ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3              18G  822M   16G   5% /
tmpfs                 491M     0  491M   0% /dev/shm
/dev/sda1             190M   27M  154M  15% /boot
192.168.159.128:/gfs01
                      4.8G   11M  4.6G   1% /opt/test_gluster
注意：128和129是一个集群，挂载哪一台都可以，如果客户端比较多注意负载即可。
加入开机自动挂载
#cat /etc/rc.local
mount -t glusterfs 192.168.159.128:/gfs01 /opt/test_gluster/

6、测试文件

在客户端130服务器创建一个测试文件testfile.txt,写入内容并用md5sum测试其哈希值
#ll /opt/test_gluster/
total 3
-rw-r--r-- 1 root root 2262 Jun  1 00:10 testfile.txt
[root@client test_gluster]# md5sum testfile.txt 
6263c0489d0567985c28d7692bde624c  testfile.txt
############################################################
到2台服务端查看情况
在128上
[root@server1 ~]# ll /data/gluster/
total 8
-rw-r--r-- 2 root root 2262 Jun  1 00:10 testfile.txt
[root@server1 ~]# md5sum /data/gluster/testfile.txt 
6263c0489d0567985c28d7692bde624c  /data/gluster/testfile.txt
在129上
[root@server2 ~]# ll /data/gluster/
total 8
-rw-r--r-- 2 root root 2262 Jun  1 00:10 testfile.txt
[root@server2 ~]# md5sum  /data/gluster/testfile.txt 
6263c0489d0567985c28d7692bde624c  /data/gluster/testfile.txt
可以看到md5sum哈希值是一致的。

7、测试集群数据一致性

之所以使用集群一个是可以分摊负载，一个是可以实现高可用，下面就测试下：
直接将129这台服务端服务器关机！！！
然后在128查看集群状态，如下:
[root@server1 ~]#gluster volume status
Status of volume: gfs01
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.159.128:/data/gluster         49152     0          Y       2319 
Self-heal Daemon on localhost               N/A       N/A        Y       2589 
 
Task Status of Volume gfs01
------------------------------------------------------------------------------
There are no active volume tasks
可以看到129集群已经不存在了！
########################################################################
此时删除客户端130的testfile.txt文件
#rm -f /opt/test_gluster/testfile.txt
然后会发现服务端128/data/gluster目录下，testfile.txt文件也消失了。
这个时候重新启动129服务器
#service glusterd status            #因为配置了开机自启动，所以是启动的。
glusterd (pid  1368) is running... 
[root@server2 ~]# ll /data/gluster/       #此时再查看文件是存在的，因为在删除客户端文件之前次服务器关机了！
total 8
-rw-r--r-- 2 root root 2262 Jun  1 00:10 testfile.txt
[root@server2 ~]#md5sum /data/gluster/testfile.txt 
6263c0489d0567985c28d7692bde624c  /data/gluster/testfile.txt  #md5sum验证哈希值也是一样。
过了大概10秒左右，在129服务器的testfile.txt文件也消失了，说明gluster集群是保持数据一致性的！

说明：glusterfs集群，不能在服务端写入数据，要在客户端写入数据，会自动同步到服务端！

三、gluster集群常用的一些命令

#删除卷
gluster volume stop gfs01
gluster volume delete gfs01
#将机器移出集群
gluster peer detach 192.168.1.100
#只允许172.28.0.0的网络访问glusterfs
gluster volume set gfs01 auth.allow 172.28.26.*
gluster volume set gfs01 auth.allow 192.168.222.1,192.168.*.*
#加入新的机器并添加到卷里(由于副本数设置为2,至少要添加2（4、6、8..）台机器)
gluster peer probe 192.168.222.134
gluster peer probe 192.168.222.135
#新加卷
gluster volume add-brick gfs01 repl 2 192.168.222.134:/data/gluster 192.168.222.135:/data/gluster force
#删除卷
gluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs start
gluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs status
gluster volume remove-brick gfs01 repl 2 192.168.222.134:/opt/gfs 192.168.222.135:/opt/gfs commit
注意：扩展或收缩卷时，也要按照卷的类型，加入或减少的brick个数必须满足相应的要求。
#当对卷进行了扩展或收缩后，需要对卷的数据进行重新均衡。
gluster volume rebalance mamm-volume start|stop|status
###########################################################
迁移卷---主要完成数据在卷之间的在线迁移
#启动迁移过程
gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test start force
#查看迁移状态
gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test status
#迁移完成后提交完成
gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test commit
#机器出现故障,执行强制提交
gluster volume replace-brick gfs01 192.168.222.134:/opt/gfs 192.168.222.134:/opt/test commit force
###########################################################
触发副本自愈
gluster volume heal mamm-volume #只修复有问题的文件
gluster volume heal mamm-volume full #修复所有文件
gluster volume heal mamm-volume info #查看自愈详情
#####################################################
data-self-heal, metadata-self-heal and entry-self-heal
启用或禁用文件内容、文件元数据和目录项的自我修复功能，默认情况下三个全部是“on”。
#将其中的一个设置为off的范例：
gluster volume set gfs01 entry-self-heal off

四、glusterfs的缺点分析
参考学习连接：https://www.cnblogs.com/langren1992/p/5316328.html

不足之处，请多多指出！

GlusterFS 相关

相关阅读