Linux HA high availability cluster knowledge, learned is to earn

HA (High Availability) high-availability cluster, which is characterized in that the front-end Diretor, the back-end RS-server, the database server, the shared storage and other cluster nodes are backed up from the backup server or multiple servers according to actual needs, once the primary server hangs. The backup server can immediately detect and replace the resources on the primary server to continue running the service, thus minimizing the service suspension caused by server downtime.

Active/primary standby node (passive/standby)
The main scheduler (Director) is generally a key node in the cluster, so there are generally backup nodes; the back-end RS-server can add backup nodes according to actual reliable requirements, and the storage server, such as Mysql-Server, also acts as a cluster. The key nodes are generally equipped with a master-slave server.
HA cluster focuses on two aspects of service reliability and stability.
Availability = service online time / (service online time + troubleshooting time)
Usability is steadily increasing by 99%, 99.9%, 99.99%, and 99.999%, with each additional one, and service availability is increased tenfold. In some applications service availability must reach five levels of 9 such as: financial trading system.....
HA Resource: These resources need to be transferred to other backup nodes once the node fails, including VIP, service, isolation device, file system. Each RS runs a service resource. When there are multiple RS nodes, once a node fails, resources are immediately transferred to other nodes, other nodes are processed for unprocessed requests, and Director is prevented from making front-end requests. Continue this node, but there are so many nodes, which node is transferred to the node when the fault occurs? And what if this faulty node is restored? At this time, it is necessary to define the viscosity of resources, the constraints of resources, and so on.
Stickiness of resources: Which node is more likely to run on resources, that is, the tendency of resources and nodes
For example, the resource sticky of the defined web service on the A server is 120, and the resource stickiness on the B server is 100. Once the A fails and returns to normal, the web service is transferred from the B server to the A server.
Stickiness of resources: Whether resources tend to run at the current node, Score>0 (prone) Scoro<0 (not inclined, that is, if there are other nodes that can run this service, the resources are immediately transferred to other nodes)
Resource constraints: the tendency to define resources and resources
Colocation (arrangement constraint): Define whether different resources can run on the same node, Score>0 (can), Score<0 (not available)
-inf (negative infinity.. can never run on the same node)
Inf (positive infinity. must be running on the same node)
Location (location constraint): Each node can give a resource a Score, Score > 0 (the resource tends to run on this node)
Score <0 (resources are not inclined to run on this node)
General resource stickiness + position constraint which is larger, resources are more inclined to run at that node
Order (order constraint): Defines the order in which the resources are started and shut down, because different resources may have dependencies such as: VIP and IPVS rules, and the VIP starts after starting the IPVS rule.
Resource classification
Primitive A resource runs on only one node (primary resource).
Clone This resource is run on each node.
Group divides multiple resources into one group, and the same group of resources move forward and backward together, and transfer together on the node.
Master/slave Master/slave, a resource can only run on two nodes, and one is the master and the slave.
How does the backup node know that the primary node is faulty?
Heartbeat: Each node must communicate with the backup node at any time in order to detect whether the other party is online.
However, when there are three or more nodes, and these nodes also transmit heartbeat information to each other (for example, if the RSs running the same service are mutually backup nodes), thereby determining whether they are faulty, whether they are legitimate nodes, how to judgment?
All nodes are defined in a multicast to ping each other. For example, five RS nodes with A, B, C, D, and E run Web services. At a certain time, nodes A, B, and C can ping each other. D and E nodes can ping each other, you can define a Quorum (voting) mechanism, defined as one vote for each node, then five nodes have a total of five votes, and the definition is only legal if the number of votes is more than half. Therefore, at this time, A, B, and C nodes have three votes, and D and E nodes have two votes. It can be considered that D and E nodes are not illegal nodes (that is, D and E nodes are faulty).
Or the A node can not pass other nodes to obtain one ticket, and the four nodes B, C, D, and E can ping each other to obtain four votes, and the node A can be considered as an illegal node.
For a multi-node cluster, the number of nodes is preferably an odd number for the implementation of the voting mechanism. If the number of votes is more than half, it is considered legal.
It can be defined that the number of votes of different nodes is different. For example, if the performance of the A node has two votes, the performance of the B node generally has one vote. In this case, the node odd number is not used, and the decision can be made as long as the total number of votes is an odd number.
What measures should a node take if it is considered an illegal node?
Freeze This illegal only processes the connected request, no longer accepts the new request, and then transfers the resource after processing the request.
Stop The illegal node directly stops running the service and performs resource transfer. This measure is most commonly used.
Ignore directly ignores the normal running service
When will I use ignore?
Only two nodes that are backed up each other
When only two nodes are backing up each other, once the primary node fails to ping the backup node, only two nodes cannot adopt the voting mechanism. Once the voting mechanism is adopted, both nodes only get one vote, and they think that they have hanged. , then not only the primary node will stop the service, but the backup node that should replace the primary node also cannot replace the primary node because it thinks it is illegal. The primary node can only continue to run the service until it is isolated by the Stonish device or the fence device for resource transfer. At this time, the backup node will also replace the primary node.
What resources do you need to provide a MySQL service?
VIP specializes in providing services
FIP (float IP) flowing IP, can be transferred between nodes
Mysql service
File system (to be mounted)
Once a node hangs, which node is it transferred to?
Define the resource constraint score of each node, which score is larger, and which node is more likely to be transferred
Brain splitting: Suppose a cluster has 4 RS_Servers A, B, C, D
A is writing data to a file, and because the CPU of the A server is busy or incorrectly adds an Iptables rule to isolate the heartbeat transmission, etc., and does not send its own heartbeat information to the backup node, then CRM (cluster resource manager) The cluster resource manager used to collect cluster resources or service information finds that the heartbeat information of A is not detected. If the A server is hanged, all the resources on A are transferred to other nodes such as B. This is the B node. Continue to complete the task of node A (write data to the file), it will cause A and B to write to a file at the same time, which will cause file system crash and file confusion.
How to avoid brain splitting?
Separate resources from the original node before making a resource transfer:
Node isolation
Stonish equipment, such as direct power failure headshot, found that a node can not transmit heartbeat directly to power it off
Resource level isolation
FC-SAN (Fibre Access Switch) enables access to isolated nodes in storage resource isolation
How to detect if a node is faulty?
The master node of the quorum disk continuously writes data to a shared disk. Once the standby node finds that it can access the shared disk but does not find the data written by the master node, the master node can be considered to hang up and isolate.
As long as the ping gateway can ping the gateway, the node is normal. Once the ping is different, you can think that it is faulty and isolated.
Watchdog watchdog, coordinate different processes on the same node to write data to the watchdog at regular intervals. Once the write watchdog interrupts, it will try to restart the process. If the restart fails, the node will be faulty and will be removed from the cluster.
Massaging Layer (responsible for transmitting heartbeat, resource stickiness, resource constraints, etc. in multicast mode between the master node and the standby node using the UDP protocol). The Massaging Layer is also a service (UDP/694), and it must be booted from the boot. .
Cluster Resource Manager: The status of each resource on the statistics collection group, such as: resource sticky resource constraints, whether the node is healthy, and the CRM component PE calculates which node the resource should run now. Then, the LRM of the CRM component directs each node's LRM to complete the corresponding operations, such as: migrating the service from the A node to B, enabling the VIP on the Node B, the file system.....
Service startup on a highly available cluster node is determined by CRM and cannot be self-started, so #chkocnfig service name off must be
PE: policy engine policy engine
TE: Tranaction Engine transaction engine
LRM: location Resource Manager local resource manager
PE, TE, and LRM are all components of CRM
RA: Resource Agent Resource Agent
All scripts that are responsible for resource startup, shutdown, restart, and status monitoring are called RA, and RA runs on each node.
RA category
Legency heartbeat v1 RA
LSB All linux-based shell programming supports start|restart|stop|status scripts are all scripts in the LSB type such as /etc/rc.d/init.d/
OCF (open cluster framework) Such scripts can accept parameters such as start|restart|stop|status, and even parameters such as monitior.
DC (designated coordinator) transaction coordinator, DC is also a component of CRM, is a node elected in multiple nodes
Software implementation of Messager Layer
Heartbeat (v1 v2 v3 three versions)
Heartbeat v3 is divided into heartbeat, pacemaker, cluster-glue
CoroSync Messaging Layer used by Red Hat 6.0 by default
Cman Red Hat 5.0 defaults to the Messaging Layer but due to working in kernel space and complex configuration, 6.0 is replaced with CoroSync working in user space.
The configuration and application of keepalived keepalived are different from those of the previous ones. For example, the configuration of the VIP is based on the VRRP (Virtual Router Redundancy Protocol) virtual routing redundancy protocol.
Software implementation of CRM (cluster resource manager) layer
CRM must work on the Messaging Layer
Haresources (heartbeat v1 v2 comes with it)
CRM (heartbeat v2 comes with)
Pacemaker (heartbeat v3 independent project)
Ragmanager (a crm specifically for Cman)
So the cluster's Messager Layer and CRM are combined as follows:
Haresource + heartbeat v1/v2
Crm + heartbeat v2
Pacemaker + corosync
Pacemaker + heartbeat v3
Cman + ragmanager
So how many nodes do you need to define a high-availability cluster for a Web service? Want to define several resources?
At least two nodes are required to run MassagerLayer and CRM
At least define four resource VIPs, httpd services, Filesystem, Stonish devices
In order to avoid a server with a good resource, install MassagerLayer and CRM, time can be synchronized to join our cluster system, how to deal with it?
Firstly, each node needs to carry out encrypted transmission (such as hash operation) when transmitting information such as heartbeat between Messager Layer and CRM node. If there are two nodes that can transmit heartbeat information by unicast, more than two nodes can Perform unicast, multicast, and broadcast transmission heartbeat information. The services on the advanced available cluster nodes must be controlled by CRM. Therefore, to set CRM self-starting, the service should be shut down and booted with chkconfig. The Massager Layer is also a service and must be powered on. Start, Messager Layer listens on UDP/694, and transmits heartbeat and other information in the Messager Layer layer by UDP protocol.
What should I pay attention to when configuring an HA cluster?
The node name should be consistent with the result of uname -n; the resolution of the node name/IP is best in the /etc/hosts file, do not use DNS resolution, otherwise the DNS-Server hangs will affect the cluster; the time of the node must be synchronized; SSH mutual communication (when you want to stop or other nodes' HA cluster services, you can't do this from the node, but you need to shut down or start the HA service from a normal node.) It is necessary to be able to remotely log in to other nodes by SSH.
What about the first node?
The first node will start itself and then start the services on the other nodes .

1.27mm Pitch Series Connector
1.27Mm Pitch Series Connector,1.27Mm Compatible Smc Connector,1.27Mm Board-To-Board Vertical Connectors,Smc Board To Board Connectors
Dongguan SOLEPIN Electronics Co., Ltd , https://www.wentae.com