redis企业版看门狗-天翼云

redis企业版看门狗

2023-05-29 10:45:37 阅读次数：112

摘要:

记录对redis企业版的高可用技术中看门狗的分析

Highly Available Redis | Redis

Auto failover

A Redis Enterprise cluster uses two watchdog processes to detect failures:

Node watchdog: Monitors all processes running on a given node. For example, the node watchdog triggers a shard failover event if a specific shard is not responsive.
Cluster watchdog: Responsible for the health of the cluster nodes and uses agossip protocol to manage the membership of the nodes in the cluster. For example, cluster watchdog triggers a node failure event or detects a network split incident.

These watchdog processes are part of the distributed cluster manager entity and reside on each node of the cluster. It is extremely important for failure detection to be managed by entities that run inside the cluster in order to avoid situations like that shown on the left side of the figure below. In this example, the watchdog entity is located in the wrong side of the network split and cannot trigger the failover process:

2022-04-11 redis企业版看门狗

Once a failure event is detected, the Redis Enterprise cluster automatically and transparently runs a set of internal distributed processes that failover the relevant shard(s) and endpoint(s) (if needed) to healthy cluster nodes. If necessary, they also reroute user traffic through a different proxy or proxies.

The Redis Enterprise cluster has out-of-the-box HA profiles for noisy (public cloud) and quiet (virtual private cloud, on-premises) environments. We have found that triggering failovers too aggressively can create stability issues. On the other hand, in a quiet network environment, a Redis Enterprise cluster can be easily tuned to support a constant single-digit (<10 sec) failover time in all failure scenarios.

活动

智算服务

应用商城

合作伙伴

开发者

支持与服务

了解天翼云

redis企业版看门狗

redis企业版看门狗

摘要:

Auto failover

分析:

一. 看门狗进程分为节点看门狗和集群看门狗两个不同的进程

二. 节点看门狗设计

三. 集群看门狗设计

故障检测要点:

一. 故障检测的耗时

二. 故障检测的准确性

故障检测watchDog常用做法:

一. 心跳和ping

二. phi增量故障检测器

三. gossip故障检测

四. 反向故障检测

相关文章

解决tomcat部署项目中碰到的几个问题

redis配置参数详细说明

Redis的发布订阅（消息队列，比如ActiveMQ，一方得到数据后，多方得到信息）

lepus监控redis执行python check_redis.py报错

redis-cluster分布式集群安装部署

非openresty方式安装Nginx + Lua + Redis 环境

RedLock 与 Redisson 实现分布式锁---算法与应用

简述Redis事务实现---------＞负载均衡算法、类型

【Redis】Redis 集群缓存测试要点--关于 线上 token 失效 BUG 的总结 --研读

【redis】redis缓存穿透及解决方案|缓存穿透，缓存击穿，雪崩的理解

作者介绍

最新文章

Redis的发布订阅（消息队列，比如ActiveMQ，一方得到数据后，多方得到信息）

lepus监控redis执行python check_redis.py报错

非openresty方式安装Nginx + Lua + Redis 环境

执行redis-cli命令创建redis集群时报错“Could not connect to Redis at IP:端口: No route to host”

redis主从复制集群环境搭建

Redis【Redis数据安全（持久化机制概述、RDB持久化机制实战 、AOF持久化机制实战、如何选用持久化方式、事务的概念与ACID特性） 】(五)-全面详解（学习总结---从入门到深化）

热门文章

redis-数据操作-键命令

k8s的operator-hub中的redis-operator的redis-cluster的CreateRedisLeaderService处理

Reids持久化

给redis cluster集群加上认证功能

celery-02-安装与使用说明-for-redis

redis主从同步，总是显示master_link_status:down的解决方法

热门标签

相关产品

弹性云主机

天翼云电脑（公众版）

对象存储

云硬盘

随机文章

【Redis底层原理】之数据结构与持久化机制

RabbitMQ—重复消费、数据丢失和消息顺序性

redis主从复制

对于redis的operator的概要设计

k8s安装redis，yaml如何写？

celery的初次使用

【Redis】Redis 集群缓存测试要点--关于线上 token 失效 BUG 的总结 --研读

Redis【Redis数据安全（持久化机制概述、RDB持久化机制实战、AOF持久化机制实战、如何选用持久化方式、事务的概念与ACID特性）】(五)-全面详解（学习总结---从入门到深化）