Let's give a new chance to phpredis to get an answer during a Redis cluster failover #2058
reporter4u
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm using PhpRedis in a LAMP-FPM environment with a six-nodes Redis cluster and want to suggest a new little feature.
Since Redis cluster will return always a 'CLUSTERDOWN' response while a master node goes down and this node is recognized as 'fail' and a replica node takes its place, I would suggest to add a new PhpRedisCluster class parameter (for example node_timeout) and a new redis.ini option that we could call redis.clusters.node_timeout, in order to wait node_timeout seconds and trying one more time the request before passing to PHP a CLUSTERDOWN response if still down.
Moreover we need an in-memory cached variable in order to 'save' the cluster status of a specific cluster name or a specific list of cluster seeds for further requests, so that we avoid to make the second request explained before if the cluster status is still down. For example an APCu userland key-value could be useful although a global server variable would be better in order to signal the status to all PhpRedisCluster instances who are using that cluster name or that list of cluster seeds.
I want to show my idea (I'm not a developer) with a little ugly pseudocode:
IMHO all this gives one more chance to the requests to be satisfied only in the case that a master node failed and a replica is going to switch to master.
This new option should be set at least at the Redis cluster-node-timeout setting (in relation with cluster-slave-validity-factor if different than default) + 1 second in order to wait that a replica switches to master.
It deserves to be said that if the node_timeout value is set too high every parallel requests will wait this time until the first request that gets a CLUSTERDOWN response sets the in-memory cached variable to 'CLUSTERDOWN'... And if there are a lot of parallel requests, for example in a LAMP scenario, it is possible to reach the max request threshold of the webserver or the php-fpm max process allowed. In my environment I wouldn't set node_timeout to more than 6-7 seconds if Redis cluster-node-timeout is set to 4-5 seconds
I hope this idea can be useful in order to sort out the issue #1270
Thank you in advance for your considerations!
Beta Was this translation helpful? Give feedback.
All reactions