Skip to content

[maintenance events] Push notification listener #4194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 29 commits into
base: feature/maintenance-events
Choose a base branch
from

Conversation

ggivo
Copy link
Collaborator

@ggivo ggivo commented Jul 1, 2025

Description

This PR enhances Jedis and UnifiedJedis with improved support for RESP3 push messages and introduces a configurable option for relaxed timeout behavior.

These changes are part of a broader effort to support more robust and controlled upgrade and maintenance scenarios. Specifically, they lay the groundwork for:

  • Temporarily relaxing timeouts during planned maintenance windows.
  • Gracefully re-binding pooled connections to updated endpoints without disrupting active operations.

The goal is to increase resilience and flexibility in production environments that require connection stability during topology changes or Redis upgrades.

Backround

Starting with RESP3, Redis servers can send push messages to clients out-of-band (i.e., outside the standard request/response flow). See the RESP3 Push Specification for more details.

In the current state of the client:

  • Jedis supports only known Push notifications like invalidate. Push notifications are supported using dedicated client-side caching (CSC) connection (CacheConnection)
  • This leads to runtime errors if CLIENT TRACKING ON is enabled in a context other than CSC, or if other RESP3 push events are received.

Changes Introduced

  • Added support for processing and discarding all RESP3 push messages, preventing errors in non-CSC scenarios.
  • Introduced support for registering a custom PushListener, enabling applications to handle push notifications as needed.
  • Pub/Sub push messages are still handled as before and propagated to the client for backward compatibility.
  • Added a configurable option to enable relaxed timeout behavior, intended for usage during maintenance events or when expecting slow responses from the server.

ggivo added 3 commits June 23, 2025 13:23
   - Preparation step for processing custom push notifications
   - Push notification can appear out-of band in-between executed commands
   - Current Connection implementation does not support out of band Push notifications
   - Meaning it will crash if "CLIENT TRACKING ON is enabled" on regular Jedis Connection and "invalidation" push event is triggered

 This commit provides a way to register push handler for the connection which process incoming push messages, before actual command is executed.  To preserve backward compatibility unprocessed push messages are forward to application logic as before.

   - By default Connection will start with NOOP push handler which marks any incoming push event as processed and skips it
   - On subcsribe/psubscribe a dedicated push handler is registered which propagates to the app only supported push  vents such as (message, subscribe, unsubscribe ...)
   - CacheConection is refactored to use a push handler handling "invalidate" push events only, and skipping any other
This commit adds a new PushHandlerChain class that implements the Chain of
Responsibility pattern for Redis RESP3 push message handling. Key features:

- Allows composing multiple PushHandlers in a processing chain
- Push events propagate through the complete chain in sequence
- Events marked as not processed are propagated to the client application
- Provides both constructor-based and fluent builder API for chain creation
- Includes predefined handlers for common use cases (CONSUME_ALL, PROPAGATE_ALL)
- Supports immutable chain transformations via methods like then(),

The chain approach provides a flexible way to handle different types of push
messages (invalidations, pub/sub, etc.) with specialized handlers while
maintaining a clean separation of concerns.

Example usage:
  PushHandlerChain chain = PushHandlerChain.of(loggingHandler)
      .then(invalidationHandler)
      .then(PushHandlerChain.PROPAGATE_PUB_SUB_PUSH_HANDLER);
  - code clean up
  - added relaxed timeout configuration
  - fix unit tests
@ggivo ggivo marked this pull request as draft July 1, 2025 12:17
@ggivo ggivo self-assigned this Jul 7, 2025
Copy link
Contributor

@atakavci atakavci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!! there are couple of things not entirely clear to me. Just to clear them on my mind i will summarize as i understand;
all push messages goes through all the consumers in the chain,
in current impl, they go through;

  • CONSUME_ALL_HANDLER
  • PUBSUB_ONLY_HANDLER
  • MaintenanceEventConsumer
  • ListenerNotificationConsumer - helps to plug customer app push listeners
  • PushInvalidateConsumer (in case of CSC -CacheConnection-)

what i don't like is each push messages visit all consumers no matter what. And the last one in the list, it kind of decides to set if data handed back to command execution or not.
so we expect each consumer to do whatever it wants to do with message and set the context appropriately.
this setup brings both flexibility and chaotic behaviour as well, because if someone shows up and marks it as they wish, they challenge to other consumer implementations plugged previously.

i think no push messages, other than pub/sub ones should be handed back to command executer anyway.
(and at some point, even they are not going to following the command response path in the case we introduce a way to customer app handle pub/sub messages via push listeners).
IMHO there should be something separate from letting client received/notified of push messages as well. May be with PUBSUB_ONLY_HANDLER functions the same way with ListenerNotificationConsumer but only for pub/sub messages.

let me try to explain the way i see it;

  • a consumer chooses if each message propagates further to next consumer or not, and returns it as a value of consume result.
    Each consumer along the chain should receive only what is let by its preceding consumers. this secures if messages should be seen or not, by any other consumer. Also this is more efficient in case there is a load of invalidation and/or pubsub messages.
interface PushConsumer {
   bool consume(PushMessageContext);
}
  • no messages other than pub/sub ones handed back to customer app (or received by command executor). isProcessed flag in context still remains the same to identify if message is chosen to be returned to the app
  • each push message initializes with default isProcessed=true and there is no need for a CONSUME_ALL_HANDLER. Then current naming of flag "isProcessed" might be suboptimal in this case. Alternatives would be forwardToClient or isRequesterBound

maintenanceEventHandler.addListener(new ConnectionRebindHandler());
}

if (TimeoutOptions.isRelaxedTimeoutEnabled(config.getTimeoutOptions().getRelaxedTimeout())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure relaxedTimeout s are something that makes sense as standalone item? All this should be considered part of hitless upgrade setup and shouldnt be some standalone item in the configuration.

Copy link
Collaborator Author

@ggivo ggivo Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I look at them as a functionality not bound to maintenance events (hitless upgrade. Maintenance events are just a way to trigger a relaxed timeout for a period of time, but this can also be done from different external triggers.
Moreover, similar to Lettuce, at some point, we can add a configurable command timeout per command type.

ggivo and others added 4 commits July 24, 2025 09:00
Issue : If Maintenace notifications are received during blocking command, relaxTimeout is enforced instead of infinit timeout.

Fix: Introduce dedicated relax timeout setting for blocking commands. It will fall back to infinit timeout if not set
@ggivo ggivo changed the base branch from master to feature/maintenance-events July 28, 2025 10:44
@ggivo ggivo changed the title Push notification listener [maintenance events] Push notification listener Jul 28, 2025
@ggivo
Copy link
Collaborator Author

ggivo commented Jul 29, 2025

looks good!! there are couple of things not entirely clear to me. Just to clear them on my mind i will summarize as i understand; all push messages goes through all the consumers in the chain, in current impl, they go through;

  • CONSUME_ALL_HANDLER
  • PUBSUB_ONLY_HANDLER
  • MaintenanceEventConsumer
  • ListenerNotificationConsumer - helps to plug customer app push listeners
  • PushInvalidateConsumer (in case of CSC -CacheConnection-)

what i don't like is each push messages visit all consumers no matter what. And the last one in the list, it kind of decides to set if data handed back to command execution or not. so we expect each consumer to do whatever it wants to do with message and set the context appropriately. this setup brings both flexibility and chaotic behaviour as well, because if someone shows up and marks it as they wish, they challenge to other consumer implementations plugged previously.

i think no push messages, other than pub/sub ones should be handed back to command executer anyway. (and at some point, even they are not going to following the command response path in the case we introduce a way to customer app handle pub/sub messages via push listeners). IMHO there should be something separate from letting client received/notified of push messages as well. May be with PUBSUB_ONLY_HANDLER functions the same way with ListenerNotificationConsumer but only for pub/sub messages.

let me try to explain the way i see it;

  • a consumer chooses if each message propagates further to next consumer or not, and returns it as a value of consume result.
    Each consumer along the chain should receive only what is let by its preceding consumers. this secures if messages should be seen or not, by any other consumer. Also this is more efficient in case there is a load of invalidation and/or pubsub messages.
interface PushConsumer {
   bool consume(PushMessageContext);
}
  • no messages other than pub/sub ones handed back to customer app (or received by command executor). isProcessed flag in context still remains the same to identify if message is chosen to be returned to the app
  • each push message initializes with default isProcessed=true and there is no need for a CONSUME_ALL_HANDLER. Then current naming of flag "isProcessed" might be suboptimal in this case. Alternatives would be forwardToClient or isRequesterBound

We can simplify the logic by switching the default and removing the need for CONSUME_ALL_HANDLER.
(Done)

However, all applicable consumers must still be invoked if present. For example, if a user registers a listener for maintenance events, the MaintenanceEventConsumer must be called; likewise, we need the ListenerNotificationConsumer to notify any registered PushListener.

Until we refactor the Pub/Sub system and verify that no external code relies on propagated push events, we still need a mechanism to determine whether a given Push event should be propagated. Any consumer may update this decision, so the order of consumers remains important.

Implementing an option for skipping a consumer is possible, but if propagation is disabled too early, again we end up with the same uncertainty. So I think the final resolution is to make sure no push messages are propagated at all, but don't think this can be handled easily and need to be tackled separately.

My idea is that PushConsumers should be kept internal and should not be exposed to users. Instead, users should interact with the system via PushListener.

@ggivo
Copy link
Collaborator Author

ggivo commented Jul 29, 2025

Discussed with @atakavci and we came up with the idea that it will be more optimal if instead of invoking the consumers, for each push message and each consumer decides if it is interested to process it, it will be better for consumers to register themself for a specific message type e.g. ("INVALIDATE") and get invoked only for it.

Considering this an optimisation, we will try to improve it with a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants