Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support async RESP responses to pending GET calls (due to disk read) #387

Merged
merged 64 commits into from
May 29, 2024

Conversation

badrishc
Copy link
Contributor

@badrishc badrishc commented May 15, 2024

Supports the following:

  • async on command to turn on async mode for session
  • async off command to turn off async mode for session
  • async barrier command to complete all ongoing pending ops on that session

Limitations:

  • Async works only when RESP3 is turned on (HELLO 3)
  • Async works only for GET operation right now, when the GET hits disk (goes pending)

Protocol summary:

  • Immediate (sync) response of: -ASYNC token to GET that goes pending
  • Asynchronous PUSH RESP3 reply on the same stream (TCP connection) as per the format in the linked proposal by @mgravell

Fixes #157

Open tasks:

  • Prevent interspersing of multi-part command results that might arrive and send replies in chunks (e.g., MGET, MSET) with async results.
  • Verify no perf impact for non-async case, due to networkSender locking; optimize if needed

@badrishc badrishc changed the title [Sketch] Support for async out-of-band RESP responses - focusing on GET [WIP] Support for async out-of-band RESP responses - focusing on GET May 16, 2024
Copy link
Contributor

@mgravell mgravell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome to see this being investigated, and I fully understand this is very early! probably the main thing to describe somewhere is: what exactly is the expected usage, in particular what this means in terms of connections and RESP flow. Who sends and receives what, etc - and on which connection (out-of-band kinda demands 2 connections, or RESP3). Happy to help poke with sample client usage as this develops - I'll keep a watch, but feel free to ping me.

libs/server/API/IGarnetAdvancedApi.cs Show resolved Hide resolved
libs/server/Resp/AsyncProcessor.cs Show resolved Hide resolved
@badrishc
Copy link
Contributor Author

badrishc commented May 16, 2024

How to test out-of-band responses

  1. Launch GarnetServer in low memory mode (only 2 pages of 4k size each) with storage tier enabled

GarnetServer --memory 8k --page 4k --logger-level Trace --storage-tier true --port 6379

  1. Insert 3 keys of around 4000-byte values so that oldest key (key1) has to evict to storage

redis-cli -h 127.0.0.1 mset key1 "$(head -c 3000 /dev/urandom | base64)" key2 "$(head -c 3000 /dev/urandom | base64)" key3 "$(head -c 3000 /dev/urandom | base64)"

  1. Use telnet to try the new API on single session, as no client supports it yet:

telnet 127.0.0.1 6379

Request:

*2
$5
hello
$1
3
*2
$5
async
$2
on

Reply:

<...hello response...>
+OK

Request:

*2
$3
get
$4
key1

Reply (sync):

-ASYNC 0

Reply (async):

>3
$5
async
$1
0
$4052
8klktiEJiSfag5XvDiMCKXLv6G8GOvTF1U5hXsGtJ3D7KIL1N8MwVv8xPwxnsBw7QwZfQVP0MzlN
<...snipped...>
2RR9d3nCbWAx9oclQpdUEV5iskgomho/OZf3kHY7bsi5lWT8

Request:

*2
$5
async
$7
barrier

Reply:

+OK

@badrishc badrishc changed the title [WIP] Support for async out-of-band RESP responses - focusing on GET [WIP] Support for async RESP responses to disk cache reads with GET May 16, 2024
@badrishc
Copy link
Contributor Author

badrishc commented May 16, 2024

awesome to see this being investigated, and I fully understand this is very early! probably the main thing to describe somewhere is: what exactly is the expected usage, in particular what this means in terms of connections and RESP flow. Who sends and receives what, etc - and on which connection (out-of-band kinda demands 2 connections, or RESP3). Happy to help poke with sample client usage as this develops - I'll keep a watch, but feel free to ping me.

I have clarified that this is specifically for async responses on the SAME client-server connection. Creating a separate connection to send responses "out-of-band" is not scalable for the intended use case, so we will not pursue that.

When user sends a GET command that hits disk, and async is turned on for the session, the server will immediately reply with -ASYNC token. At some point in the future, the server will send the async response packet back on the same connection.

Needless to mention, we guarantee that the async response will not break in between a sync response to a different command issued on the same connection, while we were waiting for the async to complete in the background. The only exception to this is the async barrier command, for which the response of +OK will arrive after all async packets are delivered (on the same connection as before).

@mgravell
Copy link
Contributor

mgravell commented May 17, 2024 via email

@badrishc
Copy link
Contributor Author

badrishc commented May 17, 2024

IMO that single-connection approach cannot be done safely before RESP3, which adds a token prefix to specify an out-of-band message. I do not believe it is 100% safe or reasonable to hand-wave a "that looks like it might be an async response, treat it as such"

Oh, sure, we can do what it takes for this to be RESP3 compatible (as long as it's not going to add overhead on the critical path or too much work). I just followed the protocol as I understood it from your issue. Any link to exactly what needs to be done?

As mentioned, I'm not seeing how we can reasonably expect clients to double the number of connections to the server just for this.

@mgravell
Copy link
Contributor

mgravell commented May 17, 2024

OK; RESP3; the absolute minimum you'd need here would be:

  • add protover as a concept at the connection level, i.e. default to 2 with capacity to switch to 3 via:
  • implement HELLO in a way that HELLO 3 works (oh, and it MUST also support the optional args; SE.Redis uses all of them, conditionally - i.e. "do we have a password? use it; do we want to set the client lib name? pass it" - totally fine if you just drop the lib name on the floor if you don't have anything sensible to do with it, but you can't reject the HELLO because it was included)
  • validity checks
    • ensure protover is at least 3 before enabling same-connection ASYNC, i.e. disallow if not
    • think about what happens if someone tries HELLO 2 while an ASYNC is outstanding (you can issue HELLO at any time)
  • change out-of-band async to use the Push type, as described here; the short version would be "use >3\r\n instead of *3\r\n"

Now; strictly speaking there's actually a whole bunch of other things that change between RESP2 and RESP3 with differing levels of documentation - there's a whole story there, and the final "here's the changes in one place" never actually got merged, you can see it here: https://github.com/redis/redis-doc/pull/2513/files - however, to be honest, I suspect you can largely limp by without those secondary changes; the reason I say that is:

  • libraries already need to cope with both the RESP2 and RESP3 layouts
  • the most jarring deltas are in Lua, and Garnet doesn't do Lua

SE.Redis copes fine with either layout, for example. I cannot say that every client will be fine if that client has hard-coded RESP3 vs RESP2 response handling.

@badrishc
Copy link
Contributor Author

As first step, added basic HELLO support here: #398

@mgravell
Copy link
Contributor

mgravell commented May 18, 2024 via email

@badrishc
Copy link
Contributor Author

badrishc commented May 21, 2024

@mgravell - this work item is now complete and ready to review. Would be great to see sample client integration for end-to-end async wiring.

@mgravell
Copy link
Contributor

I can confirm that I can connect as RESP3 (although I have not done exhaustive tests), and I've started a very early spike here: https://github.com/StackExchange/StackExchange.Redis/tree/marc/async - zero relevant testing yet

libs/server/Resp/AsyncProcessor.cs Outdated Show resolved Hide resolved
libs/server/Resp/AsyncProcessor.cs Show resolved Hide resolved
libs/server/Resp/AdminCommands.cs Show resolved Hide resolved
libs/server/Resp/AdminCommands.cs Show resolved Hide resolved
libs/server/Resp/BasicCommands.cs Show resolved Hide resolved
Copy link
Contributor

@TalZaccai TalZaccai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with some minor comments

libs/server/Resp/AsyncProcessor.cs Outdated Show resolved Hide resolved
libs/server/Resp/AsyncProcessor.cs Outdated Show resolved Hide resolved
playground/CommandInfoUpdater/GarnetCommandsInfo.json Outdated Show resolved Hide resolved
test/Garnet.test/RespTests.cs Show resolved Hide resolved
@TalZaccai TalZaccai linked an issue May 29, 2024 that may be closed by this pull request
@badrishc badrishc merged commit 53cff60 into main May 29, 2024
33 checks passed
@badrishc badrishc deleted the badrishc/async-resp branch May 29, 2024 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants