-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance tuning via reduced durability #54
Comments
IIUC, currently I like the option of continuing user code before data is written to the disk. It looked like |
We could add a 'reallyUnsafeUpdate' function which would make the result immediately available without waiting for serialization. This wouldn't work for the Remote backend, obviously. And God have mercy on your soul (and your data) if you use the function carelessly. |
Well, it's not "really unsafe", depends on the context. My test code above implements a storage backend for server-side sessions. On most websites it will be no big deal if the session data from the last few seconds is lost forever. However, adding hundreds on milliseconds per request due to What about, instead of providing a different update function, adding an option that could be set when opening the Local backend? The option could be named "really unsafe" without hurting the legibility of every bit of code that updates something. |
Permanently losing data doesn't qualify as "really unsafe"? I hope you're not working in banking. :) I would be willing to downgrade it from "reallyUnsafe" to just "unsafe". At first I was against adding an option to make this behavior the default but actually that's the only approach that makes sense. An 'unsafeUpdate' function would taint your entire application, not just the thread that used it. So, we want an option for the Local backend to return results immediately without waiting for fsync, and we want a 'waitUntilMyDataIsSafe :: AcidState -> IO ()' function as well. |
Well, even on banking I imagine losing session data would be harmless. You could log out some logged in users, or forget that some users had asked to be logged out. Doesn't look that bad :). |
BTW, I like your suggestion. |
I have some test code that right now that is taking 582s of wall time and 51s of CPU time to complete. If I completely remove
fsync
calls by editing the UnixFileIO
, I get 16s of wall time and 35s of CPU time. That's a 36-fold improvement to wall time!I'm not suggesting we should not call
fsync
as that would rename the package toaci-state
. However, many DBMSs provide knobs to adjust performance vs durability. For example:fsync
, tofsync
every second orfsync
every query.fsync
entirely. You can keepfsync
but return query results beforefsync
completes.I'm not proposing anything concrete that should be done, I'd just like to start a conversation about possible tradeoffs.
The text was updated successfully, but these errors were encountered: