Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpectedly using the initial state for an existing local disk state component #20

Open
dcoutts opened this issue Oct 24, 2013 · 4 comments

Comments

@dcoutts
Copy link

dcoutts commented Oct 24, 2013

The behaviour of openLocalState is such that if there is existing on-disk state but no checkpoint yet, then it replays the events on top of the initial state passed to this call of openLocalState, rather than on top of the initial state used to create the on-disk state originally.

This means that if you use code like:

now <- getCurrentTime
db <- openLocalState (initDB now)

then we get (arguably) wrong results. The user expects that the initial state is only used the first time when the database is created for the first time. After that we already have a db and it really should use the saved state. If you don't look carefully it looks like data loss, as if we're reverting to the initial state.

Given the mechanism that acid-state's Local driver uses, I think the best fix would be to write the initial state into that first checkpoints-0000000000.log rather than having it as empty.

@dcoutts
Copy link
Author

dcoutts commented Oct 24, 2013

For other users: in the meantime a workaround would be to create a checkpoint manually after opening the db, though it'd be best to optimise this so you only create the checkpoint if the db didn't exist previously.

@dag
Copy link
Contributor

dag commented Nov 6, 2013

I think this is a subtly important issue. Since transactions need to be pure, the only way to do things like, say, UUID-4, is to put a seed in the initial state. Now if you don't checkpoint all your IDs will change every time you replay the transaction log (ie. restart your application), assuming you randomize the initial seed.

@lemmih
Copy link
Member

lemmih commented Nov 11, 2013

Agreed. Will fix this in the next iteration.

@dcoutts
Copy link
Author

dcoutts commented Aug 12, 2015

I just bumped into this issue again, but now in regards to migrations. With complex migrations it's sometimes useful to insert a flag into the state and then in the application logic check that flag and perform some complex one off transaction. So suppose we originally had:

data OurState = OurState { theRealData :: Map This That }

And now we introduce this new flag

data OurState = OurState { theRealData :: Map This That, doneTheTrickyMigration :: Bool }

Obviously the safe-copy migration will set this flag to be False, but now what do we use for the initialOurState value we use for openLocalState? We would want in newly created dbs for it to be True, but if we do use:

initialOurState = OurState Map.empty True

then when there's no checkpoint yet we would end up with the flag being True which is obviously wrong. This can be even worse if some of the transactions themselves need to consult the flag.

The basic problem is that by changing the initialOurState value we're actually changing the past; changing the initial state on which we replay all the old transactions. So for correctness, a newly initialised acid state component really must create a checkpoint with that original initial value, so that it's not affected by subsequent code changes to the initial value for new dbs.

dcoutts added a commit to haskell/hackage-server that referenced this issue Aug 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants