Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append/writes recovery failure due to inconsistent inode numbers #66

Open
OmSaran opened this issue May 2, 2022 · 0 comments
Open

Append/writes recovery failure due to inconsistent inode numbers #66

OmSaran opened this issue May 2, 2022 · 0 comments

Comments

@OmSaran
Copy link
Contributor

OmSaran commented May 2, 2022

The append recovery logic currently depends on inode numbers of the file and the staging file stored in the append log. But the inode number of the file may change in some scenarios upon recovery.

Consider the following example happening in order (SplitFS Strict mode):

  1. file1 is created with size 0. Let its inode number be 1. A LOG_FILE_CREATE operation is created in oplog
  2. An append operation is done on file1. The contents are written on a staging file with say inode number 2
    Also, an append log entry is created storing source (1) and destination (2) numbers.
  3. There's a crash (power failure) and there was no fsync. Lets assume it crashed after the append/write call returned to the application.

During recovery, the following happens:

  1. Op log recovery attempts to from step 1 in example attempts to re-create the file (file1 is lost due to lack of fsync, thus relies log recovery) via ext4-dax. This inode number is not guaranteed to be 1. Lets say it is 3 now.
  2. Append log recovery attempts to relink file with an invalid inode (1) and inode (3) and thus the append is lost.

To fix this, one solution that I could think of is to keep track of old and new inode numbers during op log recovery by creating a mapping between old and new inode numbers. During append log recovery use the new inode in place of the old one by examining the mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant