Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve non-transactional metadata-sync experience considering clusters with large number of shards #7487

Open
onurctirtir opened this issue Feb 8, 2024 · 3 comments

Comments

@onurctirtir
Copy link
Member

  1. Investigate how we end-up with large number of locks in non-transactional mode.
  2. Make non-transactional metadata-sync truly idempotent:
    e.g., the commands that we send to detach partitions is not truly idempotent even if we do ALTER TABLE IF EXISTS DETACH PARTITION.
    This command guards us against the cases where the parent relation does not exist. However, it doesn't save us in the cases where:
    • the child relation does not exist
    • the child relation is not a partition of parent relation
@onurctirtir
Copy link
Member Author

onurctirtir commented Feb 8, 2024

@aykut-bozkurt / @halilozanakgul / @mtuncer / @vlkncetin, please feel free to add more comments if I'm missing something.

cc: @austuner / @pinodeca on prioritization of this issue.

@aykut-bozkurt
Copy link
Contributor

  1. Those locks that are accumulating on the coordinator are normal. They are not that much even for a huge cluster, so it does not cause any failure. We verified non-transactional sync regularly clears accumulated locks at all workers via many small transactions.
  2. This sounds true to me. We actually need to make sure that anywhere where we propagate "detach partition" is idempotent. It is extra important for non-transactional metadata-sync as it would leave the sync in an intermediate state that should be completed.

@aykut-bozkurt
Copy link
Contributor

I found out one place where we accumulate locks on the coordinator. extern_IsColumnarTableAmTable points to IsColumnarTableAmTable which acquires lock on the relation without releasing it. See below:

if (accessMethod == NULL && extern_IsColumnarTableAmTable(relationId))

We can solve it with a variant of IsColumnarTableAmTable which takes no lock, which is fine to replace above snippet where we already take lock on metadata tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants