Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected error when moving partition shards with different owners than the parent's #7585

Open
agedemenli opened this issue Apr 17, 2024 · 0 comments

Comments

@agedemenli
Copy link
Contributor

In LogicallyReplicateShards, in multi_logical_replication.c, we are filtering out partitioned shard intervals from the shard list, using the function PrepareReplicationSubscriptionList.

/*
 * PrepareReplicationSubscriptionList returns list of shards to be logically
 * replicated from given shard list. This is needed because Postgres does not
 * allow logical replication on partitioned tables, therefore shards belonging
 * to a partitioned tables should be exluded from logical replication
 * subscription list.
 */

Then we create publication info hash using the filtered list, excluding partitioned ones. But when it comes to creating target lists, we use the original list unfiltered, including the partitioned one.

	List *replicationSubscriptionList = PrepareReplicationSubscriptionList(shardList);
        ....
        ....
	List *logicalRepTargetList = CreateShardMoveLogicalRepTargetList(publicationInfoHash, shardList);

	HTAB *groupedLogicalRepTargetsHash = CreateGroupedLogicalRepTargetsHash(logicalRepTargetList);

This causes creation of publication infos in the hash, with owner ids of the filtered shard intervals as keys (excluding partitioned ones). Then for creating the target lists, Citus expects all the owner ids to be present in the hash keys. But it fails since we have filtered the partitioned tables out when creating those keys.

This doesn't cause any errors if the ignored (partitioned) shard's owner is also the owner of any other shard in the list, most probably from its children. But when the parent's owner is different than the children's, Citus errors out as the owner id is not present in the key part of the hash.

To reproduce:

SET citus.next_shard_id TO 980000;
CREATE TABLE partitioning_hash_test(id int, subid int) PARTITION BY HASH(subid);
CREATE TABLE partitioning_hash_test_1 PARTITION OF partitioning_hash_test FOR VALUES WITH (MODULUS 3, REMAINDER 0);
 
create role r1 with login superuser;
alter table partitioning_hash_test owner to r1;
SELECT create_distributed_table('partitioning_hash_test', 'id');
select citus_move_shard_placement(980000, 'localhost', 9701, 'localhost', 9702, 'force_logical');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant