Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] KNN filtering with limit and KNN distance function #4036

Merged
merged 28 commits into from
May 24, 2024

Conversation

emmanuel-keller
Copy link
Contributor

@emmanuel-keller emmanuel-keller commented May 14, 2024

What is the motivation?

KNN operations should be able to be filtered, as described in #4000.

What does this change do?

This PR modifies the behavior of the KNN search so that the search reaches the K goal by bypassing documents that do not match the query.

It also introduces the vector::distance::knn() function, which returns the distance computed during the query (avoiding recomputation).

Eg.:

SELECT id, vector::distance::knn() AS distance FROM pts
WHERE flag = true && point <|2|> [4.5f] ORDER BY distance;
  • HNSW implementation
  • MTree implementation
  • Bruteforce implementation

What is your testing strategy?

Tests have been written (for MTree and HNSW).

Is this related to any issues?

Fixes #4000

Does this change need documentation?

If this pull request requires changes, updates, or improvements to the documentation, then add a corresponding issue on the docs.surrealdb.com repository, and link to it here.

Have you read the Contributing Guidelines?

@emmanuel-keller emmanuel-keller changed the title KNN filtering with limit KNN filtering with limit and knn distance function May 15, 2024
@emmanuel-keller emmanuel-keller marked this pull request as ready for review May 21, 2024 13:05
@emmanuel-keller emmanuel-keller requested review from a team and tobiemh as code owners May 21, 2024 13:05
@emmanuel-keller emmanuel-keller modified the milestones: v2.0.0, v1.6.0-beta.1 May 21, 2024
@emmanuel-keller emmanuel-keller changed the title KNN filtering with limit and knn distance function [Feat] KNN filtering with limit and knn distance function May 22, 2024
@emmanuel-keller emmanuel-keller changed the title [Feat] KNN filtering with limit and knn distance function [Feat] KNN filtering with limit and KNN distance function May 22, 2024
@emmanuel-keller emmanuel-keller added topic:surrealql This is related to the SurrealQL query language topic:indexing This is related to indexing and full-text search labels May 22, 2024
phughk
phughk previously approved these changes May 23, 2024
Copy link
Contributor

@phughk phughk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, really cool thanks!

core/src/idx/planner/mod.rs Outdated Show resolved Hide resolved
@emmanuel-keller emmanuel-keller modified the milestones: v1.6.0-beta.1, v2.0.0 May 24, 2024
Copy link
Contributor

@phughk phughk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm thanks!

@emmanuel-keller emmanuel-keller added this pull request to the merge queue May 24, 2024
Merged via the queue into main with commit 7495611 May 24, 2024
22 checks passed
@emmanuel-keller emmanuel-keller deleted the emmanuel/knn_with_limit branch May 24, 2024 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:indexing This is related to indexing and full-text search topic:surrealql This is related to the SurrealQL query language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Vector NN search count does not respect WHERE clause
3 participants