Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] usage of nested knn query and efficient filter with nested query is not correct #2511

Open
wdongyu opened this issue Feb 10, 2025 · 0 comments
Labels
bug Something isn't working untriaged

Comments

@wdongyu
Copy link

wdongyu commented Feb 10, 2025

What is the bug?
From #1356, when we need to execute a nested knn query and an efficient filter with a nested field, we are advised to use query like this:

{
  "query": {
    "nested": {
      "path": "test_nested",
      "query": {
        "knn": {
          "test_nested.test_vector": {
            "vector": [
              5
            ],
            "k": 24,
            "filter": {
              "nested": {
                "path": "test_nested",
                "query": {
                  "term": {
                    "test_nested.parking": "false"
                  }
                }
              }
            }
          }
        }
      },
      "inner_hits": {}
    }
  }
}'

But indeed, it will return wrong number of hits. When I use the following query, it works as expected:

{
  "query": {
    "nested": {
      "path": "test_nested",
      "query": {
        "knn": {
          "test_nested.test_vector": {
            "vector": [
              5
            ],
            "k": 24,
            "filter": {
              "term": {
                "test_nested.parking": "false"
              }
            }
          }
        }
      },
      "inner_hits": {}
    }
  }
}

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Base on testcase AdvancedFilteringUseCasesIT.testFiltering_whenNestedKNNAndFilterFieldWithNestedQueries_thenSuccess, we first increase k to 100 as we need to match more nested knn docs in all(NUM_DOCS=50) parent docs:
private static final int k = 100;
  1. In final validation, change totalSearchHits to NUM_DOCS/2 = 25, as there are half docs that meet the requirements:
Assert.assertEquals("For engine " + engine + ", totalSearchHits: ", NUM_DOCS / 2, parseTotalSearchHits(response));
if (KNNEngine.getEngine(engine) == KNNEngine.FAISS) {
    ...
    Assert.assertEquals("For engine " + engine + ", totalSearchHits with ANN search :", NUM_DOCS / 2, parseTotalSearchHits(response));
    ...
}
  1. Run the test and it fail, but other tests in the same IT pass:
 totalSearchHits:  expected:<25> but was:<24>

What is the expected behavior?
Nested query in efficient filter should work.

What is your host/environment?

  • OS: Debian GNU/Linux 8
  • Version: 2.18.0
  • Plugins: KNN

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Add any other context about the problem.

@wdongyu wdongyu added bug Something isn't working untriaged labels Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

1 participant