[Feature Request] Avoid invalid retries on multiple replicas when querying #17361
Labels
enhancement
Enhancement or improvement to existing feature or request
Search
Search query, autocomplete ...etc
untriaged
Is your feature request related to a problem? Please describe
It is known that when a query fails in a shard, OpenSearch will select a new replica to retry. However, in certain cases, such as when encountering an IllegalArgumentException or TaskCancelledException, this retry is invalid.
OpenSearch/server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java
Line 558 in 9de21d1
In our product, some of the replicas have a count greater than 10. This leads to an excessive number of invalid retries, which not only increases the response time for users but also exerts unnecessary pressure on the cluster.
Describe the solution you'd like
When a query fails in a shard, and the exception is 4xx(IllegalArgumentException or TaskCancelledException), In such scenarios, we should fast fail the querying shard.
Related component
Search
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: