Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query requests return with sample size defined in advanced setting #9356

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

abbyhu2000
Copy link
Member

@abbyhu2000 abbyhu2000 commented Feb 7, 2025

Description

The size of the SQL and PPL responses should be determined by advanced setting discover:sampleSize. Default value is 500.

SQL:

Default

SELECT * FROM dataset
  • we use fetch_size param when sending SQL request to define how many hits we should get back:

PPL:

Default:

source = dataset | head ${advanced setting sampleSize value, default is 500}

async SQL/PPL:

Default:

SELECT * FROM dataset LIMIT ${advanced setting sampleSize value, default is 500}
source = dataset | head ${advanced setting sampleSize value, default is 500}
  • fetch_size currently do not support for async queries; we interpolate the default queries to add a sample size

Screenshot

Screen.Recording.2025-02-13.at.4.56.28.PM.mov

Testing the changes

Changelog

  • feat: Remove sample size in default query string but use advanced setting

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

Copy link
Contributor

github-actions bot commented Feb 7, 2025

❌ Empty Changelog Section

The Changelog section in your PR description is empty. Please add a valid changelog entry or entries. If you did add a changelog entry, check to make sure that it was not accidentally included inside the comment block in the Changelog section.

@abbyhu2000 abbyhu2000 force-pushed the sending_size_response branch from d3fe9fb to b164a62 Compare February 7, 2025 23:32
Copy link

codecov bot commented Feb 7, 2025

Codecov Report

Attention: Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 61.70%. Comparing base (0994dbe) to head (49a56fd).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
...gins/query_enhancements/public/datasets/s3_type.ts 0.00% 3 Missing ⚠️
..._enhancements/server/search/sql_search_strategy.ts 33.33% 2 Missing ⚠️
...c/plugins/query_enhancements/server/utils/facet.ts 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9356      +/-   ##
==========================================
- Coverage   61.71%   61.70%   -0.01%     
==========================================
  Files        3816     3817       +1     
  Lines       91829    91847      +18     
  Branches    14543    14549       +6     
==========================================
+ Hits        56668    56670       +2     
- Misses      31506    31520      +14     
- Partials     3655     3657       +2     
Flag Coverage Δ
Linux_1 28.98% <0.00%> (-0.01%) ⬇️
Linux_2 56.46% <ø> (ø)
Linux_3 39.19% <100.00%> (+0.01%) ⬆️
Linux_4 ?
Windows_1 29.00% <0.00%> (-0.01%) ⬇️
Windows_2 56.41% <ø> (ø)
Windows_3 39.19% <100.00%> (+0.01%) ⬆️
Windows_4 28.89% <12.50%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

getInitialQueryString: (query: Query) => {
switch (query.language) {
case 'PPL':
return `source = ${query.dataset?.title} | head 10`;
Copy link
Member

@kavilla kavilla Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be a follow up. but did want to potentially interpolate the sample size here?

so by default without changing the advance settings this would be source = foo | head 500?

@canascar however i dont know if that would cause some workload issues

@@ -43,6 +43,7 @@ export class Facet {
const { format, lang } = request.body;
const params = {
body: {
fetch_size: request.body.fetch_size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have the context still here are we not able to just access advanced settings in this file and set the param? or was there a technical blocker that we should specify for each strategy the size if set or not.

and should we check if this value is set or not if it is then we pass it? what are implications of passing undefined to this the API? because right now i believe it will fire a request like

{
  body: {
    fetch_size: undefined,
    query: query.query,
    // ... other properties
  },
  // ... format property if applicable
}

will it be ignored?

Copy link
Member

@kavilla kavilla Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont see updates to the async search strategies (like S3), but I think doing it here in the fetch call should actually cover everything. Since async searches eventually hit this fetch method anyway, we should be good.

Mind double-checking that though? Just want to make sure I'm not missing something with the async flow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly passing the fetch size in the facet will cause async query to fail since it doesn't accept fetch_size as a param

const fetchSize = await context.core.uiSettings.client.get('discover:sampleSize');
      const params = {
        body: {
          fetch_size: fetchSize,
          query: query.query,
          ...(meta?.name && { datasource: meta.name }),
          ...(meta?.sessionId && {
            sessionId: meta.sessionId,
          }),
          ...(lang && { lang }),
        },
        ...(format !== 'jdbc' && { format }),
      };
Screenshot 2025-02-12 at 5 51 14 PM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think since async search for s3 and PPL doesn't support fetch_size yet, we can interpolate the default query string for s3 dataset types and PPL queries for now. The limit number we get from advanced setting

source = ${dataset.title} | head 500

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change to only pass in fetch_size if fetch_size is defined.

@@ -35,6 +35,7 @@ export const pplSearchStrategyProvider = (
return {
search: async (context, request: any, options) => {
try {
request.body.fetch_size = await context.core.uiSettings.client.get('discover:sampleSize');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is my bad - I get why you did it this way.

After thinking about it more, pulling from discover's advanced settings on the server side probably isn't the right move. If we want dashboards to have query editor + ppl/sql support down the road, we shouldn't make everything use discover's sample size setting.

however - until we deprecate these search strategies which is something we need to do and move to the right ones (based on data type + proper internal routes), we might be stuck. Could you check if we're already passing size params in the request/options somewhere? If we are, we should probably use that instead.

If we're not though, trying to refactor this now might be more trouble than it's worth, since we know we need to clean it up later anyway. Might make more sense to keep it on the query enhancements side for now and just add a TODO and link whatever issue we open in the code for follow up.

Let me know what you find re: the request params and we can figure out next steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. By looking at the runSearch and fetch function in data plugin, i do not see we pass in any size param here.

Do you mind keep it on the query enhancements side for now?

@@ -33,6 +33,7 @@ export const sqlSearchStrategyProvider = (
return {
search: async (context, request: any, options) => {
try {
request.body.fetch_size = await context.core.uiSettings.client.get('discover:sampleSize');
Copy link
Member

@kavilla kavilla Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of modifying objects directly, we should try to spread them like:
const newObj = { ...oldObj, newStuff }

It's a small thing but helps avoid unexpected side effects 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@@ -98,7 +98,7 @@ const queriesTestSuite = () => {
// Default SQL query should be set
cy.waitForLoader(true);
cy.getElementByTestId(`osdQueryEditor__multiLine`).contains(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

js file 😭

@kavilla
Copy link
Member

kavilla commented Feb 10, 2025

smart! some comments though first.

Signed-off-by: abbyhu2000 <[email protected]>
Signed-off-by: abbyhu2000 <[email protected]>
@abbyhu2000 abbyhu2000 changed the title Remove sample size in default query string but use advanced setting Query requests return with sample size defined in advanced setting Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants