Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reindexing job reported as completed, even though it didn't process a single record. #4020

Open
rafalbroll-datapharm opened this issue Aug 14, 2024 · 9 comments
Labels
Bug Bug bug bug. VSTS-Planned Planned for an upcoming sprint

Comments

@rafalbroll-datapharm
Copy link

Describe the bug
While running the reindex job for newly defined custom search parameter, FHIR Server reports completion, even though no record is indexed (neither can be found using the new search param).

Hosting details:
The FHIR Server is hosted inside the Azure Container App, with assigned resources:
1 CPU, 2 GB RAM

The data instance sizes:
The entire Resources table is ~60 GB, 978 676 rows
The biggest Resource row (measured as datalength(rawresource)) is 5 MB, average is 58 KB

FHIR Version?
R4B, [3.4.342]

Data provider?
SQL Server

To Reproduce
Steps to reproduce the behaviour:

  1. Set up the custom search parameter running the command:
curl --location --request PUT "http://${FHIR_HOST}/SearchParameter/datapharm-product-family" \
--header 'Content-Type: application/json' \
--data-raw '{
 "resourceType" : "SearchParameter",
 "id": "datapharm-product-family",
 "url" : "https://medicines.org.uk/productFamily",
 "version" : "0.0.1",
 "name" : "datapharm-product-family",
 "status" : "active",
 "date" : "2023-02-01",
 "publisher" : "Datapharm Ltd",
 "contact" : [
   {
     "telecom" : [
       {
         "system" : "other",
         "value" : "https://www.datapharm.com/"
       }
     ]
   }
 ],
 "description" : "Searching by product family",
 "jurisdiction" : [
   {
     "coding" : [
       {
         "system" : "https://medicines.org.uk/productFamily",
         "code" : "UK",
         "display" : "United Kingdom"
       }
     ]
   }
 ],
 "code" : "datapharm-product-family",
 "base" : [
   "MedicinalProductDefinition"
 ],
 "type" : "token",
 "expression" : "MedicinalProductDefinition.extension.where(url='\''https://medicines.org.uk/productFamily'\'').value"
}'
  1. Run the reindexing Job
{ 
    "resourceType": "Parameters",  
    "parameter": [
           {
            "name": "targetSearchParameterTypes",
            "valueString": "https://medicines.org.uk/productFamily"
          }    
      ]
}
  1. Check the results by calling GET /_operations/reindex/<reindex job id>

Expected behavior
The result is

...
{
    {
       "name":"totalResourcesToReindex",
       "valueDecimal": 18514.0
    },
    {
       "name":"resourcesSuccessfullyReindexed",
       "valueDecimal": 18514.0
    },
    {
       "name":"progress",
       "valueDecimal": 100.0
    },
    {
       "name":"status",
       "valueString": "Completed"
    }
}
...

and attempts of searching by that parameters succeed.

Actual behavior
The result is

...
{
    {
       "name":"totalResourcesToReindex",
       "valueDecimal": 18514.0
    },
    {
       "name":"resourcesSuccessfullyReindexed",
       "valueDecimal": 0.0
    },
    {
       "name":"progress",
       "valueDecimal": 0.0
    },
    {
       "name":"status",
       "valueString": "Completed"
    }
}
...

and attempts of searching by that parameter fail.


Applying maximumNumberOfResourcesPerQuery:

In addition, we were experimenting with adding the parameter maximumNumberOfResourcesPerQuery while running the Reindexing Job. That partially deals with the problem, but sometimes it indexes only 99% of the records.
ie. for Reindex Job defined like that:

{ 
    "resourceType": "Parameters",  
    "parameter": [
           {
            "name": "targetSearchParameterTypes",
            "valueString": "https://medicines.org.uk/productFamily"
          } ,
          {
             "name": "maximumNumberOfResourcesPerQuery",
             "valueInteger": "5000"
         }
      ]
}

we got that response

...
{
    {
       "name":"totalResourcesToReindex",
       "valueDecimal": 18514.0
    },
    {
       "name":"resourcesSuccessfullyReindexed",
       "valueDecimal": 18513.0
    },
    {
       "name":"progress",
       "valueDecimal": 99.99
    },
    {
       "name":"status",
       "valueString": "Completed"
    }
}
...
@rafalbroll-datapharm rafalbroll-datapharm added the Bug Bug bug bug. label Aug 14, 2024
@EXPEkesheth
Copy link
Collaborator

@rafalbroll-datapharm - Are you provisioned on Cosmos DB/ SQL persistence layer?

@rafalbroll-datapharm
Copy link
Author

@rafalbroll-datapharm - Are you provisioned on Cosmos DB/ SQL persistence layer?

It's the SQL Server.

@fahadnadeem1995
Copy link

@rafalbroll-datapharm I am facing the same issue with Azure FHIR Service with SQL server.

@rafalbroll-datapharm
Copy link
Author

@rafalbroll-datapharm I am facing the same issue with Azure FHIR Service with SQL server.

Just to clarify, which scenario do you experience? Having a completed task with 0.0% progress, or the latter - having 99% of completion?

@fahadnadeem1995
Copy link

@rafalbroll-datapharm It shows completion with 0.0% progress although indicating that a few resources reindexed successfully. And a few resources do return as a result of using that custom search parameter in the query.

Please have a look at the screenshot below:

image

@EXPEkesheth
Copy link
Collaborator

Thanks for reporting , we will investigate further
#AB125852

@EXPEkesheth EXPEkesheth added the VSTS-Planned Planned for an upcoming sprint label Sep 9, 2024
@EXPEkesheth
Copy link
Collaborator

@rafalbroll-datapharm - we recently made improvements to the reindexing logic, can you please execute reindex and let us know if you still see the issue. Looking forward to your response

@fahadnadeem1995
Copy link

@rafalbroll-datapharm I opened a ticket with Microsoft and the support team told me about the fix/improvement to the reindexing logic they made afterwards. I then retried and it began working.

@rafalbroll-datapharm
Copy link
Author

@rafalbroll-datapharm - we recently made improvements to the reindexing logic, can you please execute reindex and let us know if you still see the issue. Looking forward to your response

Many thanks. Which FHIR server version has those changes introduced?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Bug bug bug. VSTS-Planned Planned for an upcoming sprint
Projects
None yet
Development

No branches or pull requests

3 participants