-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minion Task to support automatic Segment Refresh #14300
Minion Task to support automatic Segment Refresh #14300
Conversation
43691e7
to
f485324
Compare
@vvivekiyer : the idea is quite interesting and the value add I see here is:
But for Upserts I think one of the biggest cost is recomputing the validDocId map, so for Upsert tables we won't see any specific benefits right? (outside of the ones which are applicable for Realtime tables too). |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14300 +/- ##
============================================
+ Coverage 61.75% 63.71% +1.96%
- Complexity 207 1567 +1360
============================================
Files 2436 2672 +236
Lines 133233 146635 +13402
Branches 20636 22487 +1851
============================================
+ Hits 82274 93435 +11161
- Misses 44911 46281 +1370
- Partials 6048 6919 +871
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
Are you suggesting an increase in concurrency at the minion level or the server level? At the server level, it seems we would still issue a SegmentRefreshTask, which means the default concurrency would remain at 1. We can investigate performance improvements that might allow us to adjust the concurrency configuration. Overall, this appears to be a valuable feature to reduce index build time and associated costs for servers! However, we need to consider the trade-off between SegmentRefresh and SegmentReload costs. Ultimately, we would still issue a SegmentRefresh call to the servers, if I understand correctly. For upsert tables with snapshot enabled, we risk losing validDocIDSnapshot during downloads from deep store since deep store lacks snapshot copies. This could potentially increase refresh times for these tables, as we wouldn't be able to utilize the preload feature. |
Yes, that's right. Exploring possibilities here - if we couple segment refresh minion task to also do other things (like upsert compaction, etc), will that help? |
The benefits of including compaction in this task will vary from use-case to use-case depending on the number of invalid docIDs. |
...in/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskGenerator.java
Outdated
Show resolved
Hide resolved
...in/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskGenerator.java
Outdated
Show resolved
Hide resolved
...in/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskGenerator.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskExecutor.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskExecutor.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskExecutor.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskExecutor.java
Outdated
Show resolved
Hide resolved
...in/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskGenerator.java
Outdated
Show resolved
Hide resolved
...ain/java/org/apache/pinot/plugin/minion/tasks/segmentrefresh/SegmentRefreshTaskExecutor.java
Outdated
Show resolved
Hide resolved
...oller/src/main/java/org/apache/pinot/controller/api/resources/PinotTableRestletResource.java
Show resolved
Hide resolved
...in/java/org/apache/pinot/plugin/minion/tasks/refreshsegment/RefreshSegmentTaskGenerator.java
Outdated
Show resolved
Hide resolved
// tableMTime > segmentZKMetadata.getCreationTime() || schemaMTime > segmentZKMetadata.getCreationTime(); | ||
|
||
boolean segmentProcessedBeforeUpdate = tableMTime > lastProcessedTime || schemaMTime > lastProcessedTime; | ||
return segmentProcessedBeforeUpdate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also add a crc check to figure out if we need to trigger a refresh or not. This way we also ensure deepstore copy gets updated with latest indexes / schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get this. Are you suggesting we need to check the crc in the ZK metadata against the crc in deepstore? They are bound to always be the same right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Necessarily not! I have seen this a lot of times in Upsert-compaction task as well: #13491.
This is some race-condition which we should solve fundamentally but I think for now we can let this task refresh the segment in deepstore anyways.
...ain/java/org/apache/pinot/plugin/minion/tasks/refreshsegment/RefreshSegmentTaskExecutor.java
Show resolved
Hide resolved
...test/java/org/apache/pinot/integration/tests/RefreshSegmentMinionClusterIntegrationTest.java
Show resolved
Hide resolved
c145418
to
2ee3b96
Compare
2ee3b96
to
9a7814c
Compare
Thanks for capturing our discussion in the description.
|
|
||
|
||
@TaskExecutorFactory | ||
public class SegmentRefreshTaskExecutorFactory implements PinotTaskExecutorFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename this as well to RefreshSegment...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually prefer "SegmentRefreshTask" over "RefreshSegmentTask". Whichever name you pick, please make sure it's used everywhere.
|
||
|
||
@EventObserverFactory | ||
public class SegmentRefreshTaskProgressObserverFactory extends BaseMinionProgressObserverFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename this as well to RefreshSegment...?
...ain/java/org/apache/pinot/plugin/minion/tasks/refreshsegment/RefreshSegmentTaskExecutor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a few minor comments, LGTM.
if (fieldSpecInSchema.isVirtualColumn()) { | ||
continue; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do virtual columns show up in the schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have these virtual columns like $segmentName that show up in the schema
SegmentConversionResult segmentConversionResult) { | ||
return new SegmentZKMetadataCustomMapModifier(SegmentZKMetadataCustomMapModifier.ModifyMode.UPDATE, | ||
Collections.singletonMap(MinionConstants.RefreshSegmentTask.TASK_TYPE + MinionConstants.TASK_TIME_SUFFIX, | ||
String.valueOf(_taskStartTime))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put human readable time string? Something like 2024-11-09T03:21:59.989Z
makes debugging much easier.
|
||
|
||
@TaskExecutorFactory | ||
public class SegmentRefreshTaskExecutorFactory implements PinotTaskExecutorFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually prefer "SegmentRefreshTask" over "RefreshSegmentTask". Whichever name you pick, please make sure it's used everywhere.
Created followup issue #14483 |
8d46140
to
8a395ea
Compare
...ain/java/org/apache/pinot/plugin/minion/tasks/refreshsegment/RefreshSegmentTaskExecutor.java
Outdated
Show resolved
Hide resolved
8a395ea
to
0679ba3
Compare
* Minion Task to support automatic Segment Refresh * Address review comments * Address review comments.
Currently, when new columns are added or indexes are added/removed, the segment reloads happen on the server. There are a number of issues with this approach:
This PR creates a minion task to automatically refresh segments when there are index/column updates to table config/schema. It can support automatic refresh for the following operations:
Followup Work:
Notes on Implementation
The premise used to solve this was: Keep the deepstore segment in sync with table Config (this will automatically make sure that the servers have the updated segments). Please see #9360. Keeping deep store in sync becomes crucial for: Reducing server startup times when servers are replaced/migrated.
Task Generation Flow:
Task Execution Flow:
Relying on a server API call to indicate whether a segment needs to be refreshed was not preferred because:
Servers might indicate that a segment doesn’t need refresh (using a mechanism like Support API for checking if segments need to be reloaded for a table #12117) just because they were restarted. This will still leave the segments on deepstore outdated.
Server Preprocess currently supports very limited operations. As we add more capability like datatype changes/compression changes, relying on server Preprocess will give the wrong signal just because serverPreprocess doesn’t support the operation.
Using server APIs to get all segment Metadata to the controller for all tables every time the periodic task runs can be overkill.
Cons of this approach is that there will be minion tasks created for all segments for each table config update.
To overcome this problem, we can use a server side API that will return the list of segments to be refreshed. It is being developed in #14450. We can incorporate these changes in the Task Generation Flow once it is merged. (cc: @swaminathanmanish)