-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite waits while resolving data stores / requests #4508
Comments
@leeviana , @montselozanod , @anthony-murphy , @curtisman I think we should go with second approach. Would be great to add more people to discussion from app side to provide their feedback. |
I like the second approach. :) I think offline mode was always going to be a "con" for whatever solution we ended up here/with similar challenges and we'll need to work out our vision/support there in a separate larger effort/architecture design. Not sure who on the app side cares too much about this right now (mostly just affects future offline thinking), but lets start with @AndreiZe |
Personally, I like them both. I view sequence number as a hint, but for scenarios listed above, and because it's user modifiable we can never fully trust it, but if it's there and reasonable we can leverage it. However, we'll likely always want a fallback, and waiting for catch up seems appropriate |
URL is the identity of the reference, so having the minimum sequence number that the URL Is valid make sense. |
I'll use this uber task for all related issues (and dedup the rest). Here is what we need to do:
Not really related, but occasionally mentioned in similar topics:
|
This issue has been automatically marked as stale because it has had no activity for 180 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework! |
not stale |
This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework! |
not stale |
This issue has been automatically marked as stale because it has had no activity for 180 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework! |
not stale |
not stale |
not stale |
Problem Statement
ContainerRuntime.getDataStore() and ContainerRuntime.request() result in infinite wait when request hits non-existing data store.
It will be resolved if/when container eventually processes attach op for such data store. However if request is made for wrong (non-existing) data store, such request deadlocks.
This also shows up in end-to-end workflows where container is created paused, and URI resolution with waits will result in deadlock. Thus host may need to implement multi-stage process of doing initial request within waits, and on 404, resume container and retry request with infinite wait.
Proposed solutions
Use waitContainerToCatchUp() mechanism
Use it as a preparation step before resolving URI (or as a step in above mentioned algorithm after getting 404) as a wait to wait for container to catch up. This will fundamentally not change how resolution is made, but will allow us to remove waits from getDataStore() and request resolution.
Incorporate sequence number into URIs
When URI is generated, client can add (as an optional argument) current reference sequence number. This would allow (or rather - require) client resolving URI to get up to this sequence number before finishing resolving request. The nice pro of this design is that it has better user experience (or rather allows to provide better user experience) - we can decide to show progress UI or some other indication while we catch up, for user to better understand why it takes long to get to what this URI represents, and now show partial or previous state of document / component. The potential con of this approach is that URIs can be generated only for content that actually is in the file, i.e. was acked by server. While it looks like con, it will also not allow a scenario where user is offline, creates URI and sends it over Teams or email, where URI reaches consumers without referenced fluid content eve making it to a file (note that background sync of email will happen on some systems when laptop/phone looks off, so browser may be suspended and fluid content not synching, but URI may make it through just fine). As result user may never be able to resolve such URI. Pushing this problem to producer has certain pros and cons, but is in line with work in other areas, in particular sequential IDs to represent ranges of text in Scriptor and URIs being generated only for online content.
The text was updated successfully, but these errors were encountered: