-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluffy: Implement offer cache #2827
base: master
Are you sure you want to change the base?
Conversation
# Caches the most recently received content/offers. | ||
# Content is only stored in this cache if it falls within our radius and similarly | ||
# the cache is only checked if the content id is within our radius. | ||
OfferCache = LruCache[ContentId, seq[byte]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought of this as a cache that only holds the content ids or the content keys (and thus could store more of them for the same data size). The cache would only be there to avoid hammering the database on a spam of offers.
After checking the PR I understand that it is similar as the ContentCache
, in the sense that it is also used to check for data on requests. But with the difference that it only gets stored via the offer/accept flow (and json-rpc store method). This was not immediately clear to me.
In general the ContentCache
is more useful because it foremost avoids re-doing recent network requests. This is something that OfferCache
does not avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I remember we discussed only storing the content keys/ids and mapping to a bool in order to save space. I'm open to doing it this way which does provide some benefit for rejecting existing offers quickly but I believe that caching the recently received offers provides some other benefits because the cached offers are reusable in the other flows.
The Sqlite in memory caching is limited and having some form of in memory caching per sub-network that is not specifically tied to recent queries will be useful. For example once we have the network fully synced and following the latest blocks, the most recently received offered content is likely to be looked up more frequently during normal usage of portal network by applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compare the in memory caching of Sqlite to RocksDb:
- Sqlite only uses the os page cache
- RocksDb uses the os page cache, the block cache (cache for recent reads) and MemTable (cache of recent writes)
For RocksDb the block cache improves the performance of recent reads and the MemTable improves the performance of reading recently written values (this is not the only purpose of the MemTable but one of its benefits). For the caching in Fluffy I was thinking we should have something similar in purpose to the RocksDb MemTable where we cache recent writes (we can keep the default size small to avoid increased memory usage).
For Fluffy we have a single database/table shared across multiple logically isolated sub-networks for which the data doesn't normally need to be queried together which ideally should be stored in separate tables (but that leads to compilations with pruning). With the offer cache we at least have a separate write cache per subnetwork to alleviate load on the shared database.
My motivation for this change was partly because I noticed that the performance of returning a value from a large fully populated LRU cache is dramatically faster than even doing a contains check on a small Sqlite db.
n.portalProtocol.storeContent(contentKey, contentId, contentItem) | ||
n.portalProtocol.storeContent( | ||
contentKey, contentId, contentItem, cacheOffer = true | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure about whether to cache in the store call. It may not be necessary and might be better to only cache offers received over the network.
ContentMessage( | ||
contentMessageType: contentType, content: ByteList[2048](content) | ||
) | ||
let contentResult = p.getLocalContent(fc.contentKey, contentId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When handling findContent messages we should return the content from the cache if available. When following the latest blocks, we likely will see more of these queries looking up the latest received offers.
@@ -955,7 +1020,7 @@ proc offer( | |||
if contentIdResult.isOk(): | |||
let | |||
contentId = contentIdResult.get() | |||
contentResult = p.dbGet(contentKey, contentId) | |||
contentResult = p.getLocalContent(contentKey, contentId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case when offering content via the database, the cache should be checked. The only downside I can think of is that it would influence the order of the content cache which may not be desired. It might be better to just check the offer cache and not the content cache, not sure about this yet.
I did some quick performance testing and confirmed that looking up data from a cache is always faster than even using contains on a small sqlite db (with the current default settings).
This PR introduces a cache that stores the most recently accepted offers. This will be useful to speed up checking for existing offers which often happens during the gossip process as each node may receive offers for the same content from multiple peers. It will also help reduce the load on the database and increase lookup performance in general and will help speed up the state bridge gossip performance.
We may not need to split up the content databases as is suggested in this issue if we use caching as an alternative.