Fluffy: Implement offer cache #2827

bhartnett · 2024-11-04T14:44:32Z

I did some quick performance testing and confirmed that looking up data from a cache is always faster than even using contains on a small sqlite db (with the current default settings).

This PR introduces a cache that stores the most recently accepted offers. This will be useful to speed up checking for existing offers which often happens during the gossip process as each node may receive offers for the same content from multiple peers. It will also help reduce the load on the database and increase lookup performance in general and will help speed up the state bridge gossip performance.

We may not need to split up the content databases as is suggested in this issue if we use caching as an alternative.

kdeme · 2024-11-04T20:58:58Z

fluffy/network/wire/portal_protocol.nim

+  # Caches the most recently received content/offers.
+  # Content is only stored in this cache if it falls within our radius and similarly
+  # the cache is only checked if the content id is within our radius.
+  OfferCache = LruCache[ContentId, seq[byte]]


I initially thought of this as a cache that only holds the content ids or the content keys (and thus could store more of them for the same data size). The cache would only be there to avoid hammering the database on a spam of offers.

After checking the PR I understand that it is similar as the ContentCache, in the sense that it is also used to check for data on requests. But with the difference that it only gets stored via the offer/accept flow (and json-rpc store method). This was not immediately clear to me.

In general the ContentCache is more useful because it foremost avoids re-doing recent network requests. This is something that OfferCache does not avoid.

Yes I remember we discussed only storing the content keys/ids and mapping to a bool in order to save space. I'm open to doing it this way which does provide some benefit for rejecting existing offers quickly but I believe that caching the recently received offers provides some other benefits because the cached offers are reusable in the other flows.

The Sqlite in memory caching is limited and having some form of in memory caching per sub-network that is not specifically tied to recent queries will be useful. For example once we have the network fully synced and following the latest blocks, the most recently received offered content is likely to be looked up more frequently during normal usage of portal network by applications.

Compare the in memory caching of Sqlite to RocksDb:

Sqlite only uses the os page cache

RocksDb uses the os page cache, the block cache (cache for recent reads) and MemTable (cache of recent writes)

For RocksDb the block cache improves the performance of recent reads and the MemTable improves the performance of reading recently written values (this is not the only purpose of the MemTable but one of its benefits). For the caching in Fluffy I was thinking we should have something similar in purpose to the RocksDb MemTable where we cache recent writes (we can keep the default size small to avoid increased memory usage).

For Fluffy we have a single database/table shared across multiple logically isolated sub-networks for which the data doesn't normally need to be queried together which ideally should be stored in separate tables (but that leads to compilations with pruning). With the offer cache we at least have a separate write cache per subnetwork to alleviate load on the shared database.

My motivation for this change was partly because I noticed that the performance of returning a value from a large fully populated LRU cache is dramatically faster than even doing a contains check on a small Sqlite db.

bhartnett · 2024-11-05T02:28:49Z

fluffy/network/beacon/beacon_network.nim

-      n.portalProtocol.storeContent(contentKey, contentId, contentItem)
+      n.portalProtocol.storeContent(
+        contentKey, contentId, contentItem, cacheOffer = true
+      )


I wasn't sure about whether to cache in the store call. It may not be necessary and might be better to only cache offers received over the network.

bhartnett · 2024-11-05T02:38:18Z

fluffy/network/wire/portal_protocol.nim

-          ContentMessage(
-            contentMessageType: contentType, content: ByteList[2048](content)
-          )
+  let contentResult = p.getLocalContent(fc.contentKey, contentId)


When handling findContent messages we should return the content from the cache if available. When following the latest blocks, we likely will see more of these queries looking up the latest received offers.

bhartnett · 2024-11-05T02:55:54Z

fluffy/network/wire/portal_protocol.nim

@@ -955,7 +1020,7 @@ proc offer(
          if contentIdResult.isOk():
            let
              contentId = contentIdResult.get()
-              contentResult = p.dbGet(contentKey, contentId)
+              contentResult = p.getLocalContent(contentKey, contentId)


In the case when offering content via the database, the cache should be checked. The only downside I can think of is that it would influence the order of the content cache which may not be desired. It might be better to just check the offer cache and not the content cache, not sure about this yet.

bhartnett added 13 commits November 1, 2024 09:57

Commit progress.

15a224a

Remove getSszDecoded from ContentDb.

29ed761

Update ContentDb get to use onData callback to reduce copies.

f0cf1fa

Use templates for helper procs in ContentDb.

37c795b

Add contains handler to portal protocol.

39aa06f

Update test.

3a93b24

Update fluffy-tools.

2900261

Improve performance of DbGetHandler.

09a9951

Implement offer cache.

6404565

Cache offers in json rpc APIs.

a76d712

Merge branch 'master' into fluffy-offer-cache

2d69b93

Fix last merge.

24d9cdb

Update init config.

7a37a08

bhartnett requested a review from kdeme November 4, 2024 15:50

kdeme reviewed Nov 4, 2024

View reviewed changes

bhartnett commented Nov 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fluffy: Implement offer cache #2827

Fluffy: Implement offer cache #2827

bhartnett commented Nov 4, 2024 •

edited

Loading

kdeme Nov 4, 2024

bhartnett Nov 5, 2024

bhartnett Nov 5, 2024

bhartnett Nov 5, 2024

bhartnett Nov 5, 2024

bhartnett Nov 5, 2024

Fluffy: Implement offer cache #2827

Are you sure you want to change the base?

Fluffy: Implement offer cache #2827

Conversation

bhartnett commented Nov 4, 2024 • edited Loading

kdeme Nov 4, 2024

Choose a reason for hiding this comment

bhartnett Nov 5, 2024

Choose a reason for hiding this comment

bhartnett Nov 5, 2024

Choose a reason for hiding this comment

bhartnett Nov 5, 2024

Choose a reason for hiding this comment

bhartnett Nov 5, 2024

Choose a reason for hiding this comment

bhartnett Nov 5, 2024

Choose a reason for hiding this comment

bhartnett commented Nov 4, 2024 •

edited

Loading