You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We have a setup where we use a bytes fast field to store serialized data. As part of querying a collector retrieves this data via BytesCollumn::ord_to_bytes. We are seeing limited performance that flame graphs show as being dominated (>90%) by this path:
From a quick look at the code it appears that random access to the fast field requires decompressing the block holding the term on each ord lookup, which is why the CPU is dominated with decompression activity. As the number of documents being collected increases the efficiency can drop quickly as the same block is decompressed again and again.
Describe the solution you'd like
Ideally there would be a setting that would disable SSTable compression for a given fast field where the caller is willing to pay for more space to get faster random access time. This could be the default for dictionaries built as part of fast fields where fast random access is the desired outcome but an opt-in would also work.
[Optional] describe alternatives you've considered
We experimented with walking the entire dictionary to only decompress a given block once, but there are simply too many unique terms to make this faster. Even with range limits on the dictionary stream this is slower than the random access today.
We also considered storing this field and using the doc store where we can control the compression via index settings, but we need the fast field for some range filtering so we're hoping to avoid duplicate storage and the overhead of retrieving the entire document.
Lastly, one final alternative that may help is caching blocks, like what the StoreReader does. With a fake test where the only field stored is the single BytesColumn we found that going through the store reader was 3x faster than the fast field column, largely because of the cache reducing the amount of decompression needed.
The text was updated successfully, but these errors were encountered:
One other option that may help is zstd's skippable frames which could also be used to speed up random access at the cost of (presumably) a worse compression ratio.
sstable block codec already has an 8bit field encoding whether a block is compressed, so it should be a matter of passing a bool to DeltaWriter, and modifying this heuristic. We could also experiment with negative compression level here, this would disable zstd's entropy coding, which likely improve decompression speed, at the cost of less compression
Is your feature request related to a problem? Please describe.
We have a setup where we use a bytes fast field to store serialized data. As part of querying a collector retrieves this data via
BytesCollumn::ord_to_bytes
. We are seeing limited performance that flame graphs show as being dominated (>90%) by this path:From a quick look at the code it appears that random access to the fast field requires decompressing the block holding the term on each ord lookup, which is why the CPU is dominated with decompression activity. As the number of documents being collected increases the efficiency can drop quickly as the same block is decompressed again and again.
Describe the solution you'd like
Ideally there would be a setting that would disable SSTable compression for a given fast field where the caller is willing to pay for more space to get faster random access time. This could be the default for dictionaries built as part of fast fields where fast random access is the desired outcome but an opt-in would also work.
[Optional] describe alternatives you've considered
We experimented with walking the entire dictionary to only decompress a given block once, but there are simply too many unique terms to make this faster. Even with range limits on the dictionary stream this is slower than the random access today.
We also considered storing this field and using the doc store where we can control the compression via index settings, but we need the fast field for some range filtering so we're hoping to avoid duplicate storage and the overhead of retrieving the entire document.
Lastly, one final alternative that may help is caching blocks, like what the StoreReader does. With a fake test where the only field stored is the single BytesColumn we found that going through the store reader was 3x faster than the fast field column, largely because of the cache reducing the amount of decompression needed.
The text was updated successfully, but these errors were encountered: