Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Rust crate lz4_flex to 0.11.0 #113

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

renovate[bot]
Copy link

@renovate renovate bot commented May 5, 2024

This PR contains the following updates:

Package Type Update Change
lz4_flex dependencies minor 0.9.5 -> 0.11.0

Release Notes

pseitz/lz4_flex (lz4_flex)

v0.11.3

==================

  • Fix support for --deny=unsafe_code compilationhttps://redirect.github.com/PSeitz/lz4_flex/pull/1524_flex/pull/152)
  • make get_maximum_output_size consthttps://redirect.github.com/PSeitz/lz4_flex/pull/1534_flex/pull/153)

v0.11.2

==================

  • Include license file in the published crate

v0.11.1

==================

v0.11.0

==================

Documentation
  • Docs: add decompress block example
Fixes
  • Handle empty input in Frame Formathttps://redirect.github.com/PSeitz/lz4_flex/pull/1204_flex/pull/120)
Empty input was ignored previously and didn't write anything. Now an empty Frame is written. This improves compatibility with the reference implementation and some corner cases.
  • Fix: Small dict leads to panichttps://redirect.github.com/PSeitz/lz4_flex/pull/1334_flex/pull/133)
compress_into_with_dict panicked when the dict passed was smaller than 4 bytes. A match has the minimum length of 4 bytes, smaller dicts will be ignored now.
Features
  • [breaking] invert checked-decode to unchecked-decodehttps://redirect.github.com/PSeitz/lz4_flex/pull/1344_flex/pull/134)
invert `checked-decode` feature flag to `unchecked-decode`
Previously setting `default-features=false` removed the bounds checks from the
`checked-decode` feature flag. `unchecked-decode` inverts this, so it will needs to be
deliberately deactivated.

To migrate, just remove the `checked-decode` feature flag.
  • Allow to pass buffer larger than sizehttps://redirect.github.com/PSeitz/lz4_flex/pull/784_flex/pull/78)
This removes an unnecessary check in the decompression, when the passed buffer is too big.
Empty input was ignored previously and didn't write anything. Now an empty Frame is written. This improves compatibility with the reference implementation and some corner cases.
  • Autodetect frame blocksizehttps://redirect.github.com/PSeitz/lz4_flex/pull/814_flex/pull/81)
The default blocksize of FrameInfo is now auto instead of 64kb, it will detect the blocksize
depending of the size of the first write call. This increases
compression ratio and speed for use cases where the data is larger than
64kb.
  • Add fluent API style construction for FrameInfohttps://redirect.github.com/PSeitz/lz4_flex/pull/994_flex/pull/99) (thanks @​CosmicHorrorDev)
This adds in fluent API style construction for FrameInfo. Now you can do

let info = FrameInfo::new()
    .block_size(BlockSize::Max1MB)
    .content_checksum(true);
Performance
Replace calls to memcpy with custom function
  • Perf: optimize wildcopyhttps://redirect.github.com/PSeitz/lz4_flex/pull/1094_flex/pull/109)
The initial check in the the 16 byte wild copy is unnecessary, since it is already done before calling the method.
  • Perf: faster duplicate_overlappinghttps://redirect.github.com/PSeitz/lz4_flex/pull/1144_flex/pull/114)
Replace the aggressive compiler unrolling after the
failed attempt #​69 (wrote out of bounds in some cases)

The unrolling is avoided by manually unrolling less aggressive.
Decompression performance is slightly improved by ca 4%, except the
smallest test case.

  • Perf: simplify extend_from_within_overlappinghttps://redirect.github.com/PSeitz/lz4_flex/pull/724_flex/pull/72)
extend_from_within_overlapping is used in safe decompression when
overlapping data has been detected. The prev version had unnecessary
assertions/safe guard, since this method is only used in safe code.
Removing the temporary &mut slice also simplified assembly output.

uiCA Code Analyzer

Prev
Tool 	    Skylake	IceLake 	Tiger Lake 	Rocket Lake
uiCA Cycles 28.71 	30.67 		28.71 		27.57

Simplified
Tool 	    Skylake	IceLake 	TigerLake 	Rocket Lake
uiCA Cycles 13.00 	15.00 		13.00 		11.00
  • Perf: remove unnecessary assertions
those assertions are only used in safe code and therefore unnecessary
Improve safe decompression speed by 8-18%

Reduce multiple slice fetches. every slice access, also nested ones
, carries some overhead. In the hot loop a fixed &[u8;16] is fetched to
operate on. This is purely done to pass that info to the compiler.

Remove error handling that only carries overhead. As we are in safe
mode we can rely on bounds checks if custom error handling only adds overhead.
In normal operation no error should occur.

The strategy to identify improvements was by counting the lines of
assembly. A rough heuristic, but seems effective.
cargo asm --release --example decompress_block decompress_block::main |
wc -l
The frame encoding uses a fixed size hashtable.
By creating a special hashtable with a Box<[u32; 4096]> size,
in combination with the bit shift of 4, which is also moved into a constant,
the compiler can remove the bounds checks.
For that to happen, the compiler also needs to recognize the `>> 48` right
shift from the hash algorithm (u64 >> 52 <= 4096), which is the case. Yey

It also means we can use less `unsafe` for the unsafe version
  • Perf: switch to use only 3 kinds of hashtablehttps://redirect.github.com/PSeitz/lz4_flex/pull/774_flex/pull/77)
use only hashtables with fixed sizes and bit shifts, that allow to
remove bounds checks.
Refactor
  • Refactor: remove VecSinkhttps://redirect.github.com/PSeitz/lz4_flex/pull/714_flex/pull/71)
remove VecSink since it can be fully replaced with a slice
this will reduce code bloat from generics
Testing
  • Tests: add proptest roundtriphttps://redirect.github.com/PSeitz/lz4_flex/pull/694_flex/pull/69)

v0.10.0

==================

Features

Add support of decoding legacy frames, used by linux kernel (thanks to @​yestyle)


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot force-pushed the renovate/lz4_flex-0.x branch from 08cbf4f to 026efd3 Compare May 26, 2024 00:51
@renovate renovate bot force-pushed the renovate/lz4_flex-0.x branch from 026efd3 to 7c33d43 Compare June 23, 2024 00:53
@renovate renovate bot force-pushed the renovate/lz4_flex-0.x branch from 7c33d43 to 04eeb54 Compare June 30, 2024 00:55
Copy link

stale bot commented Jan 31, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants