Skip to content

Commit

Permalink
📝 Release note for 3.23.0 (#5410)
Browse files Browse the repository at this point in the history
**Description**

<!-- Please provide a short description and potentially linked issues
justifying the need for this PR -->

<!-- * Your PR is fixing a bug or regression? Check for existing issues
related to this bug and link them -->
<!-- * Your PR is adding a new feature? Make sure there is a related
issue or discussion attached to it -->

<!-- You can provide any additional context to help into understanding
what's this PR is attempting to solve: reproduction of a bug, code
snippets... -->

**Checklist** — _Don't delete this checklist and make sure you do the
following before opening the PR_

- [ ] The name of my PR follows [gitmoji](https://gitmoji.dev/)
specification
- [ ] My PR references one of several related issues (if any)
- [ ] New features or breaking changes must come with an associated
Issue or Discussion
- [ ] My PR does not add any new dependency without an associated Issue
or Discussion
- [ ] My PR includes bumps details, please run `yarn bump` and flag the
impacts properly
- [ ] My PR adds relevant tests and they would have failed without my PR
(when applicable)

<!-- More about contributing at
https://github.com/dubzzz/fast-check/blob/main/CONTRIBUTING.md -->

**Advanced**

<!-- How to fill the advanced section is detailed below! -->

- [ ] Category: ...
- [ ] Impacts: ...

<!-- [Category] Please use one of the categories below, it will help us
into better understanding the urgency of the PR -->
<!-- * ✨ Introduce new features -->
<!-- * 📝 Add or update documentation -->
<!-- * ✅ Add or update tests -->
<!-- * 🐛 Fix a bug -->
<!-- * 🏷️ Add or update types -->
<!-- * ⚡️ Improve performance -->
<!-- * _Other(s):_ ... -->

<!-- [Impacts] Please provide a comma separated list of the potential
impacts that might be introduced by this change -->
<!-- * Generated values: Can your change impact any of the existing
generators in terms of generated values, if so which ones? when? -->
<!-- * Shrink values: Can your change impact any of the existing
generators in terms of shrink values, if so which ones? when? -->
<!-- * Performance: Can it require some typings changes on user side?
Please give more details -->
<!-- * Typings: Is there a potential performance impact? In which cases?
-->
  • Loading branch information
dubzzz authored Nov 5, 2024
1 parent 9374983 commit 48f9798
Show file tree
Hide file tree
Showing 2 changed files with 137 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ We have also started deprecating a few variations around `BigInt`. We believe th

Please note that these arbitraries will remain available in all versions 3.x of fast-check, although they will likely be removed in version 4.x.

## Changelog since 3.22.0
## Changelog since 3.21.0

The version 3.22.0 is based on version 3.21.0.

Expand Down
136 changes: 136 additions & 0 deletions website/blog/2024-11-05-whats-new-in-fast-check-3-23-0/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
---
title: What's new in fast-check 3.23.0?
authors: [dubzzz]
tags: [what's new, arbitrary, performance]
---

The previous release introduced the `unit` concept to the string arbitraries, enhancing control over string generation. In version 3.23.0, we’ve extended this feature to even more arbitraries, broadening its applicability. This update also includes several performance optimizations to make your testing faster and more efficient.

Continue reading to explore the detailed updates it brings.

<!--truncate-->

## Polyglot `string` to the next level

Polyglot string was introduced in [version 3.22.0 of fast-check](/blog/2024/08/29/whats-new-in-fast-check-3-22-0/#polyglot-string), allowing for more diverse string generation. In this release, we've extended the unit option to several additional arbitraries:

- `anything`
- `json`
- `object`

With this change, we recommend using the new `stringUnit` constraint to specify the type of characters you'd like in generated JSONs, objects, and other structures. The new syntax is as follows:

```ts
fc.json({ stringUnit: 'grapheme' });
```

## Dichotomic search in `mapToConstant`

The `mapToConstant` arbitrary provides an easy-to-use API for generating values based on an index. Our documentation includes an example that shows how to create an arbitrary to generate characters from `a` to `z` and numbers from `0` to `9`:

```ts
fc.mapToConstant(
{ num: 26, build: (v) => String.fromCharCode(v + 0x61) },
{ num: 10, build: (v) => String.fromCharCode(v + 0x30) },
);
// Examples of generated values: "6", "8", "d", "9", "r"…
```

This arbitrary is actually more central to the library than it might seem at first glance. It’s a core building block backing the `string` arbitrary and playing a crucial role in the codebase. Given its importance, maximizing its efficiency is essential.

When benchmarking different variations of `string` with various `unit` values, we noticed a significant performance disparity: the `grapheme` unit was significantly slower than the `grapheme-ascii` unit. We traced this slowdown to a linear search within the `mapToConstant` implementation, where each call to `generate` involved looking up the right `build` function. Initially, our code looked something like this:

```ts
function buildForChoiceIndex(choiceIndex: number) {
let idx = -1;
let numSkips = 0;
while (choiceIndex >= numSkips) {
numSkips += entries[++idx].num;
}
return entries[idx].build(choiceIndex - numSkips + entries[idx].num);
}
```

Unfortunately, this approach has poor performance due to its linear complexity. For small entries arrays (like in our example with just two entries), the impact is minimal. However, for larger entries arrays (as with `grapheme`), this linear complexity results in significant slowdowns, as each lookup’s average time increases in proportion to the number of entries.

To improve this, we introduced a new `start` field to each entry (only internally), marking the starting index for each range. Applied to our example, the entries now look like this:

```ts
const entries = [
{ start: 0, num: 26, build: (v) => String.fromCharCode(v + 0x61) },
{ start: 26, num: 10, build: (v) => String.fromCharCode(v + 0x30) },
];
```

With this additional information, we can perform a dichotomic search to locate the correct `build` function. Instead of iterating through each entry, we can efficiently narrow down the search range, achieving logarithmic complexity. Here’s the updated implementation:

```ts
function buildForChoiceIndex(choiceIndex: number) {
let min = 0;
let max = entries.length;
while (max - min > 1) {
const mid = Math.floor((min + max) / 2);
if (choiceIndex < entries[mid].from) {
max = mid;
} else {
min = mid;
}
}
return entries[min].build(choiceIndex - entries[min].start + entries[min].num);
}
```

This optimization yielded a performance improvement from 1,800 operations per second to 3,700 operations per second for the following code:

```ts
fc.assert(fc.property(fc.string({ unit: 'grapheme' }), (s) => true));
```

While 1,800 operations per second may already sound substantial, the impact becomes more noticeable in real-world applications where multiple strings are generated in combination, such as arrays of strings or objects with string properties. This improvement can make a tangible difference in CI performance.

For more details on this change, take a look at the [PR#5386](https://github.com/dubzzz/fast-check/pull/5386). We also applied a similar optimization to reduce the cost of `canShrinkWithoutContext` on `constant`, where we replaced a linear search with direct access, as described in [PR#5372](https://github.com/dubzzz/fast-check/pull/5372).

## Cached computations for faster instantiation

Our next set of performance optimizations focused on reducing the cost of instantiating the `string` arbitrary. In version 3.22.0, this cost was still relatively high — _and for good reason_.

By default, the `string` arbitrary in fast-check can generate values such as `__proto__` and key, among others. To achieve this, we need to confirm that these specific values are valid outputs of the generator. This requires checking if `__proto__` or other reserved values could actually be produced by `string({ unit })`, where `unit` might be something like `constantFrom('a', 'b', 'c')` or even more complex. This validation step is computationally intensive.

To improve efficiency, we realized that for a given `unit`, the set of acceptable strings remains consistent across calls to `string({ unit })`. Based on this, we implemented a caching mechanism backed by a `WeakMap`, with the arbitrary linked to the specified `unit` used as the cache key.

This change led to a dramatic improvement, boosting instantiation performance from 20,000 instances per second to 260,000 instances per second.

These optimizations are detailed in [PR#5387](https://github.com/dubzzz/fast-check/pull/5387), [PR#5388](https://github.com/dubzzz/fast-check/pull/5388) and [PR#5389](https://github.com/dubzzz/fast-check/pull/5389).

## Changelog since 3.22.0

The version 3.23.0 is based on version 3.22.0.

### Features

- ([PR#5366](https://github.com/dubzzz/fast-check/pull/5366)) Add support for string-`unit` on `object`/`anything` arbitrary
- ([PR#5367](https://github.com/dubzzz/fast-check/pull/5367)) Add support for string-`unit` on `json` arbitrary
- ([PR#5390](https://github.com/dubzzz/fast-check/pull/5390)) Add back strong unmapping capabilities to `string`

### Fixes

- ([PR#5327](https://github.com/dubzzz/fast-check/pull/5327)) Bug: Resist even more to external poisoning for `string`
- ([PR#5368](https://github.com/dubzzz/fast-check/pull/5368)) Bug: Better support for poisoning on `stringMatching`
- ([PR#5344](https://github.com/dubzzz/fast-check/pull/5344)) CI: Adapt some tests for Node v23
- ([PR#5346](https://github.com/dubzzz/fast-check/pull/5346)) CI: Drop usages of `it.concurrent` due to Node 23 failing
- ([PR#5363](https://github.com/dubzzz/fast-check/pull/5363)) CI: Move to Vitest for `examples/`
- ([PR#5391](https://github.com/dubzzz/fast-check/pull/5391)) CI: Preview builds using `pkg.pr.new`
- ([PR#5392](https://github.com/dubzzz/fast-check/pull/5392)) CI: Connect custom templates to `pkg.pr.new` previews
- ([PR#5394](https://github.com/dubzzz/fast-check/pull/5394)) CI: Install dependencies before building changesets
- ([PR#5396](https://github.com/dubzzz/fast-check/pull/5396)) CI: Proper commit name on changelogs
- ([PR#5393](https://github.com/dubzzz/fast-check/pull/5393)) Clean: Drop unused `examples/jest.setup.js`
- ([PR#5249](https://github.com/dubzzz/fast-check/pull/5249)) Doc: Release note for fast-check 3.22.0
- ([PR#5369](https://github.com/dubzzz/fast-check/pull/5369)) Doc: Typo fix in model-based-testing.md
- ([PR#5370](https://github.com/dubzzz/fast-check/pull/5370)) Doc: Add new contributor jamesbvaughan
- ([PR#5383](https://github.com/dubzzz/fast-check/pull/5383)) Doc: Properly indent code snippets for the documentation
- ([PR#5372](https://github.com/dubzzz/fast-check/pull/5372)) Performance: Faster `canShrinkWithoutContext` for constants
- ([PR#5386](https://github.com/dubzzz/fast-check/pull/5386)) Performance: Faster generate process for `mapToConstant`
- ([PR#5387](https://github.com/dubzzz/fast-check/pull/5387)) Performance: Faster tokenizer of strings
- ([PR#5388](https://github.com/dubzzz/fast-check/pull/5388)) Performance: Faster initialization of `string` with faster slices
- ([PR#5389](https://github.com/dubzzz/fast-check/pull/5389)) Performance: Faster initialization of `string` with pre-cached slices
- ([PR#5371](https://github.com/dubzzz/fast-check/pull/5371)) Test: Add extra set of tests for `constant*`

0 comments on commit 48f9798

Please sign in to comment.