Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STL Hardening #5274

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open

Conversation

StephanTLavavej
Copy link
Member

@StephanTLavavej StephanTLavavej commented Feb 11, 2025

🗺️ Overview

This was preceded by the "prelude" PRs #5255 and #5270, but they aren't critical for understanding this PR.

Resolves #5090.

After this, I'll work on a distinct-but-related feature, "destructor tombstones".

See the STL Hardening wiki page for details, most of which I won't repeat here. The highlights are:

  • STL Hardening supersedes and removes an earlier (2019) attempt in this area, _CONTAINER_DEBUG_LEVEL (aka CDL).
  • STL Hardening has simple, minimal interactions with _ITERATOR_DEBUG_LEVEL (aka IDL).
    • They share their checks (with IDL being a superset) and their termination mechanism, being overhauled here. More on that below. IDL is otherwise generally not being changed here (exceptions will be noted below).
  • STL Hardening is binary-compatible; it does not affect ABI.
    • All checking is performed with information that containers/views already have.
  • STL Hardening follows the design and wording of WG21-P3471R2 Standard Library Hardening (after extensive review, comparing it to our prior work with CDL).
    • Thanks to the paper authors for their thoughtful and careful work! Their design and rationale have helped me become very confident in this new STL Hardening mode, and resolves the major concerns I had with CDL's completely unstructured development. Additionally, they helped identify several places that we had missed in CDL. Conversely, I found a couple of classes that were missed in R2, which the authors will be adding in R3.
  • STL Hardening is a deliberate exception to our general policy of implementing only what's in the Working Draft. That said, because P3471 is not yet part of the Standard, we aren't defining its feature-test macros (but can easily do so in the future).
  • STL Hardening is similar to libc++'s Hardening Modes, with a few differences:
    • We intentionally don't harden algorithms against reversed iterator pairs. (WG21-P3471R2 also excludes this. We have debug checks, of course.)
    • We intentionally don't have an "extensive" mode.
    • We aren't doing anything ABI-breaking with iterators.
  • STL Hardening is initially shipping off-by-default, allowing us to gather performance feedback from early adopters that will help the optimizer team focus their efforts. We intend to make this on-by-default in the future (but not for the VS 2022 17.x release series).

I don't have concrete numbers about the performance impact of these changes, especially in real-world programs. (Remember, I write one-page programs named meow.cpp all day.) Microbenchmarks aren't terribly illuminating for this, so I'm intentionally not adding any. We've gathered a set of MS-internal customers who have promised to be early adopters and provide performance feedback; we also welcome external feedback.

For programs where STL Hardening is disabled (which is all of them, until users take action to opt-in, for the time being), there is zero performance impact in release mode - the checks are all preprocessed out. And in debug mode (where our long-standing IDL=2 debug checks supersede STL Hardening), this work led me to improve the non-optimized codegen for debug checks - that was in #5270.

This has been approved by my bosses and boss-like entities, but not my ultrabosses (apparently I worked a bit faster than expected), so while I'm getting this out for review as soon as possible, I'll wait for their blessings before merging this.

As usual for my PRs, this is organized into well-structured commits - let's look at them. Make rocket go now.

💀 Remove CDL

In these commits, I'm taking the legacy CDL checks that we don't want to be part of STL Hardening, and changing them from testing #if _CONTAINER_DEBUG_LEVEL > 0 to #if _ITERATOR_DEBUG_LEVEL != 0. It helps to understand that CDL was only ever 0 or 1, and (because nobody ever bothered to use it), it was defined as:

STL/stl/inc/yvals.h

Lines 160 to 166 in d43d49a

#ifndef _CONTAINER_DEBUG_LEVEL
#if _ITERATOR_DEBUG_LEVEL == 0
#define _CONTAINER_DEBUG_LEVEL 0
#else // ^^^ _ITERATOR_DEBUG_LEVEL == 0 / _ITERATOR_DEBUG_LEVEL != 0 vvv
#define _CONTAINER_DEBUG_LEVEL 1
#endif // _ITERATOR_DEBUG_LEVEL == 0
#endif // !defined(_CONTAINER_DEBUG_LEVEL)

That is, for IDL=0 (the release mode default) we defined CDL to be 0, and for IDL != 0 (including IDL=2, the debug mode default) we defined CDL to be 1.

So this replacement isn't altering when the checks are emitted, as long as nobody was defining CDL, which they weren't (outside of our test suite).

  • Exclude valarray's 28 binary ops, no test impact.
  • Exclude condition_variable::wait_until, no test impact.
  • Exclude basic_string::resize_and_overwrite, no test impact.
  • Exclude basic_string_view's ctor, update test coverage.
  • Exclude span::size_bytes, update test coverage.
  • Exclude most of <mdspan>, update test coverage.
    • Need to silence Clang -Wunused-variable warnings in the tests.
  • Exclude <generator>, update test coverage.
  • Exclude iota_view, update test coverage.
  • Exclude repeat_view, update test coverage.
  • Exclude filter_view, update test coverage.
  • Exclude take_view, update test coverage.
  • Exclude take_while_view, update test coverage.
  • Exclude drop_view, update test coverage.
  • Exclude drop_while_view, update test coverage.
  • Exclude views::counted, update test coverage.
  • Exclude chunk_view, update test coverage.
  • Exclude slide_view, update test coverage.
  • Exclude chunk_by_view, update test coverage.
  • Exclude stride_view, update test coverage.
  • Exclude cartesian_product_view, update test coverage.

☑️ Control macros

Unlike CDL's weird system which was entangled with IDL, STL Hardening has a simpler, yet more powerful scheme.

We have a coarse-grained control macro, _MSVC_STL_HARDENING, which should be either 0 or 1. The default value is currently 0.

We then have fine-grained control macros, which are per-class. For example, _MSVC_STL_HARDENING_VECTOR controls whether vector (both vector<T> and vector<bool>) are hardened. The default value is whatever the coarse-grained macro is.

This allows the user to have a one-touch setting to control the whole scheme. But if they want, they can add per-class control. (For example, disabled in general but enabled for just span as an initial test of perf impact. Or, enabled in general but disabled for just basic_string.)

As for all control macros, if the user wishes to provide a definition, they should do this project-wide, i.e. on the command line or in their build system, not in source files.

  • Add _MSVC_STL_HARDENING control macros, initially off-by-default.

💎 Add hardening checks

After all of this buildup, the changes to add hardening are generally quite simple, because we already had most of the checks for CDL. The pattern is that (e.g. in <array>) #if _CONTAINER_DEBUG_LEVEL > 0 is being replaced by #if _MSVC_STL_HARDENING_ARRAY || _ITERATOR_DEBUG_LEVEL != 0. That's it - the actual check below remains unchanged.

I've previously explained how converting a CDL check into IDL != 0 is behavior-preserving (for everyone who didn't touch CDL). This simply adds the fine-grained macro _MSVC_STL_HARDENING_ARRAY as an additional way to enable the check.

This "logical OR" pattern, as well as the unified nature of the checking macros (_STL_VERIFY etc.) makes the interaction between STL Hardening and IDL easy to understand:

  • If STL Hardening is enabled (right now, if the user opts in; in the future, by default), then we will perform the check regardless of IDL. This means that we'll check in IDL=0 release mode (and failfast; more on termination later). We'll also check in IDL=2 debug mode, in which case we continue to emit cool assertion dialogs (it would be annoying to failfast in debug mode and make it harder for the programmer-user to figure out what happened).
  • If STL Hardening is disabled (right now, by default; in the future, if the user opts out), then IDL controls whether the check happens. Ignoring the accursed IDL=1 (which is a tedious story, and this is already long enough, and I'm not interfering with its behavior here), this means that disabling STL Hardening means no checking in IDL=0 release mode, and debug checks in IDL=2 debug mode, with the usual cool assertion dialogs. That's the usual behavior today, no surprises.

In a few of these commits, I had to add checking that was outright missing, or not controlled by CDL, or was entangled with normal control flow. To support "Continue on Error" (more on that later), our new pattern is that checks should always be performed prior to library logic, and not interact with it.

  • Harden array.
  • Harden deque, overhaul pop_front/pop_back.
    • They had checks that were entangled with library logic, and (even more annoyingly) split into IDL=2 and IDL < 2 cases.
  • Harden forward_list, add pop_front.
    • We didn't even have a debug check here!
  • Harden list.
  • Harden vector, add vector<bool>::pop_back.
    • CDL missed vector<bool>::pop_back. I believe IDL=2 debug checks would have caught it, but with a worse message (because it would be detected when forming end() - 1).
  • Harden basic_string, add pop_back.
    • This was missing from CDL, but not IDL, at least.
  • Harden basic_string_view.
  • Harden span.
  • Harden mdspan.
  • Harden expected.
  • Harden optional.
  • Harden valarray.
  • Harden ranges::view_interface.
  • Harden bitset, overhauling operator[].

☠️ Removing CDL from human memory

  • Update test coverage for CDL removal.
    • In VSO_0677157_flist_merge_edge_cases, avoid duplicate coverage of front().
    • In VSO_0000000_string_view_idl, remove comment citing VSO-830211, the bug that requested CDL (added by MSVC-PR-174936 on 2019-04-15).
  • Remove CDL's default definition and emit an error if users define it.

😸 Test everything

This tests every single hardened check. I've intentionally avoided extracting commonality into template machinery - there just isn't enough of it to be worth the added complexity. It's much easier to read 2-3 lines and see what each test function is doing.

  • Add test coverage: GH_005090_stl_hardening

🎁 Bonus enhancement: span::size_bytes() debug checks

While reviewing the prelude PR, @sivadeilra observed that this could be improved: #5270 (comment)

It was too much effort to move into the prelude PR. This isn't strictly part of hardening, but it's related to debug checks, so it's here.

  • Enhancement: Move span::size_bytes() debug checks into span's ctors.
    • We don't need to push-disable-pop warning C4127 "conditional expression is constant" because we're going to separately handle static vs. dynamic extents.
    • Add a comment to size_bytes() explaining that we've already checked. (Having constructor checks is sufficient because spans can't grow.)
    • The checks are moving into _Span_extent_type, which has different cases for static vs. dynamic extents. We need to say sizeof(_Ty) because we don't have the element_type typedef.
    • In the debug messages, drop qualification from "std::numeric_limits"; the name is obvious, and if we can mention span unqualified, surely numeric_limits is fine.
    • For static extents, we can unconditionally static_assert, as the throughput cost is trivial. I'm doing this in the constructor, (1) for consistency with dynamic extents below, and (2) because I could imagine pathological code (in STL test suites?) claiming that span<int, static_cast<size_t>(-2)> isn't forbidden to simply instantiate as a type - but the moment that an object of such a type is constructed, the constructor arguments must surely be lying, so we're justified in performing a debug check then.
    • For dynamic extents, _Span_extent_type construction performs the runtime check. This is IDL != 0 (as opposed to hardening), so it's okay if we check more frequently than absolutely necessary. (For example, when constructing from a built-in array or a std::array, we don't need to worry about overflow.)

🦹‍♂️ DOOM RULES ALL

Finally, I'm overhauling how the STL's termination mechanism works. Our cool debug assert messages are unchanged (that's controlled by _RPTF0(_CRT_ASSERT, mesg); in _STL_REPORT_ERROR), but I'm changing what happens when we need to end program execution.

  • Enhancement: Overhaul the STL's "doom function".
    • Emit errors if the old macros are defined.
    • _MSVC_STL_DOOM_FUNCTION is so named because it could expand to many things: a user-customized function, abort(), _invoke_watson(), or (in the future) __fastfail().
    • Add comments explaining how the doom function can be replaced.
      • We now attempt to support "Continue on Error" - not as part of this macro scheme, but as part of how we structure our checks and library logic. I may add automated testing of this in a future PR (it would be full of undefined behavior, so no promises).
    • Provide _MSVC_STL_USE_ABORT_AS_DOOM_FUNCTION as users have specifically requested abort().
    • Change the _STL_CRT_SECURE_INVALID_PARAMETER(expr) parameter to _MSVC_STL_DOOM_FUNCTION(mesg). We always call it with a mesg string, so stringizing it again with #expr is a mistake.
    • Calling _invoke_watson() will immediately call __fastfail() (on Win8+), without calling any user-customized invalid parameter handlers. This gives attackers fewer opportunities to exploit running code.
    • Apparently, in the _SECURE_SCL era of 2005, the CRT's invalid parameter handler was considered a desirable path for termination. We've got new guidance from our internal teams that fastfail is the best path, which makes a lot of sense.
    • When we can drop Win7 support in Dev18, we should be able to directly call __fastfail(), resulting in less codegen (I suspect pushing all of the nullptr / 0 arguments onto the stack is a cost).
    • Cleanup: Add global scope qualification to _invoke_watson in the no-exceptions definition of _RAISE, for consistency.
    • Remove /D_STL_CALL_ABORT_INSTEAD_OF_INVALID_PARAMETER from the test suites.
      • I recall that we needed this to avoid the CRT's invalid parameter handler creating dialogs, even after suppressing the CRT assertion dialogs. But failfast ends so abruptly, there is no dialog.

Need to silence Clang -Wunused-variable warnings in the tests.
StephanTLavavej and others added 6 commits February 10, 2025 21:24
Filed GH 5262 "`<bitset>`: `operator[]` could be SAL annotated with `_In_range_(<, _Bits)`" for followup.

Fusing _Validate() into operator[] fixes some libcxx bitset tests that were failing due to constexpr step limits.
In VSO_0677157_flist_merge_edge_cases, avoid duplicate coverage of front().

In VSO_0000000_string_view_idl, remove comment citing VSO-830211, the bug that requested CDL (added by MSVC-PR-174936 on 2019-04-15).
We don't need to push-disable-pop warning C4127 "conditional expression is constant" because we're going to separately handle static vs. dynamic extents.

Add a comment to size_bytes() explaining that we've already checked. (Having constructor checks is sufficient because spans can't grow.)

The checks are moving into `_Span_extent_type`, which has different cases for static vs. dynamic extents. We need to say `sizeof(_Ty)` because we don't have the `element_type` typedef.

In the debug messages, drop qualification from "std::numeric_limits"; the name is obvious, and if we can mention `span` unqualified, surely `numeric_limits` is fine.

For static extents, we can unconditionally `static_assert`, as the throughput cost is trivial. I'm doing this in the constructor, (1) for consistency with dynamic extents below, and (2) because I could imagine pathological code (in STL test suites?) claiming that `span<int, static_cast<size_t>(-2)>` isn't forbidden to simply instantiate as a type - but the moment that an object of such a type is constructed, the constructor arguments must surely be lying, so we're justified in performing a debug check then.

For dynamic extents, `_Span_extent_type` construction performs the runtime check. This is IDL != 0 (as opposed to hardening), so it's okay if we check more frequently than absolutely necessary. (For example, when constructing from a built-in array or a `std::array`, we don't need to worry about overflow.)
Emit errors if the old macros are defined.

`_MSVC_STL_DOOM_FUNCTION` is so named because it could expand to many things: a user-customized function, abort(), _invoke_watson(), or (in the future) __fastfail().

Add comments explaining how the doom function can be replaced.

Provide `_MSVC_STL_USE_ABORT_AS_DOOM_FUNCTION` as users have specifically requested abort().

Change the `_STL_CRT_SECURE_INVALID_PARAMETER(expr)` parameter to `_MSVC_STL_DOOM_FUNCTION(mesg)`. We always call it with a `mesg` string, so stringizing it again with `#expr` is a mistake.

Calling _invoke_watson() will immediately call __fastfail(), without calling any user-customized invalid parameter handlers. This gives attackers fewer opportunities to exploit running code.

(When we can drop Win7 support in Dev18, we should be able to directly call __fastfail(), resulting in less codegen.)

Cleanup: Add global scope qualification to _invoke_watson in the no-exceptions definition of _RAISE, for consistency.

Remove /D_STL_CALL_ABORT_INSTEAD_OF_INVALID_PARAMETER from the test suites.
@StephanTLavavej StephanTLavavej added enhancement Something can be improved high priority Important! labels Feb 11, 2025
@StephanTLavavej StephanTLavavej requested a review from a team as a code owner February 11, 2025 13:23
Copy link
Member

@DanielJump DanielJump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me.

@davidmrdavid davidmrdavid self-assigned this Feb 11, 2025
Copy link
Member

@davidmrdavid davidmrdavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work. I'm excited for💎hardening💎 to help improve the correctness of C++ programs everywhere.

I left some comments, most nitpicks, some questions. No major blockers, but I'd like to resolve these before approving. Thanks!

Copy link
Member

@davidmrdavid davidmrdavid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: No concerns from me in moving forward, LGTM. I'll be monitoring for more reviews in this thread in case any of them pick up on something I missed.

Before enabling this by default (in another PR), I think it would be wise to benchmark. 🔬 . Those numbers may make for a good blog post!

(I won't be moving this through the kanban board as I understand you're waiting for even more approvals from other stakeholders)

@StephanTLavavej
Copy link
Member Author

WG21-P3471R4 was just voted into the Working Paper. R3 added expected<void, E> as I implemented it here. (I've asked why ranges::view_interface wasn't added.) R4 then specified the feature in terms of Contracts, which we obviously don't have yet, so we can't define the feature-test macros. (Conveniently, the feature-test macros are per-class, matching my fine-grained scheme.)

As a result, I have no further changes for this PR.

@StephanTLavavej
Copy link
Member Author

I've pushed an additional commit to pre-emptively resolve a merge conflict with the toolset update #5284. (When mirroring, I always sequence toolset updates first, so all following PRs run on the new pool.) The toolset update entirely SKIPPED the bitset tests that were inconsistently triggering constexpr step limits, which supersedes how this PR was selectively removing FAIL from them. By making the same change in both PRs, they will merge cleanly with each other here.

@StephanTLavavej
Copy link
Member Author

I'm speculatively mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Something can be improved high priority Important!
Projects
Status: Blocked
Development

Successfully merging this pull request may close these issues.

Implement a hardened mode
6 participants