Jit: Conditional Escape Analysis and Cloning #111473

AndyAyersMS · 2025-01-15T22:06:41Z

Enhance escape analysis to determine if an object escapes only under failed GDV tests. If so, clone to create a path of code so that the object doesn't escape, and then stack allocate the object.

More details in the included document.

Contributes to #108913

Enhance escape analysis to determine if an object escapes only under failed GDV tests. If so, clone to create a path of code so that the object doesn't escape, and then stack allocate the object. More details in the included document. Contributes to dotnet#108913

dotnet-policy-service · 2025-01-15T22:07:19Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

AndyAyersMS · 2025-01-15T22:07:29Z

@dotnet/jit-contrib PTAL

Not sure who wants to sign up to review this. Let me know.

SPMI is not going to be useful here. We can try MihuBot but that may not show much either.

amanasifkhalid

Nice writeup! I haven't looked at the code yet, though you should probably get someone else to look at it

docs/design/coreclr/jit/DeabstractionAndConditionalEscapeAnalysis.md

Co-authored-by: Aman Khalid <[email protected]>

AndyAyersMS · 2025-01-17T23:16:43Z

One of the libraries failures is that we're not updating the enclosing EH extents properly.

Assertion failed 'found && "BBJ_EHFINALLYRET predecessor of block that doesn't follow a BBJ_CALLFINALLY!"' in 'System.IO.File:InternalWriteAllLines(System.IO.StreamWriter,System.Collections.Generic.IEnumerable`1[System.String])' during 'Allocate Objects' (IL size 84; hash 0xfef048f0; Tier1)

fgCloneTryRegion assumes the cloned try will be lexically after the original, so any EH extent that ended at the original now ends at the clone. But here we're often putting the clone before the original, and so we get the enclosing extent wrong.

It should be possible to reorder these and put the clone after, but the EH lexicality constraints have been a consistent source of trouble.

AndyAyersMS · 2025-01-23T20:02:34Z

@EgorBo this is the PR I was mentioning earlier

AndyAyersMS · 2025-01-23T23:12:47Z

Hmm, still not getting the EH right.

AndyAyersMS · 2025-01-24T19:01:27Z

Failures on linux libraries tests to look into.

AndyAyersMS · 2025-01-30T02:40:51Z

/azp run runtime-coreclr pgo, runtime-coreclr pgostress, runtime-coreclr libraries-pgo,
runtime-coreclr jitstress

azure-pipelines · 2025-01-30T02:41:19Z

Azure Pipelines successfully started running 4 pipeline(s).

AndyAyersMS · 2025-01-30T15:57:59Z

Looking at outerloop failures

jitstress is Test failure: baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester.cmd #110133
libraries-pgo is
- inconsistent profile data at Invert Loops (@amanasifkhalid are these known?) for System.Text.Json.Tests and System.Linq.Tests (possibly related)
- gc hole assertion on crypto for x86 (possibly related)
- linux x64 assert during allocate objects (related)
pgostress is Test failure: baseservices/exceptions/stackoverflow/stackoverflowtester/stackoverflowtester.cmd #110133

So more things to investigate

amanasifkhalid · 2025-01-30T16:20:40Z

inconsistent profile data at Invert Loops (@amanasifkhalid are these known?) for System.Text.Json.Tests and System.Linq.Tests (possibly related)

Yes, I have a fix for this in #111684

AndyAyersMS · 2025-01-31T14:33:20Z

/azp run runtime-coreclr libraries-pgo

azure-pipelines · 2025-01-31T14:33:35Z

Azure Pipelines successfully started running 1 pipeline(s).

AndyAyersMS · 2025-01-31T17:33:28Z

libraries-pgo testing still ragged... think the failures are unrelated but there are way too many to check. Will wait for CI issues to be sorted, merge up, and try again.

AndyAyersMS · 2025-01-31T21:14:07Z

/azp run runtime-coreclr libraries-pgo

azure-pipelines · 2025-01-31T21:14:21Z

Azure Pipelines successfully started running 1 pipeline(s).

AndyAyersMS · 2025-01-31T23:17:33Z

I think the remaining libraries-pgo failures are unrelated.

@amanasifkhalid still seeing some inconsistent profiles after loop inversion.

jakobbotsch · 2025-01-31T23:47:00Z

src/coreclr/jit/objectalloc.cpp

-            unsigned const lclNum = tree->AsLclVarCommon()->GetLclNum();
+            GenTree* const   tree   = *use;
+            unsigned const   lclNum = tree->AsLclVarCommon()->GetLclNum();
+            LclVarDsc* const varDsc = m_compiler->lvaGetDesc(lclNum);


Is this used?

jakobbotsch · 2025-01-31T23:52:55Z

src/coreclr/jit/objectalloc.cpp

+
+    for (unsigned p = 0; p < m_numPseudoLocals; p++)
+    {
+        unsigned const pseudoLocal = p + comp->lvaCount;


Does the base analysis actually need to allocate an index for all locals in the first place? Shouldn't it be sufficient to allocate indices for only TYP_REF, TYP_BYREF, TYP_I_IMPL and TYP_STRUCT locals?

Yeah, it is using bigger BVs than necessary. We would need to create some convenient dense numbering scheme. I can revisit this when I look at field-sensitive approaches perhaps.

We could probably just reuse LclVarDsc::lvVarIndex for the dense numbering scheme. Doing this later sounds good to me.

It also seems like there should be some kind of limit on the number of tracked locals here, since this analysis is inherently super-linear.

jakobbotsch · 2025-02-01T00:03:33Z

src/coreclr/jit/objectalloc.cpp

+        return nullptr;
+    }
+
+    GenTree* const tree = jumpTree->AsOp()->gtOp1;


Nit (here and also below)

Suggested change

GenTree* const tree = jumpTree->AsOp()->gtOp1;

GenTree* const tree = jumpTree->gtGetOp1();

jakobbotsch · 2025-02-01T00:14:24Z

src/coreclr/jit/objectalloc.cpp

+    }
+    else
+    {
+        JITDUMP("... no pseudo local?\n");


Seems this would be more readable with some early outs.

jakobbotsch · 2025-02-01T00:16:23Z

src/coreclr/jit/objectalloc.cpp

+        {
+            assert(comp->m_dfsTree != nullptr);
+            assert(comp->m_domTree == nullptr);
+            comp->m_domTree = FlowGraphDominatorTree::Build(comp->m_dfsTree);


Is there any benefit to waiting until we actually want to use it to compute it? Or do we know at this point we are going to end up using it?

Yeah we pretty much know we're going to use it.

jakobbotsch · 2025-02-01T00:25:18Z

src/coreclr/jit/objectalloc.cpp

+    BitVecTraits traits(comp->compBasicBlockID, comp);
+    BitVec       visitedBlocks(BitVecOps::MakeEmpty(&traits));
+    toVisit.Push(allocBlock);


Seems this can use post order traits.

jakobbotsch · 2025-02-01T00:26:42Z

src/coreclr/jit/objectalloc.cpp

+
+        JITDUMP("walking through " FMT_BB "\n", visitBlock->bbNum);
+
+        for (BasicBlock* const succ : visitBlock->Succs())


VisitRegularSuccs would be better to use here

Should this actually use VisitAllSuccs? It seems this needs to take into account any possible control flow path if it's trying to validate some form of post dominance.

See note below, since we're in an inlined GDV region, and we're validing all blocks are in the same EH region, any EH successor would bypass assignment and so would not flow into the cloned code we're going to create.

Makes sense.

jakobbotsch · 2025-02-01T00:27:13Z

src/coreclr/jit/objectalloc.cpp

+            if (BitVecOps::IsMember(&traits, visitedBlocks, succ->bbID))
+            {
+                continue;
+            }
+            toVisit.Push(succ);


Can we add to visitedBlocks here to avoid unnecessarily pushing the same block on toVisit unnecessarily?

I should probably revise this bit, but would like to hold off.

Ok with me to do this later. There is another graph walk below that can use the same kind of enhancement.

jakobbotsch · 2025-02-01T00:37:41Z

src/coreclr/jit/objectalloc.cpp

+            }
+            toVisit.Push(succ);
+        }
+    }


Is this loop missing some check for "sink" blocks like BBJ_RETURN that cannot reach defBlock? What would happen in those cases? Unnecessary cloning?

I suppose so.

But with the current hinting the only allocation sites we look at are those that are part of an inlinee on the fast side of a GDV, so with normal flow they can't really avoid reaching the bottom of the GDV diamond.

We also know the allocation site is not in a loop (via our lexical analysis) or we wouldn't consider it eligible in the first place. Might be nice to replace that with a DFS based check but I don't know if we have a general "this block is in a cycle" utility.

So the only other possibility for unexpected flow is some kind of exception. If the exception is handled within the fast GDV side we would expect it to assign some kind of return value, otherwise the GetEnumerator we've inlined is defective. If it's not handled or the GetEnumerator is defective then if the program tries to use the enumerator it should get a null ref (or whatever the reaching def is for the enumerator var).

So I don't think there is anything wasteful going on here, but if we generalize the analysis to consider other kinds of conditional allocations then we'll need to reconsider.

Makes sense, thanks.

jakobbotsch · 2025-02-01T00:40:28Z

src/coreclr/jit/objectalloc.cpp

+    auto latestStmt = [](Statement* stmt1, Statement* stmt2) {
+        if (stmt1 == stmt2)
+        {
+            return stmt1;
+        }
+
+        Statement* cursor1 = stmt1->GetNextStmt();
+        Statement* cursor2 = stmt2->GetNextStmt();
+
+        while (true)
+        {
+            if ((cursor1 == stmt2) || (cursor2 == nullptr))
+            {
+                return stmt2;
+            }
+
+            if ((cursor2 == stmt1) || (cursor1 == nullptr))
+            {
+                return stmt1;
+            }
+
+            cursor1 = cursor1->GetNextStmt();
+            cursor2 = cursor2->GetNextStmt();
+        }
+    };


Extract this and the other versions into a Compiler method?

jakobbotsch · 2025-02-01T00:41:51Z

src/coreclr/jit/objectalloc.cpp

+            }
+        }
+
+        // todo: proper check for same block


Still needed?

amanasifkhalid · 2025-02-04T01:00:12Z

@amanasifkhalid still seeing some inconsistent profiles after loop inversion.

Sorry I missed this; taking a look

jakobbotsch · 2025-02-04T16:15:43Z

src/coreclr/jit/objectalloc.cpp

+
+    // Verify this appearance is under the same guard
+    //
+    if ((info.m_local == lclNum) && (pseudoGuardInfo->m_local == lclNum) && (info.m_type == pseudoGuardInfo->m_type))


Does the m_type == pseudoGuardInfo->m_type check work correctly for NAOT? From what I can tell one is an actual compile time class handle from GenTreeAllocObj::gtAllocObjClsHnd, while the other is the constant that we were pointing out is not a compile time handle above. Perhaps that one needs to be changed to use GenTreeIntCon::gtCompileTimeHandle.

Could be -- I will take a look.

that's why I pointed out that we should never cast IntCon (including VN) to CLASS_HANDLE and use a helper instead (feel free to rename it)🙂. I suspect it shouldn't be too hard to eliminate compile time handle for classes just like we don't need it in many other handle types

Yes, it should use the compile time handle from the icon. It is currently hard to reach this opt in R2R/NAOT because it requires GDV and hence PGO.

I'll have a PR up in a bit with this and some of the other deferred fixes.

I guess for R2R/NAOT the priority may be not that high given that now we have late devirt inlining. It should already unblock many cases of stack allocating an enumerator, unless the MoveNext is too large to inline.

* main: JIT: Set PGO data inconsistent when flow disappears in cast expansion (dotnet#112147) [H/3] Fix handling H3_NO_ERROR (dotnet#112125) Change some workflows using `pull_request` to use `pull_request_target` instead (dotnet#112161) Annotate ConfiguredCancelableAsyncEnumerable T with allows ref struct and update extensions (dotnet#111953) Delete copy of performance pipelines in previous location (dotnet#112113) Optimize BigInteger.Divide (dotnet#96895) Use current STJ in HostModel and remove unnecessary audit suppressions (dotnet#109852) JIT: Unify handling of InstParam argument during inlining (dotnet#112119) Remove unneeded DiagnosticSource content (dotnet#112116) Improve compare-and-branch sequences produced by Emitter (dotnet#111797) Jit: Conditional Escape Analysis and Cloning (dotnet#111473) Re-enable HKDF-SHA3 on Azure Linux Remove fstream usage from corehost (dotnet#111859)

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 15, 2025

dotnet-policy-service bot assigned AndyAyersMS Jan 15, 2025

AndyAyersMS mentioned this pull request Jan 15, 2025

JIT: De-abstraction in .NET 10 #108913

Open

AndyAyersMS added 2 commits January 15, 2025 14:45

fix build; fix doc

f7af8e7

restore fix lost in a merge somewhere

a785363

amanasifkhalid reviewed Jan 16, 2025

View reviewed changes

AndyAyersMS and others added 3 commits January 17, 2025 09:35

Clarify sentence in Linq methods section

9f18c69

Apply suggestions from code review

2b5546e

Co-authored-by: Aman Khalid <[email protected]>

properly detect if allocation escapes via an alloc temp

c2a15ad

build-analysis bot mentioned this pull request Jan 17, 2025

The hosted runner encountered an error while running your job. (Error Type: Disconnect). dotnet/dnceng#1919

Open

3 tasks

AndyAyersMS added 5 commits January 17, 2025 15:37

fix insertion point logic for the cloning

670b407

Merge branch 'main' into ArrayDeAbstractionCloning

f5c2d93

merge main; fix up some conflicts

e9fa669

fix bad merge

95f8ea5

fix enclosing EH region extent

45badb6

build-analysis bot mentioned this pull request Jan 23, 2025

Test failure WindowsAlternateDataStreamOverwrite #83659

Open

better fix for enclosing regions

41f8397

AndyAyersMS mentioned this pull request Jan 24, 2025

Special case List<ClaimsIdentity> in SelectPrimaryIdentity #111799

Open

AndyAyersMS added 2 commits January 24, 2025 17:38

Merge branch 'main' into ArrayDeAbstractionCloning

59a010f

wip

49d9ade

cleanup some unneded code

cd03844

process try regions outer to inner

32ad17b

Merge branch 'main' into ArrayDeAbstractionCloning

6d43cdc

jakobbotsch reviewed Jan 31, 2025

View reviewed changes

jakobbotsch reviewed Feb 1, 2025

View reviewed changes

AndyAyersMS added 2 commits February 3, 2025 14:28

common latest statement utility

3a18636

review feedback

0bc4908

AndyAyersMS merged commit f6c74b8 into dotnet:main Feb 4, 2025
114 checks passed

jakobbotsch reviewed Feb 4, 2025

View reviewed changes

	GenTree* const tree = jumpTree->AsOp()->gtOp1;
	GenTree* const tree = jumpTree->gtGetOp1();


		JITDUMP("walking through " FMT_BB "\n", visitBlock->bbNum);

		for (BasicBlock* const succ : visitBlock->Succs())

Jit: Conditional Escape Analysis and Cloning #111473

Jit: Conditional Escape Analysis and Cloning #111473

Conversation

AndyAyersMS commented Jan 15, 2025

dotnet-policy-service bot commented Jan 15, 2025

AndyAyersMS commented Jan 15, 2025 • edited Loading

amanasifkhalid left a comment

Choose a reason for hiding this comment

AndyAyersMS commented Jan 17, 2025

AndyAyersMS commented Jan 23, 2025

AndyAyersMS commented Jan 23, 2025

AndyAyersMS commented Jan 24, 2025

AndyAyersMS commented Jan 30, 2025

azure-pipelines bot commented Jan 30, 2025

AndyAyersMS commented Jan 30, 2025

amanasifkhalid commented Jan 30, 2025

AndyAyersMS commented Jan 31, 2025

azure-pipelines bot commented Jan 31, 2025

AndyAyersMS commented Jan 31, 2025

AndyAyersMS commented Jan 31, 2025

azure-pipelines bot commented Jan 31, 2025

AndyAyersMS commented Jan 31, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakobbotsch Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amanasifkhalid commented Feb 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EgorBo Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hez2010 Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

AndyAyersMS commented Jan 15, 2025 •

edited

Loading

jakobbotsch Jan 31, 2025 •

edited

Loading

EgorBo Feb 4, 2025 •

edited

Loading

hez2010 Feb 6, 2025 •

edited

Loading