Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never use heap for return buffers #112060

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions docs/design/coreclr/botr/clr-abi.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,7 @@ Passing/returning structs according to hardware floating-point calling conventio

## Return buffers

The same applies to some return buffers. See `MethodTable::IsStructRequiringStackAllocRetBuf()`. When that returns `false`, the return buffer might be on the heap, either due to reflection/remoting code paths mentioned previously or due to a JIT optimization where a call with a return buffer that then assigns to a field (on the GC heap) are changed into passing the field reference as the return buffer. Conversely, when it returns true, the JIT does not need to use a write barrier when storing to the return buffer, but it is still not guaranteed to be a compiler temp, and as such the JIT should not introduce spurious writes to the return buffer.

NOTE: This optimization is now disabled for all platforms (`IsStructRequiringStackAllocRetBuf()` always returns `false`).
Since .NET 10, return buffers must always be allocated on the stack by the caller. After the call, the caller is responsible for copying the return buffer to the final destination using write barriers if necessary. The JIT can assume that the return buffer is always on the stack and may optimize accordingly, such as by omitting write barriers when writing GC pointers to the return buffer. In addition, the buffer is allowed to be used for temporary storage within the method since its content must not be aliased or cross-thread visible.

ARM64-only: When a method returns a structure that is larger than 16 bytes the caller reserves a return buffer of sufficient size and alignment to hold the result. The address of the buffer is passed as an argument to the method in `R8` (defined in the JIT as `REG_ARG_RET_BUFF`). The callee isn't required to preserve the value stored in `R8`.

Expand Down Expand Up @@ -778,9 +776,7 @@ The return value is handled as follows:
1. Floating-point values are returned on the top of the hardware FP stack.
2. Integers up to 32 bits long are returned in EAX.
3. 64-bit integers are passed with EAX holding the least significant 32 bits and EDX holding the most significant 32 bits.
4. All other cases require the use of a return buffer, through which the value is returned.

In addition, there is a guarantee that if a return buffer is used a value is stored there only upon ordinary exit from the method. The buffer is not allowed to be used for temporary storage within the method and its contents will be unaltered if an exception occurs while executing the method.
4. All other cases require the use of a return buffer, through which the value is returned. See [Return buffers](#return-buffers).

# Control Flow Guard (CFG) support on Windows

Expand Down
2 changes: 2 additions & 0 deletions src/coreclr/inc/corinfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -594,6 +594,8 @@ enum CorInfoHelpFunc
CORINFO_HELP_VALIDATE_INDIRECT_CALL, // CFG: Validate function pointer
CORINFO_HELP_DISPATCH_INDIRECT_CALL, // CFG: Validate and dispatch to pointer

CORINFO_HELP_ENSURE_NONHEAP, // Ensure that the target was not in the heap.

CORINFO_HELP_COUNT,
};

Expand Down
1 change: 1 addition & 0 deletions src/coreclr/inc/jithelpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,7 @@
JITHELPER(CORINFO_HELP_VALIDATE_INDIRECT_CALL, NULL, METHOD__NIL)
JITHELPER(CORINFO_HELP_DISPATCH_INDIRECT_CALL, NULL, METHOD__NIL)
#endif
JITHELPER(CORINFO_HELP_ENSURE_NONHEAP, JIT_EnsureNonHeapTarget,METHOD__NIL)

#undef JITHELPER
#undef DYNAMICJITHELPER
Expand Down
5 changes: 3 additions & 2 deletions src/coreclr/inc/readytorun.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@
// src/coreclr/nativeaot/Runtime/inc/ModuleHeaders.h
// If you update this, ensure you run `git grep MINIMUM_READYTORUN_MAJOR_VERSION`
// and handle pending work.
#define READYTORUN_MAJOR_VERSION 11
#define READYTORUN_MAJOR_VERSION 12
#define READYTORUN_MINOR_VERSION 0x0000

#define MINIMUM_READYTORUN_MAJOR_VERSION 11
#define MINIMUM_READYTORUN_MAJOR_VERSION 12

// R2R Version 2.1 adds the InliningInfo section
// R2R Version 2.2 adds the ProfileDataInfo section
Expand All @@ -39,6 +39,7 @@
// R2R Version 10.0 adds support for the statics being allocated on a per type basis instead of on a per module basis, disable support for LogMethodEnter helper
// R2R Version 10.1 adds Unbox_TypeTest helper
// R2R Version 11 uses GCInfo v4, which encodes safe points without -1 offset and does not track return kinds in GCInfo
// R2R Version 12 requires all return buffers to be always on the stack
EgorBo marked this conversation as resolved.
Show resolved Hide resolved

struct READYTORUN_CORE_HEADER
{
Expand Down
7 changes: 4 additions & 3 deletions src/coreclr/jit/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -4938,7 +4938,7 @@ class Compiler
Statement** pAfterStmt = nullptr,
const DebugInfo& di = DebugInfo(),
BasicBlock* block = nullptr);
GenTree* impStoreStructPtr(GenTree* destAddr, GenTree* value, unsigned curLevel);
GenTree* impStoreStructPtr(GenTree* destAddr, GenTree* value, unsigned curLevel, GenTreeFlags indirFlags = GTF_EMPTY);

GenTree* impGetNodeAddr(GenTree* val, unsigned curLevel, GenTreeFlags* pDerefFlags);

Expand Down Expand Up @@ -5228,8 +5228,8 @@ class Compiler
static LONG jitNestingLevel;
#endif // DEBUG

static bool impIsInvariant(const GenTree* tree);
static bool impIsAddressInLocal(const GenTree* tree, GenTree** lclVarTreeOut = nullptr);
bool impIsInvariant(const GenTree* tree);
bool impIsAddressInLocal(const GenTree* tree, GenTree** lclVarTreeOut = nullptr);

void impMakeDiscretionaryInlineObservations(InlineInfo* pInlineInfo, InlineResult* inlineResult);

Expand Down Expand Up @@ -10793,6 +10793,7 @@ class Compiler
STRESS_MODE(POISON_IMPLICIT_BYREFS) \
STRESS_MODE(STORE_BLOCK_UNROLLING) \
STRESS_MODE(THREE_OPT_LAYOUT) \
STRESS_MODE(NONHEAP_RET_BUFFER) \
STRESS_MODE(COUNT)

enum compStressArea
Expand Down
11 changes: 11 additions & 0 deletions src/coreclr/jit/flowgraph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2457,6 +2457,17 @@ PhaseStatus Compiler::fgAddInternal()

noway_assert(!dbgHandle || !pDbgHandle);

#if DEBUG
// JitStress: Insert a helper call to ensure that the return buffer is not on the GC heap.
if (compStressCompile(STRESS_NONHEAP_RET_BUFFER, 50) && (info.compRetBuffArg != BAD_VAR_NUM) &&
!opts.IsReadyToRun())
{
GenTree* retBuffAddr = gtNewLclvNode(info.compRetBuffArg, TYP_BYREF);
fgNewStmtAtBeg(fgFirstBB, gtNewHelperCallNode(CORINFO_HELP_ENSURE_NONHEAP, TYP_VOID, retBuffAddr));
madeChanges = true;
}
#endif

if (dbgHandle || pDbgHandle)
{
// Test the JustMyCode VM global state variable
Expand Down
62 changes: 53 additions & 9 deletions src/coreclr/jit/importer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -843,7 +843,23 @@ GenTree* Compiler::impStoreStruct(GenTree* store,
// TODO-Bug?: verify if flags matter here
GenTreeFlags indirFlags = GTF_EMPTY;
GenTree* destAddr = impGetNodeAddr(store, CHECK_SPILL_ALL, &indirFlags);
NewCallArg newArg = NewCallArg::Primitive(destAddr).WellKnown(wellKnownArgType);

// Make sure we don't pass something other than a local address to the return buffer arg.
// It is allowed to pass current's method return buffer as it is a local too.
if (!impIsAddressInLocal(destAddr) && !eeIsByrefLike(srcCall->gtRetClsHnd))
{
unsigned tmp = lvaGrabTemp(false DEBUGARG("stack copy for value returned via return buffer"));
lvaSetStruct(tmp, srcCall->gtRetClsHnd, false);

GenTree* spilledCall = gtNewStoreLclVarNode(tmp, srcCall);
GenTree* comma = gtNewOperNode(GT_COMMA, store->TypeGet(), spilledCall,
gtNewLclvNode(tmp, lvaGetDesc(tmp)->TypeGet()));
store->Data() = comma;
comma->AsOp()->gtOp1 = impStoreStruct(spilledCall, curLevel, pAfterStmt, di, block);
return impStoreStruct(store, curLevel, pAfterStmt, di, block);
}

NewCallArg newArg = NewCallArg::Primitive(destAddr).WellKnown(wellKnownArgType);

if (destAddr->OperIs(GT_LCL_ADDR))
{
Expand Down Expand Up @@ -953,6 +969,27 @@ GenTree* Compiler::impStoreStruct(GenTree* store,
// TODO-Bug?: verify if flags matter here
GenTreeFlags indirFlags = GTF_EMPTY;
GenTree* destAddr = impGetNodeAddr(store, CHECK_SPILL_ALL, &indirFlags);

// Make sure we don't pass something other than a local address to the return buffer arg.
// It is allowed to pass current's method return buffer as it is a local too.
if (!impIsAddressInLocal(destAddr) && !eeIsByrefLike(call->gtRetClsHnd))
{
unsigned tmp = lvaGrabTemp(false DEBUGARG("stack copy for value returned via return buffer"));
lvaSetStruct(tmp, call->gtRetClsHnd, false);
destAddr = gtNewLclVarAddrNode(tmp, TYP_BYREF);

// Insert address of temp into existing call
NewCallArg retBufArg = NewCallArg::Primitive(destAddr).WellKnown(WellKnownArg::RetBuffer);
call->gtArgs.InsertAfterThisOrFirst(this, retBufArg);

// Now the store needs to copy from the new temp instead.
call->gtType = TYP_VOID;
src->gtType = TYP_VOID;
var_types tmpType = lvaGetDesc(tmp)->TypeGet();
store->Data() = gtNewOperNode(GT_COMMA, tmpType, src, gtNewLclvNode(tmp, tmpType));
return impStoreStruct(store, CHECK_SPILL_ALL, pAfterStmt, di, block);
}

call->gtArgs.InsertAfterThisOrFirst(this,
NewCallArg::Primitive(destAddr).WellKnown(WellKnownArg::RetBuffer));

Expand Down Expand Up @@ -1010,21 +1047,22 @@ GenTree* Compiler::impStoreStruct(GenTree* store,
// impStoreStructPtr: Store (copy) the structure from 'src' to 'destAddr'.
//
// Arguments:
// destAddr - address of the destination of the store
// value - value to store
// curLevel - stack level for which a spill may be being done
// destAddr - address of the destination of the store
// value - value to store
// curLevel - stack level for which a spill may be being done
// indirFlags - flags to be used on the store node
//
// Return Value:
// The tree that should be appended to the statement list that represents the store.
//
// Notes:
// Temp stores may be appended to impStmtList if spilling is necessary.
//
GenTree* Compiler::impStoreStructPtr(GenTree* destAddr, GenTree* value, unsigned curLevel)
GenTree* Compiler::impStoreStructPtr(GenTree* destAddr, GenTree* value, unsigned curLevel, GenTreeFlags indirFlags)
{
var_types type = value->TypeGet();
ClassLayout* layout = (type == TYP_STRUCT) ? value->GetLayout(this) : nullptr;
GenTree* store = gtNewStoreValueNode(type, layout, destAddr, value);
GenTree* store = gtNewStoreValueNode(type, layout, destAddr, value, indirFlags);
store = impStoreStruct(store, curLevel);

return store;
Expand Down Expand Up @@ -11247,18 +11285,19 @@ bool Compiler::impReturnInstruction(int prefixFlags, OPCODE& opcode)

if (info.compRetBuffArg != BAD_VAR_NUM)
{
var_types retBuffType = lvaGetDesc(info.compRetBuffArg)->TypeGet();
// Assign value to return buff (first param)
GenTree* retBuffAddr =
gtNewLclvNode(info.compRetBuffArg, TYP_BYREF DEBUGARG(impCurStmtDI.GetLocation().GetOffset()));
gtNewLclvNode(info.compRetBuffArg, retBuffType DEBUGARG(impCurStmtDI.GetLocation().GetOffset()));

op2 = impStoreStructPtr(retBuffAddr, op2, CHECK_SPILL_ALL);
op2 = impStoreStructPtr(retBuffAddr, op2, CHECK_SPILL_ALL, GTF_IND_TGT_NOT_HEAP);
impAppendTree(op2, CHECK_SPILL_NONE, impCurStmtDI);

// There are cases where the address of the implicit RetBuf should be returned explicitly.
//
if (compMethodReturnsRetBufAddr())
{
op1 = gtNewOperNode(GT_RETURN, TYP_BYREF, gtNewLclvNode(info.compRetBuffArg, TYP_BYREF));
op1 = gtNewOperNode(GT_RETURN, retBuffType, gtNewLclvNode(info.compRetBuffArg, retBuffType));
}
else
{
Expand Down Expand Up @@ -12645,6 +12684,11 @@ bool Compiler::impIsAddressInLocal(const GenTree* tree, GenTree** lclVarTreeOut)
return true;
}

if (op->OperIsScalarLocal() && (op->AsLclVarCommon()->GetLclNum() == impInlineRoot()->info.compRetBuffArg))
{
return true;
}

return false;
}

Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/jit/lclvars.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -485,7 +485,7 @@ void Compiler::lvaInitRetBuffArg(InitVarDscInfo* varDscInfo, bool useFixedRetBuf
info.compRetBuffArg = varDscInfo->varNum;

LclVarDsc* varDsc = varDscInfo->varDsc;
varDsc->lvType = TYP_BYREF;
varDsc->lvType = TYP_I_IMPL;
varDsc->lvIsParam = 1;
varDsc->lvIsRegArg = 0;

Expand Down
1 change: 1 addition & 0 deletions src/coreclr/jit/utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1770,6 +1770,7 @@ void HelperCallProperties::init()
isNoGC = true;
FALLTHROUGH;
case CORINFO_HELP_ASSIGN_REF_ENSURE_NONHEAP:
case CORINFO_HELP_ENSURE_NONHEAP:
case CORINFO_HELP_BULK_WRITEBARRIER:
mutatesHeap = true;
break;
Expand Down
2 changes: 1 addition & 1 deletion src/coreclr/nativeaot/Runtime/inc/ModuleHeaders.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ struct ReadyToRunHeaderConstants
{
static const uint32_t Signature = 0x00525452; // 'RTR'

static const uint32_t CurrentMajorVersion = 11;
static const uint32_t CurrentMajorVersion = 12;
static const uint32_t CurrentMinorVersion = 0;
};

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ internal struct ReadyToRunHeaderConstants
{
public const uint Signature = 0x00525452; // 'RTR'

public const ushort CurrentMajorVersion = 11;
public const ushort CurrentMajorVersion = 12;
public const ushort CurrentMinorVersion = 0;
}
#if READYTORUN
Expand Down
2 changes: 2 additions & 0 deletions src/coreclr/tools/Common/JitInterface/CorInfoHelpFunc.cs
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,8 @@ which is the right helper to use to allocate an object of a given type. */
CORINFO_HELP_VALIDATE_INDIRECT_CALL, // CFG: Validate function pointer
CORINFO_HELP_DISPATCH_INDIRECT_CALL, // CFG: Validate and dispatch to pointer

CORINFO_HELP_ENSURE_NONHEAP, // Ensure that the target was not in the heap.

CORINFO_HELP_COUNT,
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -182,13 +182,6 @@ private void FakeGcScanRoots(MethodDesc method, ArgIterator argit, CORCOMPILE_GC
return;
}

// Also if the method has a return buffer, then it is the first argument, and could be an interior ref,
// so always promote it.
if (argit.HasRetBuffArg())
{
frame[_transitionBlock.GetRetBuffArgOffset(argit.HasThis)] = CORCOMPILE_GCREFMAP_TOKENS.GCREFMAP_INTERIOR;
}

//
// Now iterate the arguments
//
Expand Down
14 changes: 0 additions & 14 deletions src/coreclr/vm/frames.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1376,13 +1376,6 @@ void TransitionFrame::PromoteCallerStackHelper(promote_func* fn, ScanContext* sc
(fn)(PTR_PTR_Object(pThis), sc, CHECK_APP_DOMAIN);
}

if (argit.HasRetBuffArg())
{
PTR_PTR_VOID pRetBuffArg = dac_cast<PTR_PTR_VOID>(pTransitionBlock + argit.GetRetBuffArgOffset());
LOG((LF_GC, INFO3, " ret buf Argument promoted from" FMT_ADDR "\n", DBG_ADDR(*pRetBuffArg) ));
PromoteCarefully(fn, PTR_PTR_Object(pRetBuffArg), sc, GC_CALL_INTERIOR|CHECK_APP_DOMAIN);
}

int argOffset;
while ((argOffset = argit.GetNextOffset()) != TransitionBlock::InvalidOffset)
{
Expand Down Expand Up @@ -1982,13 +1975,6 @@ void FakeGcScanRoots(MetaSig& msig, ArgIterator& argit, MethodDesc * pMD, BYTE *
return;
}

// Also if the method has a return buffer, then it is the first argument, and could be an interior ref,
// so always promote it.
if (argit.HasRetBuffArg())
{
FakePromote((Object **)(pFrame + argit.GetRetBuffArgOffset()), &sc, GC_CALL_INTERIOR);
}

//
// Now iterate the arguments
//
Expand Down
10 changes: 10 additions & 0 deletions src/coreclr/vm/gchelpers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1466,6 +1466,16 @@ extern "C" HCIMPL2_RAW(VOID, JIT_WriteBarrierEnsureNonHeapTarget, Object **dst,
}
HCIMPLEND_RAW

extern "C" HCIMPL1_RAW(VOID, JIT_EnsureNonHeapTarget, void *dst)
{
STATIC_CONTRACT_MODE_COOPERATIVE;
STATIC_CONTRACT_THROWS;
STATIC_CONTRACT_GC_NOTRIGGER;

_ASSERT(!GCHeapUtilities::GetGCHeap()->IsHeapPointer(dst));
}
HCIMPLEND_RAW

// This function sets the card table with the granularity of 1 byte, to avoid ghost updates
// that could occur if multiple threads were trying to set different bits in the same card.

Expand Down
1 change: 1 addition & 0 deletions src/coreclr/vm/jitinterface.h
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,7 @@ extern "C" FCDECL2(VOID, JIT_CheckedWriteBarrier, Object **dst, Object *ref);

extern "C" FCDECL2(VOID, JIT_WriteBarrier, Object **dst, Object *ref);
extern "C" FCDECL2(VOID, JIT_WriteBarrierEnsureNonHeapTarget, Object **dst, Object *ref);
extern "C" FCDECL1(VOID, JIT_EnsureNonHeapTarget, void *dst);

// ARM64 JIT_WriteBarrier uses special ABI and thus is not callable directly
// Copied write barriers must be called at a different location
Expand Down
31 changes: 29 additions & 2 deletions src/coreclr/vm/reflectioninvocation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -492,6 +492,9 @@ extern "C" void QCALLTYPE RuntimeMethodHandle_InvokeMethod(
// If an exception occurs a gc may happen but we are going to dump the stack anyway and we do
// not need to protect anything.

// Allocate a local buffer for the return buffer if necessary
PVOID pLocalRetBuf = nullptr;

{
BEGINFORBIDGC();
#ifdef _DEBUG
Expand All @@ -501,8 +504,19 @@ extern "C" void QCALLTYPE RuntimeMethodHandle_InvokeMethod(
// Take care of any return arguments
if (fHasRetBuffArg)
{
PVOID pRetBuff = gc.retVal->GetData();
*((LPVOID*) (pTransitionBlock + argit.GetRetBuffArgOffset())) = pRetBuff;
_ASSERT(hasValueTypeReturn);
PTR_MethodTable pMT = retTH.GetMethodTable();
size_t localRetBufSize = retTH.GetSize();

// Allocate a local buffer. The invoked method will write the return value to this
// buffer which will be copied to gc.retVal later.
pLocalRetBuf = _alloca(localRetBufSize);
ZeroMemory(pLocalRetBuf, localRetBufSize);
*((LPVOID*) (pTransitionBlock + argit.GetRetBuffArgOffset())) = pLocalRetBuf;
if (pMT->ContainsGCPointers())
{
pValueClasses = new (_alloca(sizeof(ValueClassInfo))) ValueClassInfo(pLocalRetBuf, pMT, pValueClasses);
}
}

// copy args
Expand Down Expand Up @@ -572,6 +586,19 @@ extern "C" void QCALLTYPE RuntimeMethodHandle_InvokeMethod(
// Call the method
CallDescrWorkerWithHandler(&callDescrData);

if (fHasRetBuffArg)
{
// Copy the return value from the return buffer to the object
if (retTH.GetMethodTable()->ContainsGCPointers())
{
memmoveGCRefs(gc.retVal->GetData(), pLocalRetBuf, retTH.GetSize());
}
else
{
memcpyNoGCRefs(gc.retVal->GetData(), pLocalRetBuf, retTH.GetSize());
}
}

// It is still illegal to do a GC here. The return type might have/contain GC pointers.
if (fConstructor)
{
Expand Down
Loading