Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KM XDPAPI prototype #409

Open
wants to merge 49 commits into
base: main
Choose a base branch
from
Open

KM XDPAPI prototype #409

wants to merge 49 commits into from

Conversation

nigriMSFT
Copy link
Contributor

KM XDPAPI prototype + hacks to test with xskbench and measure perf

@nigriMSFT
Copy link
Contributor Author

Some results comparing the user-mode and kernel-mode APIs that were taken on a physical machine. The difference is mostly insignificant, which would be expected.

perf suite commands:

.\tools\xskperfsuite.ps1 -Verbose -Fndis -RawResultsFile "artifacts/logs/xskperfsuite-user.csv" -XperfDirectory "artifacts/logs" -Config Release
.\tools\xskperfsuite.ps1 -Verbose -Fndis -RawResultsFile "artifacts/logs/xskperfsuite-kernel.csv" -XperfDirectory "artifacts/logs" -KernelMode -Config Release

PS N:\repos\xdp-for-windows> .\tools\xskperfcompare.ps1 -DataFile1 N:\shared\nigri\httpperf\xskperfsuite-user.csv -DataFile2 N:\shared\nigri\httpperf\xskperfsuite-kernel-cleaned.csv
Test Case Avg1 Avg2 %Diff Significance
XDPMP-NATIVE-RX-BUSY-2048chunksize-64iosize-FNDIS 19314 19178 -1 NOT Significant
XDPMP-NATIVE-RX-BUSY-2048chunksize-1514iosize-FNDIS 5491 5478 0 NOT Significant
XDPMP-NATIVE-TX-BUSY-2048chunksize-64iosize-FNDIS 34434 34267 0 NOT Significant
XDPMP-NATIVE-TX-BUSY-2048chunksize-1514iosize-FNDIS 34462 34305 0 NOT Significant
XDPMP-NATIVE-FWD-BUSY-2048chunksize-64iosize-FNDIS 12897 12956 0 NOT Significant
XDPMP-NATIVE-FWD-BUSY-2048chunksize-1514iosize-FNDIS 4791 4784 0 NOT Significant
XDPMP-GENERIC-RX-BUSY-2048chunksize-64iosize-FNDIS 8699 8896 2 NOT Significant
XDPMP-GENERIC-RX-BUSY-2048chunksize-1514iosize-FNDIS 3832 3809 -1 NOT Significant
XDPMP-GENERIC-TX-BUSY-2048chunksize-64iosize-FNDIS 5691 5805 2 NOT Significant
XDPMP-GENERIC-TX-BUSY-2048chunksize-1514iosize-FNDIS 3455 3529 2 NOT Significant
XDPMP-GENERIC-FWD-BUSY-2048chunksize-64iosize-FNDIS 1841 1935 5 Significant
XDPMP-GENERIC-FWD-BUSY-2048chunksize-1514iosize-FNDIS 1214 1277 5 NOT Significant

*xskperfsuite-kernel-cleaned.csv as I had to undo the "KERNEL" added to the test case name in order for xskperfcompare to do its thing.

published/external/xdpapi.h Outdated Show resolved Hide resolved
published/external/xdpapi.h Outdated Show resolved Hide resolved
published/external/xdpapi.h Outdated Show resolved Hide resolved
src/xdp/apikernel.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Outdated Show resolved Hide resolved
src/xdp/xsk.c Show resolved Hide resolved
test/xskbench/xskbench.c Outdated Show resolved Hide resolved
src/xdp/offload.c Outdated Show resolved Hide resolved

inline
VOID
PlatWaitThread(PLAT_THREAD *P, ULONG TimeoutMs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we take a dependency on the CxPlat library used by MsQuic/usersim?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, not yet. We haven't done all the refactoring to move stuff into microsoft/cxplat yet.

Copy link
Contributor Author

@nigriMSFT nigriMSFT Feb 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That cxplat repository looks stale. SHould we take a dependency on / contribute to https://github.com/microsoft/usersim/tree/main/cxplat ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, don't add any extra dependencies yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why not? A dependency seems better than duplicating code to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have the context now that providing facilities in cxplat is the end goal, but it just hasn't been prioritized yet. I think we can cheaply add a cxplat copy into xdp now and migrate to the proper cxplat once it is in place. This avoids any extra work adding and then untangling a temporary dependency on usersim.

microsoft/usersim#167

Copy link
Contributor Author

@nigriMSFT nigriMSFT Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nibanks This is being brought up again. cxplat copies are littered throughout microsoft github repositories (msquic, ebpf-for-windows, usersim, ntosebpfext, win-net-test). Adding another in xdp brings the count higher than can be counted on one hand. What is the tipping point here? Would it be reasonable to start populating the cxplat repo with what xdp needs and have xdp be the first consumer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely want to move stuff there, but just don't have the time. But I do think we've reached that tipping point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can start this work by bringing in the cxplat pieces that xdp needs. Other workstreams can contribute more to cxplat as needed and eliminate cxplat copies when ready/possible. Does that seem like a reasonable plan?

test/xskbench/xskbench.c Outdated Show resolved Hide resolved
@mtfriesen
Copy link
Contributor

Should XskRequiresTxBounceBuffer return false for kernel sockets? The idea was user mode sockets in generic mode cannot be trusted for TOC/TOU, but for kernel, we could rely on the caller to promise not to modify the packet during TX.

src/xdp/xsk.c Outdated
goto Exit;
}
if (RequestorMode != KernelMode) {
Umem->Mapping.Mdl =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we always need an MDL, or at least need it if the bounce buffer is disabled and the underlying interface uses the MDL extension? Should be easy to repro a bugcheck if the XskDisableTxBounce regkey is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Needs a test case though

</Link>
</ItemDefinitionGroup>
<Target Name="BuildCxPlat" BeforeTargets="PrepareForBuild">
<MSBuild Projects="$(SolutionDir)submodules\cxplat\src\lib\cxplat.kernel.vcxproj" Properties="UndockedOut=$(OutDir)cxplat\"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, we need to find a way to not build cxplat multiple times. we probably need to create a single dummy project in xdp now that it's being referenced in multiple places.

@ami-GS ami-GS force-pushed the nigriMSFT/km-proto branch 2 times, most recently from 8436a65 to 7ec7922 Compare October 2, 2024 06:40

inline
NTSTATUS
XdpNmrClientDetachProvider(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to MSDN this may be called at DISPATCH_LEVEL. This means we must not use locks that require PASSIVE_LEVEL, right?

Also, for the question of client cleaning up in the Detach call, that means it could not wait for cleanup, right? That seems weird, to require clean up always supporting DISPATCH_LEVEL...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in the documentation, a client can return STATUS_PENDING and then call NmrClientDetachProviderComplete at a later time. Client "cleanup" logic can be moved to ClientCleanupBindingContext which happens after both client and provider have completed detach.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, it is possible to use PASSIVE_LEVEL lock with DISPATCH_LEVEL function.
If multiple level components are mixed, lower level IRQL need to be used. PASSIVE_LEVEL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible, but the code will be inherently buggy. NickB is correct that we must not use passive level locks. Windows Internals is a highly recommended resource for this concept, I'll send a link offline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clientDetachProvider and providerDetachClient are called at dispatch level only when you deregister client/provider at dispatch level. If we have not treated these as functions that can be called at dispatch elsewhere in xdp like XdpNmrClientDetachProvider, why do we care now?

Copy link
Member

@nibanks nibanks Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, it is possible to use PASSIVE_LEVEL lock with DISPATCH_LEVEL function.

As NickG mentioned, you cannot use a passive lock at DISPATCH_LEVEL (because the lock can block/wait, which isn't allowed at dispatch). But you can use a dispatch lock at PASSIVE_LEVEL.

clientDetachProvider and providerDetachClient are called at dispatch level only when you deregister client/provider at dispatch level. If we have not treated these as functions that can be called at dispatch elsewhere in xdp like XdpNmrClientDetachProvider, why do we care now?

The NMR interface is generically defined to allow DISPATCH_LEVEL. My preference would be to not impose restrictions based on some assumptions for how XDP uses the interface today. What if XDP needs to be updated to deregister at dispatch in the future? @mtfriesen any opinion here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For code that is compiled into 3rd party code rather than something we own ourselves, we should do the right thing. IMO we should also fix code in XDP that makes incorrect assumptions about IRQL, but do that in a separate PR.

* unified close handle api

* add one more API to XDP_FILE_DISPATCH

* XdpDeleteProgram to follow the change, fix static annotations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants