Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CSIT-1950] 9000B tests with encap overhead and non-dpdk plugins see fragmented packets #4032

Open
vvalderrv opened this issue Feb 4, 2025 · 8 comments

Comments

@vvalderrv
Copy link
Contributor

Description

Seen in rls2404 coverage job. Packet trace [0] sees mtu:9000 in ip4-rewrite, not sure why exactly yet. Maybe CSIT test forgets to set higher value somewhere, maybe VPP code inserts 9000 by default as here [1]. The fact fragments cannot be longer that one buffer is a known limitation.

[0] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2402-2n-clx/21/log.html.gz#s1-s1-s1-s1-s2-t6-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

[1] https://github.com/FDio/vpp/blob/455960759b5417c767ed331748c7ee76662ffd18/src/vnet/interface_funcs.h#L323

Assignee

Unassigned

Reporter

Vratko Polak

Comments

  • vrpolak (Tue, 19 Nov 2024 11:56:08 +0000): For completeness, some lisp tests are also affected, e.g. ethip6lispip4-ip6base [9] and ethip6lispip6-ip6base.

[9] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2410-3nb-spr/9/log.html.gz#s1-s1-s1-s1-s1-t3-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

  • vrpolak (Mon, 18 Nov 2024 12:05:17 +0000): Maglev symptom still present [8] in rls2410.

[8] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2410-2n-icx/25/log.html.gz#s1-s1-s1-s1-s2-t6-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

  • vrpolak (Wed, 31 Jul 2024 10:21:37 +0000):

    A different symptom with the same cause (internal MTU of 9000B) is with dpdk plugin, in tests without fragmentation (e.g. SRv6) the packets are dropped [7].

Not opening a separate ticket for that yet, as any fix for this is likely to fix also that.

[7] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2406-3n-alt/22/log.html.gz#s1-s1-s1-s1-s1-t7-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

  • vrpolak (Tue, 30 Jul 2024 11:44:46 +0000): Maglev applies gre, so is also affected [6].

[6] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2406-2n-clx/24/log.html.gz#s1-s1-s1-s1-s2-t6-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

  • vrpolak (Tue, 30 Jul 2024 08:58:04 +0000): In rls2406 this still fails with AVF plugin, verify run [5] confirms rdma-core plugin also fails, but dpdk plugin (including mlx5) does not fail. Needs more investigation to see if the issue is in CSIT or VPP.

[5] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-master-2n-clx/1285/log.html.gz#s1-s1-s1-s1-s3-t1-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

  • vrpolak (Thu, 27 Jun 2024 13:31:37 +0000): I expect this to be fixed by [4], but waiting for coverage runs to confirm.

[4] 40901: fix(perf): Increase threshold for jumbo | https://gerrit.fd.io/r/c/csit/+/40901

  • vrpolak (Wed, 27 Mar 2024 14:31:05 +0000): Vxlan has a different symptom [3], it fragments the packets but fails reassembly as it is more than 3 fragments per packet.

[3] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2402-3n-icx/29/log.html.gz#s1-s1-s1-s1-s1-t6-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

  • vrpolak (Wed, 27 Mar 2024 09:54:53 +0000): Another example: geneve [2].

[2] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-master-2n-icx/336/log.html.gz#s1-s1-s1-s1-s1-t1-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

Original issue: https://jira.fd.io/browse/CSIT-1950

@vvalderrv
Copy link
Contributor Author

For completeness, some lisp tests are also affected, e.g. ethip6lispip4-ip6base [9] and ethip6lispip6-ip6base.

[9] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2410-3nb-spr/9/log.html.gz#s1-s1-s1-s1-s1-t3-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

@vvalderrv
Copy link
Contributor Author

@vvalderrv
Copy link
Contributor Author

A different symptom with the same cause (internal MTU of 9000B) is with dpdk plugin, in tests without fragmentation (e.g. SRv6) the packets are dropped [7].
Not opening a separate ticket for that yet, as any fix for this is likely to fix also that.

[7] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2406-3n-alt/22/log.html.gz#s1-s1-s1-s1-s1-t7-k3-k7-k1-k1-k1-k8-k14-k2-k1-k1-k1-k1

@vvalderrv
Copy link
Contributor Author

@vvalderrv
Copy link
Contributor Author

In rls2406 this still fails with AVF plugin, verify run [5] confirms rdma-core plugin also fails, but dpdk plugin (including mlx5) does not fail. Needs more investigation to see if the issue is in CSIT or VPP.

[5] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-verify-master-2n-clx/1285/log.html.gz#s1-s1-s1-s1-s3-t1-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

@vvalderrv
Copy link
Contributor Author

I expect this to be fixed by [4], but waiting for coverage runs to confirm.

[4] 40901: fix(perf): Increase threshold for jumbo | https://gerrit.fd.io/r/c/csit/+/40901

@vvalderrv
Copy link
Contributor Author

Vxlan has a different symptom [3], it fragments the packets but fails reassembly as it is more than 3 fragments per packet.

[3] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-report-coverage-2402-3n-icx/29/log.html.gz#s1-s1-s1-s1-s1-t6-k3-k7-k1-k1-k1-k8-k14-k1-k1-k1-k1

@vvalderrv
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant