Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(otelgin): remove multipartform temporary file #6609

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Yiling-J
Copy link

Fix #5946

@Yiling-J Yiling-J requested a review from a team as a code owner January 13, 2025 13:45
Copy link

linux-foundation-easycla bot commented Jan 13, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@github-actions github-actions bot requested a review from akats7 January 13, 2025 13:45
Copy link

codecov bot commented Jan 13, 2025

Codecov Report

Attention: Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.6%. Comparing base (9640704) to head (65ce788).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
...ation/github.com/gin-gonic/gin/otelgin/gintrace.go 75.0% 2 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #6609     +/-   ##
=======================================
+ Coverage   68.5%   68.6%   +0.1%     
=======================================
  Files        200     200             
  Lines      16781   16882    +101     
=======================================
+ Hits       11500   11589     +89     
- Misses      4935    4943      +8     
- Partials     346     350      +4     
Files with missing lines Coverage Δ
...ation/github.com/gin-gonic/gin/otelgin/gintrace.go 87.8% <75.0%> (-2.1%) ⬇️

... and 2 files with indirect coverage changes

_ = c.AbortWithError(http.StatusInternalServerError, err)
return
}
_ = ff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the content of ff can be detected here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the test is to ensure that temporary files are automatically removed, not to test multipart file parsing. The sole purpose of using c.FormFile is to trigger multipart parsing, which creates temporary files. I added a fileHeader.Open() test in this function to confirm that the temporary file is available. However, I think checking the value of ff is unnecessary.

//
// This means that when we are on returning path from handler middlewares up in chain from this middleware
// can not access these temporary files anymore because we deleted them here.
if c.Request.MultipartForm != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, if there's another middleware layer outside this one, would it be removed because of this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to say, but in all cases, regardless of how many middlewares are chained, temporary files should be removed. I didn't check the return error of RemoveAll here, which aligns with the behavior in both Go's http package and echo-contrib. I believe each middleware should independently attempt to remove temporary files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@flc1125
Copy link
Member

flc1125 commented Jan 13, 2025

I have written 2 examples to facilitate the analysis of this issue.

I believe that gin falls into the first category, so I think the main logic of the modifications in this PR should be fine.

Here I am using a URL to simulate the behavior of MultipartForm, and I believe their logic should be consistent.


  • The first example:
package gin

import (
	"context"
	"net/http"
	"net/url"
	"testing"

	"github.com/stretchr/testify/assert"
)

type handler func(t *testing.T, req *http.Request)

var test1Handler handler = func(t *testing.T, req *http.Request) {
	t.Logf("req1.0: %p, req.URL: %p", req, req.URL)
}

var test2Handler handler = func(t *testing.T, req *http.Request) {
	t.Logf("req2.0: %p, req.URL: %p", req, req.URL)

	saveCtx := req.Context()
	defer func() {
		t.Logf("req2.3: %p, req.URL: %p", req, req.URL)
		req = req.WithContext(saveCtx)
		t.Logf("req2.4: %p, req.URL: %p", req, req.URL)
	}()

	defer func() {
		t.Logf("req2.1: %p, req.URL: %p", req, req.URL)
		req.URL = &url.URL{
			Scheme: "https",
		}
		t.Logf("req2.2: %p, req.URL: %p", req, req.URL)
	}()

	req = req.WithContext(context.WithValue(req.Context(), "key", "value"))
}

var test3Handler handler = func(t *testing.T, req *http.Request) {
	t.Logf("req3.0: %p, req.URL: %p", req, req.URL)
	t.Logf("req3.1: %v", req.Context().Value("key"))
}

func TestIssue_Middleware(t *testing.T) {
	req, err := http.NewRequest("POST", "/upload", nil)
	assert.NoError(t, err)
	req.URL = &url.URL{
		Scheme: "http",
	}

	t.Logf("req0.0: %p, req.URL: %p", req, req.URL)

	handlers := []handler{
		test1Handler,
		test2Handler,
		test3Handler,
	}
	for _, h := range handlers {
		h(t, req)
	}
}

output:

=== RUN   TestIssue_Middleware
    issue_test.go:51: req0.0: 0x1400049ba40, req.URL: 0x1400016ca20
    issue_test.go:15: req1.0: 0x1400049ba40, req.URL: 0x1400016ca20
    issue_test.go:19: req2.0: 0x1400049ba40, req.URL: 0x1400016ca20
    issue_test.go:29: req2.1: 0x1400049bb80, req.URL: 0x1400016ca20
    issue_test.go:33: req2.2: 0x1400049bb80, req.URL: 0x1400016cab0
    issue_test.go:23: req2.3: 0x1400049bb80, req.URL: 0x1400016cab0
    issue_test.go:25: req2.4: 0x1400049bcc0, req.URL: 0x1400016cab0
    issue_test.go:40: req3.0: 0x1400049ba40, req.URL: 0x1400016ca20
    issue_test.go:41: req3.1: <nil>
--- PASS: TestIssue_Middleware (0.00s)
  • The secend example:
package gin

import (
	"context"
	"net/http"
	"net/url"
	"testing"

	"github.com/stretchr/testify/assert"
)

type handler func(t *testing.T, req *http.Request) *http.Request

var test1Handler handler = func(t *testing.T, req *http.Request) *http.Request {
	t.Logf("req1.0: %p, req.URL: %p", req, req.URL)
	return req
}

var test2Handler handler = func(t *testing.T, req *http.Request) *http.Request {
	t.Logf("req2.0: %p, req.URL: %p", req, req.URL)

	saveCtx := req.Context()
	defer func() {
		t.Logf("req2.3: %p, req.URL: %p", req, req.URL)
		req = req.WithContext(saveCtx)
		t.Logf("req2.4: %p, req.URL: %p", req, req.URL)
	}()

	defer func() {
		t.Logf("req2.1: %p, req.URL: %p", req, req.URL)
		req.URL = &url.URL{
			Scheme: "https",
		}
		t.Logf("req2.2: %p, req.URL: %p", req, req.URL)
	}()

	req = req.WithContext(context.WithValue(req.Context(), "key", "value"))

	return req
}

var test3Handler handler = func(t *testing.T, req *http.Request) *http.Request {
	t.Logf("req3.0: %p, req.URL: %p", req, req.URL)
	t.Logf("req3.1: %v", req.Context().Value("key"))
	return req
}

func TestIssue_Middleware(t *testing.T) {
	req, err := http.NewRequest("POST", "/upload", nil)
	assert.NoError(t, err)
	req.URL = &url.URL{
		Scheme: "http",
	}

	t.Logf("req0.0: %p, req.URL: %p", req, req.URL)

	handlers := []handler{
		test1Handler,
		test2Handler,
		test3Handler,
	}
	for _, h := range handlers {
		req = h(t, req)
	}
}

output:

=== RUN   TestIssue_Middleware
    issue_test.go:55: req0.0: 0x140001b6500, req.URL: 0x14000130360
    issue_test.go:15: req1.0: 0x140001b6500, req.URL: 0x14000130360
    issue_test.go:20: req2.0: 0x140001b6500, req.URL: 0x14000130360
    issue_test.go:30: req2.1: 0x140001b6640, req.URL: 0x14000130360
    issue_test.go:34: req2.2: 0x140001b6640, req.URL: 0x140001303f0
    issue_test.go:24: req2.3: 0x140001b6640, req.URL: 0x140001303f0
    issue_test.go:26: req2.4: 0x140001b6780, req.URL: 0x140001303f0
    issue_test.go:43: req3.0: 0x140001b6640, req.URL: 0x140001303f0
    issue_test.go:44: req3.1: value
--- PASS: TestIssue_Middleware (0.00s)
PASS

@@ -85,6 +85,17 @@ func Middleware(service string, opts ...Option) gin.HandlerFunc {

// pass the span through the request context
c.Request = c.Request.WithContext(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we be fixing this rather more easily with the following:

Suggested change
c.Request = c.Request.WithContext(ctx)
c.Request = c.Request.Clone(ctx)

WithContext performs a shallow clone, and therefore doesn't clone the multipart form.
However, Clone performs a deep copy, and makes an explicit clone of MultipartForm.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is, go http request will call MultipartForm.RemoveAll() automatically when request finishes. However, the otelgin middleware creates a new request, and temporary files will be stored in this new request's MultipartForm field. Unfortunately, no one is calling MultipartForm.RemoveAll() on this new request.

A deep clone wouldn’t resolve the problem because the MultipartForm is also copied. A shallow clone might seem like it should work since the MultipartForm field is shared, but it doesn’t. This is because the MultipartForm field is lazily initialized—it’s only assigned a value when you parse the form. As a result, in the middleware, the field is nil.

Related issues: golang/go#58809, labstack/echo#2413

I think other OpenTelemetry middlewares, such as otelmux and otelehco, might also have the same issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we parse the form before making the new request?
Wouldn't we share the object then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach could work, but it feels a bit unusual to enforce form parsing in the middleware.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to force remove multipart files is a bit unusual too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmathieu maybe we should discuss this in the SIG since it might have implications in the other instrumentation modules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't be able to attend this week's meeting (it's the week where it's late for EU). But yes, feel free to bring it there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yiling-J what are you thoughts on parsing the form in the middleware, the more I think about it, the more that seems to be the better approach, for example what about certain edge cases where down the chain middlewares might want to access the files

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akats7 Another option is to save c.Request to a variable before calling c.Request.WithContext(ctx). Then, in the defer function, if c.Request.MultipartForm != nil, we can reattach the MultipartForm to the original request. This way, there won’t be any automatic removing or parsing. If other middlewares handle the form correctly, this approach should work.
In http/server.go, the ServeHTTP method is called first, which invokes the Gin handlers/middlewares, and then finishRequest is called, which removes the temporary files.

I haven’t tested yet, but I think it should work. However, I’m considering opening an issue in Go to get their suggestions. Since it’s closely related to the http package and the WithContext method is widely used by middlewares and frameworks, I think they might be able to suggest best practice for this situation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akats7 It seems I can’t get any help from the Go team. I’m not in favor of parsing the form in the middleware, as this step should be explicitly done by the user. Parsing large forms can consume resources, and the ParseMultipartForm function has a maxMemory parameter that should also be decided by the user.

After giving this more thought over the past few days, I believe this is more of a Go design issue. The WithContext and ParseMultipartForm behavior is implicit, hard to notice, poorly documented, and lacks a clear best practice solution. I think the reattach approach might be better because it’s less aggressive and can likely be safely applied to other OpenTelemetry middlewares facing the same issue. The only downside, compared to RemoveAll, is that it relies on other middlewares to also handle WithContext and ParseMultipartForm correctly. If all middlewares do the right thing, net/http should eventually clean up the temporary files automatically.

@Yiling-J
Copy link
Author

I skipped the failed test on the Windows platform. This is consistent with how Go handles form cleanup tests (https://github.com/golang/go/blob/6da16013ba4444e0d71540f68279f0283a92d05d/src/net/http/serve_test.go#L7028), so I think it’s fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using otelgin.Middleware will cause temporary MultipartForm files not to be destroyed.
4 participants