-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(otelgin): remove multipartform temporary file #6609
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6609 +/- ##
=======================================
+ Coverage 68.5% 68.6% +0.1%
=======================================
Files 200 200
Lines 16781 16882 +101
=======================================
+ Hits 11500 11589 +89
- Misses 4935 4943 +8
- Partials 346 350 +4
|
_ = c.AbortWithError(http.StatusInternalServerError, err) | ||
return | ||
} | ||
_ = ff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the content of ff
can be detected here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the test is to ensure that temporary files are automatically removed, not to test multipart file parsing. The sole purpose of using c.FormFile
is to trigger multipart parsing, which creates temporary files. I added a fileHeader.Open()
test in this function to confirm that the temporary file is available. However, I think checking the value of ff
is unnecessary.
instrumentation/github.com/gin-gonic/gin/otelgin/test/gintrace_test.go
Outdated
Show resolved
Hide resolved
// | ||
// This means that when we are on returning path from handler middlewares up in chain from this middleware | ||
// can not access these temporary files anymore because we deleted them here. | ||
if c.Request.MultipartForm != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering, if there's another middleware layer outside this one, would it be removed because of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to say, but in all cases, regardless of how many middlewares are chained, temporary files should be removed. I didn't check the return error of RemoveAll
here, which aligns with the behavior in both Go's http
package and echo-contrib. I believe each middleware should independently attempt to remove temporary files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see: #6609 (comment)
I have written 2 examples to facilitate the analysis of this issue. I believe that gin falls into the first category, so I think the main logic of the modifications in this PR should be fine.
package gin
import (
"context"
"net/http"
"net/url"
"testing"
"github.com/stretchr/testify/assert"
)
type handler func(t *testing.T, req *http.Request)
var test1Handler handler = func(t *testing.T, req *http.Request) {
t.Logf("req1.0: %p, req.URL: %p", req, req.URL)
}
var test2Handler handler = func(t *testing.T, req *http.Request) {
t.Logf("req2.0: %p, req.URL: %p", req, req.URL)
saveCtx := req.Context()
defer func() {
t.Logf("req2.3: %p, req.URL: %p", req, req.URL)
req = req.WithContext(saveCtx)
t.Logf("req2.4: %p, req.URL: %p", req, req.URL)
}()
defer func() {
t.Logf("req2.1: %p, req.URL: %p", req, req.URL)
req.URL = &url.URL{
Scheme: "https",
}
t.Logf("req2.2: %p, req.URL: %p", req, req.URL)
}()
req = req.WithContext(context.WithValue(req.Context(), "key", "value"))
}
var test3Handler handler = func(t *testing.T, req *http.Request) {
t.Logf("req3.0: %p, req.URL: %p", req, req.URL)
t.Logf("req3.1: %v", req.Context().Value("key"))
}
func TestIssue_Middleware(t *testing.T) {
req, err := http.NewRequest("POST", "/upload", nil)
assert.NoError(t, err)
req.URL = &url.URL{
Scheme: "http",
}
t.Logf("req0.0: %p, req.URL: %p", req, req.URL)
handlers := []handler{
test1Handler,
test2Handler,
test3Handler,
}
for _, h := range handlers {
h(t, req)
}
} output:
package gin
import (
"context"
"net/http"
"net/url"
"testing"
"github.com/stretchr/testify/assert"
)
type handler func(t *testing.T, req *http.Request) *http.Request
var test1Handler handler = func(t *testing.T, req *http.Request) *http.Request {
t.Logf("req1.0: %p, req.URL: %p", req, req.URL)
return req
}
var test2Handler handler = func(t *testing.T, req *http.Request) *http.Request {
t.Logf("req2.0: %p, req.URL: %p", req, req.URL)
saveCtx := req.Context()
defer func() {
t.Logf("req2.3: %p, req.URL: %p", req, req.URL)
req = req.WithContext(saveCtx)
t.Logf("req2.4: %p, req.URL: %p", req, req.URL)
}()
defer func() {
t.Logf("req2.1: %p, req.URL: %p", req, req.URL)
req.URL = &url.URL{
Scheme: "https",
}
t.Logf("req2.2: %p, req.URL: %p", req, req.URL)
}()
req = req.WithContext(context.WithValue(req.Context(), "key", "value"))
return req
}
var test3Handler handler = func(t *testing.T, req *http.Request) *http.Request {
t.Logf("req3.0: %p, req.URL: %p", req, req.URL)
t.Logf("req3.1: %v", req.Context().Value("key"))
return req
}
func TestIssue_Middleware(t *testing.T) {
req, err := http.NewRequest("POST", "/upload", nil)
assert.NoError(t, err)
req.URL = &url.URL{
Scheme: "http",
}
t.Logf("req0.0: %p, req.URL: %p", req, req.URL)
handlers := []handler{
test1Handler,
test2Handler,
test3Handler,
}
for _, h := range handlers {
req = h(t, req)
}
} output:
|
instrumentation/github.com/gin-gonic/gin/otelgin/test/gintrace_test.go
Outdated
Show resolved
Hide resolved
@@ -85,6 +85,17 @@ func Middleware(service string, opts ...Option) gin.HandlerFunc { | |||
|
|||
// pass the span through the request context | |||
c.Request = c.Request.WithContext(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we be fixing this rather more easily with the following:
c.Request = c.Request.WithContext(ctx) | |
c.Request = c.Request.Clone(ctx) |
WithContext performs a shallow clone, and therefore doesn't clone the multipart form.
However, Clone performs a deep copy, and makes an explicit clone of MultipartForm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is, go http request will call MultipartForm.RemoveAll() automatically when request finishes. However, the otelgin middleware creates a new request, and temporary files will be stored in this new request's MultipartForm field. Unfortunately, no one is calling MultipartForm.RemoveAll() on this new request.
A deep clone wouldn’t resolve the problem because the MultipartForm is also copied. A shallow clone might seem like it should work since the MultipartForm field is shared, but it doesn’t. This is because the MultipartForm field is lazily initialized—it’s only assigned a value when you parse the form. As a result, in the middleware, the field is nil.
Related issues: golang/go#58809, labstack/echo#2413
I think other OpenTelemetry middlewares, such as otelmux and otelehco, might also have the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we parse the form before making the new request?
Wouldn't we share the object then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach could work, but it feels a bit unusual to enforce form parsing in the middleware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to force remove multipart files is a bit unusual too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmathieu maybe we should discuss this in the SIG since it might have implications in the other instrumentation modules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I won't be able to attend this week's meeting (it's the week where it's late for EU). But yes, feel free to bring it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Yiling-J what are you thoughts on parsing the form in the middleware, the more I think about it, the more that seems to be the better approach, for example what about certain edge cases where down the chain middlewares might want to access the files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akats7 Another option is to save c.Request
to a variable before calling c.Request.WithContext(ctx)
. Then, in the defer function, if c.Request.MultipartForm != nil
, we can reattach the MultipartForm
to the original request. This way, there won’t be any automatic removing or parsing. If other middlewares handle the form correctly, this approach should work.
In http/server.go, the ServeHTTP
method is called first, which invokes the Gin handlers/middlewares, and then finishRequest
is called, which removes the temporary files.
I haven’t tested yet, but I think it should work. However, I’m considering opening an issue in Go to get their suggestions. Since it’s closely related to the http
package and the WithContext
method is widely used by middlewares and frameworks, I think they might be able to suggest best practice for this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@akats7 It seems I can’t get any help from the Go team. I’m not in favor of parsing the form in the middleware, as this step should be explicitly done by the user. Parsing large forms can consume resources, and the ParseMultipartForm
function has a maxMemory
parameter that should also be decided by the user.
After giving this more thought over the past few days, I believe this is more of a Go design issue. The WithContext
and ParseMultipartForm
behavior is implicit, hard to notice, poorly documented, and lacks a clear best practice solution. I think the reattach approach might be better because it’s less aggressive and can likely be safely applied to other OpenTelemetry middlewares facing the same issue. The only downside, compared to RemoveAll
, is that it relies on other middlewares to also handle WithContext
and ParseMultipartForm
correctly. If all middlewares do the right thing, net/http
should eventually clean up the temporary files automatically.
I skipped the failed test on the Windows platform. This is consistent with how Go handles form cleanup tests (https://github.com/golang/go/blob/6da16013ba4444e0d71540f68279f0283a92d05d/src/net/http/serve_test.go#L7028), so I think it’s fine. |
Fix #5946