You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I run it wrapped as .NET Core REST service in a docker container on Ubuntu so can easily monitor memory usage of the container.
Added garbage collection call "GC.Collect()" after each call and limit to 2 simultaneous calls via SlimSemaphore.
Under load the memory goes up and container eventually crashes. It has 2 GB.
I attached memory graph.
I create only 1 InferenceSession option with following options and it's used from 2 threads simultaneously.
Describe the issue
I am running Bert model in production with C# and ONNX runtime. The code is practically a copy from https://onnxruntime.ai/docs/tutorials/csharp/bert-nlp-csharp-console-app.html
I run it wrapped as .NET Core REST service in a docker container on Ubuntu so can easily monitor memory usage of the container.
Added garbage collection call "GC.Collect()" after each call and limit to 2 simultaneous calls via SlimSemaphore.
Under load the memory goes up and container eventually crashes. It has 2 GB.
I attached memory graph.
I create only 1 InferenceSession option with following options and it's used from 2 threads simultaneously.
_opt = new Microsoft.ML.OnnxRuntime.SessionOptions()
{
EnableCpuMemArena = false,
ExecutionMode = ExecutionMode.ORT_SEQUENTIAL,
GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_ALL,
InterOpNumThreads = 1,
IntraOpNumThreads = 1,
LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_WARNING,
LogVerbosityLevel = 0,
};
essentially the code is as following (where API object is a wrapper for tutorial.)
string path = Directory.GetCurrentDirectory();
Api api = new();
api.Initialize(Path.Combine(path, "bert.onnx"));
SemaphoreSlim semaphore = new(2, 2);
app.MapPost("/api", async ([FromBody] ApiRequest rq) =>
{
await semaphore.WaitAsync();
var r = api.ProcessRequest(rq);
semaphore.Release();
GC.Collect();
return r;
}
To reproduce
can not provide easily reproducible code.
Urgency
No response
Platform
Linux
OS Version
Ubuntu 22.04.5 LTS
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.19.2 and 1.20.1
ONNX Runtime API
C#
Architecture
X64
Execution Provider
Default CPU
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: