[Question/FR] measuring execution time along the memory usage #347

NightMachinery · 2022-01-17T11:46:01Z

I want to benchmark some functions and compare them. (I do NOT want to profile these functions' internals.) I need to run the functions multiple times (more for functions that take less to finish) and get their time and peak memory usage statistics. Basically what https://github.com/JuliaCI/BenchmarkTools.jl does for Julia.

I'd also like to abort the functions when they take more than X minutes.

Is this possible with memory_profiler? Or any other Python packages you happen to know?

Thanks.

The text was updated successfully, but these errors were encountered:

NightMachinery · 2022-01-17T19:08:47Z

After reading around, I came up with:

import memory_profiler as mp
from time import time
import gc
import numpy
np = numpy

def benchmark_one(proc, *, name, interval=1):
  mp_kwargs = { 
    "max_usage": True, 
    "retval": True,
    # "timeout": timeout, #: after reading its source code, this basically only manipulates the number of iterations
    "max_iterations": 1,
    # "multiprocess": True, #: useless for functions
    # "include_children": True, 
    #: @upstreamBug? This double counts (well, probably a copy-on-write fork causes this) memory when measuring functions, but not when measuring the starting memory
    "include_children": False,
    "interval": interval  
  }
  start_mem = mp.memory_usage(**mp_kwargs)
  start_time = time()
  res = mp.memory_usage(proc, **mp_kwargs)
  end_time = time()
  dur = end_time - start_time
  ret_val = res[1]
  max_mem = res[0] #: I don't know why this is not returning the number of samples per its doc.
  used_mem = max_mem - start_mem

  print(f"{name}: dur={dur}, used_mem={used_mem}, max_mem={max_mem}")
  # print(f"res={repr(res)}")

  return ret_val, dur, used_mem

def benchmark_n(n, *args, **kwargs):
  durs = np.empty(n) #: filled with nans
  used_mems = np.empty(n)
  losses = np.empty(n)
  ret_val = None #: needed for correct scoping
  for i in range(n):
    ret_val, dur, used_mem = benchmark_one(*args, **kwargs)
    loss = ret_val['loss']
    losses[i] = loss
    durs[i] = dur
    used_mems[i] = used_mem
    if 1 != (n-1):
      ret_val = None #: might help GC

  return (ret_val, durs, used_mems, losses)

But testing it with:

def dummy():
  gc.collect()
  a = np.zeros(10_000_000)
  return {'loss': 0}

benchmark_n(3, 
              (dummy, []), 
              name="dummy",)

dummy: dur=0.14468979835510254, used_mem=76.5546875, max_mem=191.30078125
dummy: dur=0.19488978385925293, used_mem=0.3828125, max_mem=191.7109375
dummy: dur=0.1732182502746582, used_mem=0.015625, max_mem=191.7265625

The memory results are completely wrong; it reports zero memory usage for all runs after the first one. My guess is that Python is reusing free but claimed memory, thus not increasing its total OS memory usage. But I expect memory_profiler to be able to handle that.

NightMachinery · 2022-01-22T10:45:51Z

I am closing this issue in favor of the clearer issue I submitted today.

NightMachinery closed this as completed Jan 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question/FR] measuring execution time along the memory usage #347

[Question/FR] measuring execution time along the memory usage #347

NightMachinery commented Jan 17, 2022

NightMachinery commented Jan 17, 2022 •

edited

Loading

NightMachinery commented Jan 22, 2022

[Question/FR] measuring execution time along the memory usage #347

[Question/FR] measuring execution time along the memory usage #347

Comments

NightMachinery commented Jan 17, 2022

NightMachinery commented Jan 17, 2022 • edited Loading

NightMachinery commented Jan 22, 2022

NightMachinery commented Jan 17, 2022 •

edited

Loading