-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort Data in-place and consume Node on the fly to use less memory #317
base: master
Are you sure you want to change the base?
Conversation
In-place Merge Sortdef merge_sort(arr):
Example usagedata = [4, 2, 9, 6, 1, 5, 8, 7, 3] for item in data: |
I measured both the performance and maximum RSS (as reported by GNU time) of this patch against a synthetic dataset, and it seems to reduce maximum RSS by about 2% while increasing runtime by about 5.4%:
It may be that these ratios could change significantly with different input; I see in your Phabricator posts that you're using significantly larger input. Is there a way that I can download |
@matthew-olson-intel Yes! The raw data behind the visualisations at https://performance.wikimedia.org/php-profiling/ is published daily at https://performance.wikimedia.org/arclamp/logs/daily/. The |
@Krinkle OK, thanks -- I ran another gamut of experiments. These are all 10 iterations per config, using the mean average (runtime in seconds, RSS in MB).
Seems to have changed things quite a lot: a 90% reduction in max RSS, and a 7.5% hit to performance. A clear win for this input, IMO. |
See https://phabricator.wikimedia.org/T315056
Use in-place sort for
@Data
and consume$Node
on the fly to use less memory.This helped stave off out-of-memory errors on cron scripts that builds svgs from large log files.
Re-creates PR #298 by @AaronSchulz.
Ref #58.