Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU Usage #10

Open
IcyFlamingArrow opened this issue Apr 6, 2021 · 34 comments · Fixed by #20 or #96
Open

CPU Usage #10

IcyFlamingArrow opened this issue Apr 6, 2021 · 34 comments · Fixed by #20 or #96
Labels
help wanted Extra attention is needed

Comments

@IcyFlamingArrow
Copy link

IcyFlamingArrow commented Apr 6, 2021

The xlpr file manager results in intense CPU usage when the binary is executed.
How can I avoid this?
https://ibb.co/7j1Rm77

The version I am using is the xrpl-git from the aur.

@sayanarijit
Copy link
Owner

Hey.Thanks for reporting this. I have a good idea of where the problems are and can definitely optimize this.

I've been delaying working on cleanups and optimizations to focus on the implementations. But now I think it's time I start looking into that area.

sayanarijit added a commit that referenced this issue Apr 6, 2021
- Write to pipes only when the value changes.
- Introduce a minimal delay to the background threads.

Fixes: #10
sayanarijit added a commit that referenced this issue Apr 6, 2021
- Write to pipes only when the value changes.
- Sleep when not reading key event or messages.

Fixes: #10
sayanarijit added a commit that referenced this issue Apr 6, 2021
- Write to pipes only when the value changes.
- Sleep when not reading key event or messages.

Fixes: #10
@amalic
Copy link

amalic commented Apr 6, 2021

When I run your current version via cargo run --release it still shows 50-60% CPU usage. I don't think this issue got fixed.

image

@sayanarijit
Copy link
Owner

Reopening this.

@sayanarijit sayanarijit reopened this Apr 7, 2021
@sayanarijit
Copy link
Owner

I'll probably need to get a test suit up to measure the cpu usage in a vm environment.

@amalic
Copy link

amalic commented Apr 7, 2021

I think profiling the app could help.
e.g.: https://gist.github.com/KodrAus/97c92c07a90b1fdd6853654357fd557a

I tested on Ubuntu 20.04. Peeking into your code (I'm a Rust beginner) I see a lot of object cloning and threading going on.

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 7, 2021

Initial flamegraph:
1
CPU usage: 51%
Conclusion: We are reading the keyboard inputs wrong.

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 7, 2021

Try 1: 1bfb011

Flamegraph:

2

CPU usage: 17% (for other reasons)
Conclusion: Reading keys in a blocking way partly fixes the load issue, but we get error when we invoke shell :!

[+ 73879 suspended (tty input)  cargo run

Probably because bash tried to capture the tty which was being read in a blocking way.

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 9, 2021

I'll continue working on it after #39 to make sure we don't break anything in order to optimize things.

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 9, 2021

Related: crossterm-rs/crossterm#397

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 22, 2021

The suggested solution was to rewrite the codebase in async rust which I don't think is a good idea for this project as of now since we're still in the exploring phase. async runtime might add many limitations to how we can use and extend xplr because I have plans for xplr to be also used as a library, not just a bin.

So, this is my 2nd attempt to fix the issue without changing much of the code. @IcyFlamingArrow @amalic @maximbaz can you please test if this works for you?

@amalic
Copy link

amalic commented Apr 22, 2021

Cranking up sleep time will definitely be helpful, but like the author in the other thread said, it could mess up rrsponsiveness.

Will give it another try later, and let you know asap.

@sayanarijit
Copy link
Owner

sayanarijit commented Apr 22, 2021

It doesn't seem to have much impact in my machine since the sleep time should apply only for the first click. When holding the key, subsequent events should be faster.

@maximbaz
Copy link
Contributor

To me it looks like a significant improvement, it dropped from 50% cpu usage in idle to 2.5%.
Of course in an ideal world I would prefer to get to 0% when idle, but this is definitely a good move, and perhaps sufficient for now.

I don't feel any decrease in responsiveness (remember to build in release mode when testing, just in case).

@sayanarijit
Copy link
Owner

Awesome. Thanks.

sayanarijit added a commit that referenced this issue Apr 22, 2021
@amalic
Copy link

amalic commented Apr 27, 2021

Can confirm. It uses about 2 - 2.5% CPU time, which is still more than necessary but 20 to 25 times better than 50%.

image

@sayanarijit
Copy link
Owner

It will definitely get better as xplr stabilizes.

@sayanarijit
Copy link
Owner

sayanarijit commented May 29, 2021

Just to keep everyone in the know, a lot of CPU and performance optimizations were made. The latest version uses about 0.1% - 0.7% CPU when idle and also feels a lot snappier.

  scroll coordinates: y = 1/275 (tasks), x = 1/12 (fields)
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
    854 root      20   0 1398192 100436  66536 S   4.3   1.3   3:24.68 Xorg
  19985 sayanar+  20   0 2822512 295844 138324 R   3.0   3.9   0:36.73 Web Content
   1382 sayanar+  20   0 2836484 277496 153912 S   2.0   3.7   0:37.69 plasmashell
    314 root     -51   0       0      0      0 D   1.7   0.0   0:15.99 irq/71-MSFT0002
   2115 sayanar+  20   0   15.1g 421448 148976 S   1.7   5.6   2:33.93 WebExtensions
   1305 sayanar+  20   0 2207360 181156 117912 S   1.3   2.4   0:39.46 kwin_x11
   1875 sayanar+  20   0 3985168 578012 248472 S   1.3   7.6  14:47.54 firefox
    798 root      20   0  406804  20676  17344 S   1.0   0.3   0:03.14 NetworkManager
   1557 sayanar+  20   0  245224  34020  28060 S   1.0   0.4   0:16.41 ksystemstats
    796 dbus      20   0   14004   7564   5152 S   0.7   0.1   0:03.24 dbus-daemon
   1301 sayanar+  20   0 1171892  79360  59388 S   0.7   1.0   0:03.48 kded5
  21287 sayanar+  20   0 1812768  97704  69684 S   0.7   1.3   0:00.80 alacritty
     25 root      20   0       0      0      0 S   0.3   0.0   0:00.71 ksoftirqd/1
     32 root      20   0       0      0      0 S   0.3   0.0   0:00.63 ksoftirqd/2
    428 root     -51   0       0      0      0 S   0.3   0.0   0:00.53 irq/78-iwlwifi:
   1226 sayanar+  20   0  658608  26148  22508 S   0.3   0.3   0:02.34 appimagelaunche
   2272 sayanar+  20   0 2819244 212652 124308 S   0.3   2.8   0:46.70 Web Content
   9728 root      20   0       0      0      0 I   0.3   0.0   0:00.19 kworker/u32:2-events_unbound
  21342 sayanar+  20   0  279196   7296   4948 S   0.3   0.1   0:00.36 xplr

@sayanarijit
Copy link
Owner

Although, it still has plenty of potential for more optimization.

@dm9pZCAq
Copy link

in previous message were strace of main process but as you can see from pic there also thread which uses CPU

strace of this thread:

Trace of process 16035 - xplr
strace: Process 16035 attached
09:35:15.103256 epoll_pwait(3, [], 3, 10, NULL, 8) = 0 <0.010103>
09:35:15.113581 epoll_pwait(3, [], 3, 49, NULL, 8) = 0 <0.049194>
09:35:15.162911 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000030>
09:35:15.163056 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000029>
09:35:15.163169 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000017>
09:35:15.163266 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000016>
09:35:15.163349 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.163430 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.163511 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>
09:35:15.163593 epoll_pwait(3, [], 3, 49, NULL, 8) = 0 <0.049171>
09:35:15.212896 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000030>
09:35:15.213041 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000019>
09:35:15.213146 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000014>
09:35:15.213229 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.213312 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.213393 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>
09:35:15.213471 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>
09:35:15.213550 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>
09:35:15.213632 epoll_pwait(3, [], 3, 49, NULL, 8) = 0 <0.049189>
09:35:15.262941 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000019>
09:35:15.263084 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000028>
09:35:15.263197 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000019>
09:35:15.263296 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000017>
09:35:15.263392 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000016>
09:35:15.263474 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.263555 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>
09:35:15.263638 epoll_pwait(3, [], 3, 49, NULL, 8) = 0 <0.049180>
09:35:15.312939 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000030>
09:35:15.313086 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000028>
09:35:15.313198 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000017>
09:35:15.313289 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000016>
09:35:15.313372 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000016>
09:35:15.313454 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.313535 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000015>
09:35:15.313615 epoll_pwait(3, [], 3, 0, NULL, 8) = 0 <0.000013>

here also constantly happens epoll_pwait

@sayanarijit
Copy link
Owner

sayanarijit commented Jun 21, 2021

Thanks for reporting. I see it went up again.

As for why:

  1. Reading inputs
  2. Watch pwd for modifications. When you create, delete, or rename files using another program, lf (r22) won't auto load the changes.
  3. Auto refresh UI every sec. It renders the ui every sec which I agree could be improved and made more reactive. I added it when we needed to manually refresh after passing each batch of messages. But now the manual refresh messages will be added to each batch automatically. So there shouldn't be any need for a service refreshing the UI now.

@dm9pZCAq
Copy link

  1. but reading input should not consume much CPU
  2. i think for this, at least in UNIX, can be used inotify(7)

@sayanarijit
Copy link
Owner

but reading input should not consume much CPU

As you can see from the previous comments, it's due to an open crossterm issue.

i think for this, at least in UNIX, can be used inotify(7)

It's more or less the same, some simple code running on a thread. I'm not sure if it's optimal, but comparing modification times shouldn't cause any major overhead.

@dm9pZCAq
Copy link

here is similar issue about watching directory in lf

It's more or less the same

no, it should be faster and it's not take so much CPU, because it's build in kernel

you can check this with inotify-tools:
$ inotifywait -rm /test/dir/

and watch this process in htop and strace

@sayanarijit
Copy link
Owner

sayanarijit commented Jun 21, 2021

Hmm wow didn't know. Regardless, as I understand inotify will make it Linux dependent.

I did a simple experiment.

  1. Removed the probably unnecessary auto_refresher here. It brought down the CPU usage by about 50%.
  2. Commented out the pwd_watcher::keep_watching. No noticeable changes.
  3. Commented out event_reader::keep_reading (input reader). And it reduced the CPU usage drastically. Basically removed xplr from the top page.

So, I'd first try to optimize reading inputs. Maybe switching the input reading logic to termion might help.

sayanarijit added a commit that referenced this issue Jun 21, 2021
@dm9pZCAq
Copy link

inotify will make it Linux dependent

yes, but you can use cfg attribute to use it only on UNIX

@sayanarijit sayanarijit added the help wanted Extra attention is needed label Jun 21, 2021
@maximbaz
Copy link
Contributor

To consider, cross-platform notify library, uses inotify under the hood for Unix: https://crates.io/crates/notify

@sayanarijit
Copy link
Owner

sayanarijit commented Jun 21, 2021

To consider, cross-platform notify library, uses inotify under the hood for Unix: https://crates.io/crates/notify

Looks good. But from the experiments in my last comment, we know that the real issue isn't with the watcher. It's the input reading mechanism. I think something changed when I last upgraded crossterm. I'll have a closer look.

@sayanarijit sayanarijit pinned this issue Jun 21, 2021
sayanarijit added a commit that referenced this issue Jun 22, 2021
sayanarijit added a commit that referenced this issue Jun 22, 2021
sayanarijit added a commit that referenced this issue Oct 5, 2021
Following https://github.com/crossterm-rs/crossterm/blob/master/examples/event-stream-async-std.rs
let's see if we can somehow reduce CPU usage and improve responsiveness
using async rust.

Ref: #10
sayanarijit added a commit that referenced this issue Oct 5, 2021
Following https://github.com/crossterm-rs/crossterm/blob/master/examples/event-stream-async-std.rs
let's see if we can somehow reduce CPU usage and improve responsiveness
using async rust.

Ref: #10
sayanarijit added a commit that referenced this issue Oct 5, 2021
Following https://github.com/crossterm-rs/crossterm/blob/master/examples/event-stream-async-std.rs
let's see if we can somehow reduce CPU usage and improve responsiveness
using async rust.

Ref: #10
sayanarijit added a commit that referenced this issue Oct 5, 2021
Following https://github.com/crossterm-rs/crossterm/blob/master/examples/event-stream-async-std.rs
let's see if we can somehow reduce CPU usage and improve responsiveness
using async rust.

Ref: #10
@sayanarijit sayanarijit unpinned this issue May 22, 2022
@sayanarijit
Copy link
Owner

NOTE: keeping this issue open to implement notify/inotify for Linux with test cases for all platforms.

@litoj
Copy link

litoj commented Jul 3, 2023

This is an uneducated question, so please take it for what it is, but:

Why is polling for input needed at all? I haven't tried the Windows side of things, but can't we just wait for input? (example reader in C); After all, the application usually exits on user input / command, I thought. The problem, I assume, comes if input is read, but parsed/actions executed in a separate thread.

Edit: I just noticed just going up and down through contents of a directory uses a lot of CPU (45% when holding, vs 16% in ranger), even in a small dir (24 files). I assume xplr doesn't reload the directory on every action/move, so could all that be caused just by updating the relative item numbers or all the extra file information?

@sayanarijit
Copy link
Owner

I guess it's because of the lua function calls along with all the serialisation and deserialization required for that. The number of columns being rendering in the table somewhat impacts scrolling performance. Using plugins like zentable.xplr could be an easy fix, still, each render requires at-least one lua function call.

@litoj
Copy link

litoj commented Jul 7, 2023

Using plugins like zentable.xplr could be an easy fix

Unfortunately, whether manually disabling all columns, except the filename, or using the plugin, the CPU usage on holding down arrow is still ~48%.

@sayanarijit
Copy link
Owner

Hmm... I think we can now start porting some of the custom Lua stuff to the Rust side as built-in defaults. That should optimize things significantly, as long as it's not being customized.

@sayanarijit
Copy link
Owner

Need to ensure that we don't break compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants