Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling Javascript with quickjs ? #183

Open
mingodad opened this issue Jun 4, 2023 · 15 comments
Open

Profiling Javascript with quickjs ? #183

mingodad opened this issue Jun 4, 2023 · 15 comments

Comments

@mingodad
Copy link

mingodad commented Jun 4, 2023

Does someone managed to profile Javascript code with quikjs ?
Ideally qjs would have an option to save/output a profile report of the running script, something like:

qjs -p myscript.js myscript_arg

I'm looking at the source code to try find the ideal point to install hooks that allow such functionality, any help is welcome !

Cheers !

@mingodad
Copy link
Author

mingodad commented Jun 5, 2023

I've got an initial implementation of it that outputs something like:

[1	2	1	103	<eval>]
[16589	1	1	13352741	load_model]
[16248	1	1	1	glp_mpl_free_wksp]
[16315	1	1	13342411	glp_mpl_generate]
[13023	1	1	13342408	mpl_generate]
[12441	1	1	236	mpl_internal_build_problem]
[12424	1	1	13342158	mpl_internal_generate_model]
[16243	1	1	233	glp_mpl_alloc_wksp]
[12763	1	1	231	mpl_initialize]
[746	1	1	216	rng_create_rand]
[491	1	1	3	glp_version]
[14860	1	1	2	mpl_get_num_cols]
[14841	1	1	1	mpl_get_num_rows]
[16260	1	1	9873	glp_mpl_read_model]
[12887	1	1	9870	mpl_read_model]
[12889	1	1	68	<null>]
[5720	1	1	12	mpl_internal_end_statement]
[6994	1	1	958	mpl_internal_data_section]
[6865	1	1	147	mpl_internal_parameter_data]
[6553	1	1	50	mpl_internal_plain_format]
[6506	1	1	29	mpl_internal_read_value]
[6016	1	1	5	mpl_internal_read_number]
[6462	1	1	2	mpl_internal_select_parameter]
[6326	1	1	783	mpl_internal_set_data]
[6155	20	2	79	mpl_internal_simple_format]
[5935	20	2	2	mpl_internal_slice_arity]
[5953	2	1	1	mpl_internal_fake_slice]
[6118	1	1	2	mpl_internal_select_set]
[5964	2	1	1	mpl_internal_delete_slice]
[6006	9	1	4	mpl_internal_is_literal]
[5991	95	10	10	mpl_internal_is_symbol]
[6029	40	4	47	mpl_internal_read_symbol]
[5981	41	5	5	mpl_internal_is_number]
[5892	2	1	2	mpl_internal_expand_slice]
[12363	1	1	23	mpl_internal_alloc_content]
[5844	1	1	7595	mpl_internal_model_section]
[12547	1	1	914	mpl_internal_open_input]
[454	2	1	6	xfillArr]
...

That after processing can output this:

Ordered by time_spent
line	count	atime%	time%	ttime	octime	name
10129	439413	20.5373%	20.5373%	47159683	1073.23	mpl_internal_eval_numeric
8693	174751	34.3818%	13.8445%	31792325	1819.20	mpl_internal_eval_domain_func
8630	739605	47.1288%	12.7470%	29270565	395.76	mpl_internal_enter_domain_block
8802	722102	58.2187%	11.0899%	25465667	352.66	mpl_internal_loop_domain_func
8917	98969	64.8072%	6.5885%	15129071	1528.65	mpl_internal_loop_within_domain
9396	57911	69.3048%	4.4976%	10329331	1783.38	mpl_internal_eval_num_func
8762	769232	73.7825%	4.4776%	10281944	133.66	mpl_internal_eval_within_domain
9415	57910	78.0582%	4.2757%	9818108	1695.41	mpl_internal_eval_member_num
9362	57910	82.2578%	4.1997%	9643524	1665.26	mpl_internal_take_member_num
10522	1606326	86.3781%	4.1203%	9461348	58.90	mpl_internal_eval_symbolic
10102	42060	90.0949%	3.7167%	8534538	2029.13	mpl_internal_iter_num_func
10662	1220160	93.4589%	3.3641%	7724772	63.31	mpl_internal_eval_logical
10641	28039	94.2256%	0.7667%	1760589	627.88	mpl_internal_iter_log_func
16589	1	94.8071%	0.5815%	13352741	13352741.00	load_model
16315	1	95.3882%	0.5810%	13342411	13342411.00	glp_mpl_generate
13023	1	95.9692%	0.5810%	13342408	13342408.00	mpl_generate
12424	1	96.5502%	0.5810%	13342158	13342158.00	mpl_internal_generate_model
11043	795292	97.0891%	0.5389%	1237369	15.56	<:isGone:>
7426	5421311	97.3777%	0.2886%	662714	1.22	mpl_internal_compare_symbols
12117	1	97.6260%	0.2483%	5702333	5702333.00	mpl_internal_execute_display
12018	1	97.8744%	0.2483%	5702331	5702331.00	mpl_internal_display_func
9661	1	98.1227%	0.2483%	5702316	5702316.00	mpl_internal_eval_whole_par
9643	1	98.3710%	0.2483%	5702313	5702313.00	mpl_internal_whole_par_func
7607	3545108	98.6101%	0.2391%	549030	1.55	mpl_internal_compare_tuples
8585	1479210	98.8198%	0.2097%	481505	3.26	mpl_internal_update_dummy_indices
8388	1480241	99.0142%	0.1944%	446461	3.02	mpl_internal_find_member
8545	1212141	99.1875%	0.1733%	397859	3.28	mpl_internal_assign_dummy_index
9163	710940	99.3594%	0.1719%	394659	5.55	mpl_internal_eval_member_set
7699	711010	99.5236%	0.1642%	377154	5.30	mpl_internal_find_tuple
8257	2439184	99.6493%	0.1257%	288632	1.18	mpl_internal_delete_value
9095	710941	99.7225%	0.0731%	167945	2.36	mpl_internal_eval_set_func
9054	710940	99.7881%	0.0656%	150619	2.12	mpl_internal_take_member_set
7542	1019407	99.8442%	0.0561%	128859	1.26	mpl_internal_expand_tuple
7628	710930	99.8976%	0.0535%	122775	1.73	mpl_internal_build_subtuple
11833	40	99.9280%	0.0303%	69687	17421.75	mpl_internal_check_func
8414	55858	99.9387%	0.0107%	24629	4.41	<null>
...

@mingodad
Copy link
Author

mingodad commented Jun 6, 2023

Just did this pull request #184 with my initial working implementation.

@mingodad
Copy link
Author

mingodad commented Jun 8, 2023

Here is a sample script to format the output :

import * as std from "std";
import * as os from "os";

const fname = "tsc-csparser-noEmit.prof.txt";
let txt_fd = std.open(fname, "r");

let txt_profile;
try {
	txt_profile = txt_fd.readAsString();
}
catch(e) {
	print(e);
	os.exit(1);
}
txt_fd.close();

let func_list = {};

let total_time_spent = 0;

const regexp = /\[(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\S+)\]/g;

let match;
while ((match = regexp.exec(txt_profile)) !== null) {
	let code_line = parseInt(match[1]);
	let call_count = parseInt(match[2]);
	let time_spent_count = parseInt(match[3]);
	let time_spent = parseFloat(match[4]);
	let func_name = match[5];
	//print(code_line, call_count, time_spent_count, time_spent, func_name);

	let time_spent_calc = (time_spent / time_spent_count)*call_count;
	total_time_spent += time_spent_calc;
	func_list[code_line] = {
		"call_count":call_count,
		"time_spent_calc":time_spent_calc,
		"time_spent":time_spent,
		"time_spent_count":time_spent_count,
		"func_name":func_name
		};
}

function showProfile(ary)
{
	print("line\tcount\tatime%\ttime%\tttime\toctime\tname");
	let total_time_spent_pct = 0.0;
	for(const elm in ary)
	{
		let code_line = ary[elm][0];
		let rec = ary[elm][1];
		let time_spent_pct = (rec.time_spent_calc/total_time_spent);
		total_time_spent_pct += time_spent_pct;
		let time_spent_pct_fmt = std.sprintf("%.4f%%", time_spent_pct*100.0);
		let acc_time_spent_pct_fmt = std.sprintf("%.4f%%", total_time_spent_pct*100.0);
		let oc_time_spent = std.sprintf("%.2f", rec.time_spent_calc/rec.call_count);
		std.printf("%d\t%d\t%s\t%s\t%d\t%s\t%s\n",
			code_line, rec.call_count,acc_time_spent_pct_fmt,
			time_spent_pct_fmt, rec.time_spent, oc_time_spent,
			rec.func_name);
	}
}

let ary = [];
for (const key in func_list) {
	//print(key)
	if(func_list.hasOwnProperty(key))
		ary.push([key, func_list[key]]);
}
//print(ary.length);

print("\nOrdered by code_line");
ary.sort((a,b) => a[0] - b[0]);
showProfile(ary);

print("\nOrdered by call_count");
ary.sort((a,b) => b[1].call_count - a[1].call_count);
showProfile(ary);

print("\nOrdered by time_spent");
ary.sort((a,b) => b[1].time_spent_calc - a[1].time_spent_calc);
showProfile(ary);

TooTallNate pushed a commit to TooTallNate/quickjs that referenced this issue Dec 18, 2023
@chqrlie
Copy link
Collaborator

chqrlie commented Feb 11, 2024

Definitely interesting! Will look into merging your pull request.

@chqrlie chqrlie changed the title Profinling Javascript with quickjs ? Profiling Javascript with quickjs ? Feb 11, 2024
@rubenlg
Copy link

rubenlg commented Nov 22, 2024

I tried to use the PR linked above, but I always get <:isGone:> as the function name. I suppose with the latest new version of quickjs, the atoms get destroyed before the functions.

Anyway, adding an actual_function_name to the debug structure and populating it from the (still alive) atom when incrementing b->debug.time_spent_count (only the first time) does the trick. Sadly, it adds a tiny bit of overhead because the JS_AtomGetStrRT call isn't free, but it can be incurred only the first time.

@mingodad
Copy link
Author

Hello @rubenlg thanks for reply !
Maybe this feature can be surrounded by a preprocessor macro and be built only for debugging purposes, for release it's not included so no performance penalty.

@rubenlg
Copy link

rubenlg commented Nov 22, 2024

Thanks @mingodad ! Of course, I don't want to include this in release mode. I was just worried that the overhead will skew the results of the performance measurements.

By the way, I just noticed that it only works for free functions. Class methods and functions declared inside objects have a null atom in the func_name property. I traced that null atom back to js_create_function. QuickJS obviously has a way to get that name somehow, because if I throw an error from these functions, the stack trace contains their names. But I have no idea how to get it from inside the library. Any ideas?

@mingodad
Copy link
Author

@rubenlg can you give a minimal example with the actual output ?

@rubenlg
Copy link

rubenlg commented Nov 22, 2024

Sure. Here is a minimal example exercising the three cases (toplevel function, function inside object, and class):

function test1() {}
const obj = {
  test2() {}
};
class K {
  test3() {}
}
test1();
obj.test2();
new K().test3();

Here is the output of your original patch applied to the latest version of quickjs:

[1	1	0	0	<eval>	mytest.js]
[68	1	0	0	<:isGone:>	@=^]
[70	1	0	0	<null>	@=^]
[74	1	0	0	<null>	@=^]
[73	1	0	0	<null>	@=^]

Already there only the toplevel function has a non-null atom, even if it's gone by the time the code dumps these stats.

Here is the output with the changes I mentioned (saving the function name inside debug when incrementing the counters, while the atom is still alive):

[1	1	0	0	<eval>	mytest.js]
[68	1	0	0	test1	@=^]
[70	1	0	0	<null>	@=^]
[74	1	0	0	<null>	@=^]
[73	1	0	0	<null>	@=^]

At least the toplevel function works now, but the others are still showing a null atom in the function name. I've been trying to hack something using the stack frames but not understanding the code well enough it's like shooting in the dark.

@rubenlg
Copy link

rubenlg commented Nov 22, 2024

By looking at how stack frames are computed in JS_CallInternal, I noticed that it's using get_func_name(ctx, func_obj) Where func_obj is a parameter to JS_CallInternal. That seems to be the correct way to get the name of the function being called. Now it works:

[1	1	0	0	<eval>	mytest.js]
[68	1	0	0	test1	@=^]
[70	1	0	0	test2	@=^]
[74	1	0	0	K	@=^]
[73	1	0	0	test3	@=^]

@mingodad
Copy link
Author

@rubenlg thank you !
Can you make a pull request ? Then I'll remove mine.

@rubenlg
Copy link

rubenlg commented Nov 22, 2024

I need to clean it up a bit, it's leaking memory. And I want to show the context for functions that are not toplevel (e.g. show K.test3 instead of just test3. When it's ready I'll open a pull request.

@mingodad
Copy link
Author

Ok ! @rubenlg thank you again !

@rubenlg
Copy link

rubenlg commented Nov 23, 2024

I just sent #371 with the fixes mentioned above and a slightly different implementation that produces traces that can be open in the chrome devtools and a bunch of other external tools, where you can still get aggregated results, but also see the reconstructed call stacks, etc.

@mingodad
Copy link
Author

Hello @rubenlg thank you again I just closed my pull request pointing out that your is a improved one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants