-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cursor prop cache v2 #225
base: master
Are you sure you want to change the base?
Cursor prop cache v2 #225
Conversation
Very cool! One thing that I wondering: if we parse the same header file twice in sphinx (e.g. through by directly referencing two different symbols from the same file), does this caching also apply? I think probably not? That would be something that could considerably speed up the build in certain scenarios. |
I believe it will reuse it as long as it's in the context of a single Though the bigger issue with caching in that case is the caching of the clang parsing stage, which is still not a fully global cache yet. You'll find some references to that if you look around in the issues / old PRs. |
Cool. I will test it out on my code and post some results here :)
Ah, you probably mean #75? |
Hm, doesn't seem to be faster: Without cache:
With cache:
I'm reading the 5 different files roughly 2 times on average. |
Yup, but there's also #157 and #171 at least ;)
I wouldn't expect otherwise. It takes a lot of compounding to make it noticeable, and your use case seems too small. Can you artificially create a test case in which you go over the same file like a 100 times? Otherwise you're getting the overhead and none of the benefits. Ideally, the overhead won't matter in small cases, but the benefits will kick in in force for the big cases, which would make this viable. And there are of course other scenarios in which it might matter: we plan on exposing the cursor to the user through extensions for instance, allowing the user to exponentially grow the number of calls to these methods. |
c31dd04
to
cfccbc1
Compare
Some of the methods within the cursors are expensive, so caching them can have a compounding effect on performance as they are called more often. Currently, this patch actually worsens performance a negligible amount over our unit tests, but in more complicated code bases there's higher potential for cursors to be revisited often. Even more so when we expose them to user land over which we have no control naturally. In order for this to work, we need to make sure that the `__eq__` method is defined on top of the existing `__hash__`. Together, these guarantee that we can construct 2 different `DocCursors` from the same arguments and get to prime the cache from one of them and reuse it in the other one. The (maximum) cache size is also made configurable through an environment variable as the most elegant way of injecting the constant in the module before it gets evaluated. This may later be exposed to the user in a configuration variable. 2 functions are also created to dump the caches' status and wipe them respectively. These too can later be used to inform the user on how to tune the cache size. Here we set up caching for all methods that are either callable through multiple paths or directly exposed through the API, with the exception of `_specifiers_fixup`. This is to do with the limitation all the method arguments are hashable, together with the `DocCursor` class itself. As it turns out, Clang's types are not always hashable, so using a bare `Type` object as a parameters prevents caching. There are ways around it, sure, but in this case there's no significant performance impact and it's not a path that might get exposed to the user as it's only called through internal methods that can themselves be cached.
This may be helpful in tuning the cache size.
Don't really know how to do this better at this point, but it allows some crude testing for now.
86dcba4
to
24c4c78
Compare
Here is the new cache with the
functools
decorator. As expected, that pitfall was actually the feature I wanted all along. 2 equalDocCursor
instances are able to reuse the same values, which is a win when iterating over children / parents when traversing the AST, possibly multiple times.From limited testing and looking at the code, I have the impression this currently benefits C++ code more than C (greater need to look into parents and children of any given cursor, constantly hitting the same types and so on). With instrumentation of the code, I saw some methods save 50+% of all calls when running the test suite, but I need to look at it better. This is meant only for show and tell at this point.
Sadly this did not improve performance (still over the test suite) in any measurable way. If anything it makes it ever so slightly worse (some <0.1s per run on average on my machine).
Some of the cached methods were never triggered twice for the same cursor within our code, so I tried enabling cache only for those that showed promising numbers. But no measurable difference there, so this branch is enabling cache for all methods that are called from more than one place in our code and / or are directly exposed through the cursor's API (with one exception as noted in the relevant commit).
Performance wise this doesn't worry me, I think it's worth it for future proofing all sorts of use cases within events and so on. But not sure all methods deserve an equally large maximum cache size or what the defaults should be. Also don't know hoe to best expose tuning and profiling values to the user.