-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix include paths #263
base: master
Are you sure you want to change the base?
Fix include paths #263
Conversation
@BrunoMSantos Sorry for being quiet for such a long time. I've been researching and thinking about this on and off though, so rest assured you haven't been ignored even though I haven't replied. I think the overall conclusion is that we'll have to infer the args from the compiler, and we can't trust libclang to do it. It's a bummer, but oh well. As to the implementation, I have a few comments:
|
No worries, we've all been there! Sorry in advance for the long reply :)
So actually I see it as the expected behaviour: you state what compiler you use to compile the code, you get the exact same headers you do when compiling. If the default headers used by this compiler are not ok, then you have to override them just as you would for compiling anything. This is by far the least surprising option I think. Consider the alternative: you select the same compiler you use to compile the code, you give the exact same flags and still it won't parse the code ok despite the same options compiling fine, after which you go file a bug report / request for help.
What do you mean by compatible with libclang? It only needs to support extraction of the standard includes in the gcc/clang manner. If not, it's good to support the As for the default, it's a toss for me... Presumably clang is available, so it's a good default, in fact this is what we thought was libclang's default already. And as long as we default to clang, maybe we could use just
I confess I didn't understand what 'dtrt' should stand for 😅 As for the Finally, regarding the all/c/c++, currently it does assume a single compiler has support for both languages with
Hmm, good point, thanks.
I get the backwards compatibility point, but we add back anything we remove with I'm more sympathetic to the argument name change... Maybe this would avoid widening the API and still nudge it towards language agnosticism? def get_include_args(cpath='clang', lang='c', cc_path=None):
if cc_path is not None:
# Emit deprecation warning, maybe inform about `lang`
cpath = cc_path
... -->8-- Untimely right now, but I'm kind of brewing this idea that maybe we could (optionally) support the extraction of more specific compilation flags from a Not considering working on that for now though, however, a possible middle ground if not a stepping stone towards that might be a configuration dictionary where a user can specify options for any/all/none files (or directories?) individually. |
My point was: Specifying the compiler should not lead to any magic happening. That's just the compiler that would be used for figuring out the header search path if that's desired.
But we aren't using the compiler (front end), we use libclang. We need to figure out the header search path using the front end because libclang is unable to. And we need to specify which front end to use.
The point is, there may be several clang toolchains available. For example, on github macos-latest, there are three. Two from homebrew that use GCC libstd++ and one from Xcode that uses libc++. Whichever libclang you end up using, you need to specify a compiler that gives you the appropriate headers. The C++ headers between libstd++ and libc++ are different. The language versions supported by the compilers/libclang may be different.
Do The Right Thing. :)
This seems to claim it does: https://clang.llvm.org/docs/ClangCommandLineReference.html#include-path-management
The idea was mainly to try to do the right thing, but allow the user to not have that per language if one language works fine and the other not, and let user handle it manually.
Again, the idea is not to use |
Hmm, I'm a bit confused by your mentioning of "front ends" (such an LLVM term) and libclang compatibility. Our only compatibility requirement is that the code is some C/C++ dialect that libclang understands. I.e. But back on the magic claim, that's a fair point of disagreement I guess, though I'm a bit surprised we do disagree on that. It sounds so much more surprising that we should get missing headers despite specifying a compiler and the same compilation flags that work when passed to said compiler. Surely we're not doing enough with the information the user gave us, no?
Sure, but why is it different to have 10 different clangs or 10 different gccs? As for selecting a different libclang, that's out of our purview. But using libclang from toolchain A with headers from clang in toolchain B is totally fine and as valid as using the headers from some (e.g.) gcc based cross compiler. Now if you specify an exotic compiler that doesn't support exfiltration of include headers in this manner, then you're out of luck and you need to do it unaided.
That's for a newer version it seems, maybe they've added it recently? The version I tested with wouldn't work (I did try it originally assuming it would work like gcc). Don't remember if it tripped on the option or if it ignored it, I'll test it again later. But it's not in the man page for sure.
Hmmm, I'm not thoroughly convinced, but yeah, we could allow that. I'd then do
You mean it should fail gracefully unlike the current Maybe we could indeed warn that fetching the headers failed and invite the user to disable the |
I'm starting to feel like I could express myself more precisely writing this in Python rather than English. 😁 I don't think we have any huge disagreements here though. I think we agree we need to be able to configure a compiler to use for automatically figuring out the args to pass to libclang. The rest is details. Let's try to focus on one thing at a time maybe? I would like to separate specifying When the automatic part works, this is not a big deal. But what if it fails, and the user needs to manually specify the args? Wouldnt't it be nice if we had an interface to get the include search path using For the same reason I wouldn't want And we can have an internal function for doing all the magic for the automatic configure, including adding I'd suggest:
I'm simplifying the latter config option from my earlier suggestion. I don't think the existing Maybe we should have an API that takes Sphinx app or env so that it can look up the config to decide on the compiler to use? |
I think I addressed that specifically without a counter. Notably, the whole purpose of the function is to figure out standard includes, prefixing it with To the extent that anyone already relies on it... Maybe we could be more conservative in how we handle the transition? Though I fail to see what case should warrant such caution. (note: originally I thought you were referring to the API change where a user might be calling the function with named parameters as
I'm with you so far, though:
Yeah, it might still surprise a user that specifying a particular libclang should not yield the corresponding clang as default or fail to find it altogether. Not sure how to solve that though...
...and that is a big part of why I left the default as |
I'll probably work on a revised patch over the holidays, whatever the outcome. |
I guess that's fair. I'm just looking for maximum flexibility when that is required. I'm not hung up on this point though. Maybe it should be a separate first step to add
Well, I also meant that. 😁
I think you're looking to add one config option per thing to autoconfigure. I'm thinking true/false could be expanded to a list of things. Same goal, different approaches.
It's just that if If
I don't think we can even reliably figure out which libclang we end up using.
I'm just thinking of sane defaults. The default for libclang is just libclang.so in the library path. It's not too far off to use clang in the regular path. 🤷 |
else: | ||
clang_args.extend(getattr(conf, 'hawkmoth_clang_cpp', [])) | ||
clang_args.extend(compiler.get_include_args('clang', 'c++')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two should get the compiler config from conf.py. We already import it for other configs and use e.g. clang_args = getattr(conf, 'hawkmoth_clang', [])
.
For #254 I imagine we'll need to somehow pass the compiler path from the workflow yaml. It's a bit easier if we can handle that in conf.py alone, not everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order in which the final args are added should be the same in both the extension and here.
I'll split it up 👍
Ah, I see. That makes sense, keep the number of configuration variables down.
Right, possibly splitting hairs, but maybe a matrix of default possibilities can clear it up? Options numbered in my order of preference, with compiler / autoconf pairs as:
Options 1, 3 and 4 have no change in behaviour, but are affected by differences across distros that patch libclang's search paths by default (should we care?). One could argue that Between the 1st and 4th one, I'd wager it's more likely that clang is not the desired compiler than it is likely a user would not want auto path configuration having chosen a compiler, so the 4th option would more often require setting both settings explicitly to get it working as intended. More importantly though, the 1st one is more explicit in what hurdle a user should be clearing: it's not hawkmoth that magically autoconfigures when prompted, it's the user that selects an appropriate compiler so that it can do so. The 4th option places the focus on the magical 'autoconf' against a default/implicit compiler, which is why I rank it lower than my 3rd preference. And this is by far my most important claim I guess. Finally, n.2 changes the behaviour, but towards what we expected from libclang's defaults in the 1st place, so... Fine? There's a possible surprise with default mismatches between clang / libclang, but it should solve any difference across distros as long as a clang is in I really can't make a better case than this though, so I'll defer to you if I have not yet convinced you. |
I must be missing something. To me option 1 is an impossible default, because you can't autoconfigure without the compiler, which defaults to You refer to defaulting to liblang's paths (i.e. status quo), but then what does option 3 mean? Do you mean it would pass [...]
I would like to understand your idea better before I disagree with it. 😁 |
To me it makes sense in 1 of 2 ways, though the 1st one is better:
Option 1 aside, do you give any weight to the argument that most / all users' 1st concern should be with selecting the right compiler and not autoconf? Or maybe see option 2 as viable? It's really a toss between 1 and 2 for me: they both allow the user to select a compiler and get going. |
Right. IOW, autoconfigure by default if we have a compiler to work with. Finally 💡
I agree selecting the right compiler is more important than enabling autoconf. However, the 1st concern shouldn't be fiddling with either, really. If we had no backwards compatibility to consider, I think I'd probably go with How about backwards compatibility? If the user doesn't specify anything manually, or uses |
We've been taking backwards compatibility quite seriously considering we're still at version < 1.0. Too seriously? |
I agree, that option seems to be the perfect one in hindsight.
I may have done it wrong here, but we can and should ensure we add the autoconf Judging by the tickets we got and intuitively, I don't think any of those scenarios are much to worry about. |
I think it's a judgment call. Sometimes you've got to break things even when they are in use (e.g. imagine introducing a breaking change that is rolled out and becomes itself depended upon...). Fortunately I don't think that's our case. I think it's fair to say we have few users, and the ones we have had more problems working around the issue in the first place than not, and the fix (assuming option 2) would take a frankly adversarially constructed case to break anything 🤷♂️ I wouldn't place too much weight on being below v1.0 though. Open source projects have a tendency to dwell there long after general adoption and we haven't committed to a definition of v1.0 yet (have we?). |
Good point. My gut feeling was to add this first so they can be overridden, but really this is stuff that can't be overridden. I commented on the extension and tests having a different order. That's still valid, they should be the same, but these options should come last. |
The case for v1.0+ being different wrt breaking changes comes from semantic versioning. It specifically says the rules aren't the same for < v1.0. We follow it, but we haven't really committed to it. In any case, I think we should just go ahead with this. |
I'll try to recap. Add Modify Add Add If
If Agreed? I don't think that's too far off from your original proposal. Just the extra config option and figuring out stuff early. |
I'll get to it in a few days ;) |
a3b5c61
to
f22be55
Compare
I think it works as we discussed now. Let me know what you think ;) I'm still exploring how to tackle testing though. |
Include paths differ between C and C++ domains (typically C++ has additional search directories), so we need to take that into account. Furthermore, it makes sense to specify `-nostdinc` if we are including the standard include directories explicitly already. Not only we prevent potential duplication of search paths, we ensure that the resulting search is strictly confined to the paths the given compiler would search and not the disjunction of those and libclang's own default search paths. Since this is going to be called from within the directives now, it also makes sense to improve error handling, which we do.
It's a common problem to want hawkmoth to parse the sources with the exact headers a given code is meant to be compiled with and doing so requires some verbosity upfront which can easily be avoided. Provide a couple of options to do just that. The defaults are chosen in a technically backwards incompatible way, but it would take a frankly adversarial case to break it. In fact the new defaults should yield the default behaviour everyone expected from the beginning: we all expected libclang's search paths to be an exact match to clang's, and indeed they are in some distributions. Finally, the new search paths are added after other user's options since the headers are searched in a left to right manner, giving the user the possibility of overriding specific headers despite using this option. For maximum flexibility, this functionality is gated on the 'autoconf' settings, which can easily be extended later to support additional features.
This reverts the effects of commit 1243852, while fixing the inconsistent handling of C/C++ domains since libclang can't always figure out the right search paths for its own version of Clang despite prior claims. It is still unclear whether the upstream issue is an actual issue or if this is the intended behaviour (llvm/llvm-project#18150), though it _is_ an issue and simply using the fixed helper function and configuration parameter comes at a suitably low cost to maintenance. Fixes jnikula#262.
Allow each test case to select which of all the supported tests are active. By default, all of them are, but sometimes it's impossible to create tests that will pass as they target something specific to those tests. The motivation here is the expanding set of configuration variables, which have not been tested so far, but which only affect extension tests. Creating a test case for a breaking configuration would then not work since the parser / cli tests would not see the same breaking configuration. Technically, it's unlikely a test case will ever need to be _just_ for cli or parser, but not extension: anything that breaks those would break the extension ones. But we do it in a general way anyway for simplicity.
f22be55
to
ad093b8
Compare
- bool.c | ||
conf-overrides: | ||
hawkmoth_clang: -nostdinc | ||
hawkmoth_autoconf: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's this one failing in CI, though it passes locally. Don't have the time now (short lunch break). The idea was to add -nostdinc
to normalize behaviour across distros and disable autoconf such that stdbool.h
wouldn't be found that way either.
Not sure this one tests what we want to though, so I may just remove it: is it failing to find the header because we disabled auto configuring or because we set -nostdinc
?
|
||
This is a shortcut to specify ``-nostdinc -I<dir 1> ... -I<dir n>`` in | ||
:data:`hawkmoth_clang` with the search directories of the specified compiler. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the documentation for hawkmoth_compiler
should merely briefly reference what it's used for, and other places (i.e. hawkmoth_autoconf
) should describe the details.
self._clang_args_post = compiler.get_include_args(cpath, 'c++') | ||
else: | ||
self.logger.warning('autoconf: \'stdinc\' option ignored (missing compiler)') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still gets called once per each directive used in the documentation. There's one instance for directive.
The hard part is that setup()
can't use the config options. IIUC you'd have to use a Sphinx event to do it once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to simplify and go back to the first version (instead of the constructor) and optimize later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course. Well, I did this on a crowded flight, it was bound to happen 😅
I'll look into it when I can. There are other (maybe less clean) alternatives like caching the result if we go for a temporary solution, but the events should work well enough if there's a suitable one.
So here's a possible (interim) solution which also solves a common problem for me at least: always having to specify
-nostdinc -I...
manually when playing around with cross compiled projects and so on.This does not address the fact that the
clang
in the path might not be the one associated with the libclang we're using at any given moment, but does that matter? We're telling hawkmoth to parse these files with the headers used by compiler A, where A may not even be clang at all... Generally, libclang is meant to provide us with a parser, not code.As for our tests... We don't really care I think: both the parser and our code snippets will work with any compliant headers, we just make a fair assumption that there's a clang binary in the path, which is quite fair except for mac perhaps (I didn't look into that, but should be a simple fix, no?).
What do you think @jnikula ?