-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: XML Entity file handling #701
base: integration
Are you sure you want to change the base?
Conversation
- moved the `Jhove` and `JhoveView` classes from default packageas Java 11 tooling complained; - updated `jhove-apps` POM to add new `Jhove` class package to manifest line; - updated windows batch and linux shell start scripts to add new `Jhove` class package to manifest line; and - updated Maven dependencies for `izpack -> 5.1.3` and `jacoco -> 0.8.7`for Java 11 compatibility.
- fixed misfiring entity processor; and - fixed package warnings.
Observation, if you're going to use an automated code formatter, it would make sense to run it on the entire project to create a baseline. This would allow for cleaner PRs and ease the burden of resolving spurious merge conflicts. |
ent = new InputSource(conn.getInputStream()); | ||
} | ||
|
||
ent = new InputSource(obj.openStream()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change here means that the code will no longer follow cross-protocol redirects (eg http->https). By default, HttpUrlConnection
does follow redirects. However, it does not follow redirects across protocols.
I've spent a fair bit of time looking at this function the past couple of days. Here is the modified implementation that I arrived at: https://github.com/UW-Madison-Library/jhove/blob/48d083b2394fc8b9426d9567744eff519284b830/jhove-modules/xml-hul/src/main/java/edu/harvard/hul/ois/jhove/module/xml/XmlModuleHandler.java#L332 Things to note:
|
Hi Peter, thanks for your super helpful review. This all sounds a little familiar, since I did a review of the XML module code a year or two ago, but can barely remember the details now. I fixed a few bugs, and did a lot of house keeping. The PR (#634) is still awaiting integration, but you might find it of interest, if only to prevent duplicating efforts in places. A fix for this issue which still retains the old redirection logic was included in commit d223a28 of that PR. I'm not sure if it addresses all of your concerns, since it's been too long for me to remember, but let me know if it doesn't, and maybe I can amend that older PR. |
@david-russo Yeah, it looks like your old PR does address the same issue as is addressed in this PR. It does still have the bug when resolving relative entities but that's a different problem. I haven't looked at your PR in detail, but it looks like there's some good stuff in there, including a few changes that I have also made. It would be really nice if PRs didn't languish here for years, forcing us to duplicate each other's work and creating an ever growing merge problem for PR authors if/when PRs ever start to be merged again. |
Cheers. Does this comment relate to the relative entity resolution issue you're describing? If so, from what I remember/wrote, it sounds like there's a hacky solution in the module at the moment, but a proper fix may require a change to what information JHOVE passes its modules. |
Yes, but it can be addressed in some (perhaps not all) cases, simply by setting the system id on the returned InputSource. See here I'll create a ticket for it with a demonstration of the problem later today. [Edit] That is for entities that are relative a parent. This fix has no affect on relative entities without parents. |
Fixes #700 output is now a more informative: