-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XML support #22
XML support #22
Conversation
Interesting!!! That's really cool. My main hesitation here is, I'll be honest, I'm not super up on the nitty gritty of XML and particularly how XML differs from HTML. I only had HTML in mind when I originally designed Hickory. If the data model is that compatible, that would be awesome. We could surely implement some XML parser for the Clojure version. I'm not a Clojurescript user, so I am not sure what the XML parsing situation is in Javascript land. IE9+ doesn't sound great, but I defer to @jeluard on these issues. He has selected an HTML parser for the CLJS version, and although none of those options were superb, I do think he made the best compromise possible, so I would think trying to be as close to there would be best. |
A JS-expert friend of mine just said that IE9+ is "pretty reasonable in the modern era." |
Make case insensitive attr selector work even when actual attributes may have case.
This reverts commit 1a39238. Easier to start fresh I think.
I think one of the fiddly bits is that the code base tries to do case On 23 December 2014 at 13:49, David Santiago [email protected]
|
Yeah, less worried about that. As a base line it would be quite broad On 23 December 2014 at 13:53, David Santiago [email protected]
|
Hm, yeah. It's supposed to be case insensitive for HTML. I'm not sure what to do there. Rather than trying to squeeze a round peg through a square hole, maybe we should introduce a new namespace, like hickory.select-xml, with selectors that are adapted specifically for the XML usage case. What would you think of that? I think the thing is some people want to use it outside of browsers. |
Sounds fine to me. I expect it will be able to reuse most things. On 23 December 2014 at 18:03, David Santiago [email protected]
|
IE9+ sounds good enough and the code is pretty similar to the existing so it all sounds good. I agree there probably will be people expecting this feature to work outside browser, including on node platform. Also I am wondering if it would not make more sense to add ClojureScript support to data.xml rather than XML support to |
That sounds strategically sane. I imagine building on Google Closure's goog.dom.xml namespace would avoid Either way, for my current project I think I'll stick with my current cheers, Oliver On 23 December 2014 at 23:41, Julien Eluard [email protected]
|
Quick update regarding node support. Looks like "no browser = no dom = no https://code.google.com/p/closure-library/wiki/NodeJS Does Closure-Library-on-Node.js behave differently than Closure Library in Seems sensible to switch in the loadXml call though. I'll patch to use
|
Okay, those two changes are added. |
Sorry guys, was traveling yesterday. @jeluard Good question about data.xml. I assume Oliver had a reason for doing this work, though. Although I don't have a specific use case in mind for why a library that can work with both XML and HTML is useful, it seems like something that could be useful. But @olivergeorge, you said something above about doing this requiring "a slightly stricter approach." What did you mean by that? I didn't understand there to be any changes to the semantics of HTML handling in any of these patches. Also, I assumed that more selectors would need modification to be correct for XML semantics. There's really only just the one selector that should be case-sensitive? If it's just that one, maybe we should just do a namespace like hickory.xml and stick all of the XML related functions together in there? |
I also would like XML support. I'm currently doing it like this: (defn parse-xml
"Parse an XML string into DOM objects."
[s]
(.parseFromString (js/DOMParser.) s "text/xml")) I'm using as-hiccup now and it seems to work fine. I definitely understand why you'd want to use goog.dom.xml though. Perhaps this just needs a few tests and it's good to go? |
So, the thing that I have there doesn't really work because my code doesn't fix |
I just backported this, and it worked like a charm. I also added: (def as-hiccup-xml
(comp hickory-to-hiccup as-hickory-xml)) What would it take to get this merged? Should I add unit tests or something? |
I wonder if it might make sense to use the browser DOM APIs to reproduce a document and then serialize it. |
I believe data.xml is the proper tool to parse XML and will soon-ish document it in a prominent place in hopes that people seeking an XML parser will find an XML parser and not "misuse" an HTML parser for that usecase.
The main considerations here are scope creep and expertise. Expertise: I write web-scraping programs and am familiar with the ins and outs of network requests and HTML, not XML. I'll try my best to make hickory extensible and applicable to other hiccup producing libraries (#24) |
I've done a little work to make hickory support basic XML parsing and rendering. I'm wondering if this is something you think could be added to the main hickory repo.
My goal was parse an XML document, make changes to it and then serialise the back to XML. Where possible I wanted the rendered XML to be as close to the original as practical to simplify debugging.
The changes needed were all quite minor really:
I cut and pasted a lot to keep things simple (for me) but I think you will have ideas on a more integrated approach.
I'm concious that this code relies on js/DOMParser which might be more restrictive than hickory is otherwise. In particular it's an IE9+ approach.