-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OCamldoc/Odoc highlighting #106
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been running with an older version of this patch for many weeks and it's awesome !
I've noticed a bug due to types in code blocks, in a .mld
page:
{[
type t = int
]}
This is not recognized as text, int, [foo]
It seems related to type decls as it doesn't happen if followed by something else:
{[
type t = int
val f : int
]}
Also happens with invalid syntax so it might be related to ]}
not being a end
token:
{[
module
]}
Thank you :-) We can't reproduce the first example. |
Besides that, the OCaml syntax is heavily using “regions”, for instance for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! I just took a superficial look at the diff, without trying the code myself.
The bug found by @Julow seems very relevant. I’m a bit surprised that you cannot reproduce it. @Julow can you reproduce it with the latest version of the patch? If so, it’s indeed very likely because of ]}
not being an ending token for the type
region. In that case, ]}
should be added to ending tokens of relevant regions. It is unfortunate complexity, but things are what they are.
(Unless, perhaps, there is a way to arrange the syntax-matching instructions in some order that ultimately does The Right Thing, with respect to endings of nested regions; I doubt it, though I admit at the moment I don’t have in mind the details of how nested regions work).
syntax/odoc.vim
Outdated
syn region odocTargetSpecific matchgroup=odocMarker start="{%html:" end="%}" contains=@Spell | ||
endif | ||
|
||
syn region odocDyckWord contained start="{" end="}" contains=odocDyckWord |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny but obscure naming, why not something obvious like odocBracket
or odocGoodBracket
or odocBalancedBracket
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because that’s how they’re called? Especially since we are in the context of syntax, it seemed fitting.
https://en.wikipedia.org/wiki/Dyck_language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This region should have a transparent
attribute (see :help syn-transparent
).
Regarding naming: “Dyck word” is jargon known only to people who are knowledgeable in formal languages, and it equally applies to balanced square brackets in code spans (and to any recursive region really, including odocBold
and so on). odocBalancedBrace
is hopefully obvious to anyone reading that file, and it is more specific.
I checked that I use the last version of the patch, merged on top of #81. I reproduce the bug both in a |
I can reproduce it, I don’t really what happened last time 🤔 |
This also interferes with regular comments: (* type below is highlighted as a comment *)
type t |
60c451b
to
31134bc
Compare
I have this behaviour only with neovim. |
31134bc
to
40cd7d8
Compare
We’ve pushed an update that fixes the bug in which a single
is correctly parsed with vim 9.1.16 and neovim 0.9.4 for us. |
I confirm this now works fine :) Though, the doc comments in ml and mli files are no longer rendered in the same color as regular comments, which is confusing. Also, code spans are not highlighted differently: |
We’re making progress, great! module Make (X: sig end) =
struct
type t = int
end the body of the module is not highlighted correctly. This is similar to the bug we solved for comments, where an opening parenthesis was opening two syntax scopes. The parenthesis for the functor parameter is opening two
You think there is more than getting accustomed to this new highlighting? We could indeed highlight the text body as comments. Our rationale is that comments are usually dimmed while we want documentation to stand out.
Indeed. I think |
We've removed I believe all the known bugs are fixed. |
Hi, sorry for the delay! At last I found time to have a closer look at the PR. I found a couple of bugs, I will send a review soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At last here is my review.
I used the manual of odoc and that of ocamldoc to check for missing syntax features.
Things I noted to be missing, but which would be overkill for this PR:
- crossrefs can contain quotes, e.g.
{! section:"section-label-with-dashes" }
- keywords (like
@param
or@see
) each have their special syntax,
e.g.@param name_of_arg Paragraph of text about the argument
or@see <url_within_angle_brackets> Text for the link
.
To be honest, I’d like this to be highlighted (as the special syntax for@param
and@raise
always tricks me), but this can go in a followup PR.
The bugs mentioned in earlier messages seem resolved…
Except for the one about {[ module ]}
. This also happens for things like {[ object ]}
or even {[ ( ]}
. Is that something we want to fix, or are we happy with linting becoming messy when a code block contains bits of syntax which, alone, are invalid? Or at least do we tolerate it because it’s too hard to fix?
On the other hand, and to my pleased surprise, things like {[ type ]}
or {[ 42 : int ]
work just fine, despite the type linter’s heavy reliance on regions (see e.g. ocamlTypeDef
and ocamlTypeAnnot
), because among the ending tokens for all these regions, there is ]
, which subsumes ]}
. :-)
Regarding doc-comments being highlighted differently from regular comments:
I too find it a bit confusing, but it makes sense that it stands out; plus, trying to colorize doc-comments works badly with {b
, {i
, {e
(see my reviewing comment). I think we can let the users give a custom style to doc-comments if they wish so (by using :hi ...
for ocamlDocumentation, odocBold, odocItalic, odocEmphasis
).
Regarding code spans ([BLA]
and {v BLA v}
and also {@LANGUAGE[ BLA ]}
) not being highlighted:
I would say that it’s okay as long as the delimiters are highlighted and there is a match group the user can use for custom styling (in these cases, it’s respectively odocCode, odocVerbatim, odocCodeBlock
). However, personally I would highlight the text in [BLA]
, because the fact that we use brackets, instead of Markdown-style backticks, always tricks me, in particular I keep visually confusing them with OCaml lists’ delimiters.
FWTW, I tested the PR with Vim 9.1 in a color terminal, and I used the following .ml file for testing:
(* normal comment (* nested normal comment *)
* speling eror
* let *)
let test : int = 42*3
(** oDoc comment inside [.ml] file
* @author me
* @since always
* @unknown or non-existing tag @.
* {1 this is a heading}
* {1this is a wonky heading}
* {2:lbl this is a heading with a label}
* {b this is bold}
* {i this is italic}
* {e this is emphasized {b very much}}
* {^ superscript}
* {_ subscript}
* {% target-specific content, default target %}
* {%mytarget: target-specific content %}
* {%html: HTML content %}
* {%latex: LaTeX content %}
* {math LaTeX content %}
* {m LaTeX content %}
* {v verbatim text [{!@ v}
* {! extension:Module.Sub.crossref}
* {! extension-decl:Module.Sub.crossref}
* {{!val:Module.Sub.crossref} text for the cross-reference}
* {{: ocaml.org/ } text for the web link}
* {nonsensical} curly braces } are syntax errors {
*
* - unordered item
* - unordered item
* - unordered item
*
* + ordered item 1
* + ordered item 2
* + ordered item 3
*
* escaped characters: \@normal text \[ normal text \] \{ normal text \} \\
* (wrong escape: \a)
* some decorative asterisks
* speling eror
*
* (* nested normal comment *)
* (** nested doc-comment *)
* (**)
* let
*)
type t = (int * int (* comment in doc *))
type t = (int * int (** doc-comment in doc *))
(* type below should not be highlighted as a comment: *)
type t
(* empty/ruler comments: *)
(**)
(***)
(* stop comment: *)
(**/**)
module Make (X: sig end) =
struct
type t = int
end
(*after*)
As well as the following .mld file:
this is an oDoc file, the file extension is [.mld]
@author me
@since always
@unknown or non-existing tag @.
{1 this is a heading}
{1this is a wonky heading}
{2:lbl this is a heading with a label}
{b this is bold}
{i this is italic}
{e this is emphasized {b very much}}
{^ superscript}
{_ subscript}
{% target-specific content, default target %}
{%mytarget: target-specific content %}
{%html: HTML content %}
{%latex: LaTeX content %}
{math LaTeX content %}
{m LaTeX content %}
{v verbatim text [{!@ v}
{! extension:Module.Sub.crossref}
{! extension-decl:Module.Sub.crossref}
{{!val:Module.Sub.crossref} text for the cross-reference}
{{: ocaml.org/ } text for the web link}
{nonsensical} curly braces } are syntax errors {
- unordered item
- unordered item
- unordered item
+ ordered item 1
+ ordered item 2
+ ordered item 3
escaped characters: \@normal text \[ normal text \] \{ normal text \} \\
(wrong escape: \a)
(**)
** some decorative asterisks
speling eror
(* is this is a comment? *)
(** is this is a doc-comment? *)
below is a code block:
{@python[
def f(x):
return x + 1
]}
below are OCaml code blocks:
{@ocaml[
(* normal comment (* nested normal comment *)
* speling eror
* let *)
let test : int = 42*3
(** oDoc comment inside [.ml] file
** {1 this is a heading}
** (* nested normal comment *)
** speling eror
** let
**)
type t = (int * int (* comment in doc *))
type t = (int * int (** doc-comment in doc *))
(* type below should not be highlighted as a comment: *)
type t
(* empty/ruler comments: *)
(**)
(***)
]}
{[
type t = int ->
]}
Is this recognized as oDoc text (int, [foo]) ?
{[
let f : int ->
]}
Is this recognized as oDoc text (int, [foo]) ?
{[
module
]}
(* this is still OCaml syntax! *)
Foo = Stdlib.Map.Make (Stdlib.Int)
]}
This is text again!
syntax/odoc.vim
Outdated
" Shamelessly borrowed from HTML syntax | ||
hi def odocBold term=bold cterm=bold gui=bold | ||
hi def odocEmphasis term=bold,underline cterm=bold,underline gui=bold,underline | ||
hi def odocItalic term=italic cterm=italic gui=italic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE:
No action needed, but something to note.
This resets any style set by a surrounding region,
so the matched text (is bolded/italicized but)
falls back to default colors.
For instance, even if we are inside a doc-comment in an OCaml file,
and the user chose to highlight the doc-comments somehow
(e.g. :hi link ocamlDocumentation ocamlComment
or :hi ocamlDocumentation ctermfg=red
),
the bolded text will appear as white.
Unfortunately this is a limitation of Vim,
there is no way of inheriting surrounding style
(see e.g. here or there).
More modestly, with some amount of programming
(I found this technique here,
and someone even wrote a plugin),
we can copy the style given to a specific match group,
in our case ocamlDocumentation
;
something like:
exec 'hi odocBold term=bold cterm=bold gui=bold'
\. ' ctermfg=' . synIDattr(synIDtrans(hlID('ocamlDocumentation')), 'fg', 'cterm')
\. ' ctermbg=' . synIDattr(synIDtrans(hlID('ocamlDocumentation')), 'bg', 'cterm')
\. ' guifg=' . synIDattr(synIDtrans(hlID('ocamlDocumentation')), 'fg', 'gui')
\. ' guibg=' . synIDattr(synIDtrans(hlID('ocamlDocumentation')), 'bg', 'gui')
" ditto for odocItalic, odocEmphasis
… but this would have to be done after style is given to ocamlDocumentation
,
and I found no way of detecting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, and then bolded text would be colored even in .mld files, which is something we don’t want, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the context. I rather agree that it’s better, at least for now, to keep it simple and eventually fix it if needs be.
Now that I think of it, I think many of the regions in
We don’t want to give the |
Sorry for all these comments! I have finally a higher-level remark: when I tried the PR with a real-life So I did these customizations on my part: hi link ocamlDocEncl PreProc
" ^ defined by PR review
hi link odocMarker PreProc
hi link odocCrossrefMarker odocMarker
" ^ defined by PR review
hi link odocTag PreProc
hi link odocCrossrefKw Structure
" ^ or PreProc
hi link odocLinkText odocItalic
hi link odocCode Constant
hi link odocCodeBlock odocCode
hi link odocVerbatim odocCodeBlock … and I also disabled OCaml syntax linting within code blocks within odoc text. And now it looks better to me. These are just suggestions, in case you like it too. |
Co-authored-by: Samuel Hym <[email protected]> Co-authored-by: Maëlan <[email protected]>
Co-authored-by: Samuel Hym <[email protected]> Co-authored-by: Maëlan <[email protected]>
Disallow nested ocamlModParam. For some reason the ocamlModParam was opened twice, but closed only once. Co-authored-by: Samuel Hym <[email protected]>
In case that helps, in the branch |
Co-authored-by: Samuel Hym <[email protected]>
First of all, thanks a lot for your very thorough review!
I’m not really good at styling, especially in a case that is so dependent on the independently-chosen color scheme.
|
This PR proposes to add syntax highlighting for OCamldoc and Odoc documentation.
This adds syntax highlighting for both
*.mld
files and documentation comments in OCaml files.OCamldoc/Odoc syntax needs to load OCaml syntax in order to highlight code extracts. This PR uses the following strategy to avoid an infinite loop: documentation inside code extract inside documentation is highlighted as simple comments.
This means that we can have the following load chains:
syntax/ocaml.vim
→syntax/odoc.vim
→syntax/ocaml.vim
→ STOPsyntax/odoc.vim
→syntax/ocaml.vim
→ STOPThis is joint work with @shym.