Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JuliaSyntax #767

Open
domluna opened this issue Oct 16, 2023 · 0 comments
Open

Use JuliaSyntax #767

domluna opened this issue Oct 16, 2023 · 0 comments

Comments

@domluna
Copy link
Owner

domluna commented Oct 16, 2023

Intention is to replace CSTParser with JuliaSyntax.

Opening this up for discussion in case there are things I haven't though of before we proceed in some way. cc @davidanthoff @pfitzseb @c42f

Reasons to switch

  1. Better timings.
# src/JuliaFormatter.jl
julia> @btime JuliaSyntax.parseall(JuliaSyntax.GreenNode, s);
  640.334 μs (1276 allocations: 418.82 KiB)

julia> @btime CSTParser.parse(s, true);
  1.091 ms (21577 allocations: 994.00 KiB)

# test/runtests.jl
julia> @btime JuliaSyntax.parseall(JuliaSyntax.GreenNode, s);
  140.459 μs (350 allocations: 103.23 KiB)

julia> @btime CSTParser.parse(s, true);
  145.417 μs (3523 allocations: 209.05 KiB)
  1. JuliaSyntax seems to take care of things that currently require hacks with CSTParser. If we can reduce some of these hacks that would be promising. https://julialang.github.io/JuliaSyntax.jl/dev/reference/#Syntax-Trees

  2. Easier to extend node functionality. See https://github.com/JuliaDebug/Cthulhu.jl/tree/master/TypedSyntax.

  3. Node API is similar to CSTParser. The key fields are head, args, trivia, span, etc. So we might not need to change that much, especially since both CSTParser and JuliaSyntax seem to adhere to Meta.parse. head is the Kind type which we will need to either use to convert to the string version.

help?> JuliaSyntax.Kind
  K"name"
  Kind(namestr)


  Kind is a type tag for specifying the type of tokens and interior nodes of a syntax tree. Abstractly, this tag is
  used to define our own sum types for syntax tree nodes. We do this explicitly outside the Julia type system because
  (a) Julia doesn't have sum types and (b) we want concrete data structures which are unityped from the Julia
  compiler's point of view, for efficiency.

  Naming rules:

    •  Kinds which correspond to exactly one textural form are represented with that text. This includes keywords
       like K"for" and operators like K"*".

    •  Kinds which represent many textural forms have UpperCamelCase names. This includes kinds like
       K"Identifier" and K"Comment".

    •  Kinds which exist merely as delimiters are all uppercase

Example of conversion:

julia> x1.head
JuliaSyntax.SyntaxHead(K"macrocall", 0x0020)

julia> x1.head.kind
K"macrocall"

julia> string(x1.head.kind)
"macrocall"
  1. Going forward JuliaSyntax will likely be maintained far better than CSTParser since it will be the default parser. There won't be the situation where we need to wait for parsing to implemented in CSTParser after it's done in the main language, i.e. multidimensional array syntax.

Things to fix

This is a list of things to fix with the current formatter which might be easier to do with JuliaSyntax as the new backend.

It's not a comprehensive wish list. It's a "hopefully this will more feasible with JuliaSyntax list"

  1. Comment purgatory.

#690
#197 - nice to have, as options perhaps
#241
...

  1. Semicolons. CSTParser does not keep track of semicolons at all. We make it work for cases where semicolons are expected, such as array syntax but solving Option to add or at least not remove trailing semi-colons? #565 falls far from this type of heuristic. We would want something which precisely tracks semicolons.

#565

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant