Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

notebook markdown comments #1828

Draft
wants to merge 6 commits into
base: dev
Choose a base branch
from
Draft

notebook markdown comments #1828

wants to merge 6 commits into from

Conversation

Cyrik
Copy link
Member

@Cyrik Cyrik commented Aug 12, 2022

What has changed?

  • comments are handled as better markdown by chopping off the ;; at the start

Fixes #

My Calva PR Checklist

If this PR involves only documentation changes, I have:

  • Read Editing Documentation
  • Directed this pull request at the published branch.
  • Built the site locally (if the changes were more involved than simple typo fixes), and verified that the site is presented as expected.
  • Referenced the issue I am fixing/addressing in a commit message for the pull request (if there was is an issue for the documentation change)
    • If I am fixing the issue, I have used GitHub's fixes/closes syntax
    • If I am fixing just part of the issue, I have just referenced it w/o any of the "fixes” keywords.

If this PR involves code changes, I have:

  • Read How to Contribute.
  • Directed this pull request at the dev branch. (Or have specific reasons to target some other branch.)
  • Made sure I have changed the PR base branch, so that it is not published. (Sorry for the nagging.)
  • Updated the [Unreleased] entry in CHANGELOG.md, linking the issue(s) that the PR is addressing.
  • Figured if anything about the fix warrants tests on Mac/Linux/Windows/Remote/Whatever, and either tested it there if so, or mentioned it in the PR.
  • Added to or updated docs in this branch, if appropriate
  • Tests
    • Tested the particular change
    • Figured if the change might have some side effects and tested those as well.
    • Smoke tested the extension as such.
    • Tested the VSIX built from the PR (so, after you've submitted the PR). You'll find the artifacts by clicking Show all checks in the CI section of the PR page, and then Details on the ci/circleci: build test.
  • Referenced the issue I am fixing/addressing in a commit message for the pull request.
    • If I am fixing the issue, I have used GitHub's fixes/closes syntax
    • If I am fixing just part of the issue, I have just referenced it w/o any of the "fixes” keywords.
  • Created the issue I am fixing/addressing, if it was not present.
  • Formatted all JavaScript and TypeScript code that was changed. (use the prettier extension or run npm run prettier-format)
  • Confirmed that there are no linter warnings or errors (use the eslint extension, run npm run eslint before creating your PR, or run npm run eslint-watch to eslint as you go).

Ping @PEZ, @bpringe, @corasaurus-hex, @Cyrik

@Cyrik Cyrik requested a review from PEZ August 12, 2022 16:44
@Cyrik
Copy link
Member Author

Cyrik commented Aug 12, 2022

So this is my experiment to get markdown comments working. Turns out parsing Clojure can still be pretty difficult in certain edge cases.
It works for some definitions of "work", but it's neither pretty nor general.
I'm debating pulling the parsing into cljs to make my life easier or to maybe use someone else's work 😄 WDYT?

@PEZ
Copy link
Collaborator

PEZ commented Aug 13, 2022

Can you summarize the need for parsing a bit? Where in the process does it happen, how often, etcetera. I'm thinking we are already doing parsing of the document and want to reason around wether that parsing can be used as is, or be augmented.

@Cyrik
Copy link
Member Author

Cyrik commented Aug 13, 2022

What we currently need is not super difficult, I think I'm just being stubborn and wanting to do it the functional, Clojure way, which is more difficult in JS.
Basically what I need is to split up a .clj file into multiple cells while somehow keeping track of the whitespace/comment lines that I'm using differently than they were in the file.
So as an example of a difficulty:

(def hover-map
  {:stuff "in-here"
   :ohter "yeah"
   ;; Yada yada
   :deeper {:a 1, "foo" :bar, [1 2 3] (vec (range 2000))}})

;; Line comment, line 1
;; 
;; Line comment, line 2
(defn foo []
  (println "bar"))

I would want something like this as output:

[{
	value: "(def hover-map\n		{:stuff \"in-here\"\n		 :ohter \"yeah\"\n		 ;; Yada yada\n		 :deeper {:a 1, \"foo\" :bar, [1 2 3] (vec (range 2000))}})",
	kind: vscode.NotebookCellKind.Code,
	languageId: 'clojure',
	metadata: { endingWhitespace: '\n\n' },
  },
  {
	value: "Line comment, line 1\n	\n	Line comment, line 2 ",
	kind: vscode.NotebookCellKind.Markup,
	languageId: 'markdown',
	metadata: { asMarkdown: true, endingWhitespace: '\n'},
  },
  {
	value: "(defn foo []\n		(println \"bar\"))",
	kind: vscode.NotebookCellKind.Code,
	languageId: 'clojure',
	metadata: { endingWhitespace: '\n'},
  }]

I can mostly get there already as you can see with this PR, but there are a lot of edge cases and then I will probably have to do some look-ahead or backtracking to move newlines/whitespace to the correct cell.
If we don't want to switch to cljs for this I should probably use the cursor directly instead of trying to base everything on the existing output 😅

Base automatically changed from feature/notebook to dev August 13, 2022 21:13
@PEZ
Copy link
Collaborator

PEZ commented Aug 18, 2022

Sorry for late reply, I somehow missed that there was new info here!

Thanks for elaborating. Together with the code, I am starting to understand the nature of the problem now. (I think.)

If we don't want to switch to cljs for this I should probably use the cursor directly instead of trying to base everything on the existing output 😅

I'm not totally against using cljs. Though, it would mean that this category of code would be solved in two different ways, with possibly different nuances in how they deal with details/edge cases. I think we're risking causing confusion for ourselves and for the users. Also, the problem seems like a quite perfect fit for the Token Cursor.

Assuming we use the Token Cursor and build on the approach you have started with (top level clojure ranges interpolated with non-clojure), maybe the convention could be that all consecutive blocks of lines starting with ;; (note the trailing single space) is converted to Markdown. Everything else is left as is. Something like:

  1. Get all top-level code ranges
  2. Interpolate with all non-code ranges
  3. With each non-code range:
    1. Partition the lines in groups based on wether they start with ;; or not.
    2. For each partition:
      • If it is of type lines-start-with-;; :
        1. remove the leading ;; from each line. Make a Markdown block of it, adding metadata about the type it is.
      • Else:
        1. Make a Markdown block, not touching the content. (Will sometimes not render the text very nicely. We can improve later if the need is felt.)
  4. With each code range:
    1. Create code block, not touching the content.
  5. Flatten (since the non-code ranges have been converted to arrays of blocks.

When converting notebook blocks to text:

  • If the cell is of type lines-start-with-;; :
    1. Prepend each line with ;;
  • Else:
    1. Leave contents as is

I.e. very much like what is done already, just the extra blocks created by the partitioning and the removal/adding of ;; line prefixes.

Does this make sense?

@Cyrik
Copy link
Member Author

Cyrik commented Aug 19, 2022

Yeah after playing with rich comments a bit I think the cursor should be enough, just feels strange to do "state" things 😅
As for your algorithm to do it: I think it would be a good start. I kind of wanted to prevent having "whitespace only" markdown blocks, but I might leave them for now.
I'll probably also switch to using the cursor directly instead of only having the ranges of top-level stuff. This will probably make it easier to put the whitespace metadata onto the correct cell.

@Cyrik Cyrik changed the base branch from dev to feature/notebook-rich-comments August 20, 2022 14:37
@Cyrik
Copy link
Member Author

Cyrik commented Aug 20, 2022

I've changed this to use the cursor directly. I still need to keep track of lines without the ;; but otherwise this feels like a good solution. Any comments @PEZ?

@PEZ
Copy link
Collaborator

PEZ commented Aug 21, 2022

just feels strange to do "state" things 😅

Are we, though? Sure, the Token Cursor is a very stateful creature, but in this case we are more inspecting state than doing anything ”state”, aren't we? I see it as we are reading a representation of the state of the file contents.

've changed this to use the cursor directly. I still need to keep track of lines without the ;; but otherwise this feels like a good solution. Any comments ?

I'll have a look. I did think the previous approach seemed sufficient, but maybe it was a bit opaque... Will be back when I've checked it out.

@PEZ
Copy link
Collaborator

PEZ commented Aug 21, 2022

Now checked. I can't figure out where the change ”to use the cursor directly” is. I see mostly changes to do with RFCs. Anyway, it looks right to me, what is there now. I'll try to find some time to try it out as well.

@Cyrik
Copy link
Member Author

Cyrik commented Aug 21, 2022

ah sorry, I forgot to remove the old calls 😅 it's now easier to see that I'm using the cursor directly instead of getting the top-level forms.
There's still some cleanup to do, but I wanted to see what the discussion around #1825 (comment) produces. I like hiding the whitespace but it does have its drawbacks.

@Cyrik Cyrik force-pushed the feature/notebook-rich-comments branch from 8979ea5 to eef3bcf Compare August 29, 2022 02:00
Base automatically changed from feature/notebook-rich-comments to dev August 29, 2022 02:15
@PEZ PEZ removed their request for review September 10, 2022 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants