-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow copying with ANSI escape code control sequences #17059
base: main
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if I should be looking to add automated tests with this PR; if so, I'd love a pointer on where would be best to do that (eg if there's an analogous test I could use as reference).
I've also left a few questions inline:
if (WithControlSequences()) | ||
{ | ||
ss << L", withControlSequences: true"; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if I should have actual, translated special case names pulled from the resources when this is set (eg Copy with control sequences
and Copy single line with control sequences
). Happy to make that change if it's preferred!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, IMO this is good enough for now. We can always iterate on the naming in post
src/buffer/out/textBuffer.cpp
Outdated
@@ -2102,6 +2102,32 @@ std::wstring TextBuffer::GetPlainText(const CopyRequest& req) const | |||
return selectedText; | |||
} | |||
|
|||
// Routine Description: | |||
// - Retrieves the text data from the buffer *with* ANSI escape code control sequences and presents it in a clipboard-ready format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, any guidance on comment max length? I didn't wrap any of the doc comments I added, they get a bit long heh 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to stick to around 120 characters and 140 at most, because then you can still comfortably read it on smaller ~14" laptop screens. In this instance I'd probably split it into 2 lines and make them even length.
FYI you don't need to write "Routine Description:" on new functions. This is only a hold over from our old code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I'll go ahead and reflow the comments I've added (and remove that header haha) 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for doing this. I'm just a contributor like you 😄
I've added some helpful suggestions that you can think through while you wait for the core maintainers' reviews.
@@ -548,7 +548,7 @@ namespace winrt::TerminalApp::implementation | |||
{ | |||
if (const auto& realArgs = args.ActionArgs().try_as<CopyTextArgs>()) | |||
{ | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.CopyFormatting()); | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.WithControlSequences(), realArgs.CopyFormatting()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can add a new CopyFormat
, CopyFormat::VT
, that would avoid the need for plumbing WithControlSequences
down to ControlCore
+ introducing a new setting. It will just be carried forward with other copy formats.
If the user has "vt" in the copyFormatting
list:
"copyFormatting": ["vt"],
we copy the text in the new VT format (including in any other formats set by the user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add the new format VT = 0x04
here:
terminal/src/cascadia/TerminalControl/EventArgs.idl
Lines 7 to 12 in 5f3a857
enum CopyFormat | |
{ | |
HTML = 0x1, | |
RTF = 0x2, | |
All = 0xffffffff | |
}; |
In ControlCore::CopySelectionToClipboard
, you can check if we need to copy in VT format like so:
const auto copyVt = WI_IsFlagSet(copyFormats, CopyFormat::VT);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I had the same thought when implementing this but wasn't sure if that's actually what we want here. In particular, currently the formats represent a wholly different clipboard format. That being said singleLine
does affect all the formats so it's a bit odd to pass this boolean through to just affect the "plain text" (CF_UNICODETEXT
) format, so I'm certainly open to having a "pseudo format" here that modifies the plain text rather than opting into another one! Not totally sure how to make this call haha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely this isn't plain text, so it should actually be a separate clipboard format? Personally I would expect it to paste as ANSI only if the application can handle that, and the plain text format should still be available for applications that just accept text. Typically it should be the app's decision as to what format they want.
That said, I don't know if there's a generally agreed upon clipboard registration for ANSI text that will work automatically, but worst case I think apps should be able to offer users a choice of the available formats, and let them decide want they want at the time of the paste.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I don't know if there's a generally agreed upon clipboard registration for ANSI text that will work automatically, but worst case I think apps should be able to offer users a choice of the available formats, and let them decide want they want at the time of the paste.
As far as I'm aware there's no such format. It likely wouldn't work with a custom format either. My understanding is that the goal is that you can paste the ANSI text into markdown documents. For instance Discord supports this:
This will only work if the ANSI text is in the CF_UNICODETEXT
format.
We've briefly talked about this as a team today and we felt like this new format should not be exposed in the settings UI here:
Precisely because of the interop issue that James mentioned. Instead it could be triggered with an action via the command palette. That is, we could for instance introduce a new action type for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an ideal world, the apps that wanted functionality like this would have got together and decided on a custom clipboard format name, say "ANSIX3.64", which they'd all agree to support. Then when someone like Discord saw that on the clipboard, they could have automatically pasted it surrounded by their ansi markdown wrapper. Having to do that manually is sad. But I guess we have to live with what we've got.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha indeed, I'm the person who hacked in the ansi
"syntax highlighting" in Discord 😅
It's interesting (and, to be clear, I'm not sure what the "right" answer is here or if there is one), but in the context of using this in markdown (aka in Discord or in some other markdown as mentioned in #15703), what you're authoring is plaintext. Like, if I copied code from a website and it populated both the unicode and the HTML formats in the clipboard, when I pasted it into my markdown editor I'd personally expect it to paste the plaintext and not try to... paste formatted/wysiwyg contents, or to auto-insert a code block for me. And (again, in my opinion/use cases) similarly here. I would expect this to act less like a seamless parallel copy and more like a different behavior.
introduce a new action type for this
Especially given the above that makes sense! Would that mean something like what's done here, or not piggy backing on the copy
action at all and introducing a new one with it's own arguments (if any)?
I really appreciate the discussion here, as well as you taking the time to talk this over with your team! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I pasted it into my markdown editor I'd personally expect it to paste the plaintext and not try to... paste formatted/wysiwyg contents
This is what the "Paste Special" menu option is for. Have a look at WordPad if you want to see an example of how it's supposed to work. Just type in some text with some formatting and copy it to the clipboard, then press Ctrl+Alt+V. You should see a popup dialog offering you a bunch of choices for how you want the content to paste. If you paste as "Rich Text" it'll keep the formatting. If you paste as "Unformatted Text", you just get the plain text.
If you're responsible for the Discord clipboard code, you should be able to implement the same sort of thing. So by default you can paste as plain text, but still give users the option to paste as ANSI via a "Paste Special" context menu, or shortcut key.
Edit: I should add that I don't feel strongly about this if everyone else disagrees with me, or it's too much effort. It's just that this is the way I would have expected it to work in a perfect world.
src/buffer/out/textBuffer.cpp
Outdated
ChunkedSerialize(writeThreshold, L"\uFEFF", nullptr, [&](std::wstring& buffer, bool /* isDone */) { | ||
const auto fileSize = gsl::narrow<DWORD>(buffer.size() * sizeof(wchar_t)); | ||
DWORD bytesWritten = 0; | ||
THROW_IF_WIN32_BOOL_FALSE(WriteFile(file.get(), buffer.data(), fileSize, &bytesWritten, nullptr)); | ||
THROW_WIN32_IF_MSG(ERROR_WRITE_FAULT, bytesWritten != fileSize, "failed to write"); | ||
buffer.clear(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible refactoring:
I think we can avoid std::function
usage here, by making ChunkedSerialize
chunk buffer in row-by-row fashion instead of bytes-sized data. Then, both SerializeToPath
and GetWithControlSequences
can decide to read more or flush based on their specific logic.
This might be more complicated, so it's okay to not get into this right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if you decide to do this, then rename ChunkedSerialize
to SerializeRow
to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is a fantastic idea, it really simplifies things! ChunkedSerialize
felt like a clunky interface with the callback and it knew too much about how it might be used (changing behavior based on whether or not copyReq
was passed). This refactor is much cleaner, I think, and completely invalidates a number of the questions I had below (no longer have the inline lambda for the rowInfo
, no longer have the question about how to pass a maybe-CopyReq
) 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh thank you so much for the review @tusharsnx! I really appreciate it 😊
src/buffer/out/textBuffer.cpp
Outdated
ChunkedSerialize(writeThreshold, L"\uFEFF", nullptr, [&](std::wstring& buffer, bool /* isDone */) { | ||
const auto fileSize = gsl::narrow<DWORD>(buffer.size() * sizeof(wchar_t)); | ||
DWORD bytesWritten = 0; | ||
THROW_IF_WIN32_BOOL_FALSE(WriteFile(file.get(), buffer.data(), fileSize, &bytesWritten, nullptr)); | ||
THROW_WIN32_IF_MSG(ERROR_WRITE_FAULT, bytesWritten != fileSize, "failed to write"); | ||
buffer.clear(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is a fantastic idea, it really simplifies things! ChunkedSerialize
felt like a clunky interface with the callback and it knew too much about how it might be used (changing behavior based on whether or not copyReq
was passed). This refactor is much cleaner, I think, and completely invalidates a number of the questions I had below (no longer have the inline lambda for the rowInfo
, no longer have the question about how to pass a maybe-CopyReq
) 🙏
@@ -548,7 +548,7 @@ namespace winrt::TerminalApp::implementation | |||
{ | |||
if (const auto& realArgs = args.ActionArgs().try_as<CopyTextArgs>()) | |||
{ | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.CopyFormatting()); | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.WithControlSequences(), realArgs.CopyFormatting()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I had the same thought when implementing this but wasn't sure if that's actually what we want here. In particular, currently the formats represent a wholly different clipboard format. That being said singleLine
does affect all the formats so it's a bit odd to pass this boolean through to just affect the "plain text" (CF_UNICODETEXT
) format, so I'm certainly open to having a "pseudo format" here that modifies the plain text rather than opting into another one! Not totally sure how to make this call haha.
src/buffer/out/textBuffer.cpp
Outdated
// - addLineBreak - Whether to add a line break at the end of the serialized row. | ||
// - buffer - A string to write the serialized row into. | ||
// - previousTextAttr - Used for tracking state across multiple calls to `SerializeRow` for sequential rows. The value will be mutated by the call. The initial call should contain `nullopt`, and subsequent calls should pass the value that was written by the previous call. | ||
void TextBuffer::SerializeRow(const ROW& row, const til::CoordType startX, const til::CoordType endX, const bool addLineBreak, std::wstring& buffer, std::optional<TextAttribute>& previousTextAttr) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't actually use anything on TextBuffer
, should it still live on this class (as opposed to, eg, a freestanding function or maybe even on ROW
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROW
is technically an internal implementation detail of our TextBuffer
. It's not intentional that it has leaked out into tons of places over the years. Ideally it should stay internal 1, and if it can't, then it should at least be a member of TextBuffer
.
Footnotes
-
The only reason we have a
ROW
at all is because our current text buffer implementation happens to store text in a list of "rows" (often called a line buffer). But that's not the only way to implement text buffers. In fact, a line buffer is usually considered the worst kind of text buffer, as it's neither trivial to implement nor fast. Alternative algorithms like the rope often don't have "rows" at all. They do know the current cursor position in rows and columns of course, but navigation between rows/columns happens by walking through the underlying text - you cannot simply "jump" to the preceding/next row. This shows why it's paramount that we don't leak algorithm-specific internals out ofTextBuffer
. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense! I've moved this to be a private method since it's only used internally by TextBuffer
, so we're not adding a usage of ROW
to its external interface. Is that roughly what you had in mind? Again, there's still not anything in the method that actually needs it to be a TextBuffer
member, but I wasn't sure what else "stay internal" might mean here haha.
src/buffer/out/textBuffer.cpp
Outdated
// - buffer - A string to write the serialized row into. | ||
// - previousTextAttr - Used for tracking state across multiple calls to `SerializeRow` for sequential rows. The value will be mutated by the call. The initial call should contain `nullopt`, and subsequent calls should pass the value that was written by the previous call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular way to notate out params (like buffer
) and in/out params (like previousTextAttr
)?
Also for previousTextAttr
, another option is to have it take a copy of the value and return a TextAttribute
indicating the final state, which callers would then need to pass to subsequent calls; is that better? (maybe less spooky action at a distance? Mutating input params gives me the heebie jeebies 👻)
I'll try to review your PR sometime later this week and hopefully help you with designing the new TextBuffer functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/buffer/out/textBuffer.cpp
Outdated
@@ -2102,6 +2102,32 @@ std::wstring TextBuffer::GetPlainText(const CopyRequest& req) const | |||
return selectedText; | |||
} | |||
|
|||
// Routine Description: | |||
// - Retrieves the text data from the buffer *with* ANSI escape code control sequences and presents it in a clipboard-ready format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I'll go ahead and reflow the comments I've added (and remove that header haha) 👍
src/buffer/out/textBuffer.cpp
Outdated
// - addLineBreak - Whether to add a line break at the end of the serialized row. | ||
// - buffer - A string to write the serialized row into. | ||
// - previousTextAttr - Used for tracking state across multiple calls to `SerializeRow` for sequential rows. The value will be mutated by the call. The initial call should contain `nullopt`, and subsequent calls should pass the value that was written by the previous call. | ||
void TextBuffer::SerializeRow(const ROW& row, const til::CoordType startX, const til::CoordType endX, const bool addLineBreak, std::wstring& buffer, std::optional<TextAttribute>& previousTextAttr) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense! I've moved this to be a private method since it's only used internally by TextBuffer
, so we're not adding a usage of ROW
to its external interface. Is that roughly what you had in mind? Again, there's still not anything in the method that actually needs it to be a TextBuffer
member, but I wasn't sure what else "stay internal" might mean here haha.
@@ -548,7 +548,7 @@ namespace winrt::TerminalApp::implementation | |||
{ | |||
if (const auto& realArgs = args.ActionArgs().try_as<CopyTextArgs>()) | |||
{ | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.CopyFormatting()); | |||
const auto handled = _CopyText(realArgs.DismissSelection(), realArgs.SingleLine(), realArgs.WithControlSequences(), realArgs.CopyFormatting()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahaha indeed, I'm the person who hacked in the ansi
"syntax highlighting" in Discord 😅
It's interesting (and, to be clear, I'm not sure what the "right" answer is here or if there is one), but in the context of using this in markdown (aka in Discord or in some other markdown as mentioned in #15703), what you're authoring is plaintext. Like, if I copied code from a website and it populated both the unicode and the HTML formats in the clipboard, when I pasted it into my markdown editor I'd personally expect it to paste the plaintext and not try to... paste formatted/wysiwyg contents, or to auto-insert a code block for me. And (again, in my opinion/use cases) similarly here. I would expect this to act less like a seamless parallel copy and more like a different behavior.
introduce a new action type for this
Especially given the above that makes sense! Would that mean something like what's done here, or not piggy backing on the copy
action at all and introducing a new one with it's own arguments (if any)?
I really appreciate the discussion here, as well as you taking the time to talk this over with your team! 🙏
I've reviewed the code now and I think it's ready to merge. Ideally we would replace all those boolean parameters with something "more compact", an enum, struct, or something else, but I think that's outside the scope of the PR. However, I'd personally like to wait merging this PR until we've released 1.21 to Preview (in a month or so?) and only merge it into version 1.22. That way we'll have enough time to test this and noodle about any followup improvements we may want to do. |
Thank you for taking the time to review it! That's great to hear and makes sense regarding waiting for the release to merge 👍 I'm happy to make some refactors regarding the "more compact" thing you were talking about, but I'd need a little more about what you've got in mind. More compact in terms of syntax or bytes taken (or something else)? Would this effectively be combining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super fun. Let's do it.
if (WithControlSequences()) | ||
{ | ||
ss << L", withControlSequences: true"; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, IMO this is good enough for now. We can always iterate on the naming in post
The merge between this and main has gotten a little gnarly - I'm gonna take it out of the bugbash queue for today. Pretty sure it's not an impossible merge, but longer than I have time ATM |
This extends the copy command to be able to include control sequences, for use in tools that subsequently know how to parse and display that. At a high level, this: - Expands the `CopyTextArgs` to have a `withControlSequences` bool. - Plumbs that bool down through many layers to where we actuall get data out of the text buffer. - Pulls out the core logic of `TextBuffer::Serialize` to a private `TextBuffer::_SerializeRow`. - Uses the new `_SerializeRow` to generate the data for the copy request. To test this I've manually: - Generated some styled terminal contents, copied it with the control sequences, pasted it into a file, `cat`ed the file and seen that it looks the same. - Set `"firstWindowPreference": "persistedWindowLayout"` and validated that the contents of windows are saved and restored with styling intact.
5f92f02
to
2d1d192
Compare
Uh oh. I forgot about this PR. 😥 I'll try to review & get this merged soon. |
Naw, it was my bad for taking so long to do the rebase, sorry about that! I've got it on top of latest |
Summary of the Pull Request
This extends the copy command to be able to include control sequences, for use in tools that subsequently know how to parse and display that.
References and Relevant Issues
#15703
Detailed Description of the Pull Request / Additional comments
At a high level, this:
CopyTextArgs
to have awithControlSequences
bool.TextBuffer::Serialize
to be more generic and renames it toTextBuffer::ChunkedSerialize
.ChunkedSerialize
to generate the data for the copy request.Validation Steps Performed
To test this I've manually:
cat
ed the file and seen that it looks the same."firstWindowPreference": "persistedWindowLayout"
and validated that the contents of windows are saved and restored with styling intact.I also checked that
Invoke-OpenConsoleTests
passed.PR Checklist