Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lint step for ambiguous list literals #267

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

annevk
Copy link
Member

@annevk annevk commented Jan 3, 2022

Copy whatwg/whatwg.org@33ad837 and adjust formatting slightly.

@annevk annevk requested a review from domenic January 3, 2022 09:13
@annevk
Copy link
Member Author

annevk commented Jan 4, 2022

@noamr @zcorpan it seems that this list literal check isn't working as expected, at least the error message isn't. See the output at https://github.com/whatwg/html-build/runs/4689408108.

Wrong list literals, use ABBB instead:

lint.sh Outdated
perl -ne '$/ = "\n\n"; print "$_" if (/chosing|approprate|occured|elemenst|\bteh\b|\blabelled\b|\blabelling\b|\bhte\b|taht|linx\b|speciication|attribue|kestern|horiontal|\battribute\s+attribute\b|\bthe\s+the\b|\bthe\s+there\b|\bfor\s+for\b|\bor\s+or\b|\bany\s+any\b|\bbe\s+be\b|\bwith\s+with\b|\bis\s+is\b/si)' "$1" | perl -lpe 'print "\nPossible typos:" if $. == 1'
grep -niE '((anonym|author|categor|custom|emphas|initial|local|minim|neutral|normal|optim|raster|real|recogn|roman|serial|standard|summar|synchron|synthes|token|optim)is(e|ing|ation|ability)|(col|behavi|hono|fav)our)' "$1" | grep -vE "\ben-GB\b" | perl -lpe 'print "\nen-GB spelling (use lang=\"en-GB\", or <!-- en-GB -->, on the same line to override):" if $. == 1'
perl -ne '$/ = "\n\n"; print "$_" if (/\ban\s+(<[^>]*>)*(?!(L\b|http|https|href|hgroup|rb|rp|rt|rtc|li|xml|svg|svgmatrix|hour|hr|xhtml|xslt|xbl|nntp|mpeg|m[ions]|mtext|merror|h[1-6]|xmlns|xpath|s|x|sgml|huang|srgb|rsa|only|option|optgroup)\b|html)[b-df-hj-np-tv-z]/si or /\b(?<![<\/;])a\s+(?!<!--grammar-check-override-->)(<[^>]*>)*(?!&gt|one)(?:(L\b|http|https|href|hgroup|rt|rp|li|xml|svg|svgmatrix|hour|hr|xhtml|xslt|xbl|nntp|mpeg|m[ions]|mtext|merror|h[1-6]|xmlns|xpath|s|x|sgml|huang|srgb|rsa|only|option|optgroup)\b|html|[aeio])/si)' "$1" | perl -lpe 'print "\nPossible grammar problem: \"a\" instead of \"an\" or vice versa (to override, use e.g. \"a <!--grammar-check-override-->apple\"):" if $. == 1'
grep -ni 'and/or' "$1" | perl -lpe 'print "\nOccurrences of making Ms2ger unhappy and/or annoyed:" if $. == 1'
grep -niE '\s+$' "$1" | perl -lpe 'print "\nTrailing whitespace:" if $. == 1'
grep $'\t' "$1" | perl -lpe 'print "\nTab:" if $. == 1'
grep $'\xc2\xa0' "$1" | perl -lpe 'print "\nUnescaped nonbreaking space:" if $. == 1'
grep $'[\u226a\u226b]' "$1" | perl -lpe 'print "\nWrong list literals, use \uAB\uBB instead:" if $. == 1'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grep $'[\u226a\u226b]' "$1" | perl -CSDA -lpe 'print "\nWrong list literals, use \xAB\xBB instead:" if $. == 1'
should work in the context of html-build

Not sure why it shows an error though, the grep on source doesn't find anything

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference? Why would perl require different arguments for this particular line and not the others?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Perl doesn't handle UTF8 by default. The other commands don't have unicode characters

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so should we change the other script as well then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when I run grep $'[\u226a\u226b]' source locally on macOS it seems to return every line in the file, essentially. That's probably why this goes wrong.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, doesn't on my Mac
Will try to reproduce

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK it was a bash vs zsh thing. The next line works and seems portable:

perl -CSDA -ne '$/ = "\n\n"; print "$_" if (/[\x{226A}\x{226B}]/si)' $1 | perl -CSDA -lpe 'print "\nWrong list literals, use \xAB\xBB instead:" if $. == 1'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants