-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Fix trailing )
from interfering with extraction in Clojure keywords
#18345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@eneroth Can you see if the checkbox to allow maintainers to edit the PR is checked? I've got a few tweaks I want to make. |
fn is_keyword_terminator(byte: u8) -> bool { | ||
matches!( | ||
byte, | ||
b'"' | b';' | b'@' | b'^' | b'`' | b'~' | b'(' | b')' | b'[' | b']' | b'{' | b'}' | b'\\' | ||
) || byte.is_ascii_whitespace() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn is_keyword_terminator(byte: u8) -> bool { | |
matches!( | |
byte, | |
b'"' | b';' | b'@' | b'^' | b'`' | b'~' | b'(' | b')' | b'[' | b']' | b'{' | b'}' | b'\\' | |
) || byte.is_ascii_whitespace() | |
} | |
fn is_keyword_character(byte: u8) -> bool { | |
matches!( | |
byte, | |
b'+' | b'-' | b'/' | b'*' | b'_' | b'#' | b'.' | b':' | b'?' | |
) | byte.is_ascii_alphanumeric() | |
} |
I'd go for a positive check versus a negative one. Using |
instead of ||
produces better, branchless assembly too: https://godbolt.org/z/94WqeEz1n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, but this excludes a lot of characters that are valid in keywords. Are you explicitly selecting the ones valid in Tailwind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, the assembly inspector is cool!
Also could use a changelog entry |
In order to pre-empt any further problems from Clojure keywords, this commit inverts the previous logic: Instead of consuming everything and handling special cases, consume characters as defined by the Clojure keyword specification, and nothing else. In addition, leave string consumption as is, but—again—drop the `"`s, in order to align with the strategy of throwing away everything that is not the content of a string or a keyword, including delimiters (`"`, `:` respectively).
Sorry @thecrypticace, from what I can see on Github docs, there's supposed to be a checkbox for me to check here: But I can't see any. In lieu of that, rebased on main and amended with your changes, and added a change log entry. |
Summary
In a form like,
:bg-black
will fail to extract, while:bg-white
is extracted as expected. This PR fixes this case, implements more comprehensive candidate filtering, and supersedes a previous PR.Having recently submitted a PR for handling another special case with Clojure keywords (the presence of
:
inside of keywords), I thought it best to invert the previous strategy: Instead of handling special cases one by one, consume keywords according to the Clojure reader spec. Consume nothing else, other than strings.Because of this, this PR is a tad more invasive rather than additive, for which I apologize. The strategy is this:
"
and ends with an unescaped"
. Consume everything between these delimiters (existing case).:
, and end with whitespace, or one out of a small set of specific reserved characters. Everything else is a valid character in a keyword. Consume everything between these delimiters, and apply the class splitting previously contained in the outer loop. My previous special case handling of:
inside of keywords in Extract candidates with variants in Clojure/ClojureScript keywords #18338 is now redundant (and is removed), as this is a more general solution.I'm hoping that a strategy that is based on Clojure's definition of strings and keywords will pre-empt any further issues with edge cases.
Closes #18344.
Test plan
cargo test
-> failurecargo test
-> success