-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace string by pattern ? #14
Comments
This would allow a nice itterative workflow to create the patterns from a list of messy names, using count, or wrapping our own function around it. |
It won't be for this version but I think what I want is the perfect function to help one build patterns when they don't know the data. It could be named I imagine a table like this that would show how many messages were matched by several patterns. Here for example the pattern B has been matched 6 times but in one of these instances it was matched by A too, which has priority, so the total of matched is 5 for pattern B.
Then we'd display the number of unmatched messages and the first 10 of them, making it easy for the user to iterate on their pattern vector. |
this could actually be a parameter, so we can use it in unglue_unnest, unglue_vec... we could have a default Better maybe, this could be a parameter of This wouldn't be type stable though. Another parameter of unglue detect could have us get a logical column per pattern, making the type of table above easier to get. |
These are cool features but the names are not good, since there's no demand, we'll leave it as "nice to have". |
It's not an unglueing feature but more about aggregating by pattern.
Say I have some file names , like those but in big numbers and with more patterns:
in order to count or to aggregate, it would be nice to be able to give as input the patterns
"{name} doc {month}.doc"
and"Summary {year}.doc"
, and get as an output :Maybe the default should be to output :
And then it's an option to keep original string if unmatched ?
No real good name idea...
Maybe something like
unglue_simplify()
,unglue_generalize()
,unglue_to_pattern()
?The text was updated successfully, but these errors were encountered: