You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The dictionary-based cleaning could use something like:
from to variable
hopsital hospital location|structure_type
hopital hospital location|structure_type
hopsital hospital location|structure_type
feild field location
homw home location
maison home location
household home location
<NA> unknown .all
.default unknown location|structure_type|sex|exposure
Where the field variable illustrates the following new features:
| to list several variables
.all as a wildcard meaning "all variables"
A way to implement the above is to treat entries in variable as regular expressions to be matched against column names, with an exception rule for .all.
The text was updated successfully, but these errors were encountered:
I've implemented a .regex keyword for clean_variable_spelling() in my linelist branch, to allow matching multiple variables as Thibaut describes.
We initially went with a regex = TRUE argument, to treat all vars as regular expressions, but found it was cumbersome and inelegant to anchor all the variables for which we just wanted literal matches. So we switched to the .regex keyword approach, which has been working well in some of our linelist work at Epicentre.
Let me know if you're interested, and I can create a pull request.
Hi Patrick
that sounds great! PR most welcome, ideally with some new unit tests and an example in the doc of the function. Please also add yourself as a contributor in the DESCRIPTION file. But really cool to see contribs on this package, and to hear epicentre is using it :)
The dictionary-based cleaning could use something like:
Where the field
variable
illustrates the following new features:|
to list several variables.all
as a wildcard meaning "all variables"A way to implement the above is to treat entries in
variable
as regular expressions to be matched against column names, with an exception rule for.all
.The text was updated successfully, but these errors were encountered: