-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added tokenization and transformation #4
Conversation
Maybe you can talk me through this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't really judge the low-level correctness of some stuff (especially findR
), because the control flow is quite deeply nested and I find it hard to understand what some of the variables mean
} else { | ||
err = fn(cKey, entry) | ||
// value/search terms transformation (eg lowercase) | ||
tokens := indexParts[0].Tokenize(seek) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quite complex, could use some refactoring maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does require a refactor, but this PR is not the time to do it. See #5
|
||
if a, ok := val.([]interface{}); ok { | ||
var ra []interface{} | ||
for _, ai := range a { | ||
interm := j.matchRecursive(parts, ai) | ||
interm, err := j.matchRecursive(parts, ai) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these 1 or 2 character variables makes it hard to understand the code (because I don't work on it), since I've got to really analyze it to guess/derive what is actually means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
part of the refactor.
tokenization happens first.
A whitespace tokenizer has been provided.
This allows for "care organization A" to be indexed with terms:
care
,organization
anda
.It's pluggable, so other transformers (like pronunciation) can be used.