Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really odd error report from an innocuous search string #222

Open
walterdavis opened this issue Jul 5, 2024 · 10 comments
Open

Really odd error report from an innocuous search string #222

walterdavis opened this issue Jul 5, 2024 · 10 comments

Comments

@walterdavis
Copy link

walterdavis commented Jul 5, 2024

Environment:

  • Rails 7.0.8.3
  • scoped_search (4.1.11)
  • MariaDB 15.1
  • mysql2 0.5.4

I have a scoped_search set on a single column, and the user entered a long-ish query string:

action_dispatch.request.parameters: {"p"=>"every fact which has a beginning has a cause", "controller"=>"people", "action"=>"quoted_search"}

Which resulted in this error report:

A ScopedSearch::QueryNotSupported occurred in people#quoted_search:

 Field 'a' not recognized for searching!
 app/controllers/people_controller.rb:158:in `quoted_search'

That might be coming from https://github.com/wvanbergen/scoped_search/blob/master/lib/scoped_search/query_builder.rb#L476 or https://github.com/wvanbergen/scoped_search/blob/master/lib/scoped_search/query_builder.rb#L509, but I can't see why either would be relevant. Is there anything about the words in the request such that a might be interpreted as being a column name?

The model has this definition:

  scoped_search on: %i[search_name]

(search_name is just a downcased and simplified version of the name, created with parameterize.gsub('-', ' '))

The controller uses it here:

  def quoted_search
    @term = params.fetch(:p, '')
    @people = []
    @people = Person.quote_authors
                    .distinct
                    .search_for(@term).limit(40) if @term.length > 2
    render :quoted, formats: [ :turbo_stream ]
  end

I just can't see why Scoped Search was considering part of the query string as a mention of a column. There were no control characters or quotes in the input. Can anyone see something I have missed here?

@walterdavis
Copy link
Author

Follow-up: running this in development, I can see that the error is coming from https://github.com/wvanbergen/scoped_search/blob/master/lib/scoped_search/query_builder.rb#L476 definitively. Screwing around with the input, I can determine that the breaking point is when the word has is followed by the word a. I wonder why that is being interpreted as a query statement, when it is coming in from the user.

@walterdavis
Copy link
Author

walterdavis commented Jul 5, 2024

I was able to work around this by adding a before_action to load and process the query term:

  def search_term
    @term = params.fetch(:p, '')
    @term = "\"#{@term}\"" if @term.match? /\s/
  end

I searched through the Wiki and read through a lot of the code, and while I was able to find the keywords in the Tokenizer module, I was not able to find any way to configure Scoped Search to not act on those. It seems determined to interpret has a as a literal command meaning table_name.a IS NOT NULL, when the user just meant the literal words.

Wrapping the user input as a phrase works around this, but it feels more as though the casual use of this gem would not be to start slinging SQL operations around in a text search field. Let that be an expert option that you enable by choice, so you know to explain to your users that they must know what all of the fieldnames are before they start using the common English word 'has' followed by literally any other word as their search term.

@adamruzicka
Copy link
Collaborator

You're mostly right about where this comes from. This gem provides a query language (and enables you to search using this language), has as a keyword is an unary operator in the language.

and while I was able to find the keywords in the Tokenizer module, I was not able to find any way to configure Scoped Search to not act on those

Correct, there's no way to disable it as that would mean disabling a part of the query language.

It seems determined to interpret has a as a literal command meaning table_name.a IS NOT NULL, when the user just meant the literal words.

Correct, has is an alias for set?. The user should be aware of the language, not just type in anything and hope that it will do what they mean.

Wrapping the user input as a phrase works around this

Just bear in mind that the semantics of searching for a b c and "a b c" are different. The first searches for anything containing ain any searchable field and containing b in any searchable field and containing c in any searchable field, while the latter searches for things containing a b c literally.

Let that be an expert option that you enable by choice, so you know to explain to your users that they must know what all of the fieldnames are before they start using the common English word 'has' followed by literally any other word as their search term.

If you really only want to search for exactly what the user passed in, then scoped_search might be too big of a hammer.

@walterdavis
Copy link
Author

Thanks very much. I totally agree about the hammer bit. I reached for it because I was in a hurry, and I didn't want to be completely irresponsible by just allowing a SQL injection. I will do some more searching for a simpler solution. I really do like what you've built here, and can see it has great value where all of its features are needed.

I do realize that the phrase search is limiting the possible matches. Since the implementation here actually didn't anticipate someone entering an entire phrase (it was used for a search by name feature, triggered on keyup after the third character) the resulting error came out of left field from my perspective. But given an infinite universe of users, after four or five months someone decided to paste a phrase into the field...

Thanks very much for your feedback and for making this gem.

@walterdavis
Copy link
Author

walterdavis commented Jul 8, 2024

One more suggestion here: would it be possible to use ActiveRecord's column_names list to flavor which tokens are considered to mean that you are in the presence of an expert user, rather than an unknowing use of the keywords? That way has a would be skipped for evaluation as a to_null_sql target, rather than going down the error path (unless there actually was an a column available for evaluation), while has name would be interpreted as intending the logical operation. This could mean that instead of an error, you would get a working query. It might not give you the answer you sought, but it wouldn't just throw a 500 error.

One last thought: could there be a specific error that means "tried to search on a column, but there's no matching column"? That way an end-user could trap that error and decide what to do. The current QueryNotSupported is not unique to this case, and the only way to discern the source of the error state would be by filtering on the message text.

@adamruzicka
Copy link
Collaborator

One more suggestion here: would it be possible to use ActiveRecord's column_names list to flavor which tokens are considered to mean that you are in the presence of an expert user, rather than an unknowing use of the keywords?

In the situation where foo is a searchable column and bar is not, I'm a little uneasy about the idea of searching for has foo and has bar doing different things. How about instead of doing this, we'd rename the has operator to has? to signify it is something special? It would be more in line with other prefix unary operators (set? and null?) and hopefully it would be harder to trigger by accident.

One last thought: could there be a specific error that means "tried to search on a column, but there's no matching column"?

Sure, that's a good idea. In case you do stick with scoped_search in the end, would you be willing to take a stab at it?

@walterdavis
Copy link
Author

walterdavis commented Jul 10, 2024 via email

@walterdavis
Copy link
Author

walterdavis commented Jul 11, 2024

@adamruzicka could I get a little hint on the test setup? I'm trying to tack on to the existing query builder spec like so:

  context 'with interpreted null sql' do
    before do
      field = double('field')
      field.stub(:field).and_return(:test_field)
      field.stub(:key_field).and_return(nil)
      field.stub(:to_sql).and_return('')
      @definition.stub(:field_by_name).and_return(field)
    end
    it 'should be happy if the target exists' do
      ScopedSearch::QueryBuilder.build_query(@definition, 'has test_field').should eq(conditions: ['? IS NOT NULL', 'test_field'])
    end
  end

but I am getting

  1) ScopedSearch::QueryBuilder with interpreted null sql should be happy if the target exists
     Failure/Error: ScopedSearch::QueryBuilder.build_query(@definition, 'has test_field').should eq(conditions: ['? IS NOT NULL', 'test_field'])
     
       expected: {:conditions=>["? IS NOT NULL", "test_field"]}
            got: {:conditions=>["( IS NOT NULL)"]}
etc.

I've tried a number of different variations on this setup, with and without the ? placeholder, and while I can see that I am getting the correct AST type from the definition, nothing I do seems to give it the rhs value in the right place for it to be useful. (I always see a blank space in place of that value in the test output.) Do I need to mock that explicitly? Are there other tests that do? (Couldn't find any.)

I am certain this is down to the way I am setting up the stubs here, as I am just cribbing from the other tests in this file as best I can understand them. Any suggestions would be appreciated.

Walter

@adamruzicka
Copy link
Collaborator

Oof, this is tricky. I tried explaining that some time ago in here #217 (comment) , maybe it will give you a bit of an insight about what's going on.

I am certain this is down to the way I am setting up the stubs here

Spot on, by stubbing to_sql on the field, we bypass part of the behaviour that sets up the placeholders and so on. We can lean into the dynamic nature of ruby and define the missing method ourselves, but i'll leave it up to you decide if such a test actually tests anything apart from itself.

  context 'with interpreted null sql' do
    before do
      field = double('field')
      field.stub(:field).and_return(:test_field)
      field.stub(:key_field).and_return(nil)
      def field.to_sql(builder)
        yield :parameter, builder.ast.rhs.value
        '?'
      end
      @definition.stub(:field_by_name).and_return(field)
    end
    it 'should be happy if the target exists' do
      ScopedSearch::QueryBuilder.build_query(@definition, 'has test_field').should eq(conditions: ['(? IS NOT NULL)', 'test_field'])
    end
  end

Another option would be having this with less stubbing as an integration test and then making assertions about the sql query that gets generated by calling klass.search_for("has something").to_sql

@walterdavis
Copy link
Author

@adamruzicka when you have a chance, #223 needs a button pressed from you. My first run through exposed a cross-db issue in my test, which I believe the latest push has resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants