Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Dynamic template for vectorized output fields #2247

Open
juntezhang opened this issue Jul 13, 2023 · 4 comments
Open

[BUG] Dynamic template for vectorized output fields #2247

juntezhang opened this issue Jul 13, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@juntezhang
Copy link

What is the bug?

I want to use dynamic templates to create vectorized output fields. In the ingest processor I want to create the configuration only once, without having to configure again in the index mapping.

This is not working, because I am getting the following error:

"error": {
                    "type": "mapper_parsing_exception",
                    "reason": "failed to parse field [Body_vector] of type [knn_vector] in document with id '9'. Preview of field's value: '-0.083671086'",
                    "caused_by": {
                        "type": "illegal_argument_exception",
                        "reason": "Vector dimension mismatch. Expected: 384, Given: 1"
                    }
                }

I have a field called Body that consists of text, and the neural Ingest pipeline will create an output field called Body_vector. The dimension has already been set to 384 but it sets it to 1 or parses it as 1.

How can one reproduce the bug?

Steps to reproduce the behavior.

Follow the Neural Search plugin tutorial created by Sease, but create an index with a dynamic template like this:

"dynamic_templates": [
      {
        "vectorized": {
          "match_mapping_type": "double",
          "match_pattern": "regex",
          "path_match": ".*_vector.*",
          "mapping": {
            "type": "knn_vector",
            "dimension": 384,
            "method": {
              "name": "hnsw",
              "engine": "lucene"
            }
          }
        }
      }
]

Index documents and see that you get above exception thrown.

What is the expected behavior?

A clear and concise description of what you expected to happen.

The expected behavior is that the dynamic template should create the vectorized output fields as configured in the mapping and index without errors.

What is your host/environment?

Operating system, version.

Mac 13.3, but running OpenSearch in Docker with Ubuntu.

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

N/A

Do you have any additional context?

Add any other context about the problem.

I am happy to contribute to a solution.

@juntezhang juntezhang added bug Something isn't working untriaged labels Jul 13, 2023
@navneet1v
Copy link
Collaborator

@juntezhang Thanks for reporting the issue. I will try reproduce the issue and see what is happening.

@navneet1v navneet1v self-assigned this Jul 15, 2023
@martin-gaievski
Copy link
Member

looks like issue is related to knn vector_field type. @navneet1v @jmazanec15 any objections if we move it to knn repo?

@navneet1v navneet1v transferred this issue from opensearch-project/neural-search Nov 5, 2024
@navneet1v navneet1v removed their assignment Nov 5, 2024
@heemin32
Copy link
Collaborator

heemin32 commented Nov 5, 2024

Preview of field's value: '-0.083671086'

What is the actual field's value generated by ingest processor?

@dblock
Copy link
Member

dblock commented Nov 11, 2024

[Catch All Triage - 1, 2, 3]

@dblock dblock removed the untriaged label Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Backlog (Hot)
Development

No branches or pull requests

5 participants