Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] Expose means of configuring/providing a custom Object Store extension #194

Open
alexkohler opened this issue Jan 25, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@alexkohler
Copy link

alexkohler commented Jan 25, 2024

Hi there, I'm a user of lancedb, which leverages tantivy-py for full text search indices (see https://lancedb.github.io/lancedb/fts). A current shortcoming of the lancedb FTS support is that it's only supported for local file paths (something like S3 is not supported).

The lance maintainers have authored an Object Store extension , but as I understand it, there's no means of specifying/providing this extension within tantivy-py. Would love if this could be supported!

@changhiskhan
Copy link

LanceDB maintainer here. We'd be happy to contribute if you can point us to the right place to configure the underlying tantivy rust to integrate this extension. thanks!

@cjrh cjrh added the enhancement New feature or request label Jan 27, 2024
@dgarnitz
Copy link

dgarnitz commented Feb 4, 2024

I'm also in need of this feature. Any idea when it will be schedule?

@dtiarks
Copy link

dtiarks commented Mar 1, 2024

Also very much interested!

@cjrh cjrh added the help wanted Extra attention is needed label Mar 3, 2024
@erixison
Copy link

erixison commented Mar 8, 2024

Also interested

cjrh pushed a commit to cjrh/tantivy-py that referenced this issue Mar 16, 2024
@zmtbnv
Copy link

zmtbnv commented Mar 17, 2024

+1

@cjrh
Copy link
Collaborator

cjrh commented Mar 19, 2024

I'm looking at the example here: https://github.com/lancedb/tantivy-object-store/blob/main/examples/index_wiki_local.rs

    <snip>
    let dir = new_object_store_directory(
        Arc::new(LocalFileSystem::new()),
        dir.path().to_str().unwrap(),
        None,
        0,
        None,
        None,
    )
    .unwrap();

    let index_using_object_store =
        tantivy::Index::create(dir, schema.clone(), IndexSettings::default()).unwrap();
        
    let mut writer = index_using_object_store.writer(1024 * 1024 * 1024).unwrap();
    <snip>

The only difference is the dir object that is passed to Index::create(), is that correct?

@alex-au-922
Copy link
Contributor

Hi LanceDB Developers, would you happy to create a PyO3 bindings in Python for your tantivy-object-store crate? I'm very willing to help

@cjrh
Copy link
Collaborator

cjrh commented Oct 16, 2024

@alex-au-922 are you currently working on this somewhere else? Or alternatively, @changhiskhan what is the current status on this from the lancedb pov?

@alex-au-922
Copy link
Contributor

Nope, as I afraid there's some code plagarism issues so I haven't started. Also would like to see how LanceDB's view on this, as I think LanceDB is going to build their own retriever?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

8 participants