-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem:
Our database initialization script assumes that input data comes with precomputed vectors for each item. However, some users may have data where vectors are not available, and we need to provide a solution to generate embeddings on the fly for such data.
Proposed Solution:
I propose adding a new command-line option to our database initialization script, allowing users to specify data items without predefined vectors. When this option is used, the script should generate embeddings for these items during initialization.
Detailed Requirements:
- Option Name: --generate-embeddings or -ge
- Usage Example:
$ ./initialize-database.py --no-vector- Embedding Generation Method: The script should use a predefined embedding generation method (e.g., word embeddings, image feature extraction) for items without vectors.
- Logging: The script should log the embedding generation process to provide transparency and debugging capabilities.
- Performance Considerations: Ensure that the embedding generation process is efficient to avoid excessive computation time during initialization.
Benefits:
- Enhanced usability: Users can now initialize the database with data lacking precomputed vectors, expanding the data types our system can handle.
- Improved flexibility: Our system becomes more versatile, accommodating users with data sources that don't provide vector representations.
- Reduced manual work: Users won't need to manually precompute vectors for data items, simplifying the data integration process.
Related Issues:
- None
Assignees:
- Unassigned
Labels:
- Enhancement
Milestone:
- To be determined
Additional Information:
- The database initialization script is located in the scripts directory.
- It's important to document this new feature in the project's README or documentation to ensure users are aware of how to use it.
- Consider discussing this feature with the team to gather input and reach a consensus on its implementation.
- This enhancement will make our database initialization script more versatile and accommodating to a broader range of user data, aligning with our goal of providing a seamless experience for our users.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request