Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzy search and general CLI improvements #94

Merged
merged 21 commits into from
Jan 15, 2025

Conversation

erik-brink
Copy link
Collaborator

Contents

The What

This pull request introduces several enhancements and new features to the NGPIris project, particularly focusing on the CLI and HCP handling functionalities. The most notable changes include adding new search capabilities, improving object listing options, and updating dependencies.

Enhancements to CLI functionalities:

  • NGPIris/cli/__init__.py: Enhanced the _list_objects_generator function to support filtering by path and file-only options. Added new options to the list_objects command for pagination and file-only output. Introduced a new fuzzy_search command utilizing the RapidFuzz library for improved search capabilities. [1] [2] [3] [4] [5] [6] [7]

Improvements to HCP handling:

  • NGPIris/hcp/hcp.py: Updated the list_objects method to include filtering by path and file-only options. Refined the delete_objects and delete_folder methods to handle non-existent objects more gracefully and ensure proper deletion of folders with subfolders. Added a new fuzzy_search_in_bucket method for fuzzy search functionality using RapidFuzz. [1] [2] [3] [4] [5] [6]

Dependency updates:

  • pyproject.toml: Updated the project version to 5.2.0.dev1 and added the RapidFuzz library as a new dependency for enhanced search capabilities.

The Why

Mostly quality of life changes

This update is:

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Tests

  • pytest
  • pytype

@erik-brink erik-brink requested a review from sylvinite December 18, 2024 09:52
@erik-brink erik-brink self-assigned this Dec 18, 2024
@erik-brink erik-brink marked this pull request as draft December 18, 2024 10:53
@erik-brink erik-brink marked this pull request as ready for review December 18, 2024 12:39
This was referenced Jan 9, 2025
Copy link
Member

@sylvinite sylvinite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the need to use pagination just to check if items trail with /.
How slow is fuzzy? Compared to say, running a regex post query.
You absolutely want some self-tests for this type of function.

@erik-brink
Copy link
Collaborator Author

erik-brink commented Jan 14, 2025

@sylvinite

I don't quite understand the need to use pagination just to check if items trail with /.

I'm not entirely sure which part you are referring to, but I'm gonna guess that you mean in function like _list_objects_generator. The reason for checking if items end with the "/" character is to determine if it is a file or not. The idea for this was simply for the users that might want to just list files and not "folders".

How slow is fuzzy? Compared to say, running a regex post query.

Fuzzy search is on par with the old "simple search" in terms of speed, which is not super fast 😕 I have not tested with a regex post query specifically. The point of implementing fuzzy was just to be a slightly better simple search (which was just a substring search that runs in linear time (O(n)))

You absolutely want some self-tests for this type of function.

Sounds like a good idea. Should probably review and add more tests to the test suite soon 😅

Copy link
Member

@sylvinite sylvinite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please add tests in regard to the features added.

@erik-brink erik-brink merged commit 680588a into dev Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants