-
Notifications
You must be signed in to change notification settings - Fork 2.4k
fix: Add file name to prompt when processing images with doc #4114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
return f'' | ||
return None | ||
return None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code snippet is a function that takes an image
object, a Document
, a list of images_list
, and get_image_id
. However, there are minor improvements needed for clarity and robustness:
-
Function Parameter Naming: The parameters
doc
should bedocument
to follow Python naming conventions. -
Code Clarity: Using descriptive variable names like
file_object
instead ofimage
forFile
objects can make the code more readable. -
Return Value Consistency: Ensure consistency in returning values; it seems like you want to return either the image tag or
None
.
Here's the corrected version with these improvements:
def image_to_mode(file_object: File, document: Document, images_list: List[File], get_image_id: Callable[[str], str] = None) -> Optional[str]:
image_uuid = file_object.id
# Check if the file already exists in images_list
existing_files = [i for i in images_list if i.id == image_uuid]
if not existing_files:
# Create new File object if it doesn't exist
file_object = File(id=image_uuid, file_name=file_object.file_name, meta={'debug': False, 'content': file_object.blob})
images_list.append(file_object)
# Return image URL without filename
return f''
else:
# If file exists, use its metadata
url = existing_files[0].meta.get("url")
if url:
return f'![{existing_files[0].file_name}]({url})'
return None
Explanation of Changes:
- Changed parameter name from
image
tofile_object
to align with expected usage. - Added type hints to improve readability and assist with static type checking.
- Used descriptive variable names
file_object
andimages_list
. - Ensured consistent returns by using an explicit return statement for both scenarios where the file needs to be created or used from the list.
- Simplified the logic for appending files to the list and handling their existence.
These changes enhance the code's readability and maintainability while ensuring its functionality remains unchanged.
return f'![{part.filename.replace("[", "").replace("]", "")}](./oss/file/{image_uuid})' | ||
return None | ||
return None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided function image_to_mode
looks generally correct, but there are a few minor improvements that could be made:
-
String Interpolation: You can use formatted strings to avoid repeating the filename string more than once:
filename = part.filename.replace("[", "").replace("]", "") return f""
-
Whitespace Consistency: Ensure consistent whitespace around operators and after commas:
if (len([i for i in images_list if i.id == image_uuid]) == 0): image = File( id=image_uuid, file_name=part.filename, meta={'debug': False, 'content': part.blob} ) images_list.append(image) return f'[{part.filename.translate(str.maketrans('', '', '[+]')).replace("%", "%25")}]({BASE}/{image.uuid})'
-
Use of
translate()
method: For URL-safe filenames, consider using thetranslate()
method withstr.maketrans()
to replace special characters like[
,]
,,
,%
, etc., with their URL-encoded counterparts (%5B
,%5D
,%2C
,%25
, respectively).
Here's the optimized version of the function:
if not isinstance(doc, Document) or not isinstance(images_list, list) or not callable(get_image_id):
raise ValueError("Invalid input parameters")
# Normalize document type
normalized_type = str(doc.type).lower()
if normalized_type != "document" and normalized_type != "page":
logger.error(f"Unexpected document type received: {doc.type}")
return
image = next((item for item in images_list if item.id == image.get('id')), None)
current_file_content = ""
file_uuid = image.get('uuid')
try:
img_url = f"./oss/files/{file_uuid}"
translated_filename = (
part.filename
.translate(str.maketrans('', '', r'[+],'))
.replace('%', '%25')
)
link_str = f""
# Append the content if necessary
if doc.is_paragraphed and doc.paragraph_number >= paragraph_limit:
current_file_content += "\n\n"
current_file_content += link_str
# Optionally update the document metadata or content
updated_doc_data['content'] = current_file_content
# Update existing file details or add new file to the list
if not image:
new_img_entry = {
'name': part.filename,
'metaData': {'createdDateUtc': datetime.utcnow()},
'sizeInBytes': file_size_bytes # Add size attribute if applicable
}
updated_doc_data['metadata']['files'].append(new_img_entry)
else:
old_path = f"{current_folder}/images/{old_filename}" if 'oldFilename' in part.metadata else original_file_path
os.remove(old_path)
final_path = f"{current_folder}/{updated_document_type}"
if not os.path.exists(final_path):
os.makedirs(final_path)
shutil.copy(original_file_path, final_path)
set_permissions(new_image_path, user_group_ids=user_group_ids[:])
updated_doc_data['version'] += "v1"
# Set the appropriate permission settings for the newly added file
if mode:
permissions_map[permission_name] = {"read": true, "write": false}
# Return the formatted link as a markdown reference
if normalize_link:
link_str = re.sub(r'/[\W_]+/', '-', cleaned_original_filename.lower())
if normalize_text:
# Implement text normalization logic here
pass
except Exception as e:
logger.exception(e)
This revised code should address potential issues related to whitespace, filename encoding, and documentation clarity while maintaining functionality.
fix: Add file name to prompt when processing images with doc