Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
9baad17
feat: add missing direct API tools
jdrhyne Jun 25, 2025
0ea6ec5
test: add comprehensive integration tests for new Direct API methods
jdrhyne Jun 25, 2025
6d414eb
fix: correct action types and add missing mappings for new tools
jdrhyne Jun 25, 2025
b667868
fix: resolve potential CI failures in new Direct API methods
jdrhyne Jun 25, 2025
92f1dbe
fix: correct API parameter formats for createRedactions
jdrhyne Jun 25, 2025
827f2fb
fix: add Python 3.9 compatibility by replacing new syntax
jdrhyne Jun 26, 2025
5c75ff3
fix: add Python 3.9 compatibility to remaining integration test file
jdrhyne Jun 26, 2025
28a4d27
fix: configure project for Python 3.9+ compatibility
jdrhyne Jun 26, 2025
37c7804
fix: resolve Python 3.9 compatibility in remaining integration test f…
jdrhyne Jun 26, 2025
c76074c
fix: restore modern Python 3.10+ syntax as intended by project design
jdrhyne Jun 26, 2025
e9be734
fix: apply code formatting with ruff format
jdrhyne Jun 26, 2025
d41429f
fix: remove unsupported base_url parameter from test fixtures
jdrhyne Jun 26, 2025
5cb0db5
fix: replace Python 3.10+ union syntax in integration tests
jdrhyne Jun 26, 2025
813800c
fix: resolve ruff linting issues in integration tests
jdrhyne Jun 26, 2025
79b945a
fix: resolve isinstance union syntax runtime error
jdrhyne Jun 26, 2025
b41d4e7
fix: remove unsupported stroke_width parameter and update preset values
jdrhyne Jun 26, 2025
18b8e1f
fix: critical API integration issues for new Direct API methods
jdrhyne Jun 26, 2025
6400965
fix: correct API parameter formats based on live testing
jdrhyne Jun 26, 2025
2a0bc98
fix: comprehensive fix for Direct API integration
jdrhyne Jun 26, 2025
6e96b0f
fix: comprehensive integration test fixes based on API patterns
jdrhyne Jun 26, 2025
1526834
fix: comprehensive CI failure resolution based on multi-LLM analysis
jdrhyne Jun 26, 2025
9516f48
fix: apply ruff formatting to http_client.py
jdrhyne Jun 26, 2025
1d26358
fix: resolve API compatibility issues found in integration tests
jdrhyne Jun 26, 2025
83a9dbe
fixes issues so that we pass integration tests (#30)
HungKNguyen Jul 1, 2025
b6b7186
fixing linting issue (#31)
HungKNguyen Jul 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ result = self._http_client.post("/build", files=files, json_data=instructions)
```

### Key Learnings from split_pdf Implementation
- **Page Ranges**: Use `{"start": 0, "end": 5}` (0-based, end exclusive) and `{"start": 10}` (to end)
- **Page Ranges**: Use `{"start": 0, "end": 4}` (0-based, end inclusive) and `{"start": 10}` (to end)
- **Multiple Operations**: Some tools require multiple API calls (one per page range/operation)
- **Error Handling**: API returns 400 with detailed errors when parameters are invalid
- **Testing Strategy**: Focus on integration tests with live API rather than unit test mocking
Expand Down
126 changes: 126 additions & 0 deletions PR_CONTENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Pull Request: Add Missing Direct API Tools

## Summary
This PR adds 8 new direct API methods that were missing from the Python client, bringing it to feature parity with the Nutrient DWS API capabilities.

## New Tools Added

### 1. Create Redactions (3 methods for different strategies)
- `create_redactions_preset()` - Use built-in patterns for common sensitive data
- Presets: social-security-number, credit-card-number, email-address, international-phone-number, north-american-phone-number, date, time, us-zip-code
- `create_redactions_regex()` - Custom regex patterns for flexible redaction
- `create_redactions_text()` - Exact text matches with case sensitivity options

### 2. PDF Optimization
- `optimize_pdf()` - Reduce file size with multiple optimization options:
- Grayscale conversion (text, graphics, images)
- Image optimization quality (1-4, where 4 is most optimized)
- Linearization for web viewing
- Option to disable images entirely

### 3. Security Features
- `password_protect_pdf()` - Add password protection and permissions
- User password (for opening)
- Owner password (for permissions)
- Granular permissions: print, modification, extract, annotations, fill, etc.
- `set_pdf_metadata()` - Update document properties
- Title, author, subject, keywords, creator, producer

### 4. Annotation Import
- `apply_instant_json()` - Import Nutrient Instant JSON annotations
- Supports file, bytes, or URL input
- `apply_xfdf()` - Import standard XFDF annotations
- Supports file, bytes, or URL input

## Implementation Details

### Code Quality
- ✅ All methods have comprehensive docstrings with examples
- ✅ Type hints are complete and pass mypy checks
- ✅ Code follows project conventions and passes ruff linting
- ✅ All existing unit tests continue to pass (167 tests)

### Architecture
- Methods that require file uploads (apply_instant_json, apply_xfdf) handle them directly
- Methods that use output options (password_protect_pdf, set_pdf_metadata) use the Builder API
- All methods maintain consistency with existing Direct API patterns

### Testing
- Comprehensive integration tests added for all new methods (28 new tests)
- Tests cover success cases, error cases, and edge cases
- Tests are properly skipped when API key is not configured

## Files Changed
- `src/nutrient_dws/api/direct.py` - Added 8 new methods (565 lines)
- `tests/integration/test_new_tools_integration.py` - New test file (481 lines)

## Usage Examples

### Redact Sensitive Data
```python
# Redact social security numbers
client.create_redactions_preset(
"document.pdf",
preset="social-security-number",
output_path="redacted.pdf"
)

# Custom regex redaction
client.create_redactions_regex(
"document.pdf",
pattern=r"\b\d{3}-\d{2}-\d{4}\b",
appearance_fill_color="#000000"
)

# Then apply the redactions
client.apply_redactions("redacted.pdf", output_path="final.pdf")
```

### Optimize PDF Size
```python
# Aggressive optimization
client.optimize_pdf(
"large_document.pdf",
grayscale_images=True,
image_optimization_quality=4,
linearize=True,
output_path="optimized.pdf"
)
```

### Secure PDFs
```python
# Password protect with restricted permissions
client.password_protect_pdf(
"sensitive.pdf",
user_password="view123",
owner_password="admin456",
permissions={
"print": False,
"modification": False,
"extract": True
}
)
```

## Breaking Changes
None - all changes are additive.

## Migration Guide
No migration needed - existing code continues to work as before.

## Checklist
- [x] Code follows project style guidelines
- [x] Self-review of code completed
- [x] Comments added for complex code sections
- [x] Documentation/docstrings updated
- [x] No warnings generated
- [x] Tests added for new functionality
- [x] All tests pass locally
- [ ] Integration tests pass with live API (requires API key)

## Next Steps
After merging:
1. Update README with examples of new methods
2. Consider adding more tools: HTML to PDF, digital signatures, etc.
3. Create a cookbook/examples directory with common use cases
18 changes: 9 additions & 9 deletions SUPPORTED_OPERATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,16 +171,16 @@ Splits a PDF into multiple documents by page ranges.
parts = client.split_pdf(
"document.pdf",
page_ranges=[
{"start": 0, "end": 5}, # Pages 1-5
{"start": 5, "end": 10}, # Pages 6-10
{"start": 0, "end": 4}, # Pages 1-5
{"start": 5, "end": 9}, # Pages 6-10
{"start": 10} # Pages 11 to end
]
)

# Save to specific files
client.split_pdf(
"document.pdf",
page_ranges=[{"start": 0, "end": 2}, {"start": 2}],
page_ranges=[{"start": 0, "end": 1}, {"start": 2}],
output_paths=["part1.pdf", "part2.pdf"]
)

Expand Down Expand Up @@ -264,7 +264,7 @@ Sets custom labels/numbering for specific page ranges in a PDF.
- `labels`: List of label configurations. Each dict must contain:
- `pages`: Page range dict with `start` (required) and optionally `end`
- `label`: String label to apply to those pages
- Page ranges use 0-based indexing where `end` is exclusive.
- Page ranges use 0-based indexing where `end` is inclusive.
- `output_path`: Optional path to save the output file

**Returns:**
Expand All @@ -276,8 +276,8 @@ Sets custom labels/numbering for specific page ranges in a PDF.
client.set_page_label(
"document.pdf",
labels=[
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
{"pages": {"start": 3, "end": 10}, "label": "Chapter 1"},
{"pages": {"start": 0, "end": 2}, "label": "Introduction"},
{"pages": {"start": 3, "end": 9}, "label": "Chapter 1"},
{"pages": {"start": 10}, "label": "Appendix"}
],
output_path="labeled_document.pdf"
Expand All @@ -286,7 +286,7 @@ client.set_page_label(
# Set label for single page
client.set_page_label(
"document.pdf",
labels=[{"pages": {"start": 0, "end": 1}, "label": "Cover Page"}]
labels=[{"pages": {"start": 0, "end": 0}, "label": "Cover Page"}]
)
```

Expand Down Expand Up @@ -318,7 +318,7 @@ client.build(input_file="report.docx") \
client.build(input_file="document.pdf") \
.add_step("rotate-pages", {"degrees": 90}) \
.set_page_labels([
{"pages": {"start": 0, "end": 3}, "label": "Introduction"},
{"pages": {"start": 0, "end": 2}, "label": "Introduction"},
{"pages": {"start": 3}, "label": "Content"}
]) \
.execute(output_path="labeled_document.pdf")
Expand Down Expand Up @@ -383,4 +383,4 @@ Common exceptions:
- `APIError` - General API errors with status code
- `ValidationError` - Invalid parameters
- `FileNotFoundError` - File not found
- `ValueError` - Invalid input values
- `ValueError` - Invalid input values
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ ignore = [
"D100", # Missing docstring in public module
"D104", # Missing docstring in public package
"D107", # Missing docstring in __init__
"UP038", # Use `X | Y` in `isinstance` call instead of `(X, Y)` - not supported in Python 3.10 runtime
]

[tool.ruff.lint.pydocstyle]
Expand Down
Loading
Loading