- added support for PDF/A-4 including Level F (file attachments) and Level E (engineering)
- added more informative logs in batch processing
- added new parameter to specify the default validation profile in case of missing standard identification in XMP Metadata
- extended the model to support PDF/A-4 rules
- more robust handling of malicious PDF documents
- improved parsing of PostScript and CFF fonts
- allow empty Lang Alt arrays in XMP metadata
- excluded All and None colorants from the PDF/A-2 and PDF/A-3 requirement to have the same tintTransform and alternateSpace
- disabled JPEG2000
colr
box checks in case of explicitly defined ColorSpace in the Image dictionary - fixed validation of predefined XMP value types if they are redefined in the extension schema
- validate XMP URL type as Text
- fixed CIDSet and CharSet validation for PDF/A-1
- fixed validation of the permissions dictionary in PDF/A-2 and PDF/A-3
- added validation of Lang against RFC 1766 regular expression
- fixed validation of permitted transfer functions in Halftone dictionaries
- added test corpus of ~600 new atomic documents covering PDF/A-4 specification
- extended PDF/A-2u tests on ToUnicode mapping and character encodings in simple fonts
- added support of Java versions from 11 to 16
- fixed validation of documents with non-PDF extension in multi-process mode
- added support for PDF/UA-1 (Machine) validation
- fixed issues with STDIN support
- added support for input files with non-pdf extension
- extended the model to support PDF/UA-1 rules
- fixed parsing of inline images
- fixed token parsing on non-ASCII systems
- fixed TrueType font parsing in case of different number of glyphs specified in maxp and post tables
- fixed infinite loop in circular dependency of CMaps
- added support for documents with zero pages
- improved parsing of Type1 font private data
- improved glyph width calculation for CFF fonts
- fixed null pointer exceptions on invalid PDF documents (multiple places)
- adjusted validation reports to support PDF/UA-1 profile
- added drag&drop support for input files in the GUI
- added PDFunction type to support validation of Type4 functions
- added types corresponding to the PDF 1.7 standard structure tagset
- added actualEncoding property of XMP Metadata
- fixed null pointer exception on empty trailer
- improved the logic of parsing page content consisting of multiple streams
- improved memory management in case of large number of validation errors
- fixed installer CLI args not forwarded [#apps-253], [#iss-1021]
- fixed Java 11 XML bind dependencies [#apps-254], [#iss-986]
- command line application termination codes [#apps-255], [#apps-256], [#iss-841]
- fixed XRefStream handling for trailers [#par-359]
- fixed space skipping after RD for type1 [#par-360], [#par-361]
- fixed various issues with AES decryption [#par-362], [#par-364], [#par-365]
- fixed font encoding issue [#par-363]
- fixed buffered in filter and COSStream problems [#par-366]
- fixed SOF exception for colorspaces [#par-368]
- fixed date parser issue [#par-368]
- fixed creation of PD resources from dictionary when cos object empty [#par-370]
- logging improvements [#par-367], [#par-371]
- fixed toUnicode for PDF/A-1 glyph name check [#par-372], [#iss-1012]
- add underlying color space for PDPattern [#par-373], [#iss-984]
- fixed xmp-core byte buffer [#lib-974]
- logging improvements [#lib-980], [#lib-1023]
- input stream and file processing XML results now consistent [#lib-1020], [#iss-1014]
- fixed Java 11 XML bind dependencies [#lib-1022], [#iss-986]
- performance tweaks [#lib-1028]
- fix metadata creation [#val-270]
- fix null pointer exception when processing glyphs [#val-271]
- fix null pointer exception when validating embedded files [#val-273], [#iss-976]
- fixed graphic state initial colorspace creation and font inheritance [#val-274], [#iss-975], [#iss-978]
- deny operators q, Q, cm inside Text object [#val-276], [#iss-985]
- preflight passes, veraPDF shows clause="6.2.11.4" error [#val-277], [#iss-1019]
- fix Ignore trailing zero for info dictionary values during metadata info match xmp check [#val-278], [#iss-1017]
- fixed Java 11 XML bind dependencies [#val-279], [#iss-986]
- fix processColorspace model link implementation for PDDeviceN [#val-280], [#iss-902]
- added warning for invalid color space objects [#val-281], [#iss-797]
- fix process color operator flag logic, add underlying color space processing for PDPattern [#val-282], [#iss-984]
- fix metadata creation [#pdf-195]
- fixed Java 11 XML bind dependencies [#pdf-197], [#iss-986]
- fix processColorspace model link implementation for PDDeviceN [#pdf-198], [#iss-902]
- added support for PDF files over 2Gb [#par-334]
- fixed date parsing issues with trailing apostrophe [#par-335]
- fixed bug with Standard Encoding in PostScript font programs [#par-340]
- fixed issue parsing digital signatures for files below 1024 bytes [#par-343], [#par-351], [#par-353]
- fixed issue with recognition of standard font with differences entry and parsing empty differences array [#par-345][#par-350]
- close streams properly during parsing and ensure tempfile deletion [#par-346], [#par-352], [#par-353]
- fixed signature EOF stream logic causing ByteRange issues[#par-348]
- Fixed detection of the CFF font charset in case of a predefined charset with incomplete glyph plus issues with Adobe Type 3 Font, Adobe Type 1 Font and ASCII85 parsing [#par-348], [#par-355]
- added check to disallow and report Type 1 PFB fonts [#par-349]
- add Identifier to validator type for reporting, added details to HTML report [#val-261], [#lib-940]
- fixed issue parsing digital signatures for files below 1024 bytes [#val-265]
- exclude process colors from spot color validation in DeviceN / NChannel color spaces for PDF/A-2 and 3 [#val-267] [#pdf-191]
- fixed metadata extensions support across different PDF/A levels [#lib-947]
- fixed bug with automatic selection and processing of PDF/A flavour [#pdf-190]
- fixed validation of smooth shading color spaces and inline images in presence of default color space [#val-263]
- fixed bug in embedded file features data extraction [#lib-951]
- fixed bug with validation rule caching [#lib-963]
- added fixes for thread safety and multithreading support [#apps-246], [#apps-248], [#par-342], [#lib-941], [#lib-960], [#val-262], [#pdf-189], [#lib-950] [#apps-244]
- added multi-process parallelization in the CLI [#apps-242]
- improved formatting of XML reports [#apps-243]
- fixed plug-ins loading mechanism [#apps-244]
- updated copyright statement for GUI [#apps-245]
- merged veraPDF-xmp project with library [#lib-964]
- fixed retrieval of glyph widths from PS and CFF font programs (multiple issues);
- optimized creation and cleanup of temporary files; and
- optimized parsing of text-related data in PDF documents (up to 3 times faster for PDF documents with primarily text content).
- fixed checks on the presence of SMask, NeedAppearances keys in case of invalid value types;
- fixed ByteRange check of digital signatures in case of incrementally updated files;
- fixed Unicode checks in PDF/A-1A validation for Type1 and Type3 fonts;
- fixed inheritance of /FT entry in Widget annotations; and
- fixed role map retrieval in case of remapping standard structure types.
- fixed Schematron warnings in Policy checks; and
- fixed issue with access to temp reports from Schematron stylesheets on some systems and in case of Java 9.
- fixed various issues caused by Java 9, particularly a problem in the start up scripts; and
- fixed access to PDF resources in case of veraPDF integration into web applications.
- fixed PS-specific issues in parsing embedded CMaps, ToUnicode maps and PS Type1 fonts;
- implemented the protection against (invalid) loops in PDF tree structures;
- fixed parsing of CIDSet and CharSet and their comparison with the glyph collection in the embedded font subset (PDF/A-2 and PDF/A-3);
- implemented support for CalCMYK colour space as specified in ISO 32000-1; and
- fixed initialization and inheritance of graphics state for tiling patterns, Type3 fonts and form XObjects.
- implemented check for glyph width consistence in case of Type3 fonts;
- implemented check for the Private Unicode Area use in Level A conformance;
- implemented validation of transfer functions in Halftone dictionaries (PDF/A-2 and PDF/A-3);
- added validation of MIME type value for embedded files (PDF/A-3);
- refactored the validation model to check for presence of certain keys, even if they refer to empty arrays/collections;
- fixed misspelled predefined CMap names for GBK2K-H and GBK2K-V;
- fixed validation of UTF8 encoding for role map names (PDF/A-2 and PDF/A-3); and
- fixed detection of references to Associated files from marked content sequences (PDF/A-3)
- added new features to the report:
- PDF Version;
- form field names and values; and
- page labels.
- implemented automatic configuration of veraPDF feature report by a custom Policy profile;
- implemented workaround for the veraPDF GUI appearance on high resolution screens;
- fixed problems with spaces if full JRE path used on Windows; and
- fixed problems handling spaces in installer path.
- implemented automatic generation of PREFORMA test reports; and
- both greenfield anb PDF Box versions now built and packaged from a single branch
- fixed veraPDF test files to comply with PDFA TechNote 0010; and
- fixed outlines Count value to comply with ISO 32000-1
- GUI and CLI now capable of checking for updated version of the software.
Updated validation logic to comply with Technical Working Group resolutions:
- color spaces are validated now when specified in the content stream;
- a CMap may refer only to predefined CMaps in the ‘usecmap’ operator; and
- OpenType causes an ‘unsupported font type’ error in PDF/A-1 validation.
Other conformance checker fixes and improvements:
- fixed calculation of glyph widths for embedded CFF fonts in some special cases;
- fixed validation of digital signature ByteRange array;
- fixed recursive links of color spaces;
- fixed validation of spaces in the indirect object header;
- fixed Crypt filter handling;
- fixed misprint in ‘Lbl’ structure tag; and
- added support for glyphs with CIDs > 65535 (with an appropriate validation. error)
- fixed feature extraction from encrypted documents.
- added new test corpus for the Technical Working Group resolutions;
- fixed test file for digital signature validation; and
- fixed test files for Extension Schema definitions in XMP.
- significant optimization of performance in the greenfield PDF parser
- fixed parsing of embedded PS Type1 fonts
- fixed default value of WMode entry for embedded CMaps
- fixed inline image data parsing
- fixed digital signature parsing
- refactored feature extraction
- pretty formatted XML reports
- clearer XML structure in veraPDF reports
- improvements to the HTML reports
- added GUI wizard for creating custom policy files
- release artifacts now deployed to Maven Central
- started transfer to external static code QA service
- aligned the existing veraPDF corpus and added 80 new test files to cover Technical Working Group resolutions
PDFBox version downloadable from http://downloads.verapdf.org/rel/verapdf-installer.zip. Greenfield version downloadable from: http://downloads.verapdf.org/gf/verapdf-gf-installer.zip.
This is a maintenance release focused on bug fixing and improvements of the test infrastructure.
- fixed cache issues in parsing embedded CMaps
- fixed multiple issues with glyph widths checks for embedded CID fonts
- fixed CIDSet entry validation
- fixed delimiter handling in parsing content streams
- ignore None colorants when checking DeviceN color spaces
- fixed validation of Order arrays in optional content groups
- fixed parsing of /ToUnicode map
- fixed plug-in infrastructure
- fixed handling of unknown feature types
- added error info into HTML reports in case of broken PDFs
- updated developer samples
- updated GUI documentation
PDFBox version downloadable from http://downloads.verapdf.org/rel/verapdf-installer.zip. Greenfield version downloadable from: http://downloads.verapdf.org/gf/verapdf-gf-installer.zip.
- fixed default values for extracted PDF features; and
- fixed removal of temporary files.
- fixed cmap table parsing in TrueType fonts.
- changed Metadata File provenance test files from fail to pass (as discussed at the validation technical working group); and
- fixed xref table in test case 6-3-3-t01-pass-a.
Last pre-version 1.0 release. PDFBox version downloadable from http://downloads.verapdf.org/rel/verapdf-installer.zip. Greenfield version downloadable from: http://downloads.verapdf.org/gf/verapdf-gf-installer.zip.
- schematron based policy checker implementation:
- Example policy schemas on GitHub https://github.com/veraPDF/veraPDF-policy-docs/tree/master/Schemas; and
- policy functionality available in CLI and GUI;
- greenfield implementation of feature extraction;
- greenfield implementation of metadata fixer;
- GUI now supports checking multiple files or a directory;
- HTML summary report for multiple file results;
- single file detailed report containing policy and feature information; and
- stability improvements and performance optimization of the Greenfield parser.
- fixed glyph width checks in case of exactly 1/1000 point difference;
- fixed default color space processing for Indexed color spaces;
- fixed Order array support for OCG checks in PDF/A-2; and
- fixed Unicode character maps support for PDF/A-1 Level A.
We've made two downloads available for out 0.26 release. There's the usual version, based on Apache PDFBox and downloadable from: http://downloads.verapdf.org/rel/verapdf-installer.zip. For 0.26 we've also prepared the first beta release of our purpose built PDF parser and validation model, also known as the greenfield validator. This is downloadable from: http://downloads.verapdf.org/gf/verapdf-gf-installer.zip. It's not functionally complete yet as it only supports PDF/A validation. Full details of the release features are listed below.
- added the new rule for embedded files to be associated with the document or its parts (PDF/A-3 only).
- first beta release of greenfield PDF/A validation available as a limited functionality app;
- refactoring of sub-component and application configuration for reproducible execution;
- new BatchProcessor producing multi-item reports;
- batch processing is stream/event driven with event handlers for processing results; and
- report structures altered to accommodate batch processing.
- publication of integration tests for Greenfield components;
- memory usage and execution times in test reports;
- example test report available here at time of writing: http://tests.verapdf.org/0.26.8/
- added 7 new test files to cover the new rule in PDF/A-3 validation profile
In order to accommodate batch reporting for this release we've had to sacrifice redirecting output to user files. This isn't permanent and will be re-instated for the next release. The following functionality has been temporarily disabled:
- HTML report from CLI, HTML reporting will be a function of the dedicated reporter in the next release. HTML reports are still available from the GUI;
- the -pw option that allows the user to override the profiles wiki, this was only used to generate the HTML report so is not required;
- the -c load config option, config is automatically loaded from the app area and we're adding user config to the next release;
- the --reportfile, --reportfolder, and --overwriteReportFile as these need a rethink to accommodate batch processing.
The Greenfield release is missing all of the above plus:
- metadata fixing in GUI and CLI; and
- feature extraction in GUI and CLI.
- added extraction of the AFRelationship key for embedded files as a part of veraPDF feature extraction.
- implemented prototype of batch validation from CLI and GUI;
- implemented robust handling of run-time exceptions during batch processing; and
- added error info on the run-time exceptions to the validation report.
- moved feature extraction and metadata fixing code to library; and
- tidied various compiler warnings.
- changed default feature generation to document level features
- added a new GUI dialog for managing feature generation options
- added a user-friendly Java OutOfMemoryError with suggestions for reconfiguration
- CLI can now overwrite report files
- added help message when CLI processes STDIN stream
- synchronized the Web demo validation report with the CLI and GUI report styles
- removed the rules for validating file provenance information (based on veraPDF TWG discussion)
- fixed an issue with structure type mapping in Level A validation
- implemented resource caching for memory optimization
- converted all 'fail' test cases on file provenance information to 'pass' tests
- added signature types to features report;
- depth of feature reporting now configurable; and
- altered log level of some validation methods.
- fix for validation of character encoding requirements of invisible fonts; and
- fix for ICC Profile mluc tag.
- 34 new test files for PDF/A-2b.
This beta release provides fixes for PDF/A Validation, enhanced functionality & usability fixes for the application, and additions to the test corpus. It also marks the launch of our beta documentation site.
- suppress all PDFBox warnings in the CLI interface when parsing PDF documents
- generate error report instead of the exception in case of broken PDF documents
- added a new CLI option to save XML report to a separate file in recursive PDF processing
- enhancements to example pure java plugins available
- plugins now configurable through dedicated config file
- ignore DeviceGray color space in soft mask images
- treat glyph with GID 0 as “.notdef” in case of Type0 fonts
- fixed validation of role map for non-standard structure elements (Level A)
- fixed validation of page size implementation limits in case of negative width or height
- fixed validation of non-standard embedded CMaps referenced from other CMaps
- added 180 new test files for parts 2 and 3
- test coverage now monitored by Codecov online service
- integration tests for 2u and 3b validation profiles added
- using codacy and coverity online code QA services
This beta release features the full support of all PDF/A-2 and PDF/A-3 requirements (all levels). Together with earlier support of PDF/A-1 validation, it represents the first full support for all PDF/A parts and conformance levels.
- Conformance checker
- added validation of digital signature requirements
- added extraction of color space info from JPEG2000 images
- added validation of permissions dictionary (Parts 2 and 3)
- PDF/A-2B fix: correct implementation of CIDSystemInfo entry requirements
- command line support for plugin execution to extend feature extraction
- first set of example pure java plugins available
- optional sample plugin pack available through installer
- Added further 112 atomic test files for parts 2 and 3
- added support for all parts and conformance levels at http://demo.verapdf.org
This beta release features Transparency and Unicode character map validation in PDF/A-2 levels B and U.
- Conformance checker:
- added all transparency-related validation rules in PDF/A-2 and PDF/A-3
- added full Level U support in PDF/A-2 and PDF/A-3
- code refactoring to synchronize GUI, API and CLI interfaces
- PDF/A-1B fix: check both Tiling patterns used as different fill and stroke colour spaces in the same drawing operations
- added initial versions of PDF/A-2U, PDF/A-2A, PDF/A-3U, PDF/A-3A validation profiles. We now have initial validation for all PDF/A flavours.
- added a further 65 atomic test files for PDF/A-2 specific requirements
- a demo of the REST interface is available at http://demo.verapdf.org
This beta release features improved PDF/A-2b and PDF/A-3b validation and the fully featured REST API.
-
Conformance checker:
- PDF/A-2 and PDF-A/3 improvements: implement checks for optional content, JPEG2000 requirements
- full compliance with BFO test suite (PDF/A-2b)
- PDF/A-1b fix: check for form field appearance
- code refactoring to enable PDF model implementation via different PDF parsers
- performance and memory optimization
-
Test corpus:
- full coverage of all predefined XMP properties
-
Documentation
- first version of wiki on all validation rules available at: https://github.com/veraPDF/veraPDF-validation-profiles/wiki
-
Command line:
- CLI now supports metadata fixing
-
Infrastructure
- veraPDF-library project refactored into multiple projects.
- PDF Box validator implementation in separate project.
- Automated source packaging with dependencies.
This beta release features command line interface enhancements
-
Conformance checker:
-
new implementation of the XMP validation
-
proper CharSet / CIDSet validation
-
Command line:
-
processes stdin if no file paths are supplied for use in *nix pipes;
-
directory and recursive sub-directory processing; and
-
text mode output with summarised output
-
Test corpus:
-
initial set of PDF/A-2 test files
-
Conformance checker:
-
fixed CMap / WMode validation
-
minor fixes in PDF/A-2b and PDF/A-3b validation profiles
-
based on TWG resolution, fixed validation normal appearance object type (Dict vs Stream) for Button widgets
-
Command line fixes:
-
all CLI output for a single file now in one XML document; and
-
error output now all to stderr, keeping stdin clean;
This beta release features a command line interface for validation and feature extraction, with supporting install scripts.
- Refactored plug-in architecture.
- Re-designed command line interface for PDF/A validation and feature reporting.
- Updated validation profile syntax.
- Simplified machine-readable report format.
- Bug fixes for:
- comparison of Info dictionary and XMP metadata (PDF/A-1);
- support for missing resources and resource inheritance mechanism (PDF/A-1); and
- parsing TrueType fonts with zero-length tables.
- Synchronization with PDFBox 2.0 RC1 library.
- Installer adds command line scripts.
- Refactored integration tests.
This beta release includes a fully functional, internally-tested implementation for complete PDF/A-1b validation, PDF Feature Report generation, and the Metadata Fixer.
- Stable (beta version) implementation of the formal PDF model for PDF/A-1b
- Prototype the formal PDF model for PDF/A-1a and PDF/A-2b, 3b
- Minor refactoring and stricter naming conventions in validation rules for PDF/A-1b
- Prototype validation rules for PDF/A-1a and PDF/A-2b, 3b
- Prototype implementation of the Metadata Fixer
- Prototype implementation of the plug-in architecture for PDF Feature Report generation
- Optimized performance for PDF/A font rules validation (glyphs presence, widths consistency)
- Added cross-platform installer
- Full coverage of Bavaria, Isartor, veraPDF test suites in automated integration tests
- New test files for the use of Device colour spaces and XMP metadata
- 200 new atomic test files for PDF/A-1b extending Isartor and Bavaria test suites
The release includes a fully functional implementation (alpha version) for the complete PDF/A-1B validation and the PDF Feature Report generation
- A number of bug fixes in the implementation of the formal PDF model for PDF/A Level B validation
- Added missing validation rules for the full coverage of ISO 19005-1:2005, 19005-1:2005/Cor.1:2011, 19005-1:2005/Cor.1:2007, 19005-1:2005/Cor.2:2011, Level B
- Complete implementation of the PDF Feature Report generation
- Minor improvements in the GUI and the Human-readable Report in HTML format
- Added extra parameters to limit the number of rule failures and the number of reported errors
- Optimized performance
- Increased unit test coverage to 70%
- Increased the number of atomic validation tests to 226, including the full coverage of tests against Isartor test corpus
- Total 176 atomic test files for PDF/A-1B extending Isartor test suite
The release includes a fully functional prototype for the PDF/A-1b validation and the PDF Feature Report generation
- The formal PDF model for PDF/A Level B validation
- The set of validation rules covering ISO 19005-1:2005, 19005-1:2005/Cor.1:2011, 19005-1:2005/Cor.1:2007, 19005-1:2005/Cor.2:2011, Level B
- Implementation of the rules covering the following sections of ISO 19005-1:
- 6.1 File structure
- 6.2 Graphics
- 6.4 Transparency
- 6.5 Annotations
- 6.6 Actions
- 6.7 Metadata
- 6.9 Interactive Forms
- Initial implementation of the PDF Feature Report generation
- Minor improvements in the GUI and the Human-readable Report in HTML format
- Fully automated build and deployment procedures based on quality gateways
- More restrictive quality criteria (50% unit test coverage, 80% documentation of public API, zero-tolerance to critical Sonar issues)
- Integration tests framework with 150 atomic validation test
- Draft set of new test files for ISO 19005-1:
- 6.1 File structure
- 6.2 Graphics (under review of veraPDF Technical Workgroup of the PDF Association)
This is a minimal viable product release of the veraPDF Conformance Checker
- Implementation of the generic validation model
- Parser for validation profile
- Initial version of machine-readable reports
- Proof of concept for human-readable reports in HTML format
- Initial version of the CLI interface for the Conformance Checker
- Initial version of the GUI application
- Partial implementation of the COS layer for the PDF/A validation model
- PDF/A-1 validation profiles for ISO 19005-1:2005, 6.1 File structure
- Draft set of test files for ISO 19005-1:2005, 6.1 File structure
© 2015 [veraPDF Consortium](http://www.verapdf.org)