-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
insert, delete pages and manage bookmarks #153
base: master
Are you sure you want to change the base?
Conversation
@plasticassius I find your code very useful I used it in my project, but I can imagine a cases when it failed (see below if you're interested). Currently you just checking: It could also add condition to see if there is a GoTo action type and then to see if that's refers to some page in the document However the /D (destination to jump to) can be not only and indirect reference page object, but also a named destination (that's stored in the Root.Names.Dests ... arrays, eclipses means that the actual structure can vary as the array can be spread across multiple objects). I probably haven't covered everything as I would need to brush my Outlines PDF knowledge up a little bit. Anyway I would greatly appreciate if you continue to develop this code. |
I haven't com across a pdf with this type of bookmark, but if you have one, I can add this check to see if it would work. One issue I would like to handle more gracefully is merging documents with form fields. In particular when I merge documents with digital signatures, the resulting document doesn't play well with PDF-Xchange. The approach I've taken is to manually remove the digital signatures with a text editor and to check for the presence of form fields in the utility. I have to admit that reading and understanding PDF reference 1.7 is more than I want to handle, particularly since I don't run into any examples of most of it. |
@plasticassius Here are example pdf files outlines.
|
I looked at your sample_nametrees.pdf file, and http://www.tecxoft.com/samples.php has some examples of signed pdfs. The problem is that they are considered as read only by some tools, including PDF-Xchange, making them problematic to work with. |
Files may looks like read-only probably, because some document changes are not permitted. What is exactly your goal? If you want to modify file but preserve valid signature, then I don't think that's possible. If you don't care about the signature then it should work. You can concatenate two files or remove signature field. See example code below. Concatenating works for me, signature get invalid. Used following files 1, 2:
If you don't care about the signature you can remove it (used following pdf):
|
sample_nametrees.pdf contains named destinations. So elt.A.D[0] contains a literal string (named destination) and the correspondence between strings and destinations is defined by the /Dests entry in the document’s name dictionary (Root.Names.Dests). The value of this entry is a name tree mapping name strings to destinations. The sample_nametrees.pdf file Root.Names.Dests have 5 kids each containing names/object pdf array (The actual structure can vary between files). Selected objects from the ample_nametrees.pdf:
Name tree object may have three different entries:
|
I don't want to modify signatures, rather I combine pdfs into other pdfs. However, when one of the component pdfs has a signature, the entire result gets marked as read only by some tools. This can be fixed by removing the signatures in the first place. This is what I mean by handling signatures more gracefully. At this point I don't have a need to handle destinations, named tree objects, ... |
I think I get your point now. The issue may be cause by pdfrw.PdfWriter. It changes the content of pdf file (by file I mean the bytes of the binary pdf file) even when you made no changes to the document (by document I mean the content you see when you open pdf file in a viewer). Try just to read a pdf containing signature field by PdfReader and the write it back to a new pdf file by PdfWriter. The file size may change as well as the order of the objects in the file, because writing process is unorganized. That causes incorrect byte values in the Signature Dictionary (in a ByteRange entry). I assume that applications cannot handle that correctly (I tested on adobe acrobat pro 11 and it get non-responsive when I tried to checked advanced signature properties and got the kill it, PDF-Xchange viewer just says that signature is invalid because the file was modified/corrupted. Foxit don't show the fields et all. So it depends on the implementation.). |
Actually, I think there's no practical way to modify a pdf and preserve a functional signature. The whole point of a signature is to be able to identify the file as original and unmodified. The problem is rather that if multiple documents are merged and it turns out that one or more of them have signatures, the invalid signatures in the merged document often cause failures in other software. The best solution I can think of is to remove the signatures. Since they've been invalidated, they have no useful purpose. |
The only point I can think of is when signed document is saved with incremental updates then it is possible to undo the changes and recreate the document state as it existed at the time when it was signed. |
It took me a while to get around to it, but I expanded the ability to read outlines to include GoTo types like those @PeterSlezak mentioned previously. Have a look if it's not too late for your purposes. |
It took me a while to figure out how to use pdfrw, so I thought this example would be useful to show how pages and bookmarks can be manipulated.
If you're interested in this, I can add some documentation to it.