Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples or Documentation for Text Annotations? #342

Open
evdevdev opened this issue Jan 8, 2025 · 5 comments
Open

Examples or Documentation for Text Annotations? #342

evdevdev opened this issue Jan 8, 2025 · 5 comments
Assignees

Comments

@evdevdev
Copy link

evdevdev commented Jan 8, 2025

First, thanks so much for building this library.

Second, I was wondering if you had any examples for the application of HexaPDF::Type::Annotations::Text or HexaPDF::Type::Annotations::MarkupAnnotation?

I see that this was discussed in this older issue; however, it looks like you've added a ton of functionality since then.

Thanks in advance. for any guidance.

@gettalong gettalong self-assigned this Jan 8, 2025
@gettalong
Copy link
Owner

Yes, much functionality has been added but not in terms of creating or handling annotations.

Here is a small example on how to add a text annotation:

require 'hexapdf'

HexaPDF::Composer.create('annot.pdf') do |comp|
  comp.text("Text followed by an annotation", mask_mode: :box)
  doc = comp.document
  annot = doc.add({Type: :Annot, Subtype: :Text})
  annot[:Contents] = 'This is the content of the annotation'
  annot[:Rect] = [comp.x, comp.y, comp.x + 10, comp.y + 10]
  annot.create_appearance.canvas.
    fill_color("hp-blue").
    circle(5, 5, 4).fill
  comp.page[:Annots] = [annot]
  comp.box(:base, height: 1)
end

As you can see you still need to add the appropriate PDF annotation object and all the necessary information for the specific annotation type. As of PDF 2.0 each annotation needs to have an associated appearance stream.

Can you describe a bit more what you want to achieve?

@evdevdev
Copy link
Author

evdevdev commented Jan 8, 2025

Thanks for the super quick response!

I'm still hacking together the prototype, so this is provisional. What I want to build will look like this:

  1. Start with a PDF somebody else generated.
  2. Run it through a QA system that will identify things that need to be corrected in the pdf. This will be returned as a stream of corrections to be made, i.e. "the sentence at x,y coordinates contains a typo" or "the diagram at x,y coordinates contains a mistake." This will be coming in as JSON (although I don't think that detail particularly matters in this case).
  3. The goal of what I'm building is to take that stream of corrections and apply them as annotations to the original pdf.

So if I understand what you're saying correctly, I'll need something like...

doc = HexaPDF::Document.open(original_file)

list_of_corrections.each do |correction| 
  doc.add({Type: :Annot, Subtype: :Text})
  annot[:Contents] = correction.text
  annot[:Rect] = [correction.top_x, correction.top_y,  correction.bottom_x, correction.bottom_y]
  annot.create_appearance # however I want it to look
  annots =  comp.page[:Annots] || []
  annots << annot
  comp.page[:Annots]
end

Does that seem roughly right to you? Also, I'm happy to dig into any appropriate documentation, whether it's in HexaPDF or the pdf spec.

Thanks again for your help.

@gettalong
Copy link
Owner

Your code seems fine except that you would need to get the page of the correction and use it like so: page = doc.pages[correction.page_num].

You also might want to set a few more attributes on the annotation, like :T which identifies the author of the annotation (in this case should be set to the name of your application), :CreationDate (for setting the date) and :C (for defining the color of the annotation)

The text annotation type displays a symbol (like a note) where it is placed and the annotation text can be revealed on clicking that symbol. In my tests it seems that some viewers automatically change the appearance of the annotation as soon as it is slightly moved. That may confuse users. Maybe another type of annotation is better suited for your needs, like the polygon annotation.

As I said before it is necessary that you generate an appearance stream (so as to be forward-compatible with PDF 2.0 even if you generate a 1.7 PDF).

The PDF 2.0 spec is freely available, see https://pdfa.org/sponsored-standards/. I would recommend downloading it and referring to section 12.5 "Annotations" for details.

@evdevdev
Copy link
Author

evdevdev commented Jan 9, 2025

@gettalong Fantastic! Thank you. I really appreciate the thoughtful response.

If I can trouble you with one more question. Let's say I have a PDF that already has a few "example" annotations, which represent what my annotations should look like.

What's the best way to interrogate those objects using HexaPDF? I image there will be a lot I can just copy.

@gettalong
Copy link
Owner

You could either use irb, i.e. load the document, retrieve a page and then iterate over all annotations using page#each_annotation.

Or you could use hexapdf inspect file.pdf with the operations po 1 and then drill down to the annotation you like to inspect, e.g.

$ hexapdf inspect annot.pdf p
page 1 (1,0): 2,0
$ hexapdf inspect annot.pdf po 1
1 0 obj
<<
  /Type /Page
  /MediaBox [0 0 595.275591 841.889764 ]
  /Contents 2 0 R
  /Parent 4 0 R
  /Resources <<
    /Font <<
      /F1 6 0 R
    >>
  >>
  /Annots [7 0 R ]
>>
endobj
$ hexapdf inspect annot.pdf 7
<<
  /Type /Annot
  /Subtype /Text
  /Contents (This is the content of the annotation)
  /Rect [161.54 805.889764 171.54 815.889764 ]
  /AP <<
    /N 8 0 R
  >>
>>
$ hexapdf inspect annot.pdf s 8
0.0 0.501961 1.0 rg
9.0 5.0 m
9.0 6.427288 8.236068 7.750457 7.0 8.464102 c
5.763932 9.177746 4.236068 9.177746 3.0 8.464102 c
1.763932 7.750457 1.0 6.427288 1.0 5.0 c
1.0 3.572712 1.763932 2.249543 3.0 1.535898 c
4.236068 0.822254 5.763932 0.822254 7.0 1.535898 c
8.236068 2.249543 9.0 3.572712 9.0 5.0 c
h
f

And yes, this would be a good way to see how to duplicate an annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants