Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract content #1

Open
tcr opened this issue Dec 27, 2012 · 0 comments
Open

Extract content #1

tcr opened this issue Dec 27, 2012 · 0 comments

Comments

@tcr
Copy link
Owner

tcr commented Dec 27, 2012

Using ps2ascii content can be extracted in the form of text, images, and fills. Probably need to align this information onto a grid with a certain fineness:

Spindrift
.extract([grid fineness]) // grid fineness creates 

Group
.commands() => [<command>, <command>]
.bound(l, b, r, t) => group()
.rows() => [<group>, <group>] // tries to automatically determine "rows" of elements
.columns() => [<group>, <group>] // tries to automatically determine "columns" of elements
.text() // plaintext
.images() // images
@tcr tcr closed this as completed Dec 28, 2012
@tcr tcr reopened this Dec 28, 2012
cboulanger pushed a commit that referenced this issue May 14, 2017
Fixing missing 'end' event in the steram API
cboulanger pushed a commit that referenced this issue May 14, 2017
Add API for getting number of pages in a PDF
cboulanger pushed a commit that referenced this issue May 14, 2017
Merging "get num pages" from another fork
cboulanger pushed a commit that referenced this issue Oct 25, 2017
Update README.md because page() actually does not exist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant