-
Notifications
You must be signed in to change notification settings - Fork 16
Functions:documentParser
Thomas Timmer edited this page May 7, 2024
·
8 revisions
The documentParser
function is an asynchronous function that takes in DocumentParserArguments
as a parameter and returns a Promise
that resolves to DocumentParserResult
. It is used to parse a document and extract the text content from it.
interface DocumentParserArguments {
document: string | ArrayBuffer; // document URL or a Buffer containing a file
parserOptions?: {
forceImage?: boolean;
density?: number;
};
}
interface DocumentParserResult {
result: string;
}
The parserOptions
have two optional options. Density specifies the image resolution. The higher the density, the better the quality of the output will be. However, higher density also means slower processing. Force image forces the document to be scanned as an image. Sometimes this can result in a better output.
A Promise
that resolves to DocumentParserResult
which is an object that contains the result
which is the extracted text from the parsed document.
const isFileProperty = (value) =>
value && typeof value === 'object' && 'url' in value;
const parseDocument = async ({ document, density, forceImage }) => {
const url = isFileProperty(document) ? document?.url : document;
const { result } = await documentParser({
document: url,
parseOptions: { density, forceImage },
});
return {
result,
};
};
export default parseDocument;
const customDocumentParser = async ({ document }) => {
const customOpts = {
// your custom fetch options
};
let response = await fetch(document, customOpts);
const { result } = await documentParser({
document: response.blob().buffer,
parseOptions: {
// same options are avaible as in the example above
},
});
return {
result,
};
};
export default customDocumentParser;
- Getting started
- Page Builder Components
- Action Functions
- [deprecated] CustomFunctions