Skip to content

Commit

Permalink
add documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
masylum committed Jun 28, 2024
1 parent f9c1250 commit bb4d0c9
Showing 1 changed file with 16 additions and 1 deletion.
17 changes: 16 additions & 1 deletion src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,22 @@ const ALTER_TO_DIV_EXCEPTIONS = new Set(['div', 'article', 'section', 'p'])
// These are the classes that readability sets itself.
const CLASSES_TO_PRESERVE = ['page']

type Options = {
/**
* Options for the Readability library. All options are optional.
*
* - `debug` (boolean, default `false`): whether to enable logging.
* - `maxElemsToParse` (number, default `0` i.e. no limit): the maximum number of elements to parse.
* - `nbTopCandidates` (number, default `5`): the number of top candidates to consider when analysing how tight the competition is among candidates.
* - `charThreshold` (number, default `500`): the number of characters an article must have in order to return a result.
* - `classesToPreserve` (array): a set of classes to preserve on HTML elements when the `keepClasses` options is set to `false`.
* - `keepClasses` (boolean, default `false`): whether to preserve all classes on HTML elements. When set to `false` only classes specified in the `classesToPreserve` array are kept.
* - `disableJSONLD` (boolean, default `false`): when extracting page metadata, cheer-reader gives precedence to Schema.org fields specified in the JSON-LD format. Set this option to `true` to skip JSON-LD parsing.
* - `serializer` (function, default `$el => $el.html()`) controls how the `content` property returned by the `parse()` method is produced from the root DOM element. It may be useful to specify the `serializer` as the identity function (`$el => $el`) to obtain a cheerio element instead of a string for `content` if you plan to process it further.
* - `allowedVideoRegex` (RegExp, default `undefined` ): a regular expression that matches video URLs that should be allowed to be included in the article content. If `undefined`, the default regex is applied.
* - `linkDensityModifier` (number, default `0`): a number that is added to the base link density threshold during the shadiness checks. This can be used to penalize nodes with a high link density or vice versa.
* - `extraction` (boolean, default `true`): Some libraries are only interested on the metadata and don't want to pay the price of a full extraction. When you enable this option the `content`, `textContent`, `length` and `excerpt` will be `null`.
*/
export type Options = {
debug: boolean
maxElemsToParse: number
nbTopCandidates: number
Expand Down

0 comments on commit bb4d0c9

Please sign in to comment.