Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add imports to customize extension matching #28

Merged
merged 1 commit into from
Feb 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
steps:
- name: Check out the repo
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Setup Perl
uses: shogo82148/actions-setup-perl@v1
- name: Install Release Dependencies
Expand Down
6 changes: 2 additions & 4 deletions Build.PL
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ my $build = $class->new(
license => 'perl',
create_makefile_pl => 'traditional',
configure_requires => { 'Module::Build' => '0.4209' },
build_requires => {
recommmends => { 'CommonMark' => '0.290000' },
test_requires => {
'File::Spec::Functions' => 0,
'Module::Build' => '0.4209',
'Test::More' => '0.96',
Expand All @@ -54,9 +55,6 @@ my $build = $class->new(
'Parse::BBCode' => '0.15',
'Text::WikiCreole' => '0.07',
},
recommmends => {
'CommonMark' => '0.290000',
},
meta_merge => {
"meta-spec" => { version => 2 },
resources => {
Expand Down
2 changes: 2 additions & 0 deletions Changes
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
Revision history for Perl extension Text-Markup.

0.32
- Added the ability to change the regular expression for a format by
passing it in the `use` statement.

0.31 2023-09-10T23:24:43Z
- Fixed the passing of parameters to `parse()`.
Expand Down
30 changes: 23 additions & 7 deletions lib/Text/Markup.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package Text::Markup;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::None;
use Carp;

Expand Down Expand Up @@ -147,6 +148,15 @@ This distribution includes support for a number of markup formats:

=back

Modules under the Text::Markup namespace provide these parsers, and Text::Markup
automatically loads them on recognizing file name suffixes documented for each
module. To change the file extensions recognized for a particular parser (except
for L<Text::Markup::None>), load it directly and pass a regular expression. For
example, to have the Mediawiki parser recognized files with the suffixes
C<truck>, C<truc>, C<track>, or C<trac>, load it like so:

use Text::Markup::Mediawiki qr{tr[au]ck?};

Adding support for more markup languages is straight-forward, and patches
adding them to this distribution are also welcome. See L</Add a Parser> for
step-by-step instructions.
Expand Down Expand Up @@ -304,6 +314,11 @@ C<Text::FooBar> module, it might look something like this:
use Text::FooBar ();
use File::BOM qw(open_bom)

sub import {
# Replace the regex if passed one.
Text::Markup->register( foobar => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
my $md = Text::FooBar->new(@{ $opts || [] });
Expand Down Expand Up @@ -332,9 +347,8 @@ In such a case, read in the file as raw bytes:
open my $fh, '<:raw', $file or die "Cannot open $file: $!\n";

The returned HTML, however, B<must be encoded in UTF-8>. Please include an
L<encoding
declaration|https://en.wikipedia.org/wiki/Character_encodings_in_HTML>, such
as a content-type C<< <meta> >> element:
L<encoding declaration|https://en.wikipedia.org/wiki/Character_encodings_in_HTML>,
such as a content-type C<< <meta> >> element:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

Expand Down Expand Up @@ -430,13 +444,15 @@ UI.
=back

If you don't want to submit your parser, you can still create and use one
independently. Rather than add its information to the C<%REGEX_FOR> hash in
this module, you can just load your parser manually, and have it call the
C<register> method, like so:
independently. Just omit editing the C<%REGEX_FOR> hash in this module and make
sure you C<register> the parser manually with a default regular expression
in the C<import> method, like so:

package My::Markup::FooBar;
use Text::Markup;
Text::Markup->register(foobar => qr{fb|foob(?:ar)?});
sub import {
Text::Markup->register( foobar => $_[1] || qr{fb|foob(?:ar)?} );
}

This will be useful for creating private parsers you might not want to
contribute, or that you'd want to distribute independently.
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/Asciidoc.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::Cmd;
use utf8;

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( asciidoc => $_[1] ) if $_[1];
}

my $ASCIIDOC = find_cmd([
(map { (WIN32 ? ("$_.exe", "$_.bat") : ($_)) } qw(asciidoc)),
'asciidoc.py',
Expand All @@ -33,7 +39,7 @@

binmode $fh, ":encoding($encoding)";
local $/;
<$fh>;

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.14 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.16 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.30 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.18 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.12 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.20 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.10 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.32 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.26 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.28 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.24 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.36 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.22 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.34 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.38 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.8 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.14 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.30 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.20 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.28 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.12 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.16 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.26 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.34 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.18 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.24 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.10 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.32 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.36 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.22 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.38 on 🪟

UTF-8 "\xFC" does not map to Unicode

Check failure on line 42 in lib/Text/Markup/Asciidoc.pm

View workflow job for this annotation

GitHub Actions / 🐪 Perl 5.8 on 🪟

UTF-8 "\xFC" does not map to Unicode
};

# Make sure we have something.
Expand Down Expand Up @@ -87,6 +93,11 @@

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Asciidoc qr{ski?doc};

Normally this parser returns the output of C<asciidoc> wrapped in a minimal
HTML page skeleton. If you would prefer to just get the exact output returned
by C<asciidoc>, you can pass in a true value for the C<raw> option.
Expand Down
12 changes: 10 additions & 2 deletions lib/Text/Markup/Asciidoctor.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,16 @@ package Text::Markup::Asciidoctor;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::Cmd;
use utf8;

our $VERSION = '0.32';

# Replace Text::Markup::Asciidoc.
Text::Markup->register( asciidoc => qr{a(?:sc(?:iidoc)?|doc)?} );
sub import {
# Replace Text::Markup::Asciidoc.
Text::Markup->register( asciidoc => $_[1] || qr{a(?:sc(?:iidoc)?|doc)?} );
}

# Find Asciidoc.
my $ASCIIDOC = find_cmd([
Expand Down Expand Up @@ -98,6 +101,11 @@ Asciidoc:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::AsciiDoctor qr{ski?doc};

Normally this parser returns the output of C<asciidoctor> wrapped in a minimal
HTML page skeleton. If you would prefer to just get the exact output returned
by C<asciidoctor>, you can pass in a true value for the C<raw> option.
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/Bbcode.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@ package Text::Markup::Bbcode;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Parse::BBCode;

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( bbcode => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
Expand Down Expand Up @@ -64,6 +70,11 @@ It recognizes files with the following extensions as Markdown:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Bbcode qr{beebee};

Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output with the raw skeleton, you can pass
the C<raw> option to C<parse>.
Expand Down
11 changes: 9 additions & 2 deletions lib/Text/Markup/CommonMark.pm
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ use File::BOM qw(open_bom);

our $VERSION = '0.32';

# Replace Text::Markup::Markdown.
Text::Markup->register( markdown => qr{m(?:d(?:own)?|kdn?|arkdown)} );
sub import {
# Replace Text::Markup::Markdown.
Text::Markup->register( markdown => $_[1] || qr{m(?:d(?:own)?|kdn?|arkdown)} );
}

sub parser {
my ($file, $encoding, $opts) = @_;
Expand Down Expand Up @@ -83,6 +85,11 @@ It recognizes files with the following extensions as CommonMark Markdown:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::CommonMark qr{markd?};

Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C<raw> option to C<parse>.
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/Creole.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@ package Text::Markup::Creole;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::WikiCreole;

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( creole => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
Expand Down Expand Up @@ -60,6 +66,11 @@ It recognizes files with the following extensions as Markdown:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Creole qr{cre+ole+};

Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C<raw> option to C<parse>.
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/HTML.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,15 @@ package Text::Markup::HTML;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( html => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
my $html = do {
Expand Down Expand Up @@ -47,6 +53,11 @@ with no decoding. It recognizes files with the following extensions as HTML:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::HTML qr{hachetml};

=head1 Author

David E. Wheeler <david@justatheory.com>
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/Markdown.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@ package Text::Markup::Markdown;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::Markdown ();

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( markdown => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
Expand Down Expand Up @@ -69,6 +75,11 @@ It recognizes files with the following extensions as Markdown:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Markdown qr{markd?};

Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C<raw> option to C<parse>.
Expand Down
10 changes: 10 additions & 0 deletions lib/Text/Markup/Mediawiki.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@ package Text::Markup::Mediawiki;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::MediawikiFormat 1.0;

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( mediawiki => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
Expand Down Expand Up @@ -65,6 +71,10 @@ It recognizes files with the following extensions as MediaWiki:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Mediawiki qr{kwiki?};

Text::Markup::Mediawiki supports the two
L<Text::MediawikiFormat arguments|Text::MediawikiFormat/format>, a hash
Expand Down
11 changes: 11 additions & 0 deletions lib/Text/Markup/Multimarkdown.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,17 @@ package Text::Markup::Multimarkdown;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::MultiMarkdown ();

our $VERSION = '0.32';

sub import {
# Replace the regex if passed one.
Text::Markup->register( multimarkdown => $_[1] ) if $_[1];
}

sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
Expand Down Expand Up @@ -70,6 +76,11 @@ It recognizes files with the following extensions as MultiMarkdown:

=back

To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:

use Text::Markup::Multimarkdown qr{mmm+};

Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C<raw> option to the format options argument to C<parse>.
Expand Down
1 change: 1 addition & 0 deletions lib/Text/Markup/None.pm
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package Text::Markup::None;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use HTML::Entities;
use File::BOM qw(open_bom);

Expand Down
Loading
Loading