diff --git a/CONTRIBUTING.pod b/CONTRIBUTING.pod index a261b78..85208d7 100644 --- a/CONTRIBUTING.pod +++ b/CONTRIBUTING.pod @@ -1,6 +1,8 @@ # To read this file, run: # perldoc CONTRIBUTING.pod +=encoding utf8 + =head1 NAME How to contribute to C? @@ -14,30 +16,96 @@ C> branch, then submitted as pull requests at GitHub. The documentation is written using the -L format. Use the C tool +L format. Use the C tool to render it in a terminal: perldoc CONTRIBUTING.pod -=head1 PATCHING, STEP BY STEP +=head1 INITIAL SETUP + +=head2 1. Setup a Perl development environment + +There are various ways to setup a Perl development environment, but here is mine + (@dolmen). The key principles are: =over 4 -=item 1. Get the source +=item a. Do not depend on the Perl of the operating system. Instead, install a recent version which you control upgrades independently of the O/S. + +=item b. Setup environment to install Perl modules from CPAN into that C using C. + +=back + + +Step by step on MacOS (zsh): + +=over 4 + +=item a. Install L (note: this is incompatible with I) + + git clone git://github.com/tokuhirom/plenv.git ~/.plenv + echo 'export PATH="$HOME/.plenv/bin:$PATH"' >> ~/.profile + echo 'eval "$(plenv init -)"' >> ~/.zshrc + exec $SHELL -l + git clone git://github.com/tokuhirom/Perl-Build.git ~/.plenv/plugins/perl-build/ + +=item b. Install a recent perl: + + plenv install 5.40.1 + plenv global 5.40.1 + +=item c. Install L: + + plenv install-cpanm + +=back + + +=head2 2. Get the source git clone --origin upstream git://github.com/dolmen/github-keygen.git cd github-keygen git checkout master -=item 2. Install build dependencies +=head2 3. Install build dependencies Not required for doc patches. curl -L https://cpanmin.us | perl - --installdeps --with-develop . -=item 3. Make your fix/feature +=head2 4. Setup a fork + +=over 4 + +=item 4.1. L + +=item 4.2. Link your local repo to your fork: + +(You are already using C, aren't you?) + + git remote add github .github.com:/github-keygen.git + git remote update + +=back + +=head1 PATCHING, STEP BY STEP + +=over 4 + +=item 1. Update your local fork + + git remote update + git checkout master + git rebase upstream/master + +=item 2. Update Perl dependencies (not required for doc patches) + + curl -L https://cpanmin.us | perl - --installdeps --with-develop . + + +=item 3. Make your fix/feature in a dedicated Git branch - git checkout -b master + git checkout -b upstream/master $EDITOR bin/github-keygen @@ -49,27 +117,22 @@ Not required for doc patches. git commit -=item 4. Setup a fork - -=item 4.1. L - -=item 4.2. Link your local repo to your fork (just once) - -(You are using C isn't it?) - git remote add github .github.com:/github-keygen.git +=item 4. Submit your work -=item 5. Submit your work +=over 4 -=item 5.1 Push! +=item 4.1 Push! git push github -=item 5.2 Submit a pull request on GitHub +=item 4.2 Submit a pull request on GitHub + +=back -=item 6. Loop +=item 5. Loop -Redo from step 3. +Ready for another contribution? Redo from step 1! =back @@ -90,9 +153,10 @@ Redo from step 3. =item * Release git checkout master + ./release.pl -n ./release.pl - git push origin master release - git push origin --tags + git push github master release + git push github --tags =back diff --git a/LICENSE b/LICENSE index 94a9ed0..f288702 100644 --- a/LICENSE +++ b/LICENSE @@ -1,7 +1,7 @@ GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 - Copyright (C) 2007 Free Software Foundation, Inc. + Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. @@ -645,7 +645,7 @@ the "copyright" line and a pointer to where the full notice is found. GNU General Public License for more details. You should have received a copy of the GNU General Public License - along with this program. If not, see . + along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. @@ -664,11 +664,11 @@ might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see -. +. The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read -. +. diff --git a/README.pod b/README.pod index bf819f1..42aaaf7 100644 --- a/README.pod +++ b/README.pod @@ -1,5 +1,7 @@ =pod +=encoding utf8 + =for stopwords MITM versioning =head1 NAME @@ -100,7 +102,7 @@ Disable bad things that could come from the GitHub hosts ("Trust no-one") Disable the C option to protect you if ever GitHub (or a MITM) tries to exploit the -L vulnerability|http://www.openssh.com/txt/release-7.1p2>. +L vulnerability|https://www.openssh.com/txt/release-7.1p2>. =back @@ -226,7 +228,7 @@ more like a wizard that you use just once. So just get the file, run it, and delete it. I: the tool is written in Perl, but you don't have to install -L (or Cygwin or ActivePerl); the perl +L (or Cygwin or ActivePerl); the perl bundled with L will be automatically detected and used. @@ -258,6 +260,19 @@ L. =over 4 +=item v1.400 + +Change default key type on key creation to C (previously C) +(L). + +On key creation, custom key comment provided by C<-C> was ignored +(L). This is +fixed. Thanks to L<@tinhtruong|https://github.com/tinhtruong> for the report. + +Fix typo in a comment in F<~/.ssh/config>: "I". + +Improve Windows compatibility (fix in parsing of C). + =item v1.306 On key creation, switch default key size from 2048 bits to 4096 bits. @@ -340,7 +355,7 @@ settings were applied before our own. =item v1.101 Config: set C to protect against the -L vulnerability|http://www.openssh.com/txt/release-7.1p2>. +L vulnerability|https://www.openssh.com/txt/release-7.1p2>. =item v1.100 @@ -366,7 +381,7 @@ Various fixes/workarounds to restore full support of the old SSH (4.6p1) that is bundled with msysgit (Git on Win32). Store the C in C<$XDG_RUNTIME_DIR> (see the -L) +L) if available. Doc fixes: change "Github" to "GitHub". @@ -414,7 +429,7 @@ with GitHub. Fixed a message that wrongly told to paste the I key (C<'.pub'> forgotten). Fixed at the -L, +L, but released (too) long later. =item v1.006 @@ -517,7 +532,7 @@ If you want to contribute, have a look to L. =head1 COPYRIGHT & LICENSE -Copyright E 2011-2022 Olivier MenguE. +Copyright E 2011-2025 Olivier MenguE. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -530,6 +545,6 @@ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License -along with this program. If not, see L. +along with this program. If not, see L. =cut diff --git a/github-keygen b/github-keygen index 427b688..900fe2c 100755 --- a/github-keygen +++ b/github-keygen @@ -2452,33 +2452,26 @@ $fatpacked{"Pod/Escapes.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD POD_ESCAPES $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_SIMPLE'; - - require 5; package Pod::Simple; use strict; + use warnings; use Carp (); BEGIN { *DEBUG = sub () {0} unless defined &DEBUG } use integer; use Pod::Escapes 1.04 (); use Pod::Simple::LinkSection (); use Pod::Simple::BlackBox (); + use Pod::Simple::TiedOutFH; #use utf8; - use vars qw( - $VERSION @ISA - @Known_formatting_codes @Known_directives - %Known_formatting_codes %Known_directives - $NL - ); - - @ISA = ('Pod::Simple::BlackBox'); - $VERSION = '3.35'; + our @ISA = ('Pod::Simple::BlackBox'); + our $VERSION = '3.45'; - @Known_formatting_codes = qw(I B C L E F S X Z); - %Known_formatting_codes = map(($_=>1), @Known_formatting_codes); - @Known_directives = qw(head1 head2 head3 head4 item over back); - %Known_directives = map(($_=>'Plain'), @Known_directives); - $NL = $/ unless defined $NL; + our @Known_formatting_codes = qw(I B C L E F S X Z); + our %Known_formatting_codes = map(($_=>1), @Known_formatting_codes); + our @Known_directives = qw(head1 head2 head3 head4 head5 head6 item over back); + our %Known_directives = map(($_=>'Plain'), @Known_directives); + our $NL = $/ unless defined $NL; #----------------------------------------------------------------------------- # Set up some constants: @@ -2528,6 +2521,9 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ #@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ __PACKAGE__->_accessorize( + '_output_is_for_JustPod', # For use only by Pod::Simple::JustPod, + # If non-zero, don't expand Z<> E<> S<> L<>, + # and count how many brackets in format codes 'nbsp_for_S', # Whether to map S<...>'s to \xA0 characters 'source_filename', # Filename of the source, for use in warnings 'source_dead', # Whether to consider this parser's source dead @@ -2557,6 +2553,8 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ 'preserve_whitespace', # whether to try to keep whitespace as-is 'strip_verbatim_indent', # What indent to strip from verbatim + 'expand_verbatim_tabs', # 0: preserve tabs in verbatim blocks + # n: expand tabs to stops every n columns 'parse_characters', # Whether parser should expect chars rather than octets @@ -2622,6 +2620,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ BEGIN { *pretty = \&Pod::Simple::BlackBox::pretty; *stringify_lol = \&Pod::Simple::BlackBox::stringify_lol; + *my_qr = \&Pod::Simple::BlackBox::my_qr; } #@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @@ -2649,8 +2648,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # my $this = shift; return $this->{'output_string'} unless @_; # GET. - - require Pod::Simple::TiedOutFH; + my $x = (defined($_[0]) and ref($_[0])) ? $_[0] : \( $_[0] ); $$x = '' unless defined $$x; DEBUG > 4 and print STDERR "# Output string set to $x ($$x)\n"; @@ -2673,11 +2671,14 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ my $class = ref($_[0]) || $_[0]; #Carp::croak(__PACKAGE__ . " is a virtual base class -- see perldoc " # . __PACKAGE__ ); - return bless { + my $obj = bless { 'accept_codes' => { map( ($_=>$_), @Known_formatting_codes ) }, 'accept_directives' => { %Known_directives }, 'accept_targets' => {}, }, $class; + + $obj->expand_verbatim_tabs(8); + return $obj; } @@ -2722,7 +2723,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } DEBUG > 6 and print STDERR "$this\'s accept_directives : ", pretty($this->{'accept_directives'}), "\n"; - + return sort keys %{ $this->{'accept_directives'} } if wantarray; return; } @@ -2767,7 +2768,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # TODO: enforce some limitations on what a target name can be? $this->{'accept_targets'}{$t} = $type; DEBUG > 2 and print STDERR "Learning to accept \"$t\" as target of type $type\n"; - } + } return sort keys %{ $this->{'accept_targets'} } if wantarray; return; } @@ -2782,7 +2783,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # TODO: enforce some limitations on what a target name can be? delete $this->{'accept_targets'}{$t}; DEBUG > 2 and print STDERR "OK, won't accept \"$t\" as target.\n"; - } + } return sort keys %{ $this->{'accept_targets'} } if wantarray; return; } @@ -2793,16 +2794,15 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # XXX Probably it is an error that the digit '9' is excluded from these re's. # Broken for early Perls on EBCDIC - my $xml_name_re = eval "qr/[^-.0-8:A-Z_a-z[:^ascii:]]/"; - if (! defined $xml_name_re) { - $xml_name_re = qr/[\x00-\x2C\x2F\x39\x3B-\x40\x5B-\x5E\x60\x7B-\x7F]/; - } + my $xml_name_re = my_qr('[^-.0-8:A-Z_a-z[:^ascii:]]', '9'); + $xml_name_re = qr/[\x00-\x2C\x2F\x39\x3B-\x40\x5B-\x5E\x60\x7B-\x7F]/ + unless $xml_name_re; sub accept_code { shift->accept_codes(@_) } # alias sub accept_codes { # Add some codes my $this = shift; - + foreach my $new_code (@_) { next unless defined $new_code and length $new_code; # A good-enough check that it's good as an XML Name symbol: @@ -2825,7 +2825,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # If we say we accept "W", then a "W" in the treelet simply turns # into "W". } - + return; } @@ -2834,7 +2834,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ sub unaccept_codes { # remove some codes my $this = shift; - + foreach my $new_code (@_) { next unless defined $new_code and length $new_code; # A good-enough check that it's good as an XML Name symbol: @@ -2853,7 +2853,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ DEBUG > 2 and print STDERR "OK, won't accept the code $new_code<...>.\n"; } - + return; } @@ -2936,7 +2936,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # be eol agnostic s/\r\n?/\n/g for @lines; - + # make sure there are only one line elements for parse_lines @lines = split(/(?<=\n)/, join('', @lines)); @@ -2957,13 +2957,13 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ my($self, $source, $to) = @_; $self = $self->new unless ref($self); # so we tolerate being a class method - + if(!defined $source) { $source = *STDIN{IO} } elsif(ref(\$source) eq 'GLOB') { # stet } elsif(ref($source) ) { # stet } elsif(!length $source or $source eq '-' or $source =~ m/^<&(?:STDIN|0)$/i - ) { + ) { $source = *STDIN{IO}; } @@ -3052,7 +3052,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # Otherwise we haven't yet been to this node. Maybe alter it... - + my $content = join "\n", @{$para}[2 .. $#$para]; if($content =~ m/^\s*\*\s*$/s or $content =~ m/^\s*$/s) { @@ -3062,7 +3062,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ return $para->[1]{'~type'} = 'bullet'; } elsif($content =~ m/^\s*\*\s+(.+)/s) { # tolerance - + # Like: "=item * Foo bar baz"; $para->[1]{'~orig_content'} = $content; $para->[1]{'~_freaky_para_hack'} = $1; @@ -3072,13 +3072,13 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } elsif($content =~ m/^\s*(\d+)\.?\s*$/s) { # Like: "=item 1.", "=item 123412" - + $para->[1]{'~orig_content'} = $content; $para->[1]{'number'} = $1; # Yes, stores the number there! splice @$para, 2; # so it ends up just being ['=item', { attrhash } ] return $para->[1]{'~type'} = 'number'; - + } else { # It's anything else. return $para->[1]{'~type'} = 'text'; @@ -3105,20 +3105,21 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } else { $treelet = $self->_treelet_from_formatting_codes(@_); } - - if( $self->_remap_sequences($treelet) ) { + + if( ! $self->{'_output_is_for_JustPod'} # Retain these as-is for pod output + && $self->_remap_sequences($treelet) ) + { $self->_treat_Zs($treelet); # Might as well nix these first $self->_treat_Ls($treelet); # L has to precede E and S $self->_treat_Es($treelet); $self->_treat_Ss($treelet); # S has to come after E - $self->_wrap_up($treelet); # Nix X's and merge texties - + } else { DEBUG and print STDERR "Formatless treelet gets fast-tracked.\n"; # Very common case! } - + splice @$treelet, 0, 2; # lop the top off return $treelet; @@ -3135,8 +3136,8 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ DEBUG > 2 and print STDERR "\nStarting _wrap_up traversal.\n", $merge ? (" Merge mode on\n") : (), $nixx ? (" Nix-X mode on\n") : (), - ; - + ; + my($i, $treelet); while($treelet = shift @stack) { @@ -3157,7 +3158,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ $treelet->[$i-1] .= ( splice(@$treelet, $i, 1) )[0]; DEBUG > 4 and print STDERR " Now: ", $i-1, ":[$treelet->[$i-1]]\n"; --$i; - next; + next; # since we just pulled the possibly last node out from under # ourselves, we can't just redo() @@ -3167,7 +3168,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ if($treelet->[$i][0] eq 'L') { my $thing; - foreach my $attrname ('section', 'to') { + foreach my $attrname ('section', 'to') { if(defined($thing = $treelet->[$i][1]{$attrname}) and ref $thing) { unshift @stack, $thing; DEBUG > 4 and print STDERR " +Enqueuing ", @@ -3188,13 +3189,13 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ sub _remap_sequences { my($self,@stack) = @_; - + if(@stack == 1 and @{ $stack[0] } == 3 and !ref $stack[0][2]) { # VERY common case: abort it. DEBUG and print STDERR "Skipping _remap_sequences: formatless treelet.\n"; return 0; } - + my $map = ($self->{'accept_codes'} || die "NO accept_codes in $self?!?"); my $start_line = $stack[0][1]{'start_line'}; @@ -3212,15 +3213,15 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ ; # A recursive algorithm implemented iteratively! Whee! - + my($is, $was, $i, $treelet); # scratch while($treelet = shift @stack) { DEBUG > 3 and print STDERR " Considering children of this $treelet->[0] node...\n"; for($i = 2; $i < @$treelet; ++$i) { # iterate over children next unless ref $treelet->[$i]; # text nodes are uninteresting - + DEBUG > 4 and print STDERR " Noting child $i : $treelet->[$i][0]<...>\n"; - + $is = $treelet->[$i][0] = $map->{ $was = $treelet->[$i][0] }; if( DEBUG > 3 ) { if(!defined $is) { @@ -3234,7 +3235,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ : "tag $is<...>.\n"; } } - + if(!defined $is) { $self->whine($start_line, "Deleting unknown formatting code $was<>"); $is = $treelet->[$i][0] = '1'; # But saving the children! @@ -3249,9 +3250,9 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ while(@dynasty) { DEBUG > 4 and printf " Grafting a new %s node between %s and %s\n", - $dynasty[-1], $treelet->[0], $treelet->[$i][0], + $dynasty[-1], $treelet->[0], $treelet->[$i][0], ; - + #$nugget = ; splice @$treelet, $i, 1, [pop(@dynasty), {}, $treelet->[$i]]; # relace node with a new parent @@ -3271,7 +3272,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } } } - + DEBUG > 2 and print STDERR "End of _remap_sequences traversal.\n\n"; if(@_ == 2 and @{ $_[1] } == 3 and !ref $_[1][2]) { @@ -3287,7 +3288,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # "Go to an extreme, move back to a more comfortable place" # -- /Oblique Strategies/, Brian Eno and Peter Schmidt - + my($self, $para) = @_; my $content = join ' ', splice @$para, 2; $content =~ s/^\s+//s; @@ -3320,16 +3321,16 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ ); return; } - + if(grep $new_letter eq $_, @Known_formatting_codes) { DEBUG > 2 and print STDERR " $new_letter isn't a good thing to extend, because known.\n"; $self->whine( $para->[1]{'start_line'}, "You can't extend an established code like \"$new_letter\"" ); - + #TODO: or allow if last bit is same? - + return; } @@ -3344,7 +3345,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ ); return; } - + unless($elements_one =~ m/^[^ ,]+(,[^ ,]+)*$/s) { # like "B", "M,I", etc. $self->whine( $para->[1]{'start_line'}, @@ -3423,22 +3424,22 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ unshift @stack, $treelet->[$i]; # recurse next; } - + DEBUG > 1 and print STDERR "Nixing Z node @{$treelet->[$i]}\n"; - + # bitch UNLESS it's empty unless( @{$treelet->[$i]} == 2 or (@{$treelet->[$i]} == 3 and $treelet->[$i][2] eq '') ) { $self->whine( $start_line, "A non-empty Z<>" ); } # but kill it anyway - + splice(@$treelet, $i, 1); # thereby just nix this node. --$i; - + } } - + return; } @@ -3456,7 +3457,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # Functions|perlfunc>", the link-text is "Perl Functions". In # "L" and even "L<|Time::HiRes>", there is no link text. Note # that link text may contain formatting.) - # + # ############# The element children # Second: @@ -3470,7 +3471,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # The name or URL, or undef if none. (E.g., in "L", the name -- also sometimes called the page -- is # "perlfunc". In "L", the name is undef.) - # + # ############# The "section" attribute (which might be next, or a treelet) # Fourth: @@ -3479,7 +3480,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # is not the same as a manpage section like the "5" in "man 5 crontab". # "Section Foo" in the Pod sense means the part of the text that's # introduced by the heading or item whose text is "Foo".) - # + # # Pod parsers may also note additional attributes including: # @@ -3530,13 +3531,18 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ unshift @stack, $treelet->[$i]; # recurse next; } - - + + # By here, $treelet->[$i] is definitely an L node my $ell = $treelet->[$i]; - DEBUG > 1 and print STDERR "Ogling L node $ell\n"; - - # bitch if it's empty + DEBUG > 1 and print STDERR "Ogling L node " . pretty($ell) . "\n"; + + # bitch if it's empty or is just '/' + if (@{$ell} == 3 and $ell->[2] =~ m!\A\s*/\s*\z!) { + $self->whine( $start_line, "L<> contains only '/'" ); + $treelet->[$i] = 'L'; # just make it a text node + next; # and move on + } if( @{$ell} == 2 or (@{$ell} == 3 and $ell->[2] eq '') ) { @@ -3550,7 +3556,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ ) { $self->whine( $start_line, "L<> starts or ends with whitespace" ); } - + # Catch URLs: # there are a number of possible cases: @@ -3605,7 +3611,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ next; } - + # Catch some very simple and/or common cases if(@{$ell} == 3 and ! ref $ell->[2]) { my $it = $ell->[2]; @@ -3632,15 +3638,15 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } # else fall thru... } - - + + # ...Uhoh, here's the real L<...> parsing stuff... # "With the ill behavior, with the ill behavior, with the ill behavior..." DEBUG > 1 and print STDERR "Running a real parse on this non-trivial L\n"; - - + + my $link_text; # set to an arrayref if found my @ell_content = @$ell; splice @ell_content,0,2; # Knock off the 'L' and {} bits @@ -3682,8 +3688,8 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ last; } } - - + + # Now look for the "/" -- only in CHILDREN (not all underlings!) # And afterward, anything left in @ell_content will be the raw name # Like L @@ -3704,7 +3710,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ push @section_name, splice @ell_content, 1+$j; # leaving only things before and including J - + @ell_content = grep ref($_)||length($_), @ell_content ; @section_name = grep ref($_)||length($_), @section_name ; @@ -3743,6 +3749,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ $section_name = [splice @ell_content]; $section_name->[ 0] =~ s/^\"//s; $section_name->[-1] =~ s/\"$//s; + $ell->[1]{'~tolerated'} = 1; } # Turn L into L. @@ -3750,8 +3757,8 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ and grep !ref($_) && m/ /s, @ell_content ) { $section_name = [splice @ell_content]; + $ell->[1]{'~deprecated'} = 1; # That's support for the now-deprecated syntax. - # (Maybe generate a warning eventually?) # Note that it deliberately won't work on L<...|Foo Bar> } @@ -3797,11 +3804,11 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ ); DEBUG > 3 and print STDERR "L-to content: ", pretty($ell->[1]{'to'}), "\n"; } - + # And update children to be the link-text: @$ell = (@$ell[0,1], defined($link_text) ? splice(@$link_text) : ''); - - DEBUG > 2 and print STDERR "End of L-parsing for this node $treelet->[$i]\n"; + + DEBUG > 2 and print STDERR "End of L-parsing for this node " . pretty($treelet->[$i]) . "\n"; unshift @stack, $treelet->[$i]; # might as well recurse } @@ -3830,9 +3837,9 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ next unless ref $treelet->[$i]; # text nodes are uninteresting if($treelet->[$i][0] eq 'L') { # SPECIAL STUFF for semi-processed L<>'s - + my $thing; - foreach my $attrname ('section', 'to') { + foreach my $attrname ('section', 'to') { if(defined($thing = $treelet->[$i][1]{$attrname}) and ref $thing) { unshift @stack, $thing; DEBUG > 2 and print STDERR " Enqueuing ", @@ -3840,14 +3847,14 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ " as an attribute value to tweak.\n"; } } - + unshift @stack, $treelet->[$i]; # recurse next; } elsif($treelet->[$i][0] ne 'E') { unshift @stack, $treelet->[$i]; # recurse next; } - + DEBUG > 1 and print STDERR "Ogling E node ", pretty($treelet->[$i]), "\n"; # bitch if it's empty @@ -3858,7 +3865,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ $treelet->[$i] = 'E<>'; # splice in a literal next; } - + # bitch if content is weird unless(@{$treelet->[$i]} == 3 and !ref($content = $treelet->[$i][2])) { $self->whine( $start_line, "An E<...> surrounding strange content" ); @@ -3907,7 +3914,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ sub _treat_Ss { my($self,$treelet) = @_; - + _change_S_to_nbsp($treelet,0) if $self->{'nbsp_for_S'}; # TODO: or a change_nbsp_to_S @@ -3921,7 +3928,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ sub _change_S_to_nbsp { # a recursive function # Sanely assumes that the top node in the excursion won't be an S node. my($treelet, $in_s) = @_; - + my $is_s = ('S' eq $treelet->[0]); $in_s ||= $is_s; # So in_s is on either by this being an S element, # or by an ancestor being an S element. @@ -3936,7 +3943,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } } else { $treelet->[$i] =~ s/\s/$Pod::Simple::nbsp/g if $in_s; - + # Note that if you apply nbsp_for_S to text, and so turn # "foo S quux" into "foo bar faz quux", you # end up with something that fails to say "and don't hyphenate @@ -3961,6 +3968,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ $Carp::CarpLevel = 1, Carp::croak( "Accessor usage: \$obj->$attrname() or \$obj->$attrname(\$new_value)" ) unless (@_ == 1 or @_ == 2) and ref $_[0]; + (@_ == 1) ? $_[0]->{$attrname} : ($_[0]->{$attrname} = $_[1]); }; @@ -3977,7 +3985,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ my($class, $source) = @_; my $new = $class->new; $new->output_fh(*STDOUT{IO}); - + if(ref($source || '') eq 'SCALAR') { $new->parse_string_document( $$source ); } elsif(ref($source)) { # it's a file handle @@ -3985,7 +3993,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ } else { # it's a filename $new->parse_file($source); } - + return $new; } @@ -3995,22 +4003,22 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ sub _out { # For use in testing: Class->_out($source) # returns the transformation of $source - + my $class = shift(@_); my $mutor = shift(@_) if @_ and ref($_[0] || '') eq 'CODE'; DEBUG and print STDERR "\n\n", '#' x 76, "\nAbout to parse source: {{\n$_[0]\n}}\n\n"; - - + + my $parser = ref $class && $class->isa(__PACKAGE__) ? $class : $class->new; $parser->hide_line_numbers(1); my $out = ''; $parser->output_string( \$out ); DEBUG and print STDERR " _out to ", \$out, "\n"; - + $mutor->($parser) if $mutor; $parser->parse_string_document( $_[0] ); @@ -4023,9 +4031,9 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ # For use in testing: Class->_duo($source1, $source2) # returns the parse trees of $source1 and $source2. # Good in things like: &ok( Class->duo(... , ...) ); - + my $class = shift(@_); - + Carp::croak "But $class->_duo is useful only in list context!" unless wantarray; @@ -4035,7 +4043,7 @@ $fatpacked{"Pod/Simple.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_ unless @_ == 2; my(@out); - + while( @_ ) { my $parser = $class->new; @@ -4104,10 +4112,38 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n use integer; # vroom! use strict; + use warnings; use Carp (); - use vars qw($VERSION ); - $VERSION = '3.35'; + our $VERSION = '3.45'; #use constant DEBUG => 7; + + sub my_qr ($$) { + + # $1 is a pattern to compile and return. Older perls compile any + # syntactically valid property, even if it isn't legal. To cope with + # this, return an empty string unless the compiled pattern also + # successfully matches $2, which the caller furnishes. + + my ($input_re, $should_match) = @_; + # XXX could have a third parameter $shouldnt_match for extra safety + + my $use_utf8 = ($] le 5.006002) ? 'use utf8;' : ""; + + my $re = eval "no warnings; $use_utf8 qr/$input_re/"; + #print STDERR __LINE__, ": $input_re: $@\n" if $@; + return "" if $@; + + my $matches = eval "no warnings; $use_utf8 '$should_match' =~ /$re/"; + #print STDERR __LINE__, ": $input_re: $@\n" if $@; + return "" if $@; + + #print STDERR __LINE__, ": SUCCESS: $re\n" if $matches; + return $re if $matches; + + #print STDERR __LINE__, ": $re: didn't match\n"; + return ""; + } + BEGIN { require Pod::Simple; *DEBUG = \&Pod::Simple::DEBUG unless defined &DEBUG @@ -4116,8 +4152,37 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # Matches a character iff the character will have a different meaning # if we choose CP1252 vs UTF-8 if there is no =encoding line. # This is broken for early Perls on non-ASCII platforms. - my $non_ascii_re = eval "qr/[[:^ascii:]]/"; - $non_ascii_re = qr/[\x80-\xFF]/ if ! defined $non_ascii_re; + my $non_ascii_re = my_qr('[[:^ascii:]]', "\xB6"); + $non_ascii_re = qr/[\x80-\xFF]/ unless $non_ascii_re; + + # Use patterns understandable by Perl 5.6, if possible + my $cs_re = do { no warnings; my_qr('\p{IsCs}', "\x{D800}") }; + my $cn_re = my_qr('\p{IsCn}', "\x{09E4}"); # code point unlikely + # to get assigned + my $rare_blocks_re = my_qr('[\p{InIPAExtensions}\p{InSpacingModifierLetters}]', + "\x{250}"); + $rare_blocks_re = my_qr('[\x{0250}-\x{02FF}]', "\x{250}") unless $rare_blocks_re; + + my $script_run_re = eval 'no warnings "experimental::script_run"; + qr/(*script_run: ^ .* $ )/x'; + my $latin_re = my_qr('[\p{IsLatin}\p{IsInherited}\p{IsCommon}]', "\x{100}"); + unless ($latin_re) { + # This was machine generated to be the ranges of the union of the above + # three properties, with things that were undefined by Unicode 4.1 filling + # gaps. That is the version in use when Perl advanced enough to + # successfully compile and execute the above pattern. + $latin_re = my_qr('[\x00-\x{02E9}\x{02EC}-\x{0374}\x{037E}\x{0385}\x{0387}\x{0485}\x{0486}\x{0589}\x{060C}\x{061B}\x{061F}\x{0640}\x{064B}-\x{0655}\x{0670}\x{06DD}\x{0951}-\x{0954}\x{0964}\x{0965}\x{0E3F}\x{10FB}\x{16EB}-\x{16ED}\x{1735}\x{1736}\x{1802}\x{1803}\x{1805}\x{1D00}-\x{1D25}\x{1D2C}-\x{1D5C}\x{1D62}-\x{1D65}\x{1D6B}-\x{1D77}\x{1D79}-\x{1DBE}\x{1DC0}-\x{1EF9}\x{2000}-\x{2125}\x{2127}-\x{27FF}\x{2900}-\x{2B13}\x{2E00}-\x{2E1D}\x{2FF0}-\x{3004}\x{3006}\x{3008}-\x{3020}\x{302A}-\x{302D}\x{3030}-\x{3037}\x{303C}-\x{303F}\x{3099}-\x{309C}\x{30A0}\x{30FB}\x{30FC}\x{3190}-\x{319F}\x{31C0}-\x{31CF}\x{3220}-\x{325F}\x{327F}-\x{32CF}\x{3358}-\x{33FF}\x{4DC0}-\x{4DFF}\x{A700}-\x{A716}\x{FB00}-\x{FB06}\x{FD3E}\x{FD3F}\x{FE00}-\x{FE6B}\x{FEFF}-\x{FF65}\x{FF70}\x{FF9E}\x{FF9F}\x{FFE0}-\x{FFFD}\x{10100}-\x{1013F}\x{1D000}-\x{1D1DD}\x{1D300}-\x{1D7FF}]', "\x{100}"); + } + + my $every_char_is_latin_re = my_qr("^(?:$latin_re)*\\z", "A"); + + # Latin script code points not in the first release of Unicode + my $later_latin_re = my_qr('[^\P{IsLatin}\p{IsAge=1.1}]', "\x{1F6}"); + + # If this perl doesn't have the Deprecated property, there's only one code + # point in it that we need be concerned with. + my $deprecated_re = my_qr('\p{IsDeprecated}', "\x{149}"); + $deprecated_re = qr/\x{149}/ unless $deprecated_re; my $utf8_bom; if (($] ge 5.007_003)) { @@ -4127,6 +4192,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $utf8_bom = "\xEF\xBB\xBF"; # No EBCDIC BOM detection for early Perls. } + # This is used so that the 'content_seen' method doesn't return true on a + # file that just happens to have a line that matches /^=[a-zA-z]/. Only if + # there is a valid =foo line will we return that content was seen. + my $seen_legal_directive = 0; + #@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ sub parse_line { shift->parse_lines(@_) } # alias @@ -4141,10 +4211,10 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $cut_handler = $self->{'cut_handler'}; my $wl_handler = $self->{'whiteline_handler'}; $self->{'line_count'} ||= 0; - + my $scratch; - DEBUG > 4 and + DEBUG > 4 and print STDERR "# Parsing starting at line ", $self->{'line_count'}, ".\n"; DEBUG > 5 and @@ -4155,9 +4225,15 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # paragraph buffer. Because we need to defer processing of =over # directives and verbatim paragraphs. We call _ponder_paragraph_buffer # to process this. - + $self->{'pod_para_count'} ||= 0; + # An attempt to match the pod portions of a line. This is not fool proof, + # but is good enough to serve as part of the heuristic for guessing the pod + # encoding if not specified. + my $codes = join '', grep { / ^ [A-Za-z] $/x } sort keys %{$self->{accept_codes}}; + my $pod_chars_re = qr/ ^ = [A-Za-z]+ | [\Q$codes\E] < /x; + my $line; foreach my $source_line (@_) { if( $self->{'source_dead'} ) { @@ -4181,7 +4257,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ($line = $source_line) =~ tr/\n\r//d; # If we don't have two vars, we'll end up with that there # tr/// modding the (potentially read-only) original source line! - + } else { DEBUG > 2 and print STDERR "First line: [$source_line]\n"; @@ -4190,7 +4266,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $self->_handle_encoding_line( "=encoding utf8" ); delete $self->{'_processed_encoding'}; $line =~ tr/\n\r//d; - + } elsif( $line =~ s/^\xFE\xFF//s ) { DEBUG and print STDERR "Big-endian UTF-16 BOM seen. Aborting parsing.\n"; $self->scream( @@ -4214,7 +4290,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n next; # TODO: implement somehow? - + } else { DEBUG > 2 and print STDERR "First line is BOM-less.\n"; ($line = $source_line) =~ tr/\n\r//d; @@ -4228,8 +4304,8 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $encoding; - # No =encoding line, and we are at the first line in the input that - # contains a non-ascii byte, that is one whose meaning varies depending + # No =encoding line, and we are at the first pod line in the input that + # contains a non-ascii byte, that is, one whose meaning varies depending # on whether the file is encoded in UTF-8 or CP1252, which are the two # possibilities permitted by the pod spec. (ASCII is assumed if the # file only contains ASCII bytes.) In order to process this line, we @@ -4246,22 +4322,28 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # without conflict. CP 1252 uses most of them for graphic characters. # # Note that all ASCII-range bytes represent their corresponding code - # points in CP1252 and UTF-8. In ASCII platform UTF-8 all other code - # points require multiple (non-ASCII) bytes to represent. (A separate - # paragraph for EBCDIC is below.) The multi-byte representation is - # quite structured. If we find an isolated byte that requires multiple - # bytes to represent in UTF-8, we know that the encoding is not UTF-8. - # If we find a sequence of bytes that violates the UTF-8 structure, we - # also can presume the encoding isn't UTF-8, and hence must be 1252. + # points in both CP1252 and UTF-8. In ASCII platform UTF-8, all other + # code points require multiple (non-ASCII) bytes to represent. (A + # separate paragraph for EBCDIC is below.) The multi-byte + # representation is quite structured. If we find an isolated byte that + # would require multiple bytes to represent in UTF-8, we know that the + # encoding is not UTF-8. If we find a sequence of bytes that violates + # the UTF-8 structure, we also can presume the encoding isn't UTF-8, and + # hence must be 1252. # # But there are ambiguous cases where we could guess wrong. If so, the # user will end up having to supply an =encoding line. We use all # readily available information to improve our chances of guessing # right. The odds of something not being UTF-8, but still passing a # UTF-8 validity test go down very rapidly with increasing length of the - # sequence. Therefore we look at all the maximal length non-ascii - # sequences on the line. If any of the sequences can't be UTF-8, we - # quit there and choose CP1252. If all could be UTF-8, we guess UTF-8. + # sequence. Therefore we look at all non-ascii sequences on the line. + # If any of the sequences can't be UTF-8, we quit there and choose + # CP1252. If all could be UTF-8, we see if any of the code points + # represented are unlikely to be in pod. If so, we guess CP1252. If + # not, we check if the line is all in the same script; if not guess + # CP1252; otherwise UTF-8. For perls that don't have convenient script + # run testing, see if there is both Latin and non-Latin. If so, CP1252, + # otherwise UTF-8. # # On EBCDIC platforms, the situation is somewhat different. In # UTF-EBCDIC, not only do ASCII-range bytes represent their code points, @@ -4272,51 +4354,188 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # very unlikely to be in pod text. So if we encounter one of them, it # means that it is quite likely CP1252 and not UTF-8. The net result is # the same code below is used for both platforms. - while ($line =~ m/($non_ascii_re+)/g) { - my $non_ascii_seq = $1; - - if (length $non_ascii_seq == 1) { - $encoding = 'CP1252'; - goto guessed; - } elsif ($] ge 5.007_003) { - - # On Perls that have this function, we can see if the sequence is - # valid UTF-8 or not. - my $is_utf8; - { - no warnings 'utf8'; - $is_utf8 = utf8::decode($non_ascii_seq); + # + # XXX probably if the line has E that evaluates to illegal CP1252, + # then it is UTF-8. But we haven't processed E<> yet. + + goto set_1252 if $] lt 5.006_000; # No UTF-8 on very early perls + + my $copy; + + no warnings 'utf8'; + + if ($] ge 5.007_003) { + $copy = $line; + + # On perls that have this function, we can use it to easily see if the + # sequence is valid UTF-8 or not; if valid it turns on the UTF-8 flag + # needed below for script run detection + goto set_1252 if ! utf8::decode($copy); + } + elsif (ord("A") != 65) { # Early EBCDIC, assume UTF-8. What's a windows + # code page doing here anyway? + goto set_utf8; + } + else { # ASCII, no decode(): do it ourselves using the fundamental + # characteristics of UTF-8 + use if $] le 5.006002, 'utf8'; + + my $char_ord; + my $needed; # How many continuation bytes to gobble up + + # Initialize the translated line with a dummy character that will be + # deleted after everything else is done. This dummy makes sure that + # $copy will be in UTF-8. Doing it now avoids the bugs in early perls + # with upgrading in the middle + $copy = chr(0x100); + + # Parse through the line + for (my $i = 0; $i < length $line; $i++) { + my $byte = substr($line, $i, 1); + + # ASCII bytes are trivially dealt with + if ($byte !~ $non_ascii_re) { + $copy .= $byte; + next; + } + + my $b_ord = ord $byte; + + # Now figure out what this code point would be if the input is + # actually in UTF-8. If, in the process, we discover that it isn't + # well-formed UTF-8, we guess CP1252. + # + # Start the process. If it is UTF-8, we are at the first, start + # byte, of a multi-byte sequence. We look at this byte to figure + # out how many continuation bytes are needed, and to initialize the + # code point accumulator with the data from this byte. + # + # Normally the minimum continuation byte is 0x80, but in certain + # instances the minimum is a higher number. So the code below + # overrides this for those instances. + my $min_cont = 0x80; + + if ($b_ord < 0xC2) { # A start byte < C2 is malformed + goto set_1252; } - if (! $is_utf8) { - $encoding = 'CP1252'; - goto guessed; + elsif ($b_ord <= 0xDF) { + $needed = 1; + $char_ord = $b_ord & 0x1F; } - } elsif (ord("A") == 65) { # An early Perl, ASCII platform - - # Without utf8::decode, it's a lot harder to do a rigorous check - # (though some early releases had a different function that - # accomplished the same thing). Since these are ancient Perls, not - # likely to be in use today, we take the easy way out, and look at - # just the first two bytes of the sequence to see if they are the - # start of a UTF-8 character. In ASCII UTF-8, continuation bytes - # must be between 0x80 and 0xBF. Start bytes can range from 0xC2 - # through 0xFF, but anything above 0xF4 is not Unicode, and hence - # extremely unlikely to be in a pod. - if ($non_ascii_seq !~ /^[\xC2-\xF4][\x80-\xBF]/) { - $encoding = 'CP1252'; - goto guessed; + elsif ($b_ord <= 0xEF) { + $min_cont = 0xA0 if $b_ord == 0xE0; + $needed = 2; + $char_ord = $b_ord & (0x1F >> 1); + } + elsif ($b_ord <= 0xF4) { + $min_cont = 0x90 if $b_ord == 0xF0; + $needed = 3; + $char_ord = $b_ord & (0x1F >> 2); + } + else { # F4 is the highest start byte for legal Unicode; higher is + # unlikely to be in pod. + goto set_1252; } - # We don't bother doing anything special for EBCDIC on early Perls. - # If there is a solitary variant, CP1252 will be chosen; otherwise - # UTF-8. - } - } # End of loop through all variant sequences on the line + # ? not enough continuation bytes available + goto set_1252 if $i + $needed >= length $line; + + # Accumulate the ordinal of the character from the remaining + # (continuation) bytes. + while ($needed-- > 0) { + my $cont = substr($line, ++$i, 1); + $b_ord = ord $cont; + goto set_1252 if $b_ord < $min_cont || $b_ord > 0xBF; + + # In all cases, any next continuation bytes all have the same + # minimum legal value + $min_cont = 0x80; + + # Accumulate this byte's contribution to the code point + $char_ord <<= 6; + $char_ord |= ($b_ord & 0x3F); + } + + # Here, the sequence that formed this code point was valid UTF-8, + # so add the completed character to the output + $copy .= chr $char_ord; + } # End of loop through line + + # Delete the dummy first character + $copy = substr($copy, 1); + } + + # Here, $copy is legal UTF-8. + + # If it can't be legal CP1252, no need to look further. (These bytes + # aren't valid in CP1252.) This test could have been placed higher in + # the code, but it seemed wrong to set the encoding to UTF-8 without + # making sure that the very first instance is well-formed. But what if + # it isn't legal CP1252 either? We have to choose one or the other, and + # It seems safer to favor the single-byte encoding over the multi-byte. + goto set_utf8 if ord("A") == 65 && $line =~ /[\x81\x8D\x8F\x90\x9D]/; + + # The C1 controls are not likely to appear in pod + goto set_1252 if ord("A") == 65 && $copy =~ /[\x80-\x9F]/; + + # Nor are surrogates nor unassigned, nor deprecated. + DEBUG > 8 and print STDERR __LINE__, ": $copy: surrogate\n" if $copy =~ $cs_re; + goto set_1252 if $cs_re && $copy =~ $cs_re; + DEBUG > 8 and print STDERR __LINE__, ": $copy: unassigned\n" if $cn_re && $copy =~ $cn_re; + goto set_1252 if $cn_re && $copy =~ $cn_re; + DEBUG > 8 and print STDERR __LINE__, ": $copy: deprecated\n" if $copy =~ $deprecated_re; + goto set_1252 if $copy =~ $deprecated_re; + + # Nor are rare code points. But this is hard to determine. khw + # believes that IPA characters and the modifier letters are unlikely to + # be in pod (and certainly very unlikely to be the in the first line in + # the pod containing non-ASCII) + DEBUG > 8 and print STDERR __LINE__, ": $copy: rare\n" if $copy =~ $rare_blocks_re; + goto set_1252 if $rare_blocks_re && $copy =~ $rare_blocks_re; + + # The first Unicode version included essentially every Latin character + # in modern usage. So, a Latin character not in the first release will + # unlikely be in pod. + DEBUG > 8 and print STDERR __LINE__, ": $copy: later_latin\n" if $later_latin_re && $copy =~ $later_latin_re; + goto set_1252 if $later_latin_re && $copy =~ $later_latin_re; + + # On perls that handle script runs, if the UTF-8 interpretation yields + # a single script, we guess UTF-8, otherwise just having a mixture of + # scripts is suspicious, so guess CP1252. We first strip off, as best + # we can, the ASCII characters that look like they are pod directives, + # as these would always show as mixed with non-Latin text. + $copy =~ s/$pod_chars_re//g; + + if ($script_run_re) { + goto set_utf8 if $copy =~ $script_run_re; + DEBUG > 8 and print STDERR __LINE__, ": not script run\n"; + goto set_1252; + } - # All sequences in the line could be UTF-8. Guess that. + # Even without script runs, but on recent enough perls and Unicodes, we + # can check if there is a mixture of both Latin and non-Latin. Again, + # having a mixture of scripts is suspicious, so assume CP1252 + + # If it's all non-Latin, there is no CP1252, as that is Latin + # characters and punct, etc. + DEBUG > 8 and print STDERR __LINE__, ": $copy: not latin\n" if $copy !~ $latin_re; + goto set_utf8 if $copy !~ $latin_re; + + DEBUG > 8 and print STDERR __LINE__, ": $copy: all latin\n" if $copy =~ $every_char_is_latin_re; + goto set_utf8 if $copy =~ $every_char_is_latin_re; + + DEBUG > 8 and print STDERR __LINE__, ": $copy: mixed\n"; + + set_1252: + DEBUG > 9 and print STDERR __LINE__, ": $copy: is 1252\n"; + $encoding = 'CP1252'; + goto done_set; + + set_utf8: + DEBUG > 9 and print STDERR __LINE__, ": $copy: is UTF-8\n"; $encoding = 'UTF-8'; - guessed: + done_set: $self->_handle_encoding_line( "=encoding $encoding" ); delete $self->{'_processed_encoding'}; $self->{'_transcoder'} && $self->{'_transcoder'}->($line); @@ -4338,13 +4557,13 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $self->{'line_count'}, "=cut found outside a pod block. Skipping to next block." ); - + ## Before there were errata sections in the world, it was ## least-pessimal to abort processing the file. But now we can ## just barrel on thru (but still not start a pod block). #splice @_; #push @_, undef; - + next; } else { $self->{'in_pod'} = $self->{'start_of_pod_block'} @@ -4357,7 +4576,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n if $code_handler; # Note: this may cause code to be processed out of order relative # to pods, but in order relative to cuts. - + # Note also that we haven't yet applied the transcoding to $line # by time we call $code_handler! @@ -4368,11 +4587,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 1 and print STDERR "# Setting nextline to $1\n"; $self->{'line_count'} = $1 - 1; } - + next; } } - + # . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . # Else we're in pod mode: @@ -4392,12 +4611,13 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # ++$self->{'pod_para_count'}; $self->_ponder_paragraph_buffer(); # by now it's safe to consider the previous paragraph as done. + DEBUG > 6 and print STDERR "Processing any cut handler, line ${$self}{'line_count'}\n"; $cut_handler->(map $_, $line, $self->{'line_count'}, $self) if $cut_handler; # TODO: add to docs: Note: this may cause cuts to be processed out # of order relative to pods, but in order relative to code. - + } elsif($line =~ m/^(\s*)$/s) { # it's a blank line if (defined $1 and $1 =~ /[^\S\r\n]/) { # it's a white line $wl_handler->(map $_, $line, $self->{'line_count'}, $self) @@ -4408,29 +4628,30 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 1 and print STDERR "Saving blank line at line ${$self}{'line_count'}\n"; push @{$paras->[-1]}, $line; } # otherwise it's not interesting - + if(!$self->{'start_of_pod_block'} and !$self->{'last_was_blank'}) { DEBUG > 1 and print STDERR "Noting para ends with blank line at ${$self}{'line_count'}\n"; } - + $self->{'last_was_blank'} = 1; - + } elsif($self->{'last_was_blank'}) { # A non-blank line starting a new para... - - if($line =~ m/^(=[a-zA-Z][a-zA-Z0-9]*)(?:\s+|$)(.*)/s) { + + if($line =~ m/^(=[a-zA-Z][a-zA-Z0-9]*)(\s+|$)(.*)/s) { # THIS IS THE ONE PLACE WHERE WE CONSTRUCT NEW DIRECTIVE OBJECTS - my $new = [$1, {'start_line' => $self->{'line_count'}}, $2]; + my $new = [$1, {'start_line' => $self->{'line_count'}}, $3]; + $new->[1]{'~orig_spacer'} = $2 if $2 && $2 ne " "; # Note that in "=head1 foo", the WS is lost. # Example: ['=head1', {'start_line' => 123}, ' foo'] - + ++$self->{'pod_para_count'}; - + $self->_ponder_paragraph_buffer(); # by now it's safe to consider the previous paragraph as done. - + push @$paras, $new; # the new incipient paragraph DEBUG > 1 and print STDERR "Starting new ${$paras}[-1][0] para at line ${$self}{'line_count'}\n"; - + } elsif($line =~ m/^\s/s) { if(!$self->{'start_of_pod_block'} and @$paras and $paras->[-1][0] eq '~Verbatim') { @@ -4463,7 +4684,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } $self->{'last_was_blank'} = $self->{'start_of_pod_block'} = 0; } - + } # ends the big while loop DEBUG > 1 and print STDERR (pretty(@$paras), "\n"); @@ -4474,7 +4695,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n sub _handle_encoding_line { my($self, $line) = @_; - + return if $self->parse_characters; # The point of this routine is to set $self->{'_transcoder'} as indicated. @@ -4576,7 +4797,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n sub _handle_encoding_second_level { # By time this is called, the encoding (if well formed) will already - # have been acted one. + # have been acted on. my($self, $para) = @_; my @x = @$para; my $content = join ' ', splice @x, 2; @@ -4584,7 +4805,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $content =~ s/\s+$//s; DEBUG > 2 and print STDERR "Ogling encoding directive: =encoding $content\n"; - + if (defined($self->{'_processed_encoding'})) { #if($content ne $self->{'_processed_encoding'}) { # Could it happen? @@ -4602,14 +4823,14 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } else { DEBUG > 2 and print STDERR " (Yup, it was successfully handled already.)\n"; } - + } else { # Otherwise it's a syntax error $self->whine( $para->[1]{'start_line'}, "Invalid =encoding syntax: $content" ); } - + return; } @@ -4626,7 +4847,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n return() unless $self->{'errata'} and keys %{$self->{'errata'}}; my @out; - + foreach my $line (sort {$a <=> $b} keys %{$self->{'errata'}}) { push @out, ['=item', {'start_line' => $m}, "Around line $line:"], @@ -4639,7 +4860,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ) ; } - + # TODO: report of unknown entities? unrenderable characters? unshift @out, @@ -4653,7 +4874,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ['=over', {'start_line' => $m, 'errata' => 1}, ''], ; - push @out, + push @out, ['=back', {'start_line' => $m, 'errata' => 1}, ''], ; @@ -4694,7 +4915,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # Document, # Data, Para, Verbatim # B, C, longdirname (TODO -- wha?), etc. for all directives - # + # my $self = $_[0]; my $paras; @@ -4708,11 +4929,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # We have something in our buffer. So apparently the document has started. unless($self->{'doc_has_started'}) { $self->{'doc_has_started'} = 1; - + my $starting_contentless; $starting_contentless = ( - !@$curr_open + !@$curr_open and @$paras and ! grep $_->[0] ne '~end', @$paras # i.e., if the paras is all ~ends ) @@ -4721,7 +4942,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $starting_contentless ? 'contentless' : 'contentful', " document\n" ; - + $self->_handle_element_start( ($scratch = 'Document'), { @@ -4733,15 +4954,32 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my($para, $para_type); while(@$paras) { - last if @$paras == 1 and - ( $paras->[0][0] eq '=over' or $paras->[0][0] eq '~Verbatim' - or $paras->[0][0] eq '=item' ) - ; + + # If a directive, assume it's legal; subtract below if found not to be + $seen_legal_directive++ if $paras->[0][0] =~ /^=/; + + last if @$paras == 1 + and ( $paras->[0][0] eq '=over' + or $paras->[0][0] eq '=item' + or ($paras->[0][0] eq '~Verbatim' and $self->{'in_pod'})); # Those're the three kinds of paragraphs that require lookahead. # Actually, an "=item Foo" inside an region # and any =item inside an region (rare) # don't require any lookahead, but all others (bullets # and numbers) do. + # The verbatim is different from the other two, because those might be + # like: + # + # =item + # ... + # =cut + # ... + # =item + # + # The =cut here finishes the paragraph but doesn't terminate the =over + # they should be in. (khw apologizes that he didn't comment at the time + # why the 'in_pod' works, and no longer remembers why, and doesn't think + # it is currently worth the effort to re-figure it out.) # TODO: whinge about many kinds of directives in non-resolving =for regions? # TODO: many? like what? =head1 etc? @@ -4751,7 +4989,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 1 and print STDERR "Pondering a $para_type paragraph, given the stack: (", $self->_dump_curr_open(), ")\n"; - + if($para_type eq '=for') { next if $self->_ponder_for($para,$curr_open,$paras); @@ -4788,7 +5026,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } else { # All non-magical codes!!! - + # Here we start using $para_type for our own twisted purposes, to # mean how it should get treated, not as what the element name # should be. @@ -4828,10 +5066,10 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ; next; } - - + + my $over_type = $over->[1]{'~type'}; - + if(!$over_type) { # Shouldn't happen1 die "Typeless over in stack, starting at line " @@ -4856,7 +5094,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + if($item_type eq 'text') { # Nothing special needs doing for 'text' } elsif($item_type eq 'number' or $item_type eq 'bullet') { @@ -4872,16 +5110,16 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } else { die "Unhandled item type $item_type"; # should never happen } - + # =item-text thingies don't need any assimilation, it seems. } elsif($over_type eq 'number') { my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + my $expected_value = ++ $curr_open->[-1][1]{'~counter'}; - + if($item_type eq 'bullet') { # Hm, it's not numeric. Correct for this. $para->[1]{'number'} = $expected_value; @@ -4906,7 +5144,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } elsif($expected_value == $para->[1]{'number'}) { DEBUG > 1 and print STDERR " Numeric item has the expected value of $expected_value\n"; - + } else { DEBUG > 1 and print STDERR " Numeric item has ", $para->[1]{'number'}, " instead of the expected value of $expected_value\n"; @@ -4917,7 +5155,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ); $para->[1]{'number'} = $expected_value; # correcting!! } - + if(@$para == 2) { # For the cases where we /didn't/ push to @$para if($paras->[0][0] eq '~Para') { @@ -4934,13 +5172,13 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + if($item_type eq 'bullet') { # as expected! if( $para->[1]{'~_freaky_para_hack'} ) { DEBUG and print STDERR "Accomodating '=item * Foo' tolerance hack.\n"; - push @$para, delete $para->[1]{'~_freaky_para_hack'}; + push @$para, $para->[1]{'~_freaky_para_hack'}; } } elsif($item_type eq 'number') { @@ -5009,6 +5247,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 1 and print STDERR " Pondering known directive ${$para}[0] as $para_type\n"; } else { # An unknown directive! + $seen_legal_directive--; DEBUG > 1 and printf STDERR "Unhandled directive %s (Handled: %s)\n", $para->[0], join(' ', sort keys %{$self->{'accept_directives'}} ) ; @@ -5028,15 +5267,15 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my @fors = grep $_->[0] eq '=for', @$curr_open; DEBUG > 1 and print STDERR "Containing fors: ", join(',', map $_->[1]{'target'}, @fors), "\n"; - + if(! @fors) { DEBUG and print STDERR "Treating $para_type paragraph as such because stack has no =for's\n"; - + #} elsif(grep $_->[1]{'~resolve'}, @fors) { #} elsif(not grep !$_->[1]{'~resolve'}, @fors) { } elsif( $fors[-1][1]{'~resolve'} ) { # Look to the immediately containing for - + if($para_type eq 'Data') { DEBUG and print STDERR "Treating Data paragraph as Plain/Verbatim because the containing =for ($fors[-1][1]{'target'}) is a resolver\n"; $para->[0] = 'Para'; @@ -5055,7 +5294,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n if($para_type eq 'Plain') { $self->_ponder_Plain($para); } elsif($para_type eq 'Verbatim') { - $self->_ponder_Verbatim($para); + $self->_ponder_Verbatim($para); } elsif($para_type eq 'Data') { $self->_ponder_Data($para); } else { @@ -5069,11 +5308,12 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG and print STDERR "\n", pretty($para), "\n"; # traverse the treelet (which might well be just one string scalar) - $self->{'content_seen'} ||= 1; + $self->{'content_seen'} ||= 1 if $seen_legal_directive + && ! $self->{'~tried_gen_errata'}; $self->_traverse_treelet_bit(@$para); } } - + return; } @@ -5108,9 +5348,9 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } DEBUG > 1 and print STDERR "Faking out a =for $target as a =begin $target / =end $target\n"; - + $para->[0] = 'Data'; - + unshift @$paras, ['=begin', {'start_line' => $para->[1]{'start_line'}, '~really' => '=for'}, @@ -5122,7 +5362,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $target, ], ; - + return 1; } @@ -5139,20 +5379,20 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG and print STDERR "Ignoring targetless =begin\n"; return 1; } - + my ($target, $title) = $content =~ m/^(\S+)\s*(.*)$/; $para->[1]{'title'} = $title if ($title); $para->[1]{'target'} = $target; # without any ':' $content = $target; # strip off the title - + $content =~ s/^:!/!:/s; my $neg; # whether this is a negation-match $neg = 1 if $content =~ s/^!//s; my $to_resolve; # whether to process formatting codes $to_resolve = 1 if $content =~ s/^://s; - + my $dont_ignore; # whether this target matches us - + foreach my $target_name ( split(',', $content, -1), $neg ? () : '*' @@ -5160,7 +5400,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 2 and print STDERR " Considering whether =begin $content matches $target_name\n"; next unless $self->{'accept_targets'}{$target_name}; - + DEBUG > 2 and print STDERR " It DOES match the acceptable target $target_name!\n"; $to_resolve = 1 @@ -5197,7 +5437,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n if(!$dont_ignore or scalar grep $_->[1]{'~ignore'}, @$curr_open) { DEBUG > 1 and print STDERR "Ignoring ignorable =begin\n"; } else { - $self->{'content_seen'} ||= 1; + $self->{'content_seen'} ||= 1 unless $self->{'~tried_gen_errata'}; $self->_handle_element_start((my $scratch='for'), $para->[1]); } @@ -5223,7 +5463,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG and print STDERR "Ignoring targetless =end\n"; return 1; } - + unless($content =~ m/^\S+$/) { # i.e., unless it's one word $self->whine( $para->[1]{'start_line'}, @@ -5233,7 +5473,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG and print STDERR "Ignoring mistargetted =end $content\n"; return 1; } - + unless(@$curr_open and $curr_open->[-1][0] eq '=for') { $self->whine( $para->[1]{'start_line'}, @@ -5243,11 +5483,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG and print STDERR "Ignoring mistargetted =end $content\n"; return 1; } - + unless($content eq $curr_open->[-1][1]{'target'}) { $self->whine( $para->[1]{'start_line'}, - "=end $content doesn't match =begin " + "=end $content doesn't match =begin " . $curr_open->[-1][1]{'target'} . ". (Stack: " . $self->_dump_curr_open() . ')' @@ -5264,22 +5504,22 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } else { $curr_open->[-1][1]{'start_line'} = $para->[1]{'start_line'}; # what's that for? - - $self->{'content_seen'} ||= 1; + + $self->{'content_seen'} ||= 1 unless $self->{'~tried_gen_errata'}; $self->_handle_element_end( my $scratch = 'for', $para->[1]); } DEBUG > 1 and print STDERR "Popping $curr_open->[-1][0] $curr_open->[-1][1]{'target'} because of =end $content\n"; pop @$curr_open; return 1; - } + } sub _ponder_doc_end { my ($self,$para,$curr_open,$paras) = @_; if(@$curr_open) { # Deal with things left open DEBUG and print STDERR "Stack is nonempty at end-document: (", $self->_dump_curr_open(), ")\n"; - + DEBUG > 9 and print STDERR "Stack: ", pretty($curr_open), "\n"; unshift @$paras, $self->_closers_for_all_curr_open; # Make sure there is exactly one ~end in the parastack, at the end: @@ -5289,11 +5529,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # generate errata, and then another to be at the end # when that loop back around to process the errata. return 1; - + } else { DEBUG and print STDERR "Okay, stack is empty now.\n"; } - + # Try generating errata section, if applicable unless($self->{'~tried_gen_errata'}) { $self->{'~tried_gen_errata'} = 1; @@ -5304,7 +5544,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n return 1; # I.e., loop around again to process these fake-o paragraphs } } - + splice @$paras; # Well, that's that for this paragraph buffer. DEBUG and print STDERR "Throwing end-document event.\n"; @@ -5329,7 +5569,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # The surrounding methods set content_seen, so let us remain consistent. # I do not know why it was not here before -- should it not be here? - # $self->{'content_seen'} ||= 1; + # $self->{'content_seen'} ||= 1 unless $self->{'~tried_gen_errata'}; return; } @@ -5362,8 +5602,9 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $para->[1]{'~type'} = $list_type; push @$curr_open, $para; # yes, we reuse the paragraph as a stack item - + my $content = join ' ', splice @$para, 2; + $para->[1]{'~orig_content'} = $content; my $overness; if($content =~ m/^\s*$/s) { $para->[1]{'indent'} = 4; @@ -5385,13 +5626,13 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $para->[1]{'indent'} = 4; } DEBUG > 1 and print STDERR "=over found of type $list_type\n"; - - $self->{'content_seen'} ||= 1; + + $self->{'content_seen'} ||= 1 unless $self->{'~tried_gen_errata'}; $self->_handle_element_start((my $scratch = 'over-' . $list_type), $para->[1]); return; } - + sub _ponder_back { my ($self,$para,$curr_open,$paras) = @_; # TODO: fire off or or ?? @@ -5408,7 +5649,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n DEBUG > 1 and print STDERR "=back happily closes matching =over\n"; # Expected case: we're closing the most recently opened thing #my $over = pop @$curr_open; - $self->{'content_seen'} ||= 1; + $self->{'content_seen'} ||= 1 unless $self->{'~tried_gen_errata'}; $self->_handle_element_end( my $scratch = 'over-' . ( (pop @$curr_open)->[1]{'~type'} ), $para->[1] ); @@ -5438,10 +5679,10 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ; return 1; } - - + + my $over_type = $over->[1]{'~type'}; - + if(!$over_type) { # Shouldn't happen1 die "Typeless over in stack, starting at line " @@ -5466,7 +5707,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + if($item_type eq 'text') { # Nothing special needs doing for 'text' } elsif($item_type eq 'number' or $item_type eq 'bullet') { @@ -5482,16 +5723,16 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } else { die "Unhandled item type $item_type"; # should never happen } - + # =item-text thingies don't need any assimilation, it seems. } elsif($over_type eq 'number') { my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + my $expected_value = ++ $curr_open->[-1][1]{'~counter'}; - + if($item_type eq 'bullet') { # Hm, it's not numeric. Correct for this. $para->[1]{'number'} = $expected_value; @@ -5516,7 +5757,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } elsif($expected_value == $para->[1]{'number'}) { DEBUG > 1 and print STDERR " Numeric item has the expected value of $expected_value\n"; - + } else { DEBUG > 1 and print STDERR " Numeric item has ", $para->[1]{'number'}, " instead of the expected value of $expected_value\n"; @@ -5527,7 +5768,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ); $para->[1]{'number'} = $expected_value; # correcting!! } - + if(@$para == 2) { # For the cases where we /didn't/ push to @$para if($paras->[0][0] eq '~Para') { @@ -5544,13 +5785,13 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $item_type = $self->_get_item_type($para); # That kills the content of the item if it's a number or bullet. DEBUG and print STDERR " Item is of type ", $para->[0], " under $over_type\n"; - + if($item_type eq 'bullet') { # as expected! if( $para->[1]{'~_freaky_para_hack'} ) { DEBUG and print STDERR "Accomodating '=item * Foo' tolerance hack.\n"; - push @$para, delete $para->[1]{'~_freaky_para_hack'}; + push @$para, $para->[1]{'~_freaky_para_hack'}; } } elsif($item_type eq 'number') { @@ -5617,30 +5858,45 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $para->[1]{'xml:space'} = 'preserve'; - my $indent = $self->strip_verbatim_indent; - if ($indent && ref $indent eq 'CODE') { - my @shifted = (shift @{$para}, shift @{$para}); - $indent = $indent->($para); - unshift @{$para}, @shifted; - } - - for(my $i = 2; $i < @$para; $i++) { - foreach my $line ($para->[$i]) { # just for aliasing - # Strip indentation. - $line =~ s/^\Q$indent// if $indent - && !($self->{accept_codes} && $self->{accept_codes}{VerbatimFormatted}); - while( $line =~ - # Sort of adapted from Text::Tabs -- yes, it's hardwired in that - # tabs are at every EIGHTH column. For portability, it has to be - # one setting everywhere, and 8th wins. - s/^([^\t]*)(\t+)/$1.(" " x ((length($2)<<3)-(length($1)&7)))/e - ) {} - - # TODO: whinge about (or otherwise treat) unindented or overlong lines + unless ($self->{'_output_is_for_JustPod'}) { + # Fix illegal settings for expand_verbatim_tabs() + # This is because this module doesn't do input error checking, but khw + # doesn't want to add yet another instance of that. + my $tab_width = $self->expand_verbatim_tabs; + $tab_width = $self->expand_verbatim_tabs(8) + if ! defined $tab_width + || $tab_width =~ /\D/; + + my $indent = $self->strip_verbatim_indent; + if ($indent && ref $indent eq 'CODE') { + my @shifted = (shift @{$para}, shift @{$para}); + $indent = $indent->($para); + unshift @{$para}, @shifted; + } + + for(my $i = 2; $i < @$para; $i++) { + foreach my $line ($para->[$i]) { # just for aliasing + # Strip indentation. + $line =~ s/^\Q$indent// if $indent; + next unless $tab_width; + + # This is commented out because of github issue #85, and the + # current maintainers don't know why it was there in the first + # place. + #&& !($self->{accept_codes} && $self->{accept_codes}{VerbatimFormatted}); + while( $line =~ + # Sort of adapted from Text::Tabs. + s/^([^\t]*)(\t+)/$1.(" " x ((length($2) + * $tab_width) + -(length($1) % $tab_width)))/e + ) {} + + # TODO: whinge about (or otherwise treat) unindented or overlong lines + } } } - + # Now the VerbatimFormatted hoodoo... if( $self->{'accept_codes'} and $self->{'accept_codes'}{'VerbatimFormatted'} @@ -5680,7 +5936,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n my $scratch; $self->_handle_element_start(($scratch=$name), shift @_); - + while (@_) { my $x = shift; if (ref($x)) { @@ -5690,7 +5946,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $self->_handle_text($x); } } - + $self->_handle_element_end($scratch=$name); return; } @@ -5735,7 +5991,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n sub _verbatim_format { my($it, $p) = @_; - + my $formatting; for(my $i = 2; $i < @$p; $i++) { # work backwards over the lines @@ -5743,7 +5999,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $p->[$i] .= "\n"; # Unlike with simple Verbatim blocks, we don't end up just doing # a join("\n", ...) on the contents, so we have to append a - # newline to ever line, and then nix the last one later. + # newline to every line, and then nix the last one later. } if( DEBUG > 4 ) { @@ -5756,7 +6012,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n for(my $i = $#$p; $i > 2; $i--) { # work backwards over the lines, except the first (#2) - + #next unless $p->[$i] =~ m{^#:([ \^\/\%]*)\n?$}s # and $p->[$i-1] !~ m{^#:[ \^\/\%]*\n?$}s; # look at a formatty line preceding a nonformatty one @@ -5764,7 +6020,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n if($p->[$i] =~ m{^#:([ \^\/\%]*)\n?$}s) { DEBUG > 5 and print STDERR " It's a formatty line. ", "Peeking at previous line ", $i-1, ": $$p[$i-1]: \n"; - + if( $p->[$i-1] =~ m{^#:[ \^\/\%]*\n?$}s ) { DEBUG > 5 and print STDERR " Previous line is formatty! Skipping this one.\n"; next; @@ -5780,11 +6036,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # "^" to mean bold, "/" to mean underline, and "%" to mean bold italic. # Example: # What do you want? i like pie. [or whatever] - # #:^^^^^^^^^^^^^^^^^ ///////////// - + # #:^^^^^^^^^^^^^^^^^ ///////////// + DEBUG > 4 and print STDERR "_verbatim_format considers:\n<$p->[$i-1]>\n<$p->[$i]>\n"; - + $formatting = ' ' . $1; $formatting =~ s/\s+$//s; # nix trailing whitespace unless(length $formatting and $p->[$i-1] =~ m/\S/) { # no-op @@ -5800,7 +6056,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } # Make $formatting and the previous line be exactly the same length, # with $formatting having a " " as the last character. - + DEBUG > 4 and print STDERR "Formatting <$formatting> on <", $p->[$i-1], ">\n"; @@ -5825,10 +6081,10 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n #print STDERR "Formatting <$new_line[-1][-1]> as $new_line[-1][0]\n"; } } - my @nixed = + my @nixed = splice @$p, $i-1, 2, @new_line; # replace myself and the next line DEBUG > 10 and print STDERR "Nixed count: ", scalar(@nixed), "\n"; - + DEBUG > 6 and print STDERR "New version of the above line is these tokens (", scalar(@new_line), "):", map( ref($_)?"<@$_> ":"<$_>", @new_line ), "\n"; @@ -5875,29 +6131,46 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # [ 'B', {}, "pie" ], # "!" # ] - + # This illustrates the general format of a treelet. It is an array: + # [0] is a scalar indicating its type. In the example above, the + # types are '~Top' and 'B' + # [1] is a hash of various flags about it, possibly empty + # [2] - [N] are an ordered list of the subcomponents of the treelet. + # Scalars are literal text, refs are sub-treelets, to + # arbitrary levels. Stringifying a treelet will recursively + # stringify the sub-treelets, concatentating everything + # together to form the exact text of the treelet. + my($self, $para, $start_line, $preserve_space) = @_; - + my $treelet = ['~Top', {'start_line' => $start_line},]; - + unless ($preserve_space || $self->{'preserve_whitespace'}) { $para =~ s/\s+/ /g; # collapse and trim all whitespace first. $para =~ s/ $//; $para =~ s/^ //; } - + # Only apparent problem the above code is that N<< >> turns into # N<< >>. But then, word wrapping does that too! So don't do that! - + + + # As a Start-code is encountered, the number of opening bracket '<' + # characters minus 1 is pushed onto @stack (so 0 means a single bracket, + # etc). When closing brackets are found in the text, at least this number + # (plus the 1) will be required to mean the Start-code is terminated. When + # those are found, @stack is popped. my @stack; + my @lineage = ($treelet); my $raw = ''; # raw content of L<> fcode before splitting/processing # XXX 'raw' is not 100% accurate: all surrounding whitespace is condensed - # into just 1 ' '. Is this the regex's doing or 'raw's? + # into just 1 ' '. Is this the regex's doing or 'raw's? Answer is it's + # the 'collapse and trim all whitespace first' lines just above. my $inL = 0; DEBUG > 4 and print STDERR "Paragraph:\n$para\n\n"; - + # Here begins our frightening tokenizer RE. The following regex matches # text in four main parts: # @@ -5930,7 +6203,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n | # Match multiple-bracket end codes. $3 gets the whitespace that # should be discarded before an end bracket but kept in other cases - # and $4 gets the end brackets themselves. + # and $4 gets the end brackets themselves. ($3 can be empty if the + # construct is empty, like C<< >>, and all the white-space has been + # gobbled up already, considered to be space after the opening + # bracket. In this case we use look-behind to verify that there are + # at least 2 spaces in a row before the ">".) (\s+|(?<=\s\s))(>{2,}) | (\s?>) # $5: simple end-codes @@ -5956,23 +6233,48 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n ) { DEBUG > 4 and print STDERR "\nParagraphic tokenstack = (@stack)\n"; if(defined $1) { + my $bracket_count; # How many '<<<' in a row this has. Needed for + # Pod::Simple::JustPod if(defined $2) { DEBUG > 3 and print STDERR "Found complex start-text code \"$1\"\n"; - push @stack, length($2) + 1; - # length of the necessary complex end-code string + $bracket_count = length($2) + 1; + push @stack, $bracket_count; # length of the necessary complex + # end-code string } else { DEBUG > 3 and print STDERR "Found simple start-text code \"$1\"\n"; push @stack, 0; # signal that we're looking for simple + $bracket_count = 1; } - push @lineage, [ substr($1,0,1), {}, ]; # new node object - push @{ $lineage[-2] }, $lineage[-1]; - if ('L' eq substr($1,0,1)) { - $raw = $inL ? $raw.$1 : ''; # reset raw content accumulator - $inL = 1; + my $code = substr($1,0,1); + if ('L' eq $code) { + if ($inL) { + $raw .= $1; + $self->scream( $start_line, + 'Nested L<> are illegal. Pretending inner one is ' + . 'X<...> so can continue looking for other errors.'); + $code = "X"; + } + else { + $raw = ""; # reset raw content accumulator + $inL = @stack; + } } else { $raw .= $1 if $inL; } - + push @lineage, [ $code, {}, ]; # new node object + + # Tell Pod::Simple::JustPod how many brackets there were, but to save + # space, not in the most usual case of there was just 1. It can be + # inferred by the absence of this element. Similarly, if there is more + # than one bracket, extract the white space between the final bracket + # and the real beginning of the interior. Save that if it isn't just a + # single space + if ($self->{'_output_is_for_JustPod'} && $bracket_count > 1) { + $lineage[-1][1]{'~bracket_count'} = $bracket_count; + my $lspacer = substr($1, 1 + $bracket_count); + $lineage[-1][1]{'~lspacer'} = $lspacer if $lspacer ne " "; + } + push @{ $lineage[-2] }, $lineage[-1]; } elsif(defined $4) { DEBUG > 3 and print STDERR "Found apparent complex end-text code \"$3$4\"\n"; # This is where it gets messy... @@ -6001,20 +6303,35 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n } #print STDERR "\nHOOBOY ", scalar(@{$lineage[-1]}), "!!!\n"; + if ($3 ne " " && $self->{'_output_is_for_JustPod'}) { + if ($3 ne "") { + $lineage[-1][1]{'~rspacer'} = $3; + } + elsif ($lineage[-1][1]{'~lspacer'} eq " ") { + + # Here we had something like C<< >> which was a false positive + delete $lineage[-1][1]{'~lspacer'}; + } + else { + $lineage[-1][1]{'~rspacer'} + = substr($lineage[-1][1]{'~lspacer'}, -1, 1); + chop $lineage[-1][1]{'~lspacer'}; + } + } + push @{ $lineage[-1] }, '' if 2 == @{ $lineage[-1] }; # Keep the element from being childless - - pop @stack; - pop @lineage; - unless (@stack) { # not in an L if there are no open fcodes + if ($inL == @stack) { + $lineage[-1][1]{'raw'} = $raw; $inL = 0; - if (ref $lineage[-1][-1] && $lineage[-1][-1][0] eq 'L') { - $lineage[-1][-1][1]{'raw'} = $raw - } } + + pop @stack; + pop @lineage; + $raw .= $3.$4 if $inL; - + } elsif(defined $5) { DEBUG > 3 and print STDERR "Found apparent simple end-text code \"$5\"\n"; @@ -6028,6 +6345,11 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n push @{ $lineage[-1] }, ''; # keep it from being really childless } + if ($inL == @stack) { + $lineage[-1][1]{'raw'} = $raw; + $inL = 0; + } + pop @stack; pop @lineage; } else { @@ -6035,12 +6357,6 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n push @{ $lineage[-1] }, $5; } - unless (@stack) { # not in an L if there are no open fcodes - $inL = 0; - if (ref $lineage[-1][-1] && $lineage[-1][-1][0] eq 'L') { - $lineage[-1][-1][1]{'raw'} = $raw - } - } $raw .= $5 if $inL; } elsif(defined $6) { @@ -6049,6 +6365,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n $raw .= $6 if $inL; # XXX does not capture multiplace whitespaces -- 'raw' ends up with # at most 1 leading/trailing whitespace, why not all of it? + # Answer, because we deliberately trimmed it above } else { # should never ever ever ever happen @@ -6179,7 +6496,7 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n # letters, but I don't know if it has always worked without bugs. It # seemed safest just to list the characters. # s<([^\x20\x21\x23\x27-\x3F\x41-\x5B\x5D-\x7E])> - s<([^ !#'()*+,\-./0123456789:;\<=\>?ABCDEFGHIJKLMNOPQRSTUVWXYZ\[\]^_`abcdefghijklmnopqrstuvwxyz{|}~])> + s<([^ !"#'()*+,\-./0123456789:;\<=\>?ABCDEFGHIJKLMNOPQRSTUVWXYZ\[\]^_`abcdefghijklmnopqrstuvwxyz{|}~])> <$pretty_form{$1} || '\\x{'.sprintf("%x", ord($1)).'}'>eg; #<$pretty_form{$1} || '\\x'.(unpack("H2",$1))>eg; qq{"$_"}; @@ -6213,23 +6530,19 @@ $fatpacked{"Pod/Simple/BlackBox.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n POD_SIMPLE_BLACKBOX $fatpacked{"Pod/Simple/LinkSection.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_SIMPLE_LINKSECTION'; - - require 5; package Pod::Simple::LinkSection; # Based somewhat dimly on Array::Autojoin - use vars qw($VERSION ); - $VERSION = '3.35'; use strict; + use warnings; use Pod::Simple::BlackBox; - use vars qw($VERSION ); - $VERSION = '3.35'; + our $VERSION = '3.45'; use overload( # So it'll stringify nice '""' => \&Pod::Simple::BlackBox::stringify_lol, 'bool' => \&Pod::Simple::BlackBox::stringify_lol, # '.=' => \&tack_on, # grudgingly support - + 'fallback' => 1, # turn on cleverness ); @@ -6304,7 +6617,7 @@ $fatpacked{"Pod/Simple/LinkSection.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\ can just use the normal stringification of objects of this class; they stringify to just the text content of the section, such as "foo" for - C<< LZ<> >>, and "bar" for + C<< LZ<> >>, and "bar" for C<< LZ<>> >>. However, anyone particularly interested in getting the full value of @@ -6353,7 +6666,7 @@ $fatpacked{"Pod/Simple/LinkSection.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\ This module is managed in an open GitHub repository, L. Feel free to fork and contribute, or - to clone L and send patches! + to clone L and send patches! Patches against Pod::Simple are welcome. Please send bug reports to . @@ -6389,6 +6702,113 @@ $fatpacked{"Pod/Simple/LinkSection.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\ =cut POD_SIMPLE_LINKSECTION +$fatpacked{"Pod/Simple/TiedOutFH.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_SIMPLE_TIEDOUTFH'; + package Pod::Simple::TiedOutFH; + use strict; + use warnings; + use Symbol ('gensym'); + use Carp (); + our $VERSION = '3.45'; + + #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + sub handle_on { # some horrible frightening things are encapsulated in here + my $class = shift; + $class = ref($class) || $class; + + Carp::croak "Usage: ${class}->handle_on(\$somescalar)" unless @_; + + my $x = (defined($_[0]) and ref($_[0])) + ? $_[0] + : ( \( $_[0] ) )[0] + ; + $$x = '' unless defined $$x; + + #Pod::Simple::DEBUG and print STDERR "New $class handle on $x = \"$$x\"\n"; + + my $new = gensym(); + tie *$new, $class, $x; + return $new; + } + + #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + sub TIEHANDLE { # Ties to just a scalar ref + my($class, $scalar_ref) = @_; + $$scalar_ref = '' unless defined $$scalar_ref; + return bless \$scalar_ref, ref($class) || $class; + } + + sub PRINT { + my $it = shift; + foreach my $x (@_) { $$$it .= $x } + + #Pod::Simple::DEBUG > 10 and print STDERR " appended to $$it = \"$$$it\"\n"; + + return 1; + } + + sub FETCH { + return ${$_[0]}; + } + + sub PRINTF { + my $it = shift; + my $format = shift; + $$$it .= sprintf $format, @_; + return 1; + } + + sub FILENO { ${ $_[0] } + 100 } # just to produce SOME number + + sub CLOSE { 1 } + + #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + 1; + __END__ + + Chole + + * 1 large red onion + * 2 tomatillos + * 4 or 5 roma tomatoes (optionally with the pulp discarded) + * 1 tablespoons chopped ginger root (or more, to taste) + * 2 tablespoons canola oil (or vegetable oil) + + * 1 tablespoon garam masala + * 1/2 teaspoon red chili powder, or to taste + * Salt, to taste (probably quite a bit) + * 2 (15-ounce) cans chick peas or garbanzo beans, drained and rinsed + * juice of one smallish lime + * a dash of balsamic vinegar (to taste) + * cooked rice, preferably long-grain white rice (whether plain, + basmati rice, jasmine rice, or even a mild pilaf) + + In a blender or food processor, puree the onions, tomatoes, tomatillos, + and ginger root. You can even do it with a Braun hand "mixer", if you + chop things finer to start with, and work at it. + + In a saucepan set over moderate heat, warm the oil until hot. + + Add the puree and the balsamic vinegar, and cook, stirring occasionally, + for 20 to 40 minutes. (Cooking it longer will make it sweeter.) + + Add the Garam Masala, chili powder, and cook, stirring occasionally, for + 5 minutes. + + Add the salt and chick peas and cook, stirring, until heated through. + + Stir in the lime juice, and optionally one or two teaspoons of tahini. + You can let it simmer longer, depending on how much softer you want the + garbanzos to get. + + Serve over rice, like a curry. + + Yields 5 to 7 servings. + + +POD_SIMPLE_TIEDOUTFH + $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TEXT'; # Convert POD data to formatted text. # @@ -6406,23 +6826,21 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE package Pod::Text; - use 5.006; + use 5.010; use strict; use warnings; - use vars qw(@ISA @EXPORT %ESCAPES $VERSION); - use Carp qw(carp croak); use Encode qw(encode); use Exporter (); use Pod::Simple (); - @ISA = qw(Pod::Simple Exporter); + our @ISA = qw(Pod::Simple Exporter); + our $VERSION = '5.01_02'; + $VERSION =~ tr/_//d; # We have to export pod2text for backward compatibility. - @EXPORT = qw(pod2text); - - $VERSION = '4.11'; + our @EXPORT = qw(pod2text); # Ensure that $Pod::Simple::nbsp and $Pod::Simple::shy are available. Code # taken from Pod::Simple 3.32, but was only added in 3.30. @@ -6431,18 +6849,15 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE $NBSP = $Pod::Simple::nbsp; $SHY = $Pod::Simple::shy; } else { - if ($] ge 5.007_003) { - $NBSP = chr utf8::unicode_to_native(0xA0); - $SHY = chr utf8::unicode_to_native(0xAD); - } elsif (Pod::Simple::ASCII) { - $NBSP = "\xA0"; - $SHY = "\xAD"; - } else { - $NBSP = "\x41"; - $SHY = "\xCA"; - } + $NBSP = chr utf8::unicode_to_native(0xA0); + $SHY = chr utf8::unicode_to_native(0xAD); } + # Import the ASCII constant from Pod::Simple. This is true iff we're in an + # ASCII-based universe (including such things as ISO 8859-1 and UTF-8), and is + # generally only false for EBCDIC. + BEGIN { *ASCII = \&Pod::Simple::ASCII } + ############################################################################## # Initialization ############################################################################## @@ -6464,9 +6879,6 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE my $class = shift; my $self = $class->SUPER::new; - # Tell Pod::Simple to handle S<> by automatically inserting  . - $self->nbsp_for_S (1); - # Tell Pod::Simple to keep whitespace whenever possible. if ($self->can ('preserve_whitespace')) { $self->preserve_whitespace (1); @@ -6489,16 +6901,20 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE my @opts = map { ("opt_$_", $opts{$_}) } keys %opts; %$self = (%$self, @opts); - # Send errors to stderr if requested. + # Backwards-compatibility support for the stderr option. if ($$self{opt_stderr} and not $$self{opt_errors}) { $$self{opt_errors} = 'stderr'; } delete $$self{opt_stderr}; - # Validate the errors parameter and act on it. - if (not defined $$self{opt_errors}) { - $$self{opt_errors} = 'pod'; + # Backwards-compatibility support for the utf8 option. + if ($$self{opt_utf8} && !$$self{opt_encoding}) { + $$self{opt_encoding} = 'UTF-8'; } + delete $$self{opt_utf8}; + + # Validate the errors parameter and act on it. + $$self{opt_errors} //= 'pod'; if ($$self{opt_errors} eq 'stderr' || $$self{opt_errors} eq 'die') { $self->no_errata_section (1); $self->complain_stderr (1); @@ -6517,12 +6933,12 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE delete $$self{errors}; # Initialize various things from our parameters. - $$self{opt_alt} = 0 unless defined $$self{opt_alt}; - $$self{opt_indent} = 4 unless defined $$self{opt_indent}; - $$self{opt_margin} = 0 unless defined $$self{opt_margin}; - $$self{opt_loose} = 0 unless defined $$self{opt_loose}; - $$self{opt_sentence} = 0 unless defined $$self{opt_sentence}; - $$self{opt_width} = 76 unless defined $$self{opt_width}; + $$self{opt_alt} //= 0; + $$self{opt_indent} //= 4; + $$self{opt_margin} //= 0; + $$self{opt_loose} //= 0; + $$self{opt_sentence} //= 0; + $$self{opt_width} //= 76; # Figure out what quotes we'll be using for C<> text. $$self{opt_quotes} ||= '"'; @@ -6538,6 +6954,17 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE croak qq(Invalid quote specification "$$self{opt_quotes}"); } + # Configure guesswork based on options. + my $guesswork = $self->{opt_guesswork} || q{}; + my %guesswork = map { $_ => 1 } split(m{,}xms, $guesswork); + if (!%guesswork || $guesswork{all}) { + $$self{GUESSWORK} = {quoting => 1}; + } elsif ($guesswork{none}) { + $$self{GUESSWORK} = {}; + } else { + $$self{GUESSWORK} = {%guesswork}; + } + # If requested, do something with the non-POD text. $self->code_handler (\&handle_code) if $$self{opt_code}; @@ -6639,7 +7066,7 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE my $spaces = ' ' x $$self{MARGIN}; my $width = $$self{opt_width} - $$self{MARGIN}; while (length > $width) { - if (s/^([^\n]{0,$width})\s+// || s/^([^\n]{$width})//) { + if (s/^([^\n]{0,$width})[ \t\n]+// || s/^([^\n]{$width})//) { $output .= $spaces . $1 . "\n"; } else { last; @@ -6657,22 +7084,22 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE local $_ = shift; # If we're trying to preserve two spaces after sentences, do some munging - # to support that. Otherwise, smash all repeated whitespace. + # to support that. Otherwise, smash all repeated whitespace. Be careful + # not to use \s here, which in Unicode input may match non-breaking spaces + # that we don't want to smash. if ($$self{opt_sentence}) { s/ +$//mg; s/\.\n/. \n/g; s/\n/ /g; s/ +/ /g; } else { - s/\s+/ /g; + s/[ \t\n]+/ /g; } return $self->wrap ($_); } # Output text to the output device. Replace non-breaking spaces with spaces - # and soft hyphens with nothing, and then try to fix the output encoding if - # necessary to match the input encoding unless UTF-8 output is forced. This - # preserves the traditional pass-through behavior of Pod::Text. + # and soft hyphens with nothing, and then determine the output encoding. sub output { my ($self, @text) = @_; my $text = join ('', @text); @@ -6682,15 +7109,39 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE if ($SHY) { $text =~ s/$SHY//g; } - unless ($$self{opt_utf8}) { - my $encoding = $$self{encoding} || ''; - if ($encoding && $encoding ne $$self{ENCODING}) { - $$self{ENCODING} = $encoding; - eval { binmode ($$self{output_fh}, ":encoding($encoding)") }; - } - } + + # The logic used here is described in the POD documentation. Prefer the + # configured encoding, then the pass-through option of using the same + # encoding as the input, and then UTF-8, but commit to an encoding for the + # document. + # + # ENCODE says whether to encode or not and is turned off if there is a + # PerlIO encoding layer (in start_document). ENCODING is the encoding + # that we previously committed to and is cleared at the start of each + # document. if ($$self{ENCODE}) { - print { $$self{output_fh} } encode ('UTF-8', $text); + my $encoding = $$self{ENCODING}; + if (!$encoding) { + $encoding = $self->encoding(); + if (!$encoding && ASCII && $text =~ /[^\x00-\x7F]/) { + $encoding = 'UTF-8'; + } + if ($encoding) { + $$self{ENCODING} = $encoding; + } + } + if ($encoding) { + my $check = sub { + my ($char) = @_; + my $display = '"\x{' . hex($char) . '}"'; + my $error = "$display does not map to $$self{ENCODING}"; + $self->whine ($self->line_count(), $error); + return Encode::encode ($$self{ENCODING}, chr($char)); + }; + print { $$self{output_fh} } encode ($encoding, $text, $check); + } else { + print { $$self{output_fh} } $text; + } } else { print { $$self{output_fh} } $text; } @@ -6720,25 +7171,19 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE $$self{MARGIN} = $margin; # Default left margin. $$self{PENDING} = [[]]; # Pending output. - # We have to redo encoding handling for each document. - $$self{ENCODING} = ''; - - # When UTF-8 output is set, check whether our output file handle already - # has a PerlIO encoding layer set. If it does not, we'll need to encode - # our output before printing it (handled in the output() sub). Wrap the - # check in an eval to handle versions of Perl without PerlIO. - $$self{ENCODE} = 0; - if ($$self{opt_utf8}) { - $$self{ENCODE} = 1; - eval { - my @options = (output => 1, details => 1); - my $flag = (PerlIO::get_layers ($$self{output_fh}, @options))[-1]; - if ($flag & PerlIO::F_UTF8 ()) { - $$self{ENCODE} = 0; - $$self{ENCODING} = 'UTF-8'; - } - }; - } + # We have to redo encoding handling for each document. Check whether the + # output file handle already has a PerlIO encoding layer set and, if so, + # disable encoding. + $$self{ENCODE} = 1; + eval { + require PerlIO; + my @options = (output => 1, details => 1); + my $flag = (PerlIO::get_layers ($$self{output_fh}, @options))[-1]; + if ($flag && ($flag & PerlIO::F_UTF8 ())) { + $$self{ENCODE} = 0; + } + }; + $$self{ENCODING} = $$self{opt_encoding}; return ''; } @@ -6782,8 +7227,7 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE # Calculate the indentation and margin. $fits is set to true if the tag # will fit into the margin of the paragraph given our indentation level. - my $indent = $$self{INDENTS}[-1]; - $indent = $$self{opt_indent} unless defined $indent; + my $indent = $$self{INDENTS}[-1] // $$self{opt_indent}; my $margin = ' ' x $$self{opt_margin}; my $tag_length = length ($self->strip_format ($tag)); my $fits = ($$self{MARGIN} - $indent >= $tag_length + 1); @@ -6987,6 +7431,13 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE sub cmd_i { return '*' . $_[2] . '*' } sub cmd_x { return '' } + # Convert all internal whitespace to $NBSP. + sub cmd_s { + my ($self, $attrs, $text) = @_; + $text =~ s{ \s }{$NBSP}xmsg; + return $text; + } + # Apply a whole bunch of messy heuristics to not quote things that don't # benefit from being quoted. These originally come from Barrie Slaymaker and # largely duplicate code in Pod::Man. @@ -6996,23 +7447,35 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE # A regex that matches the portion of a variable reference that's the # array or hash index, separated out just because we want to use it in # several places in the following regex. - my $index = '(?: \[.*\] | \{.*\} )?'; + my $index = '(?: \[[^]]+\] | \{[^}]+\} )?'; # Check for things that we don't want to quote, and if we find any of # them, return the string with just a font change and no quoting. + # + # Traditionally, Pod::Text has not quoted Perl variables, functions, + # numbers, or hex constants, but this is not always desirable. Make this + # optional on the quoting guesswork flag. + my $extra = qr{(?!)}xms; # never matches + if ($$self{GUESSWORK}{quoting}) { + $extra = qr{ + \$+ [\#^]? \S $index # special ($^F, $") + | [\$\@%&*]+ \#? [:\'\w]+ $index # plain var or func + | [\$\@%&*]* [:\'\w]+ + (?: -> )? \(\s*[^\s,\)]*\s*\) # 0/1-arg func call + | [+-]? ( \d[\d.]* | \.\d+ ) + (?: [eE][+-]?\d+ )? # a number + | 0x [a-fA-F\d]+ # a hex constant + }xms; + } $text =~ m{ ^\s* (?: - ( [\'\`\"] ) .* \1 # already quoted - | \` .* \' # `quoted' - | \$+ [\#^]? \S $index # special ($^Foo, $") - | [\$\@%&*]+ \#? [:\'\w]+ $index # plain var or func - | [\$\@%&*]* [:\'\w]+ (?: -> )? \(\s*[^\s,]\s*\) # 0/1-arg func call - | [+-]? ( \d[\d.]* | \.\d+ ) (?: [eE][+-]?\d+ )? # a number - | 0x [a-fA-F\d]+ # a hex constant + ( [\'\`\"] ) .* \1 # already quoted + | \` .* \' # `quoted' + | $extra ) \s*\z - }xo && return $text; + }xms and return $text; # If we didn't return, go ahead and quote the text. return $$self{opt_alt} @@ -7165,7 +7628,7 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE =for stopwords alt stderr Allbery Sean Burke's Christiansen UTF-8 pre-Unicode utf8 nourls - parsers + parsers EBCDIC autodetecting superset unrepresentable FH NNN =head1 NAME @@ -7184,67 +7647,180 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE =head1 DESCRIPTION - Pod::Text is a module that can convert documentation in the POD format - (the preferred language for documenting Perl) into formatted text. It - uses no special formatting controls or codes whatsoever, and its output is - therefore suitable for nearly any device. + Pod::Text is a module that can convert documentation in the POD format (the + preferred language for documenting Perl) into formatted text. It uses no + special formatting controls or codes, and its output is therefore suitable for + nearly any device. - As a derived class from Pod::Simple, Pod::Text supports the same methods and - interfaces. See L for all the details; briefly, one creates a - new parser with C<< Pod::Text->new() >> and then normally calls parse_file(). + =head2 Encoding + + Pod::Text uses the following logic to choose an output encoding, in order: + + =over 4 + + =item 1. + + If a PerlIO encoding layer is set on the output file handle, do not do any + output encoding and will instead rely on the PerlIO encoding layer. + + =item 2. + + If the C or C options are set, use the output encoding + specified by those options. - new() can take options, in the form of key/value pairs, that control the - behavior of the parser. The currently recognized options are: + =item 3. + + If the input encoding of the POD source file was explicitly specified (using + C<=encoding>) or automatically detected by Pod::Simple, use that as the output + encoding as well. + + =item 4. + + Otherwise, if running on a non-EBCDIC system, use UTF-8 as the output + encoding. Since this is a superset of ASCII, this will result in ASCII output + unless the POD input contains non-ASCII characters without declaring or + autodetecting an encoding (usually via EZ<><> escapes). + + =item 5. + + Otherwise, for EBCDIC systems, output without doing any encoding and hope + this works. + + =back + + One caveat: Pod::Text has to commit to an output encoding the first time it + outputs a non-ASCII character, and then has to stick with it for consistency. + However, C<=encoding> commands don't have to be at the beginning of a POD + document. If someone uses a non-ASCII character early in a document with an + escape, such as EZ<><0xEF>, and then puts C<=encoding iso-8859-1> later, + ideally Pod::Text would follow rule 3 and output the entire document as ISO + 8859-1. Instead, it will commit to UTF-8 following rule 4 as soon as it sees + that escape, and then stick with that encoding for the rest of the document. + + Unfortunately, there's no universally good choice for an output encoding. + Each choice will be incorrect in some circumstances. This approach was chosen + primarily for backwards compatibility. Callers should consider forcing the + output encoding via C if they have any knowledge about what encoding + the user may expect. + + In particular, consider importing the L module, if available, + and setting C to C to use an output encoding appropriate to + the user's locale. But be aware that if the user is not using locales or is + using a locale of C, Encode::Locale will set the output encoding to + US-ASCII. This will cause all non-ASCII characters will be replaced with C + and produce a flurry of warnings about unsupported characters, which may or + may not be what you want. + + =head1 CLASS METHODS + + =over 4 + + =item new(ARGS) + + Create a new Pod::Text object. ARGS should be a list of key/value pairs, + where the keys are chosen from the following. Each option is annotated with + the version of Pod::Text in which that option was added with its current + meaning. =over 4 =item alt - If set to a true value, selects an alternate output format that, among other - things, uses a different heading style and marks C<=item> entries with a + [2.00] If set to a true value, selects an alternate output format that, among + other things, uses a different heading style and marks C<=item> entries with a colon in the left margin. Defaults to false. =item code - If set to a true value, the non-POD parts of the input file will be included - in the output. Useful for viewing code documented with POD blocks with the - POD rendered and the code left intact. + [2.13] If set to a true value, the non-POD parts of the input file will be + included in the output. Useful for viewing code documented with POD blocks + with the POD rendered and the code left intact. + + =item encoding + + [5.00] Specifies the encoding of the output. The value must be an encoding + recognized by the L module (see L). If the output + contains characters that cannot be represented in this encoding, that is an + error that will be reported as configured by the C option. If error + handling is other than C, the unrepresentable character will be replaced + with the Encode substitution character (normally C). + + If the output file handle has a PerlIO encoding layer set, this parameter will + be ignored and no encoding will be done by Pod::Man. It will instead rely on + the encoding layer to make whatever output encoding transformations are + desired. + + WARNING: The input encoding of the POD source is independent from the output + encoding, and setting this option does not affect the interpretation of the + POD input. Unless your POD source is US-ASCII, its encoding should be + declared with the C<=encoding> command in the source, as near to the top of + the file as possible. If this is not done, Pod::Simple will will attempt to + guess the encoding and may be successful if it's Latin-1 or UTF-8, but it will + produce warnings. See L for more information. =item errors - How to report errors. C says to throw an exception on any POD - formatting error. C says to report errors on standard error, but - not to throw an exception. C says to include a POD ERRORS section - in the resulting documentation summarizing the errors. C ignores - POD errors entirely, as much as possible. + [3.17] How to report errors. C says to throw an exception on any POD + formatting error. C says to report errors on standard error, but not + to throw an exception. C says to include a POD ERRORS section in the + resulting documentation summarizing the errors. C ignores POD errors + entirely, as much as possible. The default is C. + =item guesswork + + [5.01] By default, Pod::Text applies some default formatting rules based on + guesswork and regular expressions that are intended to make writing Perl + documentation easier and require less explicit markup. These rules may not + always be appropriate, particularly for documentation that isn't about Perl. + This option allows turning all or some of it off. + + The special value C enables all guesswork. This is also the default for + backward compatibility reasons. The special value C disables all + guesswork. Otherwise, the value of this option should be a comma-separated + list of one or more of the following keywords: + + =over 4 + + =item quoting + + If no guesswork is enabled, any text enclosed in CZ<><> is surrounded by + double quotes in nroff (terminal) output unless the contents are already + quoted. When this guesswork is enabled, quote marks will also be suppressed + for Perl variables, function names, function calls, numbers, and hex + constants. + + =back + + Any unknown guesswork name is silently ignored (for potential future + compatibility), so be careful about spelling. + =item indent - The number of spaces to indent regular text, and the default indentation for - C<=over> blocks. Defaults to 4. + [2.00] The number of spaces to indent regular text, and the default + indentation for C<=over> blocks. Defaults to 4. =item loose - If set to a true value, a blank line is printed after a C<=head1> heading. - If set to false (the default), no blank line is printed after C<=head1>, - although one is still printed after C<=head2>. This is the default because - it's the expected formatting for manual pages; if you're formatting + [2.00] If set to a true value, a blank line is printed after a C<=head1> + heading. If set to false (the default), no blank line is printed after + C<=head1>, although one is still printed after C<=head2>. This is the default + because it's the expected formatting for manual pages; if you're formatting arbitrary text documents, setting this to true may result in more pleasing output. =item margin - The width of the left margin in spaces. Defaults to 0. This is the margin - for all text, including headings, not the amount by which regular text is - indented; for the latter, see the I option. To set the right + [2.21] The width of the left margin in spaces. Defaults to 0. This is the + margin for all text, including headings, not the amount by which regular text + is indented; for the latter, see the I option. To set the right margin, see the I option. =item nourls - Normally, LZ<><> formatting codes with a URL but anchor text are formatted - to show both the anchor text and the URL. In other words: + [3.17] Normally, LZ<><> formatting codes with a URL but anchor text are + formatted to show both the anchor text and the URL. In other words: L @@ -7252,72 +7828,131 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE foo - This option, if set to a true value, suppresses the URL when anchor text - is given, so this example would be formatted as just C. This can - produce less cluttered output in cases where the URLs are not particularly - important. + This option, if set to a true value, suppresses the URL when anchor text is + given, so this example would be formatted as just C. This can produce + less cluttered output in cases where the URLs are not particularly important. =item quotes - Sets the quote marks used to surround CE> text. If the value is a - single character, it is used as both the left and right quote. Otherwise, - it is split in half, and the first half of the string is used as the left - quote and the second is used as the right quote. + [4.00] Sets the quote marks used to surround CE> text. If the value is a + single character, it is used as both the left and right quote. Otherwise, it + is split in half, and the first half of the string is used as the left quote + and the second is used as the right quote. This may also be set to the special value C, in which case no quote marks are added around CE> text. =item sentence - If set to a true value, Pod::Text will assume that each sentence ends in two - spaces, and will try to preserve that spacing. If set to false, all - consecutive whitespace in non-verbatim paragraphs is compressed into a - single space. Defaults to false. + [3.00] If set to a true value, Pod::Text will assume that each sentence ends + in two spaces, and will try to preserve that spacing. If set to false, all + consecutive whitespace in non-verbatim paragraphs is compressed into a single + space. Defaults to false. =item stderr - Send error messages about invalid POD to standard error instead of - appending a POD ERRORS section to the generated output. This is - equivalent to setting C to C if C is not already - set. It is supported for backward compatibility. + [3.10] Send error messages about invalid POD to standard error instead of + appending a POD ERRORS section to the generated output. This is equivalent to + setting C to C if C is not already set. It is + supported for backward compatibility. =item utf8 - By default, Pod::Text uses the same output encoding as the input encoding - of the POD source (provided that Perl was built with PerlIO; otherwise, it - doesn't encode its output). If this option is given, the output encoding - is forced to UTF-8. - - Be aware that, when using this option, the input encoding of your POD - source should be properly declared unless it's US-ASCII. Pod::Simple will - attempt to guess the encoding and may be successful if it's Latin-1 or - UTF-8, but it will produce warnings. Use the C<=encoding> command to - declare the encoding. See L for more information. + [3.12] If this option is set to a true value, the output encoding is set to + UTF-8. This is equivalent to setting C to C if C + is not already set. It is supported for backward compatibility. =item width - The column at which to wrap text on the right-hand side. Defaults to 76. + [2.00] The column at which to wrap text on the right-hand side. Defaults to + 76. + + =back =back - The standard Pod::Simple method parse_file() takes one argument naming the - POD file to read from. By default, the output is sent to C, but - this can be changed with the output_fh() method. + =head1 INSTANCE METHODS + + As a derived class from Pod::Simple, Pod::Text supports the same methods and + interfaces. See L for all the details. This section summarizes + the most-frequently-used methods and the ones added by Pod::Text. + + =over 4 + + =item output_fh(FH) + + Direct the output from parse_file(), parse_lines(), or parse_string_document() + to the file handle FH instead of C. + + =item output_string(REF) + + Direct the output from parse_file(), parse_lines(), or parse_string_document() + to the scalar variable pointed to by REF, rather than C. For example: + + my $man = Pod::Man->new(); + my $output; + $man->output_string(\$output); + $man->parse_file('/some/input/file'); + + Be aware that the output in that variable will already be encoded (see + L). + + =item parse_file(PATH) + + Read the POD source from PATH and format it. By default, the output is sent + to C, but this can be changed with the output_fh() or output_string() + methods. + + =item parse_from_file(INPUT, OUTPUT) + + =item parse_from_filehandle(FH, OUTPUT) + + Read the POD source from INPUT, format it, and output the results to OUTPUT. + + parse_from_filehandle() is provided for backward compatibility with older + versions of Pod::Man. parse_from_file() should be used instead. + + =item parse_lines(LINES[, ...[, undef]]) + + Parse the provided lines as POD source, writing the output to either C + or the file handle set with the output_fh() or output_string() methods. This + method can be called repeatedly to provide more input lines. An explicit + C should be passed to indicate the end of input. - The standard Pod::Simple method parse_from_file() takes up to two - arguments, the first being the input file to read POD from and the second - being the file to write the formatted output to. + This method expects raw bytes, not decoded characters. - You can also call parse_lines() to parse an array of lines or - parse_string_document() to parse a document already in memory. As with - parse_file(), parse_lines() and parse_string_document() default to sending - their output to C unless changed with the output_fh() method. + =item parse_string_document(INPUT) - To put the output from any parse method into a string instead of a file - handle, call the output_string() method instead of output_fh(). + Parse the provided scalar variable as POD source, writing the output to either + C or the file handle set with the output_fh() or output_string() + methods. - See L for more specific details on the methods available to - all derived parsers. + This method expects raw bytes, not decoded characters. + + =back + + =head1 FUNCTIONS + + Pod::Text exports one function for backward compatibility with older versions. + This function is deprecated; instead, use the object-oriented interface + described above. + + =over 4 + + =item pod2text([[-a,] [-NNN,]] INPUT[, OUTPUT]) + + Convert the POD source from INPUT to text and write it to OUTPUT. If OUTPUT + is not given, defaults to C. INPUT can be any expression supported as + the second argument to two-argument open(). + + If C<-a> is given as an initial argument, pass the C option to the + Pod::Text constructor. This enables alternative formatting. + + If C<-NNN> is given as an initial argument, pass the C option to the + Pod::Text constructor with the number C as its argument. This sets the + wrap line width to NNN. + + =back =head1 DIAGNOSTICS @@ -7352,61 +7987,66 @@ $fatpacked{"Pod/Text.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_TE =back - =head1 BUGS + =head1 COMPATIBILITY - Encoding handling assumes that PerlIO is available and does not work - properly if it isn't. The C option is therefore not supported - unless Perl is built with PerlIO support. + Pod::Text 2.03 (based on L) was the first version of this module + included with Perl, in Perl 5.6.0. Earlier versions of Perl had a different + Pod::Text module, with a different API. - =head1 CAVEATS + The current API based on L was added in Pod::Text 3.00. + Pod::Text 3.01 was included in Perl 5.9.3, the first version of Perl to + incorporate those changes. This is the first version that correctly supports + all modern POD syntax. The parse_from_filehandle() method was re-added for + backward compatibility in Pod::Text 3.07, included in Perl 5.9.4. - If Pod::Text is given the C option, the encoding of its output file - handle will be forced to UTF-8 if possible, overriding any existing - encoding. This will be done even if the file handle is not created by - Pod::Text and was passed in from outside. This maintains consistency - regardless of PERL_UNICODE and other settings. + Pod::Text 3.12, included in Perl 5.10.1, first implemented the current + practice of attempting to match the default output encoding with the input + encoding of the POD source, unless overridden by the C option or (added + later) the C option. - If the C option is not given, the encoding of its output file handle - will be forced to the detected encoding of the input POD, which preserves - whatever the input text is. This ensures backward compatibility with - earlier, pre-Unicode versions of this module, without large numbers of - Perl warnings. + Support for anchor text in LZ<><> links of type URL was added in Pod::Text + 3.14, included in Perl 5.11.5. - This is not ideal, but it seems to be the best compromise. If it doesn't - work for you, please let me know the details of how it broke. + parse_lines(), parse_string_document(), and parse_file() set a default output + file handle of C if one was not already set as of Pod::Text 3.18, + included in Perl 5.19.5. - =head1 NOTES + Pod::Text 4.00, included in Perl 5.23.7, aligned the module version and the + version of the podlators distribution. All modules included in podlators, and + the podlators distribution itself, share the same version number from this + point forward. - This is a replacement for an earlier Pod::Text module written by Tom - Christiansen. It has a revamped interface, since it now uses Pod::Simple, - but an interface roughly compatible with the old Pod::Text::pod2text() - function is still available. Please change to the new calling convention, - though. + Pod::Text 4.09, included in Perl 5.25.7, fixed a serious bug on EBCDIC + systems, present in all versions back to 3.00, that would cause opening + brackets to disappear. - The original Pod::Text contained code to do formatting via termcap - sequences, although it wasn't turned on by default and it was problematic to - get it to work at all. This rewrite doesn't even try to do that, but a - subclass of it does. Look for L. + Pod::Text 5.00 now defaults, on non-EBCDIC systems, to UTF-8 encoding if it + sees a non-ASCII character in the input and the input encoding is not + specified. It also commits to an encoding with the first non-ASCII character + and does not change the output encoding if the input encoding changes. The + L module is now used for all output encoding rather than PerlIO + layers, which fixes earlier problems with output to scalars. =head1 AUTHOR - Russ Allbery , based I heavily on the original - Pod::Text by Tom Christiansen and its conversion to - Pod::Parser by Brad Appleton . Sean Burke's initial - conversion of Pod::Man to use Pod::Simple provided much-needed guidance on - how to use Pod::Simple. + Russ Allbery , based I heavily on the original Pod::Text + by Tom Christiansen and its conversion to Pod::Parser + by Brad Appleton . Sean Burke's initial conversion of + Pod::Man to use Pod::Simple provided much-needed guidance on how to use + Pod::Simple. =head1 COPYRIGHT AND LICENSE - Copyright 1999-2002, 2004, 2006, 2008-2009, 2012-2016, 2018 Russ Allbery - + Copyright 1999-2002, 2004, 2006, 2008-2009, 2012-2016, 2018-2019, 2022 Russ + Allbery This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself. =head1 SEE ALSO - L, L, L, L + L, L, L, + L, L, L The current version of this module is always available from its web site at L. It is also part of the @@ -7431,19 +8071,19 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U ############################################################################# package Pod::Usage; - use strict; - use vars qw($VERSION @ISA @EXPORT); - $VERSION = '1.69'; ## Current version of this package + use strict; require 5.006; ## requires this Perl version or later - #use diagnostics; use Carp; use Config; use Exporter; use File::Spec; - @EXPORT = qw(&pod2usage); + our $VERSION = '2.03'; + + our @EXPORT = qw(&pod2usage); + our @ISA; BEGIN { $Pod::Usage::Formatter ||= 'Pod::Text'; eval "require $Pod::Usage::Formatter"; @@ -7525,13 +8165,13 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U my @paths = (ref $pathspec) ? @$pathspec : split($pathsep, $pathspec); for my $dirname (@paths) { - $_ = File::Spec->catfile($dirname, $basename) if length; + $_ = length($dirname) ? File::Spec->catfile($dirname, $basename) : $basename; last if (-e $_) && ($opts{'-input'} = $_); } } ## Now create a pod reader and constrain it to the desired sections. - my $parser = new Pod::Usage(USAGE_OPTIONS => \%opts); + my $parser = Pod::Usage->new(USAGE_OPTIONS => \%opts); if ($opts{'-verbose'} == 0) { $parser->select('(?:SYNOPSIS|USAGE)\s*'); } @@ -7552,8 +8192,8 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U ## Check for perldoc my $progpath = $opts{'-perldoc'} ? $opts{'-perldoc'} : - File::Spec->catfile($Config{scriptdirexp} - || $Config{scriptdir}, 'perldoc'); + File::Spec->catfile($Config{scriptdirexp} || $Config{scriptdir}, + 'perldoc'); my $version = sprintf("%vd",$^V); if ($Config{versiononly} and $Config{startperl} =~ /\Q$version\E$/ ) { @@ -7581,9 +8221,17 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U push @perldoc_cmd, ('-F', $f); unshift @perldoc_cmd, $opts{'-perlcmd'} if $opts{'-perlcmd'}; system(@perldoc_cmd); + # RT16091: fall back to more if perldoc failed if($?) { - # RT16091: fall back to more if perldoc failed - system(($Config{pager} || $ENV{PAGER} || '/bin/more'), $1); + # RT131844: prefer PAGER env + my $pager = $ENV{PAGER} || $Config{pager}; + if(defined($pager) && length($pager)) { + my $cmd = $pager . ' ' . ($^O =~ /win/i ? qq("$f") : quotemeta($f)); + system($cmd); + } else { + # the most humble fallback; should work (at least) on *nix and Win + system('more', $f); + } } } else { croak "Unspecified input file or insecure argument.\n"; @@ -7697,7 +8345,12 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U sub seq_i { return $_[1] } # Override Pod::Text->cmd_i to return just "arg", not "*arg*". # newer version based on Pod::Simple - sub cmd_i { return $_[2] } + sub cmd_i { + my $self = shift; + # RT121489: highlighting should be there with Termcap + return $self->SUPER::cmd_i(@_) if $self->isa('Pod::Text::Termcap'); + return $_[1]; + } # This overrides the Pod::Text method to do something very akin to what # Pod::Select did as well as the work done below by preprocess_paragraph. @@ -7795,13 +8448,15 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U __END__ + =for stopwords pod2usage verboseness downcased MSWin32 Marek Rouchal Christiansen ATOOMIC rjbs McDougall + =head1 NAME - Pod::Usage - print a usage message from embedded pod documentation + Pod::Usage - extracts POD documentation and shows usage information =head1 SYNOPSIS - use Pod::Usage + use Pod::Usage; my $message_text = "This text precedes the usage message."; my $exit_status = 2; ## The exit status to use @@ -7813,13 +8468,13 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U pod2usage($exit_status); pod2usage( { -message => $message_text , - -exitval => $exit_status , - -verbose => $verbose_level, + -exitval => $exit_status , + -verbose => $verbose_level, -output => $filehandle } ); pod2usage( -msg => $message_text , - -exitval => $exit_status , - -verbose => $verbose_level, + -exitval => $exit_status , + -verbose => $verbose_level, -output => $filehandle ); pod2usage( -verbose => 2, @@ -7865,39 +8520,39 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U =item C<-msg> I The text of a message to print immediately prior to printing the - program's usage message. + program's usage message. =item C<-exitval> I The desired exit status to pass to the B function. - This should be an integer, or else the string "NOEXIT" to + This should be an integer, or else the string C to indicate that control should simply be returned without terminating the invoking process. =item C<-verbose> I The desired level of "verboseness" to use when printing the usage message. - If the value is 0, then only the "SYNOPSIS" section of the pod documentation - is printed. If the value is 1, then the "SYNOPSIS" section, along with any - section entitled "OPTIONS", "ARGUMENTS", or "OPTIONS AND ARGUMENTS" is - printed. If the corresponding value is 2 or more then the entire manpage is - printed, using L if available; otherwise L is used for - the formatting. For better readability, the all-capital headings are - downcased, e.g. C =E C. + If the value is 0, then only the "SYNOPSIS" and/or "USAGE" sections of the + pod documentation are printed. If the value is 1, then the "SYNOPSIS" and/or + "USAGE" sections, along with any section entitled "OPTIONS", "ARGUMENTS", or + "OPTIONS AND ARGUMENTS" is printed. If the corresponding value is 2 or more + then the entire manpage is printed, using L if available; otherwise + L is used for the formatting. For better readability, the + all-capital headings are downcased, e.g. C =E C. The special verbosity level 99 requires to also specify the -sections parameter; then these sections are extracted and printed. =item C<-sections> I - There are two ways to specify the selection. Either a string (scalar) + There are two ways to specify the selection. Either a string (scalar) representing a selection regexp for sections to be printed when -verbose is set to 99, e.g. "NAME|SYNOPSIS|DESCRIPTION|VERSION" With the above regexp all content following (and including) any of the - given C<=head1> headings will be shown. It is possible to restrict the + given C<=head1> headings will be shown. It is possible to restrict the output to particular subsections only, e.g.: "DESCRIPTION/Algorithm" @@ -7906,7 +8561,7 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U the C<=head1 DESCRIPTION> section. The regexp binding is stronger than the section separator, such that e.g.: - "DESCRIPTION|OPTIONS|ENVIORNMENT/Caveats" + "DESCRIPTION|OPTIONS|ENVIRONMENT/Caveats" will print any C<=head2 Caveats> section (only) within any of the three C<=head1> sections. @@ -7916,7 +8571,7 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U pod2usage(-verbose => 99, -sections => [ qw(DESCRIPTION DESCRIPTION/Introduction) ] ); - This will print only the content of C<=head1 DESCRIPTION> and the + This will print only the content of C<=head1 DESCRIPTION> and the C<=head2 Introduction> sections, but no other C<=head2>, and no other C<=head1> either. @@ -7950,17 +8605,16 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U =item C<-noperldoc> - By default, Pod::Usage will call L when -verbose >= 2 is - specified. This does not work well e.g. if the script was packed - with L. The -noperldoc option suppresses the external call to - L and uses the simple text formatter (L) to - output the POD. + By default, Pod::Usage will call L when -verbose >= 2 is specified. + This does not work well e.g. if the script was packed with L. This option + suppresses the external call to L and uses the simple text formatter + (L) to output the POD. =item C<-perlcmd> By default, Pod::Usage will call L when -verbose >= 2 is specified. In case of special or unusual Perl installations, - the -perlcmd option may be used to supply the path to a L executable + this option may be used to supply the path to a L executable which should run L. =item C<-perldoc> I @@ -7973,7 +8627,7 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U =item C<-perldocopt> I By default, Pod::Usage will call L when -verbose >= 2 is specified. - The -perldocopt option may be used to supply options to L. The + This option may be used to supply options to L. The string may contain several, space-separated options. =back @@ -8081,7 +8735,7 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U =back - B doesn't force the above conventions upon you, but it will + B does not force the above conventions upon you, but it will use them by default if you don't expressly tell it to do otherwise. The ability of B to accept a single number or a string makes it convenient to use as an innocent looking error message handling function: @@ -8270,7 +8924,7 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U By default, B will use C<$0> as the path to the pod input file. Unfortunately, not all systems on which Perl runs will set C<$0> - properly (although if C<$0> isn't found, B will search + properly (although if C<$0> is not found, B will search C<$ENV{PATH}> or else the list specified by the C<-pathlist> option). If this is the case for your system, you may need to explicitly specify the path to the pod docs for the invoking script using something @@ -8287,23 +8941,45 @@ $fatpacked{"Pod/Usage.pm"} = '#line '.(1+__LINE__).' "'.__FILE__."\"\n".<<'POD_U use FindBin; pod2usage(-input => $FindBin::Bin . "/" . $FindBin::Script); - =head1 AUTHOR + =head1 SUPPORT + + This module is managed in a GitHub repository, + L Feel free to fork and contribute, or + to clone and send patches! + + Please use L to file a bug + report. The previous ticketing system, + L, is deprecated for + this package. - Please report bugs using L. + More general questions or discussion about POD should be sent to the + C mail list. Send an empty email to + C to subscribe. + + =head1 AUTHOR Marek Rouchal Emarekr@cpan.orgE + Nicolas R Enicolas@atoomic.orgE + Brad Appleton Ebradapp@enteract.comE Based on code for B written by Tom Christiansen Etchrist@mox.perl.comE + =head1 LICENSE + + Pod::Usage (the distribution) is licensed under the same terms as Perl. + =head1 ACKNOWLEDGMENTS + Nicolas R (ATOOMIC) for setting up the Github repo and modernizing this + package. + rjbs for refactoring Pod::Usage to not use Pod::Parser any more. - Steven McDougall Eswmcd@world.std.comE for his help and patience - with re-writing this manpage. + Steven McDougall Eswmcd@world.std.comE for his help and patience with + re-writing this manpage. =head1 SEE ALSO @@ -9115,7 +9791,7 @@ use Sys::Hostname; use constant HAS_TEXT_DIFF => eval { require Text::Diff; 1 }; -our $VERSION = '1.306'; +our $VERSION = '1.400'; use constant { PROG => (File::Spec->splitpath($0))[2], @@ -9181,7 +9857,7 @@ use constant { my $offline; # No network connection my $remove_all; -my $key_type = 'rsa'; +my $key_type = 'ed25519'; my $key_bits = 4096; my $github_default = ''; @@ -9444,7 +10120,7 @@ foreach my $user (@github_accounts) { printf "Creating private key %s for user %s...\n", compress_path($u->{key_file}), $user; - $u->{key_comment} = hostname."/$user\@github.com"; + $u->{key_comment} //= hostname."/$user\@github.com"; system 'ssh-keygen', '-t', delete $u->{key_type}, @@ -9552,7 +10228,8 @@ if (@github_accounts) { # To rebuild __DATA__ (if GitHub ever revokes its host keys): - # ssh-keyscan -t dsa,rsa github.com gist.github.com ssh.github.com + # ssh-keyscan -t ed25519 github.com gist.github.com 2>/dev/null + # ssh-keyscan -t ed25519 -p 443 ssh.github.com 2>/dev/null my $size = -e KNOWN_HOSTS_FILE ? (stat KNOWN_HOSTS_FILE)[7] @@ -9593,7 +10270,7 @@ if (@github_accounts) { # '6.6.1p1' => 0x6611 # '4.6p1' => 0x4601 my $SSH_VERSION = do { - my @V = $SSH_V =~ /^OpenSSH_([0-9]+)\.([0-9]+)(?:\.([0-9]+))?p([0-9]+)/; + my @V = $SSH_V =~ /^OpenSSH(?:[^0-9]*)_([0-9]+)\.([0-9]+)(?:\.([0-9]+))?p([0-9]+)/; ($V[0] << 12) | ($V[1] << 8) | (($V[2] || 0) << 4) | $V[3] }; @@ -9602,9 +10279,16 @@ if (@github_accounts) { # Algorithms subset recommended by Stribika # See https://stribika.github.io/2015/01/04/secure-secure-shell.html # Last in each list is the minimum supported by GitHub - # TODO: some algorithms are imported by OpenSSH from OpenSSL at runtime - # so we could check the output of `openssl list-cipher-algorithms` + # TODO: Some algorithms are imported by OpenSSH from OpenSSL at runtime + # so we could check the output of `openssl list -cipher-algorithms` # in the case `ssh -Q cipher` is not implemented (old OpenSSH). + # However: + # - this is less and less relevant as recent version of OpenSSH get + # deployed + # - we would have to parse various versions of the openssl tool output + # (`openssl list -cipher-algorithms` vs + # `openssl list-cipher-algorithms`: see + # tools/list-openssl-algorithms) my %algorithms = ( # Ciphers: ssh -Q cipher cipher => [ @@ -9700,7 +10384,7 @@ HostName github.com Host github.com gist.github.com ssh.github.com 443.github.com *.github.com *.gist.github.com *.ssh.github.com *.443.github.com User git -# Knwon options which are available only in some OpenSSH versions +# Known options which are available only in some OpenSSH versions IgnoreUnknown Protocol,UseRoaming,UseKeychain # Force SSH2 Protocol 2 @@ -10049,11 +10733,11 @@ Remove all accounts, except those following on the command-line. =item -t I -Set default key type for key creation. Default is C. +Set default key type for key creation. Default is C. =item -b I -Set default key bits for key creation. Default is C<2048>. +Set default key bits for key creation. Default is C<4096>. =back @@ -10165,7 +10849,7 @@ Olivier MenguE, L. =head1 COPYRIGHT & LICENSE -Copyright E 2011-2022 Olivier MenguE. +Copyright E 2011-2025 Olivier MenguE. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -10178,7 +10862,7 @@ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License -along with this program. If not, see L. +along with this program. If not, see L. =cut diff --git a/github-keygen.cmd b/github-keygen.cmd index 489b0a8..d3b6e3a 100644 --- a/github-keygen.cmd +++ b/github-keygen.cmd @@ -4,7 +4,7 @@ setlocal :: Add %GIT_HOME% to %PATH%: this should provide perl.exe and ssh-keygen.exe :: Unfortunately msysgit only bundles perl 5.8.8 and no Pod::Usage :: See https://github.com/msysgit/msysgit/issues/61 -for %%f in (git.cmd git.exe) do if not !%%~d$PATH:f==! for /D %%i in ("%%~dp$PATH:f..\bin") do path %PATH%;%%~fi +for %%f in (git.cmd git.exe) do if not !%%~d$PATH:f==! for /D %%i in ("%%~dp$PATH:f..\bin" "%%~dp$PATH:f..\usr\bin") do if exist %%~fi\perl.exe path %PATH%;%%~fi ::echo %PATH% ::ssh -V perl %~dpn0 %*