ATTENTION: This package has been replaced by the Unicode-Normalization package!
A composer-package providing a stream filter to normalize unicode, currently only utf8.
php composer.phar require sjorek/unicode-normalization-stream-filter
<?php
\Sjorek\UnicodeNormalization\StreamFilter::register();
$in_file = fopen('utf8-file.txt', 'r');
$out_file = fopen('utf8-normalized-to-nfc-file.txt', 'w');
// It works as a read filter:
stream_filter_append($in_file, 'convert.unicode-normalization.NFC');
// And it also works as a write filter:
// stream_filter_append($out_file, 'convert.unicode-normalization.NFC');
stream_copy_to_stream($in_file, $out_file);
<?php
/**
* @var $stream resource The stream to filter.
* @var $form string The form to normalize unicode to.
* @var $read_write int STREAM_FILTER_* constant to override the filter injection point
*
* @link http://php.net/manual/en/function.stream-filter-append.php
* @link http://php.net/manual/en/function.stream-filter-prepend.php
*/
stream_filter_append($stream, "convert.unicode-normalization.$form", $read_write);
Note: Be careful when using on streams in r+
or w+
(or similar) modes; by default PHP will assign the
filter to both the reading and writing chain. This means it will attempt to convert the data twice - first when
reading from the stream, and once again when writing to it.
Look at the contribution guidelines