Safely truncate HTML markup while preserving tags, handling entities, and supporting Unicode/emoji with optional word-safe truncation.
I recommend using Composer for installing and using Shorten:
composer require marcgoertz/shorten
Of course you can also just require it in your scripts directly.
<?php
use Marcgoertz\Shorten\Shorten;
$shorten = new Shorten();
print $shorten->truncateMarkup('<a href="https://example.com/">Go to example site</a>', 10);
?>
Output:
<a href="https://example.com/">Go to exam</a>…
truncateMarkup(
string $markup,
int $length = 400,
string $appendix = '…',
bool $appendixInside = false,
bool $wordsafe = false,
string $delimiter = ' '
): string
string $markup
: Text containing markupint $length
: Maximum length of truncated text (default:400
)string $appendix
: Text added after truncated text (default:'…'
)bool $appendixInside
: Add appendix to last content in tags, increases$length
by 1 (default:false
)bool $wordsafe
: Wordsafe truncation, cuts at word boundaries (default:false
)string $delimiter
: Delimiter for wordsafe truncation (default:' '
)
<?php
use Marcgoertz\Shorten\Shorten;
$shorten = new Shorten();
// Basic truncation
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10);
// Output: <b>Hello worl</b>…
// Appendix inside tags
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10, '...', true);
// Output: <b>Hello worl...</b>
// Wordsafe truncation (cuts at word boundaries)
$result = $shorten->truncateMarkup('<b>Hello world test</b>', 10, '...', false, true);
// Output: <b>Hello</b>...
// Custom delimiter for wordsafe truncation
$result = $shorten->truncateMarkup('<b>Hello-world-test</b>', 10, '...', false, true, '-');
// Output: <b>Hello</b>...
// Preserves HTML structure with nested tags
$result = $shorten->truncateMarkup('<div><b><i>Hello world</i></b></div>', 8);
// Output: <div><b><i>Hello wo</i></b></div>…
// Handles HTML entities correctly
$result = $shorten->truncateMarkup('<b>Café & Restaurant</b>', 8);
// Output: <b>Café & Re</b>…
?>
- ✅ Preserves HTML tag structure and proper nesting
- ✅ Handles HTML entities correctly
- ✅ Supports self-closing tags (both XML and HTML5 style)
- ✅ UTF-8 and multibyte character support (including emojis)
- ✅ Wordsafe truncation to avoid cutting words in the middle
- ✅ Configurable appendix text and placement
marcgoertz/shorten-twig
is a Twig extension for this package.
MIT © Marc Görtz