html extraction library, based on SimpleXML & nokogiri XpathSubquery.php
- Simple
- Minimal code
- Fast
- Query results are
SimpleXMLElement
instances - Supports nested css/xpath queries
#Using packagist:
composer require 'fizzka/extractor'
<?php
require_once 'vendor/autoload.php';
$html = gzdecode(file_get_contents('http://habrahabr.ru/'));
$ex = Extractor::fromHtml($html);
var_dump($ex->get('a.habracut'));
echo $ex->cssPathFirst('div.post')->xpathFirst('.//@href');
foreach ($ex->cssPath('div.post') as $post) {
var_dump($post->cssPathFirst('a.post_title'));
}
Just run phpunit
from the top of project
Feel free to use & contribute ;)
MIT