Skip to content

Commit

Permalink
Merge branch 'feature/en' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
fgmacedo committed Oct 1, 2013
2 parents e86c0c4 + 39469e1 commit d7ce7ac
Show file tree
Hide file tree
Showing 33 changed files with 1,016 additions and 957 deletions.
6 changes: 4 additions & 2 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
[run]
include=*raspador*
omit=tasks.py
omit=tasks.py,*ordereddict*

[report]
exclude_lines =

raise NotImplementedError

if __name__ == '__main__':
if __name__ == '__main__':

except ImportError:
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ MANIFEST

# Virtualenvs
env*
.tox

# Distribute
build
Expand Down
1 change: 0 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ python:
- "3.3"
- "pypy"
install:
- "pip install -r requirements_dev.txt --use-mirrors"
# For Python 2.6 support
- "pip install ordereddict --use-mirrors"
- "pip install coveralls"
Expand Down
84 changes: 37 additions & 47 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
raspador
========

.. image:: https://api.travis-ci.org/fgmacedo/raspador.png
.. image:: https://api.travis-ci.org/fgmacedo/raspador.png?branch=master
:target: https://travis-ci.org/fgmacedo/raspador

.. image:: https://coveralls.io/repos/fgmacedo/raspador/badge.png
Expand All @@ -20,7 +20,7 @@ Biblioteca para extração de dados em documentos semi-estruturados.
A definição dos extratores é feita através de classes como modelos, de forma
semelhante ao ORM do Django. Cada extrator procura por um padrão especificado
por expressão regular, e a conversão para tipos primitidos é feita
automaticamente a partir dos grupos capturados.
automaticamente a partir dos groups capturados.


O analisador é implementado como um gerador, onde cada item encontrado pode ser
Expand All @@ -39,66 +39,55 @@ utilização de conceitos e recursos como iteradores, geradores, meta-programaç
e property-descriptors.


Compatibilidade e dependências
===============================
Compatibility and dependencies
==============================

O raspador é compatível com Python 2.6, 2.7, 3.2, 3.3 e pypy.
raspador runs on Python 2.6+, 3.2+ and pypy.

Desenvolvimento realizado em Python 2.7.5 e Python 3.2.3.

Não há dependências externas.
There are no external dependencies.

.. note:: Python 2.6

Em Python 2.6, a biblioteca `ordereddict
<https://pypi.python.org/pypi/ordereddict/>`_ é necessária.
With Python 2.6, you must install `ordereddict
<https://pypi.python.org/pypi/ordereddict/>`_.

Você pode instalar com pip::
You can install it with pip::

pip install ordereddict

Testes
Tests
======

Os testes dependem de algumas bibliotecas externas:
To automate tests with all supported Python versions at once, we use `tox
<http://tox.readthedocs.org/en/latest/>`_.

.. code-block:: text
coverage==3.6
nose==1.3.0
flake8==2.0
invoke==0.5.0
Você pode executar os testes com ``nosetests``:
Run all tests with:

.. code-block:: bash
$ nosetests
$ tox
E adicionalmente, verificar a compatibilidade com o PEP8:
Tests depends on several third party libraries, but these are installed by tox
on each Python's virtualenv:

.. code-block:: bash
$ flake8 raspador testes
Ou por conveniência, executar os dois em sequência com invoke:

.. code-block:: bash
.. code-block:: text
$ invoke test
nose==1.3.0
coverage==3.6
flake8==2.0
Exemplos
Examples
========

Extrator de dados em logs
-------------------------
Extract data from logs
----------------------

.. code-block:: python
from __future__ import print_function
import json
from raspador import Analizador, CampoString
from raspador import Parser, StringField
out = """
PART:/dev/sda1 UUID:423k34-3423lk423-sdfsd-43 TYPE:ext4
Expand All @@ -107,22 +96,23 @@ Extrator de dados em logs
"""
class AnalizadorDeLog(Analizador):
inicio = r'^PART.*'
fim = r'^PART.*'
PART = CampoString(r'PART:([^\s]+)')
UUID = CampoString(r'UUID:([^\s]+)')
TYPE = CampoString(r'TYPE:([^\s]+)')
class LogParser(Parser):
begin = r'^PART.*'
end = r'^PART.*'
PART = StringField(r'PART:([^\s]+)')
UUID = StringField(r'UUID:([^\s]+)')
TYPE = StringField(r'TYPE:([^\s]+)')
a = AnalizadorDeLog()
a = LogParser()
# res é um gerador
res = a.analizar(linha for linha in out.splitlines())
# res is a generator
res = a.parse(iter(out.splitlines()))
print (json.dumps(list(res), indent=2))
out_as_json = json.dumps(list(res), indent=2)
print (out_as_json)
# Saída:
# Output:
"""
[
{
Expand Down
22 changes: 11 additions & 11 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
Documentação do raspador
========================
.. _topics-index:

================================
Raspador |version| documentation
================================

Conteúdo:

.. toctree::
:maxdepth: 2

raspador
.. toctree::
:hidden:

intro/overview
intro/install
intro/tutorial

Índices e tabelas
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Looking for specific information? Try the :ref:`genindex` or :ref:`modindex`.
28 changes: 28 additions & 0 deletions docs/source/intro/install.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@

*******
Install
*******


Package managers
================

You can install using pip or easy_install.

PIP::

pip install raspador

Easy install::

easy_install raspador


From source
===========

Download and install from source::

git clone https://github.com/fgmacedo/raspador.git
cd raspador
python setup.py install
13 changes: 7 additions & 6 deletions docs/source/raspador.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,31 @@

========
raspador
========

O módulo raspador fornece estrutura genérica para extração de dados a partir de
arquivos texto semi-estruturados.


Analizador
Parser
----------

.. automodule:: raspador.analizador
.. automodule:: raspador.parser
:members:


Campos
------

.. automodule:: raspador.campos
.. automodule:: raspador.fields
:members:
:undoc-members:


Coleções
--------
Item
----

.. automodule:: raspador.colecoes
.. automodule:: raspador.item
:members:
:undoc-members:

9 changes: 5 additions & 4 deletions raspador/__init__.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
# flake8: noqa

from .analizador import Analizador, Dicionario
from .campos import CampoBase, CampoString, CampoNumerico, \
CampoInteiro, CampoData, CampoDataHora, CampoBooleano
from .parser import Parser
from .item import Dictionary
from .fields import BaseField, StringField, FloatField, BRFloatField, \
IntegerField, DateField, DateTimeField, BooleanField

from .decoradores import ProxyDeCampo, ProxyConcatenaAteRE
from .decorators import FieldProxy, UnionUntilRegexProxy

from .cache import Cache
Loading

0 comments on commit d7ce7ac

Please sign in to comment.