Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pig #16

Merged
merged 29 commits into from
Sep 25, 2015
Merged

Pig #16

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
32f3ccd
Initial add
akshaisarma Aug 6, 2015
e7d51e4
Merge branch 'pig' of github.com:yahoo/validatar into pig
akshaisarma Aug 13, 2015
d16700f
Basic Pig framework. Need to fetch results
akshaisarma Sep 11, 2015
000e66f
Pig overall. Untested. Modified Hive to use helpers.
akshaisarma Sep 11, 2015
a90c428
Extracting option constants. Adding hadoop. Cleaning up Pig.
akshaisarma Sep 12, 2015
3bd920d
Fixing checkstyle and adding other Pig dependencies
akshaisarma Sep 12, 2015
633c461
Adding a full pig test. Cleaning up some more. Cobertura is broken
akshaisarma Sep 12, 2015
7d084e8
Adding trial clover
akshaisarma Sep 16, 2015
c2a8470
Testing if latest coveralls maven supports clover
akshaisarma Sep 16, 2015
4331ce5
Adding default test
akshaisarma Sep 16, 2015
4448b1b
Fixing Collectors toMap issue with null values. Adding all Pig tests …
akshaisarma Sep 17, 2015
1a50740
Adding type tests. Full coverage
akshaisarma Sep 17, 2015
d79b892
Adding ParseManager tests
akshaisarma Sep 18, 2015
166e23d
Adding Format tests
akshaisarma Sep 18, 2015
d17d9ed
Adding reflection tests and exception test for printing
akshaisarma Sep 18, 2015
d2a70c1
Removing unnecessary defaults in TypeSystem. Cannot test two lines in…
akshaisarma Sep 18, 2015
2b9861e
Documentation first changes
akshaisarma Sep 18, 2015
6d5df56
Documentation first changes
akshaisarma Sep 18, 2015
97c4a2a
Fixing checkstyle. Adding jacoco. Binding to various mvn lifecycles.
akshaisarma Sep 18, 2015
92aeaf0
Removing install as it just wastes time
akshaisarma Sep 18, 2015
ca0c1ac
full is just release artifacts
akshaisarma Sep 18, 2015
cb61492
Updating readme
akshaisarma Sep 18, 2015
966d331
Updating readme
akshaisarma Sep 18, 2015
0d39189
Coverage reached
akshaisarma Sep 19, 2015
e3a9d7e
Merge branch 'pig' of github.com:yahoo/validatar into pig
akshaisarma Sep 19, 2015
c4a7c84
README cleanup
akshaisarma Sep 19, 2015
d9dcd21
Adding shading for dom4j. Added examples and how to run for pig.
akshaisarma Sep 24, 2015
adf6697
Fixing readme
akshaisarma Sep 24, 2015
81a41a4
Fixing readme
akshaisarma Sep 24, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,13 @@ language: java
jdk:
- oraclejdk8

install: true

script:
- mvn verify

after_success:
- mvn clean cobertura:cobertura coveralls:report
- mvn coveralls:report
- test "${TRAVIS_PULL_REQUEST}" == "false" && test "${TRAVIS_TAG}" != "" && mvn deploy --settings travis/settings.xml

cache:
Expand Down
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
all: full

full:
mvn clean checkstyle:check cobertura:cobertura javadoc:jar package
mvn clean javadoc:jar package

clean:
mvn clean

test:
mvn clean checkstyle:check test
mvn clean verify

jar:
mvn clean package
Expand All @@ -16,13 +16,13 @@ release:
mvn -B release:prepare release:clean

coverage:
mvn clean cobertura:cobertura
mvn clean clover2:setup test clover2:aggregate clover2:clover

doc:
mvn clean javadoc:javadoc

see-coverage: coverage
cd target/site/cobertura; python -m SimpleHTTPServer
cd target/site/clover; python -m SimpleHTTPServer

see-doc: doc
cd target/site/apidocs; python -m SimpleHTTPServer
Expand Down
37 changes: 28 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

[![Build Status](https://travis-ci.org/yahoo/validatar.svg?branch=master)](https://travis-ci.org/yahoo/validatar) [![Coverage Status](https://coveralls.io/repos/yahoo/validatar/badge.svg?branch=master)](https://coveralls.io/r/yahoo/validatar?branch=master) [![Download](https://api.bintray.com/packages/yahoo/maven/validatar/images/download.svg)](https://bintray.com/yahoo/maven/validatar/_latestVersion)

Functional testing framework for Big Data pipelines. Current support is only for Hive, but we are planning support for Pig as well as others.
Functional testing framework for Big Data pipelines. Currently support querying pipeline results through Hive (HiveServer2) and Pig (PigServer).

Validatar is currently compiled against *Pig-0.14*. Running against an older or newer version may result in issues if interfaces have changed. These are relatively minor from experience and can be fixed with relatively minor fixes to engine code.

## How to build Validatar

Expand All @@ -14,40 +16,56 @@ Run:

## How to run

To run Validatar:
Use hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App --help (or -h) for Help

### To run Hive tests in Validatar:

export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:/path/to/hive/jdbc/lib/jars/*"
hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App -s tests/ --report report.xml
hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App -s tests/ --report report.xml --hive-jdbc ...

You will also need the settings specified for the engine you are planning to run.
Hive needs the JDBC uri of HiveServer2. Note that the DB is in the URI. Do not add it if your queries use
```
... FROM DB.TABLE WHERE ...

--hive-jdbc "jdbc:hive2://<URI>/<DB>;<Optional params: E.g. sasl.qop=auth;principal=hive/<PRINCIPAL_URL> etc>
```

### To run Pig tests in Validatar:

export HADOOP_CLASSPATH="$HADOOP_CLASSPATH:/path/to/pig/lib/*" (Add other jars here depending on your pig exec type or if hive/hcat is used in Pig)
hadoop jar validatar-jar-with-dependencies.jar com.yahoo.validatar.App -s tests/ --report report.xml --pig-exec-type mr --pig-setting 'mapreduce.job.acl-view-job=*' ...

Pig parameters are not supported in the pig query. Instead, use our parameter substitution (see below).

## Writing Tests

### Test file format

Test files are written in the YAML format. The schema is as follows:
Test files are written in the YAML format. See example in src/test/resources/. The schema is as follows:

```
name: Test family name : String
description: Test family description : String
queries:
- name: Query name : String : Ex "Analytics"
engine: Execution engine : String ("Hive")
engine: Execution engine : String Ex "hive" or "pig"
value: Query : String : Ex "SELECT COUNT(*) AS pv_count FROM page_data"
...
tests:
- name: Test name : String
description: Test description : String
asserts:
- Assertion on some query. Query name is prefixed to the value. : Ex: Analytics.pv_count > 10000
- Assertion on some query. Query name is prefixed to the value. : Ex Analytics.pv_count > 10000
...
```

Queries are named, this name is used as a namespace for all the values returned from the query. In the above example, we created a query named "Analytics". It stores the return value "pv_count". We are then able to use this in our later asserts.

Validatar can run a single test file or a folder of test files. Use the --help option to see more details.

### Assertions

Assertions are quite flexibile, allowing for the following operations:
Assertions are quite flexible, allowing for the following operations:

```
> : greater than
Expand Down Expand Up @@ -82,10 +100,11 @@ Version | Notes
0.1.8 | Null types in Hive results fix
0.1.9 | Empty results handling bug fix
0.2.0 | Internal switch to Java 8. hive-queue is no longer a setting. Use hive-setting.
0.3.0 | Pig support added.

## Members

Akshai Sarma, akshaisarma@gmail.com
Akshai Sarma, akshaisarma@gmail.com
Josh Walters, josh@joshwalters.com

## Contributors
Expand Down
Loading