Skip to content

UBC-Stat-ML/briefj

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary Build Status

briefj contains utilities for writing succinct java.

Installation

Prerequisite software:

  • Java SDK 1.6+
  • Gradle version 1.9+ (not tested on Gradle 2.0)

There are several options available to install the package:

Integrate to a gradle script

Simply add the following lines (replacing 1.0.0 by the current version (see git tags)):

repositories {
 mavenCentral()
 jcenter()
 maven {
    url "http://www.stat.ubc.ca/~bouchard/maven/"
  }
}

dependencies {
  compile group: 'ca.ubc.stat', name: 'briefj', version: '1.0.0'
}

Compile using the provided gradle script

  • Check out the source git clone git@github.com:alexandrebouchard/briefj.git
  • Compile using gradle installApp
  • Add the jars in build/install/briefj/lib/ into your classpath

Use in eclipse

  • Check out the source git clone git@github.com:alexandrebouchard/briefj.git
  • Type gradle eclipse from the root of the repository
  • From eclipse:
    • Import in File menu
    • Import existing projects into workspace
    • Select the root
    • Deselect Copy projects into workspace to avoid having duplicates

BriefIO

Convenient wrappers around common IO operations. Examples of succinct calls, which do not need typed exceptions, and also maintain memory efficiency (i.e. they are not dumped into a large list, so file that do not fit in memory can still be iterated over):

for (String line : readLines("src/test/resources/test.csv"))
  System.out.println(line);

for (String line : readLinesFromResource("/test.csv"))
  System.out.println(line);

for (String line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv"))
  System.out.println(line);

If you want to add typed exception back (e.g., later in development), just add .check():

for (String line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv").check())
  System.out.println(line);

Returning a FluentIterable (from the guava project), it is easy to limit, filter, etc (see guava project for more):

for (String line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv").skip(1).limit(10))
  System.out.println(line);

Convenient access to CSV files:

for (List<String> line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv").splitCSV().limit(10))
  System.out.println(line);

Which can also be indexed by the name of the columns of the first row via a map:

for (Map<String,String> line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv").indexCSV().limit(10))
  System.out.println(line);

Different CSV options can be used (see au.com.bytecode.opencsv for details):

for (Map<String,String> line : readLinesFromURL("http://stat.ubc.ca/~bouchard/pub/geyser.csv").indexCSV(new CSVParser(';')).limit(10))
  System.out.println(line);

Output without checked exception, optional charset:

File temp = BriefFiles.createTempFile();
PrintWriter out = output(temp);
out.println("Hello world");
out.close();

Lists files in directory, with or without suffix filter (without period)

for (File f : BriefFiles.ls(new File(".")))
  System.out.println(f);
for (File f : BriefFiles.ls(new File("."), "txt"))
  System.out.println(f);

BriefCollections

To provide a default initial value in a map, which is also inserted if the key was missing:

Map<String,Set<String>> example = Maps.newHashMap();

getOrPutSet(example, "colors").add("blue");
getOrPutSet(example, "colors").add("red");
getOrPutSet(example, "foods").add("apple");

Assert.assertEquals(example.get("colors"), new HashSet<String>(Arrays.asList("blue", "red")));

Pick an arbitrary elt from a collection

Set<String> items = Sets.newLinkedHashSet();
items.add("item");
Assert.assertEquals(BriefCollections.pick(items), "item");

Some convenience methods for hashes from doubles (Counter):

Counter<String> counter = new Counter<String>();
counter.incrementCount("a", 1.1);
counter.incrementCount("a", 0.6);
counter.incrementCount("b", -5);
Assert.assertEquals(counter.getCount("a"), 1.1 + 0.6, 0.0);
counter.setCount("a", -100);
Assert.assertEquals(counter.getCount("a"), -100, 0.0);

// iterate in order of insertion:
for (String item : counter.keySet())
  System.out.println(item + "\t" + counter.getCount(item));
Assert.assertEquals(counter.keySet().iterator().next(), "a");

// iterate in decreasing order of counts
for (String item : counter)
  System.out.println(item + "\t" + counter.getCount(item));
Assert.assertEquals(counter.iterator().next(), "b");

// get sum (normalization)
Assert.assertEquals(counter.totalCount(), -5.0 - 100.0, 0.0);

// destructively normalize
counter.normalize();
Assert.assertEquals(counter.getCount("b"), 5.0/105.0, 1e-10);

// deep copy
Counter<String> c2 = new Counter<String>(counter);

// add all counter
c2.incrementAll(counter);
Assert.assertEquals(c2.getCount("b"), 2*5.0/105.0, 1e-10);
Assert.assertEquals(counter.getCount("b"), 5.0/105.0, 1e-10);

Unordered pairs:

UnorderedPair<Integer, Integer> 
  example = UnorderedPair.of(1, 2),
  example2= UnorderedPair.of(2, 1);

Assert.assertEquals(example, example2);

Indexer are convenient when you want to have an array indexed by some arbitrary type of objects. E.g. for efficient array-based categorical sampling.

An indexer is just a bijection between integers 0, 1, .., N and a set of objects with .equals() and .hashCode() implemented.

Indexer<String> indexer = new Indexer<String>();
indexer.addToIndex("first");
indexer.addToIndex("second");
indexer.addToIndex("third");

// i2o maps from index to object
// o2i maps from object to index
Assert.assertEquals("first", indexer.i2o(indexer.o2i("first")));

BriefStrings

To quickly select a group from a regular expression, use

String match = firstGroupFromFirstMatch("I need ([0-9]*)", "I need 58 bitcoins");
Assert.assertEquals(match, "58");

List<String> matches = allGroupsFromFirstMatch("I need ([0-9]*)\\s+(.*)", "I need 58 bitcoins");
Assert.assertEquals(matches, Arrays.asList("58", "bitcoins"));

List<String> matchesFromAllMatches = firstGroupFromAllMatches("I need ([0-9]*)\\s+bitcoins\\s*", "I need 58 bitcoins I need 9 bitcoins");
Assert.assertEquals(matchesFromAllMatches, Arrays.asList("58", "9"));

BriefParallel

Parallelize some tasks indexed by integers, with an explicit control on the number of threads.

int [] items = new int[100];

// execute the operation "item[i]++" for all integers i in [0, 100),
// using 8 threads in parallel
BriefParallel.process(100, 8, i -> items[i]++);

Assert.assertEquals(IntStream.of(items).sum(), 100);