Skip to content

Latest commit

 

History

History
37 lines (23 loc) · 2.42 KB

CONTRIBUTING.md

File metadata and controls

37 lines (23 loc) · 2.42 KB

Contributing to Stuart

Guidelines

  • Busted-based TDD
  • Class modules begin with an uppercase letter, and end up in their own file that begins with an uppercase letter (e.g. RDD.lua)
  • Two spaces for indents.
  • The _ global variable is the unused variable stand-in.
  • Companion libraries such as Stuart ML (a Lua port of Spark MLlib) will end up in their own separate Git repo and LuaRocks module.

Where to Start

Here are some areas where contributors are welcome, in order of their value to the project.

1. Port more Apache Spark examples to Lua

Apache Spark example programs for Spark and Spark Streaming exist at examples/src/main/java/org/apache/spark/examples and examples/src/main/java/org/apache/spark/examples/streaming.

Once ported to Stuart, they exist in this project at examples/ApacheSpark.

These programs would validate the basic utility of the platform.

2. Contribute to supporting libraries

Stuart module dependencies are created in pure Lua so that amalgamated Lua scripts can transpile to C or run on diverse VMs. This places a burden on development of libraries that support Stuart, Stuart ML, and Stuart SQL.

  • lua-parquet - needs significant additional support for RLE, and other codecs such as Brotli, gzip, lzo, and snappy.
  • lua-long - needs Lua 5.3 support, and stringify hangs for many numbers.
  • vstruct - needs Lua 5.3 support.
  • stuart-ml - KMeans Clustering was the first algorithm to be ported, but Spark ML contains dozens more Classification and Regression models.
  • stuart-sql - DataFrame support, among other things.

3. Fix Issues

Issues are filed at github.com/nubix-io/stuart/issues.