- Busted-based TDD
- Class modules begin with an uppercase letter, and end up in their own file that begins with an uppercase letter (e.g.
RDD.lua
) - Two spaces for indents.
- The
_
global variable is the unused variable stand-in. - Companion libraries such as Stuart ML (a Lua port of Spark MLlib) will end up in their own separate Git repo and LuaRocks module.
Here are some areas where contributors are welcome, in order of their value to the project.
Apache Spark example programs for Spark and Spark Streaming exist at examples/src/main/java/org/apache/spark/examples and examples/src/main/java/org/apache/spark/examples/streaming.
Once ported to Stuart, they exist in this project at examples/ApacheSpark.
These programs would validate the basic utility of the platform.
Stuart module dependencies are created in pure Lua so that amalgamated Lua scripts can transpile to C or run on diverse VMs. This places a burden on development of libraries that support Stuart, Stuart ML, and Stuart SQL.
- lua-parquet - needs significant additional support for RLE, and other codecs such as Brotli,
gzip
,lzo
, andsnappy
. - lua-long - needs Lua 5.3 support, and stringify hangs for many numbers.
- vstruct - needs Lua 5.3 support.
- stuart-ml - KMeans Clustering was the first algorithm to be ported, but Spark ML contains dozens more Classification and Regression models.
- stuart-sql - DataFrame support, among other things.
Issues are filed at github.com/nubix-io/stuart/issues.