Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new numerics #200

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
213 changes: 102 additions & 111 deletions text/0000-expand-math.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,23 @@
- Feature Name: Expand `math`
- Feature Name: Introduce new numerics -- `Rational`, `BigInt`, `BigFloat`, `Complex`
- Start Date: 2022-02-28
- RFC PR:
- Pony Issue:

# Summary

This RFC proposes expanding the standard library's `math` package and the shape it might take to leverage the full extent of Pony. There was prior discussion on [Zulip](https://ponylang.zulipchat.com/#narrow/stream/192795-contribute-to.20Pony/topic/math.20lib) years ago, but I expect some opinions to have changed.
This RFC proposes the introduction of new numeric types; in particular the addition of a type representing a fractional number (`Rational`), arbitrary precision integer (`BigInt`), arbitrary precision float (`BigFloat`), and complex number (`Complex`).

# Motivation

Currently, `math` includes limited functionality. This should be expanded to include math types, constants, and present a structure with further expansion in mind. An expanded `math` library will allow a unified mathematics among Pony developers. As was discussed during our [2022-03-01 Sync](https://sync-recordings.ponylang.io/r/2022_03_01.m4a), one core principle for including these types in the stdlib is to have a single canonical implementation of them which allow interoperability of numeric types across the Pony ecosystem. Any numeric types introduced by this RFC **must** existing within the current numerical type hierarchy by being compliant with existing numeric traits.
The primary motivation for adding these types to the stdlib is to have a single canonical implementation of them which allow interoperability of numeric types across the Pony ecosystem.

# Detailed design

The primary goals of this initial expansion are:

1. restructure the `math` package into distinct subpackages; allowing for separation of concerns over continuing to build a monolithic `math` package
2. provide common `math` data types; for example, `BigInt`, `Rational`, and `Complex`
I propose we add the aforementioned numeric types into `builtin` so they exist alongside the other standard numeric types. These introduced numeric types **must** existing within the current numeric type hierarchy by being compliant with existing numeric traits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the preference would be to keep builtin as minimal as possible, so I'd prefer to put these new types in a new package (or series of packages), unless there is a strong motivation for it to go in builtin.

Pretty much all of the existing public types in builtin have hard requirements forcing them to be there, with a reason like one of the following:

  • they are used by core language constructs (None, string literals, numeric literals, etc)
  • they need to use raw pointers (Array, String, etc)
  • they use compile_intrinsic for one or more function definitions
  • they are part of the Env, which is passed to the Main actor on entry
  • they are used by one of the types that meet the above criteria

I don't think these new types have any hard requirements forcing them to go in builtin, so I believe they shouldn't be there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss this further on Sync and I will update the RFC according to our conversation. I have no particular need for these to be added to builtin so no objection to an agreed upon other location such as a newly created numerics or adding these to math (as was the RFC state prior to my latest changes).


## Numeric Hierarchy

Current the Pony numerics hierarchy is as follows.
Current the Pony numeric type hierarchy is as follows:

```mermaid
classDiagram
Expand Down Expand Up @@ -63,7 +60,7 @@ SignedInteger <-- ISize
SignedInteger <-- ILong
```

This RFC introduces a few more numeric types: `Rational`, `Complex`, and arbitrary precision `BigInt`. This fit into the hiearchy in the following manner.
This RFC introduces four more numeric types: `Rational`, `BigInt`, `BigFloat`, and `Complex`. These fit into the numeric type hierarchy in the following manner:

```mermaid
classDiagram
Expand Down Expand Up @@ -110,130 +107,124 @@ SignedInteger <-- ISize
SignedInteger <-- ILong
```

## Structure

I propose a structure of distinct subpackages including the following:

+ `math/big`: Arbitrary precision numbers
+ `math/series`: Mathematical series
+ `math/constants`: Mathematical constants
+ `math/rational`: `Rational` data type and related functions
+ `math/complex`: `Complex` data type and related functions
+ `math/(x,exp,etc)`: experimental additions, utilities, and effective "catch-all" for matters that do not neatly fit into other subpackages

## Common Data Types

As previewed above in [Structure](#structure), expanding the `math` package should include implementations of common mathematics data types. Below are some implementation proposals for those data types.

### `math/big`

This is a package for arbitary precision numerics.

Should include `BigInt`, `BigFloat`, and `BigDecimal` -- see [168](https://github.com/ponylang/rfcs/issues/168)

### `math/series`

`math/series` should include a `Series` trait which is a subclass of `Iterator`. The purpose of creating a new abstract data type is to generalize functions over mathematical series which do not make sense over iterators -- such as whether a `Series` is diverging or converging, a property that is all but meaningless for an `Iterator`.

Example series include `Fibonacci` (already exists), `Pascal` (nCk), `Triangular` ({n+1}C{2}), `Square` (n^2), `Pentagonal` ({2n * (2n - 1)} / 2), etc.

Current `math/Fibonacci` would go into this series package.

### `math/constant`

Initial values to include are those with underlying LLVM representations from the [numbers namespace](https://llvm.org/doxygen/namespacellvm_1_1numbers.html).

Once these values exist in `math/constant`, they could be removed from where they are now, which is on `F32` and `F64` of [`Float`](https://github.com/ponylang/ponyc/blob/master/packages/builtin/float.pony).

I foresee this as a primitive `Constant` with methods for each value (e.g., `Constant.pi[A: Float](): A(3.14159...)`).

### `math/rational`

`math/rational` should decide backing precision on `create[A: UnsignedInteger]` and return precision on `apply[B: Number]` -- `let x = Rational.create[U64](where numerator=2)` gives a type which is represented by `U64(2)`/`U64(2)` which can then be returned as any valid `Number`. `Rational` should be parameterized only on `UnsignedInteger` types and track sign via an internal field.
## Methods of Concern

```pony
// Parameterized on unsigned integers as negative status is tracked by field
class Rational[A: UnsignedInteger[A] val = USize]
var numerator: A
var denominator: A
var negative: Bool = false

// Allow for creating Rationals from signed or unsigned integer arguments
new create[B: Integer[B] val = ISize](n: B, d: B = 1)? =>
// Error if denominator is zero
if d == B.from[U8](0) then error end

// Produce unsigned equivalents of arguments
// NOTE: this produces an error of form
// Error:
// main.pony:XX:YY: type argument is outside its constraint
// This is because Integer is not a subtype of UnsignedInteger
let n': A = if n < B.from[U8](0) then A.from[B](-n) else A.from[B](n) end
let n': A = if n < B.from[U8](0) then A.from[B](-n) else A.from[B](n) end
let d': A = if d < B.from[U8](0) then A.from[B](-d) else A.from[B](d) end

// Find greatest common divisor and reduce fraction
let divisor: A = try GreatestCommonDivisor[A](n', d')? else A.from[U8](1) end
numerator = n' / divisor
denominator = d' / divisor

// Set negative field if numerator, but not denominator is negative
if (n < B.from[U8](0)) and (B.from[U8](0) < d) then
negative = true
end

fun string(): String =>
// Size of 10 here is a placeholder for determining the size needed
let output = recover String(10) end
if negative then output.append("-") end
output.append(numerator.string())
output.append(" / ")
output.append(denominator.string())
output
trait val Real[A: Real[A] val] is
(Stringable & _ArithmeticConvertible & Comparable[A])
...
new val min_value()
new val max_value()
...
```
Copy link

@jasoncarr0 jasoncarr0 Mar 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that this has already been present, but given that this is re-design I'm not clear on the benefit of these constructors, but there's some obvious costs. Is there a use case for max_value() / min_value() that makes sense for all numbers? Given that this doesn't work for BigInt/BigFloat/similar types I'd be hesitant to place it at the top of the hierarchy.

Can we instead make a Bounded interface? This will only break code which relies on these methods being present in any Real[A]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you are suggesting. What's a "bounded interface"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he means a new interface whose name is Bounded, which has the min and max value methods on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasoncarr0 is that what you meant? re: Bounded interface? if yes, did you specifically mean structural typing or were you using "interface" more loosely?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant a literal Pony interface, that is:
These constructors would be part of an interface named Bounded, while the other methods would be part of this trait.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is also an argument for removing Ordered, but only because this trait is really the lowest possible. Mostly that the implementations for things like matrices would be sketch but making them numerics makes sense and is powerful.


Changing the underlying precision "in-place" is done via a `prec[C: Real]` method which creates a new instance with a different precision.

(I see this package as subsuming `Decimal` and `Fractional` types, as previously discussed in Zulip.)

### `math/complex`

`Complex` should follow similar to `math/rational` in that it is parameterized on `Real` types which are used to represent both the real and imaginary part of the number -- `Complex[U128](7, 2)` is `U128(7) + U128(2)i`.
`Real` will exist above `Rational`, `BigInt`, and `BigFloat` and as such would require defining the above methods for these types. `Rational` can be defined as minimum and maximum of the numerator, however `BigInt` and `BigFloat` by definition have arbitrary precision making defining a minimum and maximum difficult at the least -- if we define them as the minimum and maximum of a machine-sized int and float, or define them as -Inf and Inf -- or impossible at the worst -- if we define them by their possible limits which are arbitrary.

Changing the underlying precision is done via a `prec[C: Real]` method which creates a new instance of the value with a different precision.

### `math/(x,exp,etc)`
```pony
trait val Integer[A: Integer[A] val] is Real[A]
...
fun op_and(y: A): A => this and y
fun op_or(y: A): A => this or y
fun op_xor(y: A): A => this xor y
fun op_not(): A => not this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be implemented by a BigInt, so it's odd that a BigInt can't actually be an Integer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand your comment. Can you try explaining in a different way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't do a bitwise "not" operation on a BigInt - at least not by the normal semantics of a bitwise "not" operation.

You can't because the you'd expect that any bits "more significant" than the most significant bit of the value would be set to 1.

But a BigInt has no upper bound, and thus there is no limit to the number of leading 1 bits implied by inverting its bits

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand @jemc.

"This can't be implemented by a BigInt, so it's odd that a BigInt can't actually be an Integer"

Can you explain how what you said is related to Jason's comment?

I don't understand why not being able to be implemented BigInt makes it odd that BigInt can't actually be an Integer.
I don't understand what is being said here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll copy Jason's words and insert parentheticals with my own commentary:

This (the op_not function) can't be implemented by a BigInt (for the reason mentioned in my previous comment)

So it's odd that a BigInt (which is conceptually a kind of integer) can't actually be an Integer (it cannot be a subtype of the Integer trait, because the Integer trait includes the op_not function).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jemc. I understand now.


fun bit_reverse(): A
"""
Reverse the order of the bits within the integer.
For example, 0b11101101 (237) would return 0b10110111 (183).
"""

fun bswap(): A
```

The name is subject to change and I want comments on what such a "catch all" package should be named to clearly denote it is for matters which do not neatly fit elsewhere.
`Integer` will exist above `BigInt` and as such would require defining the above methods -- however `BigInt` will be defined via other numerics so these methods could be applied recursively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this means. What are "the above methods"? Those listed on the Integer trait?

"BigInt will be defined via other numerics so these methods could be applied recursively" <-- I don't understand what this means either. There seems to be something that I am supposed to understand about "other numerics" that BigInt will be defined via (did I miss that somewhere, if yes, bringing information forward so that you don't have to hold large chunks of RFC in your head would be good). I'm also not sure what "applied recursively" means in this context.


The explicit intention of this subpackage is to gather useful matters that do not fit into another more dedicated packages. Examples of matters that would be included here are trigonometry and linear algebra functions before corresponding `math/trig` and `math/la` packaged are made.
```pony
trait val FloatingPoint[A: FloatingPoint[A] val] is Real[A]
new val min_normalised()
new val epsilon()
fun tag radix(): U8
fun tag precision2(): U8
fun tag precision10(): U8
fun tag min_exp2(): I16
fun tag min_exp10(): I16
fun tag max_exp2(): I16
fun tag max_exp10(): I16
...
fun abs(): A
fun ceil(): A
fun floor(): A
fun round(): A
fun trunc(): A

fun finite(): Bool
fun infinite(): Bool
fun nan(): Bool

fun ldexp(x: A, exponent: I32): A
fun frexp(): (A, U32)
fun log(): A
fun log2(): A
fun log10(): A
fun logb(): A

fun pow(y: A): A
fun powi(y: I32): A

fun sqrt(): A

fun sqrt_unsafe(): A
"""
Unsafe operation.
If this is negative, the result is undefined.
"""

fun cbrt(): A
fun exp(): A
fun exp2(): A

fun cos(): A
fun sin(): A
fun tan(): A

fun cosh(): A
fun sinh(): A
fun tanh(): A

fun acos(): A
fun asin(): A
fun atan(): A
fun atan2(y: A): A

fun acosh(): A
fun asinh(): A
fun atanh(): A
```

Current `math/IsPrime` would go into this catch-all package.
`FloatingPoint` will exist above `BigFloat` and as such would require defining the above methods -- many of which are ill-defined under arbitrary precision, or are functions using C-FFI and/or LLVM intrinsics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Why would we do it this way if it has these problems? This seems like an argument against doing it this way. I feel like there's some meaning here that I am missing.


# How We Teach This

For functionality additions, ample documentation and usage examples within the expanded `math` library should be sufficient.

I am proposing moving `Fibonacci` and `IsPrime` however nothing about existing functionality should change so teaching users about those updates should follow the pattern of providing example existing code with example modified code showing how to pick up the new locations of existing functionality.

A new chapter should be added as Tutorial > Packages > Math which walks through the layout and usage of Pony `math`.
Adding ample documentation to these new numerics should suffice to teach Pony users how to leverage these types in their programs. I do not think any additions to the Pony Tutorial are needed, however if additions are desired than [Arithmetic](https://tutorial.ponylang.io/expressions/arithmetic.html) may be the most sensible location.

# How We Test This

I recommend use of `pony_check` to test all reversible operations pairs (`x+y-y == x`, `x*y/y == x`, etc), precision persistence (`Rational[U8](where numerator=x, denominator=y) * y == x`), and overflow/underflow protection (`Rational[U8](255, 1) + 1 => error`).

Testing `math` should not affect any other parts of Pony and as such standard CI should suffice.
Testing these numerics should not affect any other parts of Pony and as such standard CI should suffice.

# Drawbacks

The major drawback is additional maintenance cost as well as immediate and continued disagreement around implementation and feature details.
+ Additional maintenance cost
+ May break existing code if methods must be removed from existing numeric traits to match the suggested hierarchy placements

# Alternatives

The amount of subpackages is a lot and could be reduced down to one single `math` package if we so choose as none of the proposed additions clash at this time.
Alternatively, we can introduce these types in `math` as opposed to `builtin` and/or only introduce some of the proposed new numeric types.

# Unresolved questions

+ How expansive should the `math` library become (whether that is one package or multiple subpackages)?
+ Do we need all of `Decimal`, `Rational`, and `Fractional` types? I am unaware of their distinction so believe adding a `Rational` (numerator and denominator) type to suffice for current needs.
+ Do we want the types here to be additions to `math` or, since they are really more numeric types, additions alongside other numeric types within `builtin`?
+ Should these types be introduced in `builtin` or in `math`?
+ Does `Rational` make sense as the "fractional type" or would we prefer `Fractional` to avoid confusion?
+ Do we want to also include a `Decimal` type?
+ How should we handle the stated "Methods of Concern"?