Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add size-limited strings and varying bit-width integer Value_Types to in-memory backend and check for ArithmeticOverflow in LongStorage #7557

Merged
merged 40 commits into from
Aug 22, 2023
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
a41953f
add out of bounds error
radeusgd Aug 8, 2023
e5377f5
making LongStorage variable type - WIP
radeusgd Aug 8, 2023
71653e3
fixes, pass bists in long_fetcher
radeusgd Aug 9, 2023
6ae9a7f
Add docs about overflow (TODO: implement! test!)
radeusgd Aug 9, 2023
4b74f28
Parametrize String{Builder,Storage} by TextType
radeusgd Aug 9, 2023
801e751
I think this import was unused?
radeusgd Aug 9, 2023
ea879c9
fix, text_fetcher use precise type info
radeusgd Aug 9, 2023
4ff9951
patch the type mapping
radeusgd Aug 9, 2023
abfeb10
new test and explore upload test
radeusgd Aug 9, 2023
33009df
Fixing a piece of logic for Union, adding tests there
radeusgd Aug 9, 2023
71cb460
enable/add cast tests
radeusgd Aug 10, 2023
ba16725
generalize LongStorage in cast
radeusgd Aug 10, 2023
52f752b
Implement integer casts
radeusgd Aug 10, 2023
31f968b
skeleton for overflow tests
radeusgd Aug 10, 2023
72928fe
Add tests for `Arithmetic_Overflow`
radeusgd Aug 10, 2023
22533f8
Adapt tests to new behaviour
radeusgd Aug 11, 2023
b77414c
Even modulus is promoted to 64-bit - much simpler to implement
radeusgd Aug 11, 2023
2378692
Add a test for #7565
radeusgd Aug 11, 2023
b09df2e
Add more tests for issues related to the issue from #7565
radeusgd Aug 11, 2023
73d63e5
fixes related to #7565
radeusgd Aug 11, 2023
e6df231
fix tests
radeusgd Aug 11, 2023
4433cdf
more tests
radeusgd Aug 11, 2023
f42d26a
more tests, fixing some
radeusgd Aug 11, 2023
3bab806
Factor out checks to separate class
radeusgd Aug 11, 2023
13efa62
Implement checking and reporting 64-bit integer overflow
radeusgd Aug 11, 2023
5cb6fac
remove TODO, clean
radeusgd Aug 11, 2023
164a442
remove TODO
radeusgd Aug 11, 2023
3db976c
remove unrelated test
radeusgd Aug 11, 2023
389eab3
javafmt
radeusgd Aug 11, 2023
4ca4f29
CHANGELOG: nice palindrome PR number :)
radeusgd Aug 11, 2023
cd9d526
CR1: doc comment
radeusgd Aug 21, 2023
5c06319
fixes after rebase
radeusgd Aug 21, 2023
3e07308
fixes pt. 2
radeusgd Aug 21, 2023
d4ebdf7
add benchmarks
radeusgd Aug 21, 2023
9dd67ff
more warmup
radeusgd Aug 21, 2023
2659f97
add Arithmetic to be run with other benchmarks
radeusgd Aug 21, 2023
39356b6
Merge branch 'develop' into wip/radeusgd/5159-new-inmemory-value-types
radeusgd Aug 22, 2023
7203172
javafmt
radeusgd Aug 22, 2023
16c5aa2
Merge branch 'develop' into wip/radeusgd/5159-new-inmemory-value-types
radeusgd Aug 22, 2023
c2b9d8c
add reference to #7635 in TODO
radeusgd Aug 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,8 @@
- [Retire `Column_Selector` and allow regex based selection of columns.][7295]
- [`Text.parse_to_table` can take a `Regex`.][7297]
- [Expose `Text.normalize`.][7425]
- [Implemented new value types (various sizes of `Integer` type, fixed-length
and length-limited `Char` type) for the in-memory `Table` backend.][7557]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -786,6 +788,7 @@
[7295]: https://github.com/enso-org/enso/pull/7295
[7297]: https://github.com/enso-org/enso/pull/7297
[7425]: https://github.com/enso-org/enso/pull/7425
[7557]: https://github.com/enso-org/enso/pull/7557

#### Enso Compiler

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,15 @@ type Column
Returns a column containing the result of adding `other` to each element
of `self`. If `other` is a column, the operation is performed pairwise
between corresponding elements of `self` and `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.
+ : Column | Any -> Column
+ self other =
op = case Value_Type_Helpers.resolve_addition_kind self other of
Expand All @@ -388,6 +397,15 @@ type Column
Returns a column containing the result of subtracting `other` from each
element of `self`. If `other` is a column, the operation is performed
pairwise between corresponding elements of `self` and `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.
- : Column | Any -> Column
- self other =
case Value_Type_Helpers.resolve_subtraction_kind self other of
Expand All @@ -405,6 +423,15 @@ type Column
Returns a column containing the result of multiplying `other` by each
element of `self`. If `other` is a column, the operation is performed
pairwise between corresponding elements of `self` and `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.
* : Column | Any -> Column
* self other =
Value_Type_Helpers.check_binary_numeric_op self other <|
Expand All @@ -426,6 +453,15 @@ type Column
- If division by zero occurs, an `Arithmetic_Error` warning is attached
to the result.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Divide the elements of one column by the elements of another.

Expand Down Expand Up @@ -493,6 +529,15 @@ type Column
Returns a column containing the result of raising each element of `self`
by `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Squares the elements of one column.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from Standard.Base import all

import Standard.Table.Data.Column.Column as Materialized_Column
import Standard.Table.Data.Type.Value_Type.Bits
import Standard.Table.Data.Type.Value_Type.Value_Type
import Standard.Table.Internal.Java_Exports

Expand Down Expand Up @@ -69,27 +70,27 @@ double_fetcher =
Column_Fetcher.Value fetch_value make_builder

## PRIVATE
long_fetcher : Column_Fetcher
long_fetcher =
long_fetcher : Bits -> Column_Fetcher
long_fetcher bits =
fetch_value rs i =
l = rs.getLong i
if rs.wasNull then Nothing else l
make_builder initial_size =
java_builder = Java_Exports.make_long_builder initial_size
java_builder = Java_Exports.make_long_builder initial_size bits=bits
append v =
if v.is_nothing then java_builder.appendNulls 1 else
java_builder.appendLong v
Builder.Value append (seal_java_builder java_builder)
Column_Fetcher.Value fetch_value make_builder

## PRIVATE
text_fetcher : Column_Fetcher
text_fetcher =
text_fetcher : Value_Type -> Column_Fetcher
text_fetcher value_type =
fetch_value rs i =
t = rs.getString i
if rs.wasNull then Nothing else t
make_builder initial_size =
java_builder = Java_Exports.make_string_builder initial_size
java_builder = Java_Exports.make_string_builder initial_size value_type=value_type
make_builder_from_java_object_builder java_builder
Column_Fetcher.Value fetch_value make_builder

Expand Down Expand Up @@ -137,11 +138,9 @@ date_time_fetcher =
default_fetcher_for_value_type : Value_Type -> Column_Fetcher
default_fetcher_for_value_type value_type =
case value_type of
## TODO [RW] once we support varying bit-width in storages, we should specify it
Revisit in #5159.
Value_Type.Integer _ -> long_fetcher
Value_Type.Integer bits -> long_fetcher bits
Value_Type.Float _ -> double_fetcher
Value_Type.Char _ _ -> text_fetcher
Value_Type.Char _ _ -> text_fetcher value_type
Value_Type.Boolean -> boolean_fetcher
Value_Type.Time -> time_fetcher
# We currently don't distinguish timestamps without a timezone on the Enso value side.
Expand Down
48 changes: 46 additions & 2 deletions distribution/lib/Standard/Table/0.0.0-dev/src/Data/Column.enso
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ from project.Internal.Column_Format import all
from project.Internal.Java_Exports import make_date_builder_adapter, make_double_builder, make_long_builder, make_string_builder

polyglot java import org.enso.base.Time_Utils
polyglot java import org.enso.table.data.column.builder.StringBuilder
polyglot java import org.enso.table.data.column.operation.map.MapOperationProblemBuilder
polyglot java import org.enso.table.data.column.storage.Storage as Java_Storage
polyglot java import org.enso.table.data.table.Column as Java_Column
Expand Down Expand Up @@ -388,6 +387,15 @@ type Column
Returns a column with results of adding `other` from each element of
`self`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Add two columns to each other.

Expand Down Expand Up @@ -418,6 +426,15 @@ type Column
Returns a column with results of subtracting `other` from each element of
`self`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Subtract one column from another.

Expand Down Expand Up @@ -461,6 +478,15 @@ type Column
Returns a column containing the result of multiplying each element of
`self` by `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Multiply the elements of two columns together.

Expand Down Expand Up @@ -495,6 +521,15 @@ type Column
- If division by zero occurs, an `Arithmetic_Error` warning is attached
to the result.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Divide the elements of one column by the elements of another.

Expand Down Expand Up @@ -560,6 +595,15 @@ type Column
Returns a column containing the result of raising each element of `self`
by `other`.

? Arithmetic Overflow

For integer columns, the operation may yield results that will not fit
into the range supported by the column. In such case, the in-memory
backend will replace such results with `Nothing` and report a
`Arithmetic_Overflow` warning. The behaviour in Database backends is
not specified and will depend on the particular database - it may
cause a hard error, the value may be truncated or wrap-around etc.

> Example
Squares the elements of one column.

Expand Down Expand Up @@ -1229,7 +1273,7 @@ type Column
length = self.length
storage = self.java_column.getStorage

builder = StringBuilder.new length
builder = make_string_builder length
0.up_to length . each i->
replaced = do_replace i (storage.getItem i)
builder.append replaced
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from Standard.Base import all
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument

import project.Data.Type.Storage
import project.Internal.Java_Problems
import project.Internal.Parse_Values_Helper
from project.Data.Type.Value_Type import Auto, Bits, Value_Type
Expand Down Expand Up @@ -173,9 +174,10 @@ type Data_Formatter
WhitespaceStrippingParser.new base_parser

## PRIVATE
make_integer_parser self auto_mode=False =
make_integer_parser self auto_mode=False target_type=Value_Type.Integer =
separator = if self.thousand_separator.is_empty then Nothing else self.thousand_separator
NumberParser.createIntegerParser auto_mode.not (auto_mode.not || self.allow_leading_zeros) self.trim_values separator
storage_type = Storage.from_value_type_strict target_type
NumberParser.createIntegerParser storage_type auto_mode.not (auto_mode.not || self.allow_leading_zeros) self.trim_values separator

## PRIVATE
make_decimal_parser self auto_mode=False =
Expand Down Expand Up @@ -221,8 +223,8 @@ type Data_Formatter

## PRIVATE
make_value_type_parser self value_type = case value_type of
# TODO once we implement #5159 we will need to add checks for bounds here and support 16/32-bit ints
Value_Type.Integer Bits.Bits_64 -> self.make_integer_parser
Value_Type.Integer _ ->
self.make_integer_parser target_type=value_type
# TODO once we implement #6109 we can support 32-bit floats
Value_Type.Float Bits.Bits_64 -> self.make_decimal_parser
Value_Type.Boolean -> self.make_boolean_parser
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ from Standard.Table.Errors import Inexact_Type_Coercion

polyglot java import org.enso.table.data.column.builder.Builder
polyglot java import org.enso.table.data.column.storage.type.AnyObjectType
polyglot java import org.enso.table.data.column.storage.type.Bits as Java_Bits
polyglot java import org.enso.table.data.column.storage.type.BooleanType
polyglot java import org.enso.table.data.column.storage.type.DateTimeType
polyglot java import org.enso.table.data.column.storage.type.DateType
Expand All @@ -24,9 +25,9 @@ to_value_type : StorageType -> Value_Type
to_value_type storage_type = case storage_type of
i : IntegerType -> case i.bits.toInteger of
8 -> Value_Type.Byte
b -> Value_Type.Integer (Bits.from_bits b)
b -> Value_Type.Integer (Bits.from_integer b)
f : FloatType ->
bits = Bits.from_bits f.bits.toInteger
bits = Bits.from_integer f.bits.toInteger
Value_Type.Float bits
_ : BooleanType -> Value_Type.Boolean
s : TextType ->
Expand All @@ -40,12 +41,18 @@ to_value_type storage_type = case storage_type of

## PRIVATE
closest_storage_type value_type = case value_type of
# TODO we will want builders and storages with bounds checking, but for now we approximate
Value_Type.Byte -> IntegerType.INT_64
Value_Type.Integer _ -> IntegerType.INT_64
Value_Type.Byte -> IntegerType.INT_8
Value_Type.Integer bits ->
java_bits = Java_Bits.fromInteger bits.to_integer
IntegerType.create java_bits
Value_Type.Float _ -> FloatType.FLOAT_64
Value_Type.Boolean -> BooleanType.INSTANCE
Value_Type.Char _ _ -> TextType.VARIABLE_LENGTH
Value_Type.Char Nothing True -> TextType.VARIABLE_LENGTH
Value_Type.Char Nothing False ->
Error.throw (Illegal_Argument.Error "Value_Type.Char with fixed length must have a non-nothing size")
Value_Type.Char max_length variable_length ->
fixed_length = variable_length.not
TextType.new max_length fixed_length
Value_Type.Date -> DateType.INSTANCE
# We currently will not support storing dates without timezones in in-memory mode.
Value_Type.Date_Time _ -> DateTimeType.INSTANCE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ type Bits
Bits_64

## PRIVATE
to_bits : Integer
to_bits self = case self of
to_integer : Integer
to_integer self = case self of
Bits.Bits_16 -> 16
Bits.Bits_32 -> 32
Bits.Bits_64 -> 64

## PRIVATE
from_bits : Integer -> Bits
from_bits bits = case bits of
from_integer : Integer -> Bits
from_integer bits = case bits of
16 -> Bits.Bits_16
32 -> Bits.Bits_32
64 -> Bits.Bits_64
Expand All @@ -33,17 +33,17 @@ type Bits
## PRIVATE
Provides the text representation of the bit-size.
to_text : Text
to_text self = self.to_bits.to_text + " bits"
to_text self = self.to_integer.to_text + " bits"

## PRIVATE
type Bits_Comparator
## PRIVATE
compare : Bits -> Bits -> Ordering
compare x y = Comparable.from x.to_bits . compare x.to_bits y.to_bits
compare x y = Comparable.from x.to_integer . compare x.to_integer y.to_integer

## PRIVATE
hash : Bits -> Integer
hash x = Comparable.from x.to_bits . hash x.to_bits
hash x = Comparable.from x.to_integer . hash x.to_integer

Comparable.from (_:Bits) = Bits_Comparator

Expand Down Expand Up @@ -96,8 +96,9 @@ type Value_Type

Arguments:
- size: the maximum number of characters that can be stored in the
column.
column. It can be nothing to indicate no limit.
- variable_length: whether the size is a maximum or a fixed length.
A fixed length string must have a non-nothing size.
Char size:(Integer|Nothing)=Nothing variable_length:Boolean=True

## Date
Expand Down Expand Up @@ -389,9 +390,9 @@ type Value_Type
constructor_name = Meta.meta self . constructor . name
additional_fields = case self of
Value_Type.Integer size ->
[["bits", size.to_bits]]
[["bits", size.to_integer]]
Value_Type.Float size ->
[["bits", size.to_bits]]
[["bits", size.to_integer]]
Value_Type.Decimal precision scale ->
[["precision", precision], ["scale", scale]]
Value_Type.Char size variable_length ->
Expand Down
Loading
Loading