diff --git a/LICENSE b/LICENSE index 09fc9181..e39f7d27 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ BSD 3-Clause License -Copyright (c) 2020-2022, UW Interactive Data Lab +Copyright (c) 2020-2024, UW Interactive Data Lab All rights reserved. Redistribution and use in source and binary forms, with or without diff --git a/README.md b/README.md index 8d506a5f..689a5559 100644 --- a/README.md +++ b/README.md @@ -93,13 +93,9 @@ Arquero uses modern JavaScript features, and so will not work with some outdated ### In Node.js or Application Bundles -First install `arquero` as a dependency, for example via `npm install arquero --save`. Arquero assumes Node version 12 or higher. - -Import using CommonJS module syntax: - -```js -const aq = require('arquero'); -``` +First install `arquero` as a dependency, for example via `npm install arquero --save`. +Arquero assumes Node version 18 or higher. +As of Arquero version 6, the library uses type `module` and should be loaded using ES module syntax. Import using ES module syntax, import all exports into a single object: @@ -113,6 +109,12 @@ Import using ES module syntax, with targeted imports: import { op, table } from 'arquero'; ``` +Dynamic import (e.g., within a Node.js REPL): + +```js +aq = await import('arquero'); +``` + ## Build Instructions To build and develop Arquero locally: diff --git a/docs/api/expressions.md b/docs/api/expressions.md index ea0d0d5a..cd819764 100644 --- a/docs/api/expressions.md +++ b/docs/api/expressions.md @@ -142,14 +142,10 @@ So why do we do this? Here are a few reasons: * **Performance**. After parsing an expression, Arquero performs code generation, often creating more performant code in the process. This level of indirection also allows us to generate optimized expressions for certain inputs, such as Apache Arrow data. -* **Flexibility**. Providing our own parsing also allows us to introduce new kinds of backing data without changing the API. For example, we could add support for different underlying data formats and storage layouts. - -* **Portability**. While a common use case of Arquero is to query data directly in the same JavaScript runtime, Arquero verbs can also be [*serialized as queries*](./#queries): one can specify verbs in one environment, but then send them to another environment for processing. For example, the [arquero-worker](https://github.com/uwdata/arquero-worker) package sends queries to a worker thread, while the [arquero-sql](https://github.com/chanwutk/arquero-sql) package sends them to a backing database server. As custom methods may not be defined in those environments, Arquero is designed to make this translation between environments possible and easier to reason about. - -* **Safety**. Arquero table expressions do not let you call methods defined on input data values. For example, to trim a string you must call `op.trim(str)`, not `str.trim()`. Again, this aids portability: otherwise unsupported methods defined on input data elements might "sneak" in to the processing. Invoking arbitrary methods may also lead to security vulnerabilities when allowing untrusted third parties to submit queries into a system. +* **Flexibility**. Providing our own parsing also allows us to introduce new kinds of backing data without changing the API. For example, we could add support for different underlying data formats and storage layouts. More importantly, it also allows us analyze expressions and incorporate aggregate and window functions in otherwise "normal" JavaScript expressions. * **Discoverability**. Defining all functions on a single object provides a single catalog of all available operations. In most IDEs, you can simply type `op.` (and perhaps hit the tab key) to the see a list of all available functions and benefit from auto-complete! -Of course, one might wish to make different trade-offs. Arquero is designed to support common use cases while also being applicable to more complex production setups. This goal comes with the cost of more rigid management of functions. However, Arquero can be extended with custom variables, functions, and even new table methods or verbs! As starting points, see the [params](table#params), [addFunction](extensibility#addFunction), and [addTableMethod](extensibility#addTableMethod) functions to introduce external variables, register new `op` functions, or extend tables with new methods. +Of course, one might wish to make different trade-offs. Arquero is designed to support common use cases while also being applicable to more complex production setups. This goal comes with the cost of more rigid management of functions. However, Arquero can be extended with custom variables, functions, and even new table methods or verbs! As starting points, see the [params](table#params) and [addFunction](extensibility#addFunction) methods to introduce external variables or register new `op` functions. -All that being said, not all use cases require portability, safety, etc. For such cases Arquero provides an escape hatch: use the [`escape()` expression helper](./#escape) to apply a standard JavaScript function *as-is*, skipping any internal parsing and code generation. \ No newline at end of file +All that being said, Arquero provides an escape hatch: use the [`escape()` expression helper](./#escape) to apply a standard JavaScript function *as-is*, skipping any internal parsing and code generation. As a result, escaped functions do *not* support aggregation and window operations, as these depend on Arquero's internal parsing and code generation. diff --git a/docs/api/extensibility.md b/docs/api/extensibility.md index 30ab6a3a..0308c139 100644 --- a/docs/api/extensibility.md +++ b/docs/api/extensibility.md @@ -14,6 +14,7 @@ title: Extensibility \| Arquero API Reference * [addVerb](#addVerb) * [Package Bundles](#packages) * [addPackage](#addPackage) +* [Table Methods](#table-methods)
@@ -123,158 +124,21 @@ aq.table({ x: [4, 3, 2, 1] }) ## Table Methods -Add new table-level methods or verbs. The [addTableMethod](#addTableMethod) function registers a new function as an instance method of tables only. The [addVerb](#addVerb) method registers a new transformation verb with both tables and serializable [queries](./#query). - -
# -aq.addTableMethod(name, method[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/register.js) - -Register a custom table method, adding a new method with the given *name* to all table instances. The provided *method* must take a table as its first argument, followed by any additional arguments. - -This method throws an error if the *name* argument is not a legal string value. -To protect Arquero internals, the *name* can not start with an underscore (`_`) character. If a custom method with the same name is already registered, the override option must be specified to overwrite it. In no case may a built-in method be overridden. - -* *name*: The name to use for the table method. -* *method*: A function implementing the table method. This function should accept a table as its first argument, followed by any additional arguments. -* *options*: Function registration options. - * *override*: Boolean flag (default `false`) indicating if the added method is allowed to override an existing method with the same name. Built-in table methods can **not** be overridden; this flag applies only to methods previously added using the extensibility API. - -*Examples* - -```js -// add a table method named size, returning an array of row and column counts -aq.addTableMethod('size', table => [table.numRows(), table.numCols()]); -aq.table({ a: [1,2,3], b: [4,5,6] }).size() // [3, 2] -``` - -
# -aq.addVerb(name, method, params[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/register.js) - -Register a custom transformation verb with the given *name*, adding both a table method and serializable [query](./#query) support. The provided *method* must take a table as its first argument, followed by any additional arguments. The required *params* argument describes the parameters the verb accepts. If you wish to add a verb to tables but do not require query serialization support, use [addTableMethod](#addTableMethod). - -This method throws an error if the *name* argument is not a legal string value. -To protect Arquero internals, the *name* can not start with an underscore (`_`) character. If a custom method with the same name is already registered, the override option must be specified to overwrite it. In no case may a built-in method be overridden. - -* *name*: The name to use for the table method. -* *method*: A function implementing the table method. This function should accept a table as its first argument, followed by any additional arguments. -* *params*: An array of schema descriptions for the verb parameters. These descriptors are needed to support query serialization. Each descriptor is an object with *name* (string-valued parameter name) and *type* properties (string-valued parameter type, see below). If a parameter has type `"Options"`, the descriptor can include an additional object-valued *props* property to describe any non-literal values, for which the keys are property names and the values are parameter types. -* *options*: Function registration options. - * *override*: Boolean flag (default `false`) indicating if the added method is allowed to override an existing method with the same name. Built-in verbs can **not** be overridden; this flag applies only to methods previously added using the extensibility API. - -*Parameter Types*. The supported parameter types are: - -* `"Expr"`: A single table expression, such as the input to [`filter()`](verbs/#filter). -* `"ExprList"`: A list of column references or expressions, such as the input to [`groupby()`](verbs/#groupby). -* `"ExprNumber"`: A number literal or numeric table expression, such as the *weight* option of [`sample()`](verbs/#sample). -* `"ExprObject"`: An object containing a set of expressions, such as the input to [`rollup()`](verbs/#rollup). -* `"JoinKeys"`: Input join keys, as in [`join()`](verbs/#join). -* `"JoinValues"`: Output join values, as in [`join()`](verbs/#join). -* `"Options"`: An options object of key-value pairs. If any of the option values are column references or table expressions, the descriptor should include a *props* property with property names as keys and parameter types as values. -* `"OrderKeys"`: A list of ordering criteria, as in [`orderby`](verbs/#orderby). -* `"SelectionList"`: A set of columns to select and potentially rename, as in [`select`](verbs/#select). -* `"TableRef"`: A reference to an additional input table, as in [`join()`](verbs/#join). -* `"TableRefList"`: A list of one or more additional input tables, as in [`concat()`](verbs/#concat). - -*Examples* - -```js -// add a bootstrapped confidence interval verb that -// accepts an aggregate expression plus options -aq.addVerb( - 'bootstrap_ci', - (table, expr, options = {}) => table - .params({ frac: options.frac || 1000 }) - .sample((d, $) => op.round($.frac * op.count()), { replace: true }) - .derive({ id: (d, $) => op.row_number() % $.frac }) - .groupby('id') - .rollup({ bs: expr }) - .rollup({ - lo: op.quantile('bs', options.lo || 0.025), - hi: op.quantile('bs', options.hi || 0.975) - }), - [ - { name: 'expr', type: 'Expr' }, - { name: 'options', type: 'Options' } - ] -); - -// apply the new verb -aq.table({ x: [1, 2, 3, 4, 6, 8, 9, 10] }) - .bootstrap_ci(op.mean('x')) -``` - -
- -## Package Bundles - -Extend Arquero with a bundle of functions, table methods, and/or verbs. - -
# -aq.addPackage(bundle[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/register.js) - -Register a *bundle* of extensions, which may include standard functions, aggregate functions, window functions, table methods, and verbs. If the input *bundle* has a key named `"arquero_package"`, the value of that property is used; otherwise the *bundle* object is used directly. This method is particularly useful for publishing separate packages of Arquero extensions and then installing them with a single method call. - -A package bundle has the following structure: - -```js -const bundle = { - functions: { ... }, - aggregateFunctions: { ... }, - windowFunctions: { ... }, - tableMethods: { ... }, - verbs: { ... } -}; -``` - -All keys are optional. For example, `functions` or `verbs` may be omitted. Each sub-bundle is an object of key-value pairs, where the key is the name of the function and the value is the function to add. - -The lone exception is the `verbs` bundle, which instead uses an object format with *method* and *params* keys, corresponding to the *method* and *params* arguments of [addVerb](#addVerb): - -```js -const bundle = { - verbs: { - name: { - method: (table, expr) => { ... }, - params: [ { name: 'expr': type: 'Expr' } ] - } - } -}; -``` - -The package method performs validation prior to adding any package content. The method will throw an error if any of the package items fail validation. See the [addFunction](#addFunction), [addAggregateFunction](#addAggregateFunction), [addWindowFunction](#windowFunction), [addTableMethod](#addTableMethod), and [addVerb](#addVerb) methods for specific validation criteria. The *options* argument can be used to specify if method overriding is permitted, as supported by each of the aforementioned methods. - -* *bundle*: The package bundle of extensions. -* *options*: Function registration options. - * *override*: Boolean flag (default `false`) indicating if the added method is allowed to override an existing method with the same name. Built-in table methods or verbs can **not** be overridden; for table methods and verbs this flag applies only to methods previously added using the extensibility API. +To add new table-level methods, including transformation verbs, simply assign new methods to the `ColumnTable` class prototype. *Examples* ```js -// add a package -aq.addPackage({ - functions: { - square: x => x * x, - }, - tableMethods: { - size: table => [table.numRows(), table.numCols()] - } -}); -``` - -```js -// add a package, ignores any content outside of "arquero_package" -aq.addPackage({ - arquero_package: { - functions: { - square: x => x * x, - }, - tableMethods: { - size: table => [table.numRows(), table.numCols()] +import { ColumnTable, op } from 'arquero'; + +// add a sum verb, which returns a new table containing summed +// values (potentially grouped) for a given column name +Object.assign( + ColumnTable.prototype, + { + sum(column, { as = 'sum' } = {}) { + return this.rollup({ [as]: op.sum(column) }); } } -}); +); ``` - -```js -// add a package from a separate library -aq.addPackage(require('arquero-arrow')); -``` \ No newline at end of file diff --git a/docs/api/index.md b/docs/api/index.md index 2fb03053..6d128398 100644 --- a/docs/api/index.md +++ b/docs/api/index.md @@ -10,7 +10,7 @@ title: Arquero API Reference * [Table Input](#input) * [load](#load), [loadArrow](#loadArrow), [loadCSV](#loadCSV), [loadFixed](#loadFixed), [loadJSON](#loadJSON) * [Table Output](#output) - * [toArrow](#toArrow) + * [toArrow](#toArrow), [toArrowIPC](#toArrowIPC) * [Expression Helpers](#expression-helpers) * [op](#op), [agg](#agg), [escape](#escape) * [bin](#bin), [desc](#desc), [frac](#frac), [rolling](#rolling), [seed](#seed) @@ -18,8 +18,6 @@ title: Arquero API Reference * [all](#all), [not](#not), [range](#range) * [matches](#matches), [startswith](#startswith), [endswith](#endswith) * [names](#names) -* [Queries](#queries) - * [query](#query), [queryFrom](#queryFrom)
@@ -102,6 +100,11 @@ This method performs parsing only. To both load and parse an Arrow file, use [lo * *arrowTable*: An [Apache Arrow](https://arrow.apache.org/docs/js/) data table or a byte array (e.g., [ArrayBuffer](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer) or [Uint8Array](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Uint8Array)) in the Arrow IPC format. * *options*: An Arrow import options object: * *columns*: An ordered set of columns to import. The input may consist of: column name strings, column integer indices, objects with current column names as keys and new column names as values (for renaming), or a selection helper function such as [all](#all), [not](#not), or [range](#range)). + * *convertDate*: Boolean flag (default `true`) to convert Arrow date values to JavaScript Date objects. If false, defaults to what the Arrow implementation provides, typically timestamps as number values. + * *convertDecimal*: Boolean flag (default `true`) to convert Arrow fixed point decimal values to JavaScript numbers. If false, defaults to what the Arrow implementation provides, typically byte arrays. The conversion will be lossy if the decimal can not be exactly represented as a double-precision floating point number. + *convertTimestamp*: Boolean flag (default `true`) to convert Arrow timestamp values to JavaScript Date objects. If false, defaults to what the Arrow implementation provides, typically timestamps as number values. + *convertBigInt*: Boolean flag (default `false`) to convert Arrow integers with bit widths of 64 bits or higher to JavaScript numbers. If false, defaults to what the Arrow implementation provides, typically `BigInt` values. The conversion will be lossy if the integer is so large it can not be exactly represented as a double-precision floating point number. + *memoize*: Boolean hint (default `true`) to enable memoization of expensive conversions. If true, memoization is applied for string and nested (list, struct) types, caching extracted values to enable faster access. Memoization is also applied to converted Date values, in part to ensure exact object equality. This hint is ignored for dictionary columns, whose values are always memoized. *Examples* @@ -405,10 +408,10 @@ const dt = await aq.loadJSON('data/table.json', { autoType: false }) ## Table Output -Methods for writing table data to an output format. Most output methods are defined as [table methods](table#output), not in the top level namespace. +Methods for writing data to an output format. Most output methods are available as [table methods](table#output), in addition to the top level namespace.
# -aq.toArrow(data[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/arrow/encode/index.js) +aq.toArrow(data[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/arrow/to-arrow.js) Create an [Apache Arrow](https://arrow.apache.org/docs/js/) table for the input *data*. The input data can be either an [Arquero table](#table) or an array of standard JavaScript objects. This method will throw an error if type inference fails or if the generated columns have differing lengths. For Arquero tables, this method can instead be invoked as [table.toArrow()](table#toArrow). @@ -477,6 +480,34 @@ const at = toArrow([ ]); ``` +
# +table.toArrowBuffer(data[, options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/arrow/to-arrow-ipc.js) + +Format input data in the binary [Apache Arrow](https://arrow.apache.org/docs/js/) IPC format. The input data can be either an [Arquero table](#table) or an array of standard JavaScript objects. This method will throw an error if type inference fails or if the generated columns have differing lengths. For Arquero tables, this method can instead be invoked as [table.toArrowIPC()](table#toArrowIPC). + +The resulting binary data may be saved to disk or passed between processes or tools. For example, when using [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers), the output of this method can be passed directly between threads (no data copy) as a [Transferable](https://developer.mozilla.org/en-US/docs/Web/API/Transferable) object. Additionally, Arrow binary data can be loaded in other language environments such as [Python](https://arrow.apache.org/docs/python/) or [R](https://arrow.apache.org/docs/r/). + +This method will throw an error if type inference fails or if the generated columns have differing lengths. + +* *options*: Options for Arrow encoding, same as [toArrow](#toArrow) but with an additional *format* option. + * *format*: The Arrow IPC byte format to use. One of `'stream'` (default) or `'file'`. + +*Examples* + +Encode Arrow data from an input Arquero table: + +```js +import { table, toArrowIPC } from 'arquero'; + +const dt = table({ + x: [1, 2, 3, 4, 5], + y: [3.4, 1.6, 5.4, 7.1, 2.9] +}); + +// encode table as a transferable Arrow byte buffer +// here, infers Uint8 for 'x' and Float64 for 'y' +const bytes = toArrowIPC(dt); +```
@@ -776,58 +807,3 @@ table.rename(aq.names(['a', 'b', 'c'])) // select and rename the first three columns, all other columns are dropped table.select(aq.names(['a', 'b', 'c'])) ``` - - -
- - -## Queries - -Queries allow deferred processing. Rather than process a sequence of verbs immediately, they can be stored as a query. The query can then be *serialized* to be stored or transferred, or later *evaluated* against an Arquero table. - -
# -aq.query([tableName]) · [Source](https://github.com/uwdata/arquero/blob/master/src/query/query.js) - -Create a new query builder instance. The optional *tableName* string argument indicates the default name of a table the query should process, and is used only when evaluating a query against a catalog of tables. The resulting query builder includes the same [verb](verbs) methods as a normal Arquero table. However, rather than evaluating verbs immediately, they are stored as a list of verbs to be evaluated later. - -The method *query.evaluate(table, catalog)* will evaluate the query against an Arquero table. If provided, the optional *catalog* argument should be a function that takes a table name string as input and returns a corresponding Arquero table instance. The catalog will be used to lookup tables referenced by name for multi-table operations such as joins, or to lookup the primary table to process when the *table* argument to evaluate is `null` or `undefined`. - -Use the query *toObject()* method to serialize a query to a JSON-compatible object. Use the top-level [queryFrom](#queryFrom) method to parse a serialized query and return a new "live" query instance. - -*Examples* - -```js -// create a query, then evaluate it on an input table -const q = aq.query() - .derive({ add1: d => d.value + 1 }) - .filter(d => d.add1 > 5 ); - -const t = q.evaluate(table); -``` - -```js -// serialize a query to a JSON-compatible object -// the query can be reconstructed using aq.queryFrom -aq.query() - .derive({ add1: d => d.value + 1 }) - .filter(d => d.add1 > 5 ) - .toObject(); -``` - - -
# -aq.queryFrom(object) · [Source](https://github.com/uwdata/arquero/blob/master/src/query/query.js) - -Parse a serialized query *object* and return a new query instance. The input *object* should be a serialized query representation, such as those generated by the query *toObject()* method. - -*Examples* - -```js -// round-trip a query to a serialized form and back again -aq.queryFrom( - aq.query() - .derive({ add1: d => d.value + 1 }) - .filter(d => d.add1 > 5 ) - .toObject() -) -``` diff --git a/docs/api/op.md b/docs/api/op.md index 2f5c8627..d2387605 100644 --- a/docs/api/op.md +++ b/docs/api/op.md @@ -54,14 +54,6 @@ Merges two or more arrays in sequence, returning a new array. * *values*: The arrays to merge. -
# -op.join(array[, delimiter]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) - -Creates and returns a new string by concatenating all of the elements in an *array* (or an array-like object), separated by commas or a specified *delimiter* string. If the *array* has only one item, then that item will be returned without using the delimiter. - -* *array*: The input array value. -* *join*: The delimiter string (default `','`). -
# op.includes(array, value[, index]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) @@ -79,6 +71,14 @@ Returns the first index at which a given *value* can be found in the *sequence* * *sequence*: The input array or string value. * *value*: The value to search for. +
# +op.join(array[, delimiter]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) + +Creates and returns a new string by concatenating all of the elements in an *array* (or an array-like object), separated by commas or a specified *delimiter* string. If the *array* has only one item, then that item will be returned without using the delimiter. + +* *array*: The input array value. +* *delimiter*: The delimiter string (default `','`). +
# op.lastindexof(sequence, value) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) @@ -102,21 +102,12 @@ Returns a new array in which the given *property* has been extracted for each el * *array*: The input array value. * *property*: The property name string to extract. Nested properties are not supported: the input `"a.b"` will indicates a property with that exact name, *not* a nested property `"b"` of the object `"a"`. -
# -op.slice(sequence[, start, end]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) - -Returns a copy of a portion of the input *sequence* (array or string) selected from *start* to *end* (*end* not included) where *start* and *end* represent the index of items in the sequence. - -* *sequence*: The input array or string value. -* *start*: The starting integer index to copy from (inclusive, default `0`). -* *end*: The ending integer index to copy from (exclusive, default `sequence.length`). -
# -op.reverse(array) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) +op.reverse(sequence) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) -Returns a new array with the element order reversed: the first *array* element becomes the last, and the last *array* element becomes the first. The input *array* is unchanged. +Returns a new array or string with the element order reversed: the first *sequence* element becomes the last, and the last *sequence* element becomes the first. The input *sequence* is unchanged. -* *array*: The input array value. +* *sequence*: The input array or string value.
# op.sequence([start,] stop[, step]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/sequence.js) @@ -127,6 +118,14 @@ Returns an array containing an arithmetic sequence from the *start* value to the * *stop*: The stopping value of the sequence. The stop value is exclusive; it is not included in the result. * *step*: The step increment between sequence values (default `1`). +
# +op.slice(sequence[, start, end]) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/array.js) + +Returns a copy of a portion of the input *sequence* (array or string) selected from *start* to *end* (*end* not included) where *start* and *end* represent the index of items in the sequence. + +* *sequence*: The input array or string value. +* *start*: The starting integer index to copy from (inclusive, default `0`). +* *end*: The ending integer index to copy from (exclusive, default `sequence.length`).
@@ -683,7 +682,7 @@ Compare two values for equality, using join semantics in which `null !== null`. Returns a boolean indicating whether the *object* has the specified *key* as its own property (as opposed to inheriting it). If the *object* is a [Map](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map) or [Set](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set) instance, the `has` method will be invoked directly on the object, otherwise `Object.hasOwnProperty` is used. * *object*: The object, Map, or Set to test for property membership. -* *property*: The string property name to test for. +* *key*: The string key (property name) to test for.
# op.keys(object) · [Source](https://github.com/uwdata/arquero/blob/master/src/op/functions/object.js) @@ -811,6 +810,7 @@ If specified, the *index* looks up a value of the resulting match. If *index* is * *value*: The input string value. * *regexp*: The [regular expression](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions) to match against. +* *index*: The index into the match result array or capture group. *Examples* diff --git a/docs/api/table.md b/docs/api/table.md index d5c7088d..182f2b83 100644 --- a/docs/api/table.md +++ b/docs/api/table.md @@ -14,7 +14,7 @@ title: Table \| Arquero API Reference * [assign](#assign) * [transform](#transform) * [Table Columns](#columns) - * [column](#column), [columnAt](#columnAt), [columnArray](#columnArray) + * [column](#column), [columnAt](#columnAt) * [columnIndex](#columnIndex), [columnName](#columnName), [columnNames](#columnNames) * [Table Values](#table-values) * [array](#array), [values](#values) @@ -23,7 +23,7 @@ title: Table \| Arquero API Reference * [Table Output](#output) * [objects](#objects), [object](#object), [Symbol.iterator](#@@iterator) * [print](#print), [toHTML](#toHTML), [toMarkdown](#toMarkdown) - * [toArrow](#toArrow), [toArrowBuffer](#toArrowBuffer), [toCSV](#toCSV), [toJSON](#toJSON) + * [toArrow](#toArrow), [toArrowIPC](#toArrowIPC), [toCSV](#toCSV), [toJSON](#toJSON)
@@ -235,7 +235,7 @@ aq.table({ a: [1, 2], b: [3, 4] }) Get the column instance with the given *name*, or `undefined` if does not exist. The returned column object provides a lightweight abstraction over the column storage (such as a backing array), providing a *length* property and *get(row)* method. -A column instance may be used across multiple tables and so does _not_ track a table's filter or orderby critera. To access filtered or ordered values, use the table [get](#get), [getter](#getter), or [columnArray](#columnArray) methods. +A column instance may be used across multiple tables and so does _not_ track a table's filter or orderby critera. To access filtered or ordered values, use the table [get](#get), [getter](#getter), or [array](#array) methods. * *name*: The column name. @@ -260,16 +260,6 @@ const dt = aq.table({ a: [1, 2, 3], b: [4, 5, 6] }) dt.columnAt(1).get(1) // 5 ``` -
# -table.columnArray(name[, constructor]) · [Source](https://github.com/uwdata/arquero/blob/master/src/table/table.js) - -_This method is a deprecated alias for the table [array()](#array) method. Please use [array()](#array) instead._ - -Get an array of values contained in the column with the given *name*. Unlike direct access through the table [column](#column) method, the array returned by this method respects any table filter or orderby criteria. By default, a standard [Array](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array) is returned; use the *constructor* argument to specify a [typed array](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray). - -* *name*: The column name. -* *constructor*: An optional array constructor (default [`Array`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/Array)) to use to instantiate the output array. Note that errors or truncated values may occur when assigning to a typed array with an incompatible type. -
# table.columnIndex(name) · [Source](https://github.com/uwdata/arquero/blob/master/src/table/table.js) @@ -362,14 +352,14 @@ for (const value of table.values('colA')) { ``` ```js -// slightly less efficient version of table.columnArray('colA') +// slightly less efficient version of table.array('colA') const colValues = Array.from(table.values('colA')); ```
# table.data() · [Source](https://github.com/uwdata/arquero/blob/master/src/table/table.js) -Returns the internal table storage data structure. +Returns the internal table storage data structure: an object with column names for keys and column arrays for values. This method returns the same structure used by the Table (not a copy) and its contents should not be modified.
# table.get(name[, row]) · [Source](https://github.com/uwdata/arquero/blob/master/src/table/column-table.js) @@ -438,7 +428,7 @@ Perform a table scan, invoking the provided *callback* function for each row of * *callback*: Function invoked for each row of the table. The callback is invoked with the following arguments: * *row*: The table row index. - * *data*: The backing table data store. + * *data*: The backing table data store (as returned by table [`data`](#data) method). * *stop*: A function to stop the scan early. The callback can invoke *stop()* to prevent future scan calls. * *order*: A boolean flag (default `false`), indicating if the table should be scanned in the order determined by [orderby](verbs#orderby). This argument has no effect if the table is unordered. @@ -629,14 +619,15 @@ const at2 = dt.toArrow({ }); ``` -
# +
# table.toArrowBuffer([options]) · [Source](https://github.com/uwdata/arquero/blob/master/src/arrow/encode/index.js) Format this table as binary data in the [Apache Arrow](https://arrow.apache.org/docs/js/) IPC format. The binary data may be saved to disk or passed between processes or tools. For example, when using [Web Workers](https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers), the output of this method can be passed directly between threads (no data copy) as a [Transferable](https://developer.mozilla.org/en-US/docs/Web/API/Transferable) object. Additionally, Arrow binary data can be loaded in other language environments such as [Python](https://arrow.apache.org/docs/python/) or [R](https://arrow.apache.org/docs/r/). -This method will throw an error if type inference fails or if the generated columns have differing lengths. This method is a shorthand for `table.toArrow().serialize()`. +This method will throw an error if type inference fails or if the generated columns have differing lengths. -* *options*: Options for Arrow encoding, same as [toArrow](#toArrow). +* *options*: Options for Arrow encoding, same as [toArrow](#toArrow) but with an additional *format* option. + * *format*: The Arrow IPC byte format to use. One of `'stream'` (default) or `'file'`. *Examples* @@ -652,7 +643,7 @@ const dt = table({ // encode table as a transferable Arrow byte buffer // here, infers Uint8 for 'x' and Float64 for 'y' -const bytes = dt.toArrowBuffer(); +const bytes = dt.toArrowIPC(); ```
# diff --git a/docs/index.md b/docs/index.md index 2a86f6f9..19315224 100644 --- a/docs/index.md +++ b/docs/index.md @@ -93,13 +93,9 @@ Arquero uses modern JavaScript features, and so will not work with some outdated ### In Node.js or Application Bundles -First install `arquero` as a dependency, for example via `npm install arquero --save`. Arquero assumes Node version 12 or higher. - -Import using CommonJS module syntax: - -```js -const aq = require('arquero'); -``` +First install `arquero` as a dependency, for example via `npm install arquero --save`. +Arquero assumes Node version 18 or higher. +As of Arquero version 6, the library uses type `module` and should be loaded using ES module syntax. Import using ES module syntax, import all exports into a single object: @@ -113,6 +109,12 @@ Import using ES module syntax, with targeted imports: import { op, table } from 'arquero'; ``` +Dynamic import (e.g., within a Node.js REPL): + +```js +aq = await import('arquero'); +``` + ## Build Instructions To build and develop Arquero locally: diff --git a/eslint.config.mjs b/eslint.config.js similarity index 90% rename from eslint.config.mjs rename to eslint.config.js index 200ae8e1..b0fe3aeb 100644 --- a/eslint.config.mjs +++ b/eslint.config.js @@ -6,10 +6,11 @@ export default [ js.configs.recommended, { languageOptions: { - ecmaVersion: 2020, - sourceType: 'module', + ecmaVersion: 2022, + sourceType: "module", globals: { ...globals.browser, + ...globals.mocha, ...globals.node, ...globals.es6, globalThis: false diff --git a/jsconfig.json b/jsconfig.json new file mode 100644 index 00000000..c1651dcc --- /dev/null +++ b/jsconfig.json @@ -0,0 +1,12 @@ +{ + "include": ["src/**/*"], + "compilerOptions": { + "allowJs": true, + "checkJs": true, + "noEmit": true, + "module": "node16", + "moduleResolution": "node16", + "target": "es2022", + "skipLibCheck": true + } +} diff --git a/package-lock.json b/package-lock.json index e219bee0..72bbbdcb 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,29 +1,28 @@ { "name": "arquero", - "version": "5.4.0", + "version": "6.0.0", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "arquero", - "version": "5.4.0", + "version": "6.0.0", "license": "BSD-3-Clause", "dependencies": { - "acorn": "^8.12.0", - "apache-arrow": "^15.0.2", - "node-fetch": "^2.7.0" + "acorn": "^8.12.1", + "apache-arrow": "^17.0.0", + "node-fetch": "^3.3.2" }, "devDependencies": { - "@rollup/plugin-json": "^6.1.0", "@rollup/plugin-node-resolve": "^15.2.3", "@rollup/plugin-terser": "^0.4.4", - "eslint": "^9.5.0", - "esm": "^3.2.25", - "rimraf": "^5.0.7", - "rollup": "^4.18.0", + "eslint": "^9.7.0", + "mocha": "^10.6.0", + "rimraf": "^6.0.1", + "rollup": "^4.18.1", "rollup-plugin-bundle-size": "^1.0.3", "tape": "^5.8.1", - "typescript": "^5.5.2" + "typescript": "^5.5.3" } }, "node_modules/@75lb/deep-merge": { @@ -60,23 +59,24 @@ } }, "node_modules/@eslint-community/regexpp": { - "version": "4.10.0", + "version": "4.11.0", + "resolved": "https://registry.npmjs.org/@eslint-community/regexpp/-/regexpp-4.11.0.tgz", + "integrity": "sha512-G/M/tIiMrTAxEWRfLfQJMmGNX28IxBg4PBz8XqQhqUHLFI6TL2htpIB1iQCj144V5ee/JaKyT9/WZ0MGZWfA7A==", "dev": true, - "license": "MIT", "engines": { "node": "^12.0.0 || ^14.0.0 || >=16.0.0" } }, "node_modules/@eslint/config-array": { - "version": "0.16.0", - "resolved": "https://registry.npmjs.org/@eslint/config-array/-/config-array-0.16.0.tgz", - "integrity": "sha512-/jmuSd74i4Czf1XXn7wGRWZCuyaUZ330NH1Bek0Pplatt4Sy1S5haN21SCLLdbeKslQ+S0wEJ+++v5YibSi+Lg==", + "version": "0.17.0", + "resolved": "https://registry.npmjs.org/@eslint/config-array/-/config-array-0.17.0.tgz", + "integrity": "sha512-A68TBu6/1mHHuc5YJL0U0VVeGNiklLAL6rRmhTCP2B5XjWLMnrX+HkO+IAXyHvks5cyyY1jjK5ITPQ1HGS2EVA==", "dev": true, "license": "Apache-2.0", "dependencies": { "@eslint/object-schema": "^2.1.4", "debug": "^4.3.1", - "minimatch": "^3.0.5" + "minimatch": "^3.1.2" }, "engines": { "node": "^18.18.0 || ^20.9.0 || >=21.1.0" @@ -107,11 +107,10 @@ } }, "node_modules/@eslint/js": { - "version": "9.5.0", - "resolved": "https://registry.npmjs.org/@eslint/js/-/js-9.5.0.tgz", - "integrity": "sha512-A7+AOT2ICkodvtsWnxZP4Xxk3NbZ3VMHd8oihydLRGrJgqqdEz1qSeEgXYyT/Cu8h1TWWsQRejIx48mtjZ5y1w==", + "version": "9.7.0", + "resolved": "https://registry.npmjs.org/@eslint/js/-/js-9.7.0.tgz", + "integrity": "sha512-ChuWDQenef8OSFnvuxv0TCVxEwmu3+hPNKvM9B34qpM0rDRbjL8t5QkQeHHeAfsKQjuH9wS82WeCi1J/owatng==", "dev": true, - "license": "MIT", "engines": { "node": "^18.18.0 || ^20.9.0 || >=21.1.0" } @@ -154,6 +153,8 @@ }, "node_modules/@isaacs/cliui": { "version": "8.0.2", + "resolved": "https://registry.npmjs.org/@isaacs/cliui/-/cliui-8.0.2.tgz", + "integrity": "sha512-O8jcjabXaleOG9DQ0+ARXWZBTfnP4WNAqzuiJK7ll44AmxGKv/J2M4TPjxjY3znBCfvBXFzucm1twdyFybFqEA==", "dev": true, "license": "ISC", "dependencies": { @@ -170,6 +171,8 @@ }, "node_modules/@isaacs/cliui/node_modules/ansi-regex": { "version": "6.0.1", + "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.0.1.tgz", + "integrity": "sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA==", "dev": true, "license": "MIT", "engines": { @@ -181,6 +184,8 @@ }, "node_modules/@isaacs/cliui/node_modules/strip-ansi": { "version": "7.1.0", + "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.0.tgz", + "integrity": "sha512-iq6eVVI64nQQTRYq2KtEg2d2uU7LElhTJwsH4YzIHZshxlgZms/wIc4VoDQTlG/IvVIrBKG06CrZnp0qv7hkcQ==", "dev": true, "license": "MIT", "dependencies": { @@ -250,6 +255,7 @@ "resolved": "https://registry.npmjs.org/@ljharb/resumer/-/resumer-0.1.3.tgz", "integrity": "sha512-d+tsDgfkj9X5QTriqM4lKesCkMMJC3IrbPKHvayP00ELx2axdXvDfWkqjxrLXIzGcQzmj7VAUT1wopqARTvafw==", "dev": true, + "license": "MIT", "dependencies": { "@ljharb/through": "^2.3.13", "call-bind": "^1.0.7" @@ -263,6 +269,7 @@ "resolved": "https://registry.npmjs.org/@ljharb/through/-/through-2.3.13.tgz", "integrity": "sha512-/gKJun8NNiWGZJkGzI/Ragc53cOdcLNdzjLaIa+GEjguQs0ulsurx8WN0jijdK9yPqDvziX995sMRLyLt1uZMQ==", "dev": true, + "license": "MIT", "dependencies": { "call-bind": "^1.0.7" }, @@ -304,6 +311,8 @@ }, "node_modules/@pkgjs/parseargs": { "version": "0.11.0", + "resolved": "https://registry.npmjs.org/@pkgjs/parseargs/-/parseargs-0.11.0.tgz", + "integrity": "sha512-+1VkjdD0QBLPodGrJUeqarH8VAIvQODIbwh9XpP5Syisf7YoQgsJKPNFoqqLQlu+VQ/tVSshMR6loPMn8U+dPg==", "dev": true, "license": "MIT", "optional": true, @@ -311,26 +320,6 @@ "node": ">=14" } }, - "node_modules/@rollup/plugin-json": { - "version": "6.1.0", - "resolved": "https://registry.npmjs.org/@rollup/plugin-json/-/plugin-json-6.1.0.tgz", - "integrity": "sha512-EGI2te5ENk1coGeADSIwZ7G2Q8CJS2sF120T7jLw4xFw9n7wIOXHo+kIYRAoVpJAN+kmqZSoO3Fp4JtoNF4ReA==", - "dev": true, - "dependencies": { - "@rollup/pluginutils": "^5.1.0" - }, - "engines": { - "node": ">=14.0.0" - }, - "peerDependencies": { - "rollup": "^1.20.0||^2.0.0||^3.0.0||^4.0.0" - }, - "peerDependenciesMeta": { - "rollup": { - "optional": true - } - } - }, "node_modules/@rollup/plugin-node-resolve": { "version": "15.2.3", "dev": true, @@ -399,9 +388,9 @@ } }, "node_modules/@rollup/rollup-android-arm-eabi": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.18.0.tgz", - "integrity": "sha512-Tya6xypR10giZV1XzxmH5wr25VcZSncG0pZIjfePT0OVBvqNEurzValetGNarVrGiq66EBVAFn15iYX4w6FKgQ==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.18.1.tgz", + "integrity": "sha512-lncuC4aHicncmbORnx+dUaAgzee9cm/PbIqgWz1PpXuwc+sa1Ct83tnqUDy/GFKleLiN7ZIeytM6KJ4cAn1SxA==", "cpu": [ "arm" ], @@ -413,9 +402,9 @@ ] }, "node_modules/@rollup/rollup-android-arm64": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.18.0.tgz", - "integrity": "sha512-avCea0RAP03lTsDhEyfy+hpfr85KfyTctMADqHVhLAF3MlIkq83CP8UfAHUssgXTYd+6er6PaAhx/QGv4L1EiA==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.18.1.tgz", + "integrity": "sha512-F/tkdw0WSs4ojqz5Ovrw5r9odqzFjb5LIgHdHZG65dFI1lWTWRVy32KDJLKRISHgJvqUeUhdIvy43fX41znyDg==", "cpu": [ "arm64" ], @@ -427,9 +416,9 @@ ] }, "node_modules/@rollup/rollup-darwin-arm64": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.18.0.tgz", - "integrity": "sha512-IWfdwU7KDSm07Ty0PuA/W2JYoZ4iTj3TUQjkVsO/6U+4I1jN5lcR71ZEvRh52sDOERdnNhhHU57UITXz5jC1/w==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.18.1.tgz", + "integrity": "sha512-vk+ma8iC1ebje/ahpxpnrfVQJibTMyHdWpOGZ3JpQ7Mgn/3QNHmPq7YwjZbIE7km73dH5M1e6MRRsnEBW7v5CQ==", "cpu": [ "arm64" ], @@ -441,9 +430,9 @@ ] }, "node_modules/@rollup/rollup-darwin-x64": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.18.0.tgz", - "integrity": "sha512-n2LMsUz7Ynu7DoQrSQkBf8iNrjOGyPLrdSg802vk6XT3FtsgX6JbE8IHRvposskFm9SNxzkLYGSq9QdpLYpRNA==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.18.1.tgz", + "integrity": "sha512-IgpzXKauRe1Tafcej9STjSSuG0Ghu/xGYH+qG6JwsAUxXrnkvNHcq/NL6nz1+jzvWAnQkuAJ4uIwGB48K9OCGA==", "cpu": [ "x64" ], @@ -455,9 +444,9 @@ ] }, "node_modules/@rollup/rollup-linux-arm-gnueabihf": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.18.0.tgz", - "integrity": "sha512-C/zbRYRXFjWvz9Z4haRxcTdnkPt1BtCkz+7RtBSuNmKzMzp3ZxdM28Mpccn6pt28/UWUCTXa+b0Mx1k3g6NOMA==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.18.1.tgz", + "integrity": "sha512-P9bSiAUnSSM7EmyRK+e5wgpqai86QOSv8BwvkGjLwYuOpaeomiZWifEos517CwbG+aZl1T4clSE1YqqH2JRs+g==", "cpu": [ "arm" ], @@ -469,9 +458,9 @@ ] }, "node_modules/@rollup/rollup-linux-arm-musleabihf": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.18.0.tgz", - "integrity": "sha512-l3m9ewPgjQSXrUMHg93vt0hYCGnrMOcUpTz6FLtbwljo2HluS4zTXFy2571YQbisTnfTKPZ01u/ukJdQTLGh9A==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.18.1.tgz", + "integrity": "sha512-5RnjpACoxtS+aWOI1dURKno11d7krfpGDEn19jI8BuWmSBbUC4ytIADfROM1FZrFhQPSoP+KEa3NlEScznBTyQ==", "cpu": [ "arm" ], @@ -483,9 +472,9 @@ ] }, "node_modules/@rollup/rollup-linux-arm64-gnu": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.18.0.tgz", - "integrity": "sha512-rJ5D47d8WD7J+7STKdCUAgmQk49xuFrRi9pZkWoRD1UeSMakbcepWXPF8ycChBoAqs1pb2wzvbY6Q33WmN2ftw==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.18.1.tgz", + "integrity": "sha512-8mwmGD668m8WaGbthrEYZ9CBmPug2QPGWxhJxh/vCgBjro5o96gL04WLlg5BA233OCWLqERy4YUzX3bJGXaJgQ==", "cpu": [ "arm64" ], @@ -497,9 +486,9 @@ ] }, "node_modules/@rollup/rollup-linux-arm64-musl": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.18.0.tgz", - "integrity": "sha512-be6Yx37b24ZwxQ+wOQXXLZqpq4jTckJhtGlWGZs68TgdKXJgw54lUUoFYrg6Zs/kjzAQwEwYbp8JxZVzZLRepQ==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.18.1.tgz", + "integrity": "sha512-dJX9u4r4bqInMGOAQoGYdwDP8lQiisWb9et+T84l2WXk41yEej8v2iGKodmdKimT8cTAYt0jFb+UEBxnPkbXEQ==", "cpu": [ "arm64" ], @@ -511,9 +500,9 @@ ] }, "node_modules/@rollup/rollup-linux-powerpc64le-gnu": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.18.0.tgz", - "integrity": "sha512-hNVMQK+qrA9Todu9+wqrXOHxFiD5YmdEi3paj6vP02Kx1hjd2LLYR2eaN7DsEshg09+9uzWi2W18MJDlG0cxJA==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-powerpc64le-gnu/-/rollup-linux-powerpc64le-gnu-4.18.1.tgz", + "integrity": "sha512-V72cXdTl4EI0x6FNmho4D502sy7ed+LuVW6Ym8aI6DRQ9hQZdp5sj0a2usYOlqvFBNKQnLQGwmYnujo2HvjCxQ==", "cpu": [ "ppc64" ], @@ -525,9 +514,9 @@ ] }, "node_modules/@rollup/rollup-linux-riscv64-gnu": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.18.0.tgz", - "integrity": "sha512-ROCM7i+m1NfdrsmvwSzoxp9HFtmKGHEqu5NNDiZWQtXLA8S5HBCkVvKAxJ8U+CVctHwV2Gb5VUaK7UAkzhDjlg==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.18.1.tgz", + "integrity": "sha512-f+pJih7sxoKmbjghrM2RkWo2WHUW8UbfxIQiWo5yeCaCM0TveMEuAzKJte4QskBp1TIinpnRcxkquY+4WuY/tg==", "cpu": [ "riscv64" ], @@ -539,9 +528,9 @@ ] }, "node_modules/@rollup/rollup-linux-s390x-gnu": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.18.0.tgz", - "integrity": "sha512-0UyyRHyDN42QL+NbqevXIIUnKA47A+45WyasO+y2bGJ1mhQrfrtXUpTxCOrfxCR4esV3/RLYyucGVPiUsO8xjg==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.18.1.tgz", + "integrity": "sha512-qb1hMMT3Fr/Qz1OKovCuUM11MUNLUuHeBC2DPPAWUYYUAOFWaxInaTwTQmc7Fl5La7DShTEpmYwgdt2hG+4TEg==", "cpu": [ "s390x" ], @@ -553,9 +542,9 @@ ] }, "node_modules/@rollup/rollup-linux-x64-gnu": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.18.0.tgz", - "integrity": "sha512-xuglR2rBVHA5UsI8h8UbX4VJ470PtGCf5Vpswh7p2ukaqBGFTnsfzxUBetoWBWymHMxbIG0Cmx7Y9qDZzr648w==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.18.1.tgz", + "integrity": "sha512-7O5u/p6oKUFYjRbZkL2FLbwsyoJAjyeXHCU3O4ndvzg2OFO2GinFPSJFGbiwFDaCFc+k7gs9CF243PwdPQFh5g==", "cpu": [ "x64" ], @@ -567,9 +556,9 @@ ] }, "node_modules/@rollup/rollup-linux-x64-musl": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.18.0.tgz", - "integrity": "sha512-LKaqQL9osY/ir2geuLVvRRs+utWUNilzdE90TpyoX0eNqPzWjRm14oMEE+YLve4k/NAqCdPkGYDaDF5Sw+xBfg==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.18.1.tgz", + "integrity": "sha512-pDLkYITdYrH/9Cv/Vlj8HppDuLMDUBmgsM0+N+xLtFd18aXgM9Nyqupb/Uw+HeidhfYg2lD6CXvz6CjoVOaKjQ==", "cpu": [ "x64" ], @@ -581,9 +570,9 @@ ] }, "node_modules/@rollup/rollup-win32-arm64-msvc": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.18.0.tgz", - "integrity": "sha512-7J6TkZQFGo9qBKH0pk2cEVSRhJbL6MtfWxth7Y5YmZs57Pi+4x6c2dStAUvaQkHQLnEQv1jzBUW43GvZW8OFqA==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.18.1.tgz", + "integrity": "sha512-W2ZNI323O/8pJdBGil1oCauuCzmVd9lDmWBBqxYZcOqWD6aWqJtVBQ1dFrF4dYpZPks6F+xCZHfzG5hYlSHZ6g==", "cpu": [ "arm64" ], @@ -595,9 +584,9 @@ ] }, "node_modules/@rollup/rollup-win32-ia32-msvc": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.18.0.tgz", - "integrity": "sha512-Txjh+IxBPbkUB9+SXZMpv+b/vnTEtFyfWZgJ6iyCmt2tdx0OF5WhFowLmnh8ENGNpfUlUZkdI//4IEmhwPieNg==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.18.1.tgz", + "integrity": "sha512-ELfEX1/+eGZYMaCIbK4jqLxO1gyTSOIlZr6pbC4SRYFaSIDVKOnZNMdoZ+ON0mrFDp4+H5MhwNC1H/AhE3zQLg==", "cpu": [ "ia32" ], @@ -609,9 +598,9 @@ ] }, "node_modules/@rollup/rollup-win32-x64-msvc": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.18.0.tgz", - "integrity": "sha512-UOo5FdvOL0+eIVTgS4tIdbW+TtnBLWg1YBCcU2KWM7nuNwRz9bksDX1bekJJCpu25N1DVWaCwnT39dVQxzqS8g==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.18.1.tgz", + "integrity": "sha512-yjk2MAkQmoaPYCSu35RLJ62+dz358nE83VfTePJRp8CG7aMg25mEJYpXFiD+NcevhX8LxD5OP5tktPXnXN7GDw==", "cpu": [ "x64" ], @@ -663,9 +652,9 @@ "license": "MIT" }, "node_modules/acorn": { - "version": "8.12.0", - "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.12.0.tgz", - "integrity": "sha512-RTvkC4w+KNXrM39/lWCUaG0IbRkWdCv7W/IOW9oU6SawyxulvkQy5HQPVTKxEjczcUvapcrw3cFx/60VN/NRNw==", + "version": "8.12.1", + "resolved": "https://registry.npmjs.org/acorn/-/acorn-8.12.1.tgz", + "integrity": "sha512-tcpGyI9zbizT9JbV6oYE477V6mTlXvvi0T0G3SNIYE2apm/G5huBa1+K89VGeovbg+jycCrfhl3ADxErOuO6Jg==", "license": "MIT", "bin": { "acorn": "bin/acorn" @@ -701,6 +690,16 @@ "url": "https://github.com/sponsors/epoberezkin" } }, + "node_modules/ansi-colors": { + "version": "4.1.3", + "resolved": "https://registry.npmjs.org/ansi-colors/-/ansi-colors-4.1.3.tgz", + "integrity": "sha512-/6w/C21Pm1A7aZitlI5Ni/2J6FFQN8i1Cvz3kHABAAbw93v/NlvKdVOqz7CCWz/3iv/JplRSEEZ83XION15ovw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, "node_modules/ansi-regex": { "version": "2.1.1", "dev": true, @@ -722,19 +721,32 @@ "url": "https://github.com/chalk/ansi-styles?sponsor=1" } }, - "node_modules/apache-arrow": { - "version": "15.0.2", - "resolved": "https://registry.npmjs.org/apache-arrow/-/apache-arrow-15.0.2.tgz", - "integrity": "sha512-RvwlFxLRpO405PLGffx4N2PYLiF7FD86Q1hHl6J2XCWiq+tTCzpb9ngFw0apFDcXZBMpCzMuwAvA7hjyL1/73A==", - "license": "Apache-2.0", + "node_modules/anymatch": { + "version": "3.1.3", + "resolved": "https://registry.npmjs.org/anymatch/-/anymatch-3.1.3.tgz", + "integrity": "sha512-KMReFUr0B4t+D+OBkjR3KYqvocp2XaSzO55UcB6mgQMd3KbcE+mWTyvVV7D/zsdEbNnV6acZUutkiHQXvTr1Rw==", + "dev": true, + "license": "ISC", "dependencies": { - "@swc/helpers": "^0.5.2", - "@types/command-line-args": "^5.2.1", - "@types/command-line-usage": "^5.0.2", - "@types/node": "^20.6.0", + "normalize-path": "^3.0.0", + "picomatch": "^2.0.4" + }, + "engines": { + "node": ">= 8" + } + }, + "node_modules/apache-arrow": { + "version": "17.0.0", + "resolved": "https://registry.npmjs.org/apache-arrow/-/apache-arrow-17.0.0.tgz", + "integrity": "sha512-X0p7auzdnGuhYMVKYINdQssS4EcKec9TCXyez/qtJt32DrIMGbzqiaMiQ0X6fQlQpw8Fl0Qygcv4dfRAr5Gu9Q==", + "dependencies": { + "@swc/helpers": "^0.5.11", + "@types/command-line-args": "^5.2.3", + "@types/command-line-usage": "^5.0.4", + "@types/node": "^20.13.0", "command-line-args": "^5.2.1", "command-line-usage": "^7.0.1", - "flatbuffers": "^23.5.26", + "flatbuffers": "^24.3.25", "json-bignum": "^0.0.3", "tslib": "^2.6.2" }, @@ -837,6 +849,19 @@ "dev": true, "license": "MIT" }, + "node_modules/binary-extensions": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/binary-extensions/-/binary-extensions-2.3.0.tgz", + "integrity": "sha512-Ceh+7ox5qe7LJuLHoY0feh3pHuUDHAcRUeyL2VYghZwfpkNIy/+8Ocg0a3UuSoYzavmylwuLWQOf3hl0jjMMIw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/brace-expansion": { "version": "1.1.11", "dev": true, @@ -846,6 +871,26 @@ "concat-map": "0.0.1" } }, + "node_modules/braces": { + "version": "3.0.3", + "resolved": "https://registry.npmjs.org/braces/-/braces-3.0.3.tgz", + "integrity": "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA==", + "dev": true, + "license": "MIT", + "dependencies": { + "fill-range": "^7.1.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/browser-stdout": { + "version": "1.3.1", + "resolved": "https://registry.npmjs.org/browser-stdout/-/browser-stdout-1.3.1.tgz", + "integrity": "sha512-qhAVI1+Av2X7qelOfAIYwXONood6XlZE/fXaBSmW/T5SzLAmCgzi+eiWE7fUvbHaeNBQH13UftjpXxsfLkMpgw==", + "dev": true, + "license": "ISC" + }, "node_modules/buffer-from": { "version": "1.1.2", "dev": true, @@ -867,6 +912,7 @@ "resolved": "https://registry.npmjs.org/call-bind/-/call-bind-1.0.7.tgz", "integrity": "sha512-GHTSNSYICQ7scH7sZ+M2rFopRoLh8t2bLSW6BbgrtLsahOIB5iyAVJf9GjWK3cYTDaMj4XdBpM1cA6pIS0Kv2w==", "dev": true, + "license": "MIT", "dependencies": { "es-define-property": "^1.0.0", "es-errors": "^1.3.0", @@ -891,6 +937,19 @@ "node": ">=6" } }, + "node_modules/camelcase": { + "version": "6.3.0", + "resolved": "https://registry.npmjs.org/camelcase/-/camelcase-6.3.0.tgz", + "integrity": "sha512-Gmy6FhYlCY7uOElZUSbxo2UCDH8owEk996gkbrpsgGtrJLM3J7jGxl9Ic7Qwwj4ivOE5AWZWRMecDdF7hqGjFA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/chalk": { "version": "1.1.3", "dev": true, @@ -962,6 +1021,89 @@ "node": ">=0.10.0" } }, + "node_modules/chokidar": { + "version": "3.6.0", + "resolved": "https://registry.npmjs.org/chokidar/-/chokidar-3.6.0.tgz", + "integrity": "sha512-7VT13fmjotKpGipCW9JEQAusEPE+Ei8nl6/g4FBAmIm0GOOLMua9NDDo/DWp0ZAxCr3cPq5ZpBqmPAQgDda2Pw==", + "dev": true, + "license": "MIT", + "dependencies": { + "anymatch": "~3.1.2", + "braces": "~3.0.2", + "glob-parent": "~5.1.2", + "is-binary-path": "~2.1.0", + "is-glob": "~4.0.1", + "normalize-path": "~3.0.0", + "readdirp": "~3.6.0" + }, + "engines": { + "node": ">= 8.10.0" + }, + "funding": { + "url": "https://paulmillr.com/funding/" + }, + "optionalDependencies": { + "fsevents": "~2.3.2" + } + }, + "node_modules/chokidar/node_modules/glob-parent": { + "version": "5.1.2", + "resolved": "https://registry.npmjs.org/glob-parent/-/glob-parent-5.1.2.tgz", + "integrity": "sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow==", + "dev": true, + "license": "ISC", + "dependencies": { + "is-glob": "^4.0.1" + }, + "engines": { + "node": ">= 6" + } + }, + "node_modules/cliui": { + "version": "7.0.4", + "resolved": "https://registry.npmjs.org/cliui/-/cliui-7.0.4.tgz", + "integrity": "sha512-OcRE68cOsVMXp1Yvonl/fzkQOyjLSu/8bhPDfQt0e0/Eb283TKP20Fs2MqoPsr9SwA595rRCA+QMzYc9nBP+JQ==", + "dev": true, + "license": "ISC", + "dependencies": { + "string-width": "^4.2.0", + "strip-ansi": "^6.0.0", + "wrap-ansi": "^7.0.0" + } + }, + "node_modules/cliui/node_modules/string-width": { + "version": "4.2.3", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz", + "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==", + "dev": true, + "license": "MIT", + "dependencies": { + "emoji-regex": "^8.0.0", + "is-fullwidth-code-point": "^3.0.0", + "strip-ansi": "^6.0.1" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/cliui/node_modules/wrap-ansi": { + "version": "7.0.0", + "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-7.0.0.tgz", + "integrity": "sha512-YVGIj2kamLSTxw6NsZjoBxfSwsn0ycdesmc4p+Q21c5zPuZ1pl+NfxVdxPtdHvmNVOQ6XSYG4AUtyt/Fi7D16Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-styles": "^4.0.0", + "string-width": "^4.1.0", + "strip-ansi": "^6.0.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/wrap-ansi?sponsor=1" + } + }, "node_modules/color-convert": { "version": "2.0.1", "license": "MIT", @@ -1039,6 +1181,15 @@ "node": ">= 8" } }, + "node_modules/data-uri-to-buffer": { + "version": "4.0.1", + "resolved": "https://registry.npmjs.org/data-uri-to-buffer/-/data-uri-to-buffer-4.0.1.tgz", + "integrity": "sha512-0R9ikRb668HB7QDxT1vkpuUBtqc53YyAwMwGeUFKRojY/NWKvdZ+9UYtRfGmhqNbRkTSVpMbmyhXipFFv2cb/A==", + "license": "MIT", + "engines": { + "node": ">= 12" + } + }, "node_modules/data-view-buffer": { "version": "1.0.1", "resolved": "https://registry.npmjs.org/data-view-buffer/-/data-view-buffer-1.0.1.tgz", @@ -1111,8 +1262,23 @@ } } }, + "node_modules/decamelize": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/decamelize/-/decamelize-4.0.0.tgz", + "integrity": "sha512-9iE1PgSik9HeIIw2JO94IidnE3eBoQrFJ3w7sFuzSX4DpmZ3v5sZpUiV5Swcf6mQEF+Y0ru8Neo+p+nyh2J+hQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/deep-equal": { "version": "2.2.3", + "resolved": "https://registry.npmjs.org/deep-equal/-/deep-equal-2.2.3.tgz", + "integrity": "sha512-ZIwpnevOurS8bpT4192sqAowWM76JDKSHYzMLty3BZGSswgq6pBaH3DhCSW5xVAZICZyKdOBPjwww5wfgT/6PA==", "dev": true, "license": "MIT", "dependencies": { @@ -1160,6 +1326,7 @@ "resolved": "https://registry.npmjs.org/define-data-property/-/define-data-property-1.1.4.tgz", "integrity": "sha512-rBMvIzlpA8v6E+SJZoo++HAYqsLrkg7MSfIinMPFhmkorw7X+dOXVJQs+QT69zGkzMyfDnIMN2Wid1+NbL3T+A==", "dev": true, + "license": "MIT", "dependencies": { "es-define-property": "^1.0.0", "es-errors": "^1.3.0", @@ -1174,6 +1341,8 @@ }, "node_modules/define-properties": { "version": "1.2.1", + "resolved": "https://registry.npmjs.org/define-properties/-/define-properties-1.2.1.tgz", + "integrity": "sha512-8QmQKqEASLd5nx0U1B1okLElbUuuttJ/AnYmRXbbbGDWh6uS208EjD4Xqq/I9wK7u0v6O08XhTWnt5XtEbR6Dg==", "dev": true, "license": "MIT", "dependencies": { @@ -1190,14 +1359,28 @@ }, "node_modules/defined": { "version": "1.0.1", + "resolved": "https://registry.npmjs.org/defined/-/defined-1.0.1.tgz", + "integrity": "sha512-hsBd2qSVCRE+5PmNdHt1uzyrFu5d3RwmFDKzyNZMFq/EwDNJF7Ee5+D5oEKF0hU6LhtoUF1macFvOe4AskQC1Q==", "dev": true, "license": "MIT", "funding": { "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/diff": { + "version": "5.2.0", + "resolved": "https://registry.npmjs.org/diff/-/diff-5.2.0.tgz", + "integrity": "sha512-uIFDxqpRZGZ6ThOk84hEfqWoHx2devRFvpTZcTHur85vImfaxUbTW9Ryh4CpCuDnToOP1CEtXKIgytHBPVff5A==", + "dev": true, + "license": "BSD-3-Clause", + "engines": { + "node": ">=0.3.1" + } + }, "node_modules/dotignore": { "version": "0.1.2", + "resolved": "https://registry.npmjs.org/dotignore/-/dotignore-0.1.2.tgz", + "integrity": "sha512-UGGGWfSauusaVJC+8fgV+NVvBXkCTmVv7sk6nojDZZvuOUNGUy0Zk4UpHQD6EDjS0jpBwcACvH4eofvyzBcRDw==", "dev": true, "license": "MIT", "dependencies": { @@ -1214,6 +1397,8 @@ }, "node_modules/eastasianwidth": { "version": "0.2.0", + "resolved": "https://registry.npmjs.org/eastasianwidth/-/eastasianwidth-0.2.0.tgz", + "integrity": "sha512-I88TYZWc9XiYHRQ4/3c5rjjfgkjhLyW2luGIheGERbNQ6OY7yTybanSpDXZa8y7VUP9YmDcYa+eyq4ca7iLqWA==", "dev": true, "license": "MIT" }, @@ -1288,6 +1473,7 @@ "resolved": "https://registry.npmjs.org/es-define-property/-/es-define-property-1.0.0.tgz", "integrity": "sha512-jxayLKShrEqqzJ0eumQbVhTYQM27CfT1T35+gCgDFoL82JLsXqTJ76zv6A0YLOgEnLUMvLzsDsGIrl8NFpT2gQ==", "dev": true, + "license": "MIT", "dependencies": { "get-intrinsic": "^1.2.4" }, @@ -1300,12 +1486,15 @@ "resolved": "https://registry.npmjs.org/es-errors/-/es-errors-1.3.0.tgz", "integrity": "sha512-Zf5H2Kxt2xjTvbJvP2ZWLEICxA6j+hAmMzIlypy4xcBg1vKVnx89Wy0GbS+kf5cwCVFFzdCFh2XSCFNULS6csw==", "dev": true, + "license": "MIT", "engines": { "node": ">= 0.4" } }, "node_modules/es-get-iterator": { "version": "1.1.3", + "resolved": "https://registry.npmjs.org/es-get-iterator/-/es-get-iterator-1.1.3.tgz", + "integrity": "sha512-sPZmqHBe6JIiTfN5q2pEi//TwxmAFHwj/XEuYjTuse78i8KxaqMTTzxPoFKuzRpDpTJ+0NAbpfenkmH2rePtuw==", "dev": true, "license": "MIT", "dependencies": { @@ -1369,6 +1558,16 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/escalade": { + "version": "3.1.2", + "resolved": "https://registry.npmjs.org/escalade/-/escalade-3.1.2.tgz", + "integrity": "sha512-ErCHMCae19vR8vQGe50xIsVomy19rg6gFu3+r3jkEO46suLMWBksvVyoGgQV+jOfl84ZSOSlmv6Gxa89PmTGmA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, "node_modules/escape-string-regexp": { "version": "1.0.5", "dev": true, @@ -1378,17 +1577,16 @@ } }, "node_modules/eslint": { - "version": "9.5.0", - "resolved": "https://registry.npmjs.org/eslint/-/eslint-9.5.0.tgz", - "integrity": "sha512-+NAOZFrW/jFTS3dASCGBxX1pkFD0/fsO+hfAkJ4TyYKwgsXZbqzrw+seCYFCcPCYXvnD67tAnglU7GQTz6kcVw==", + "version": "9.7.0", + "resolved": "https://registry.npmjs.org/eslint/-/eslint-9.7.0.tgz", + "integrity": "sha512-FzJ9D/0nGiCGBf8UXO/IGLTgLVzIxze1zpfA8Ton2mjLovXdAPlYDv+MQDcqj3TmrhAGYfOpz9RfR+ent0AgAw==", "dev": true, - "license": "MIT", "dependencies": { "@eslint-community/eslint-utils": "^4.2.0", - "@eslint-community/regexpp": "^4.6.1", - "@eslint/config-array": "^0.16.0", + "@eslint-community/regexpp": "^4.11.0", + "@eslint/config-array": "^0.17.0", "@eslint/eslintrc": "^3.1.0", - "@eslint/js": "9.5.0", + "@eslint/js": "9.7.0", "@humanwhocodes/module-importer": "^1.0.1", "@humanwhocodes/retry": "^0.3.0", "@nodelib/fs.walk": "^1.2.8", @@ -1397,9 +1595,9 @@ "cross-spawn": "^7.0.2", "debug": "^4.3.2", "escape-string-regexp": "^4.0.0", - "eslint-scope": "^8.0.1", + "eslint-scope": "^8.0.2", "eslint-visitor-keys": "^4.0.0", - "espree": "^10.0.1", + "espree": "^10.1.0", "esquery": "^1.5.0", "esutils": "^2.0.2", "fast-deep-equal": "^3.1.3", @@ -1430,11 +1628,10 @@ } }, "node_modules/eslint-scope": { - "version": "8.0.1", - "resolved": "https://registry.npmjs.org/eslint-scope/-/eslint-scope-8.0.1.tgz", - "integrity": "sha512-pL8XjgP4ZOmmwfFE8mEhSxA7ZY4C+LWyqjQ3o4yWkkmD0qcMT9kkW3zWHOczhWcjTSgqycYAgwSlXvZltv65og==", + "version": "8.0.2", + "resolved": "https://registry.npmjs.org/eslint-scope/-/eslint-scope-8.0.2.tgz", + "integrity": "sha512-6E4xmrTw5wtxnLA5wYL3WDfhZ/1bUBGOXV0zQvVRDOtrR8D0p6W7fs3JweNYhwRYeGvd/1CKX2se0/2s7Q/nJA==", "dev": true, - "license": "BSD-2-Clause", "dependencies": { "esrecurse": "^4.3.0", "estraverse": "^5.2.0" @@ -1507,14 +1704,6 @@ "node": ">=8" } }, - "node_modules/esm": { - "version": "3.2.25", - "dev": true, - "license": "MIT", - "engines": { - "node": ">=6" - } - }, "node_modules/espree": { "version": "10.1.0", "resolved": "https://registry.npmjs.org/espree/-/espree-10.1.0.tgz", @@ -1562,7 +1751,6 @@ "resolved": "https://registry.npmjs.org/esrecurse/-/esrecurse-4.3.0.tgz", "integrity": "sha512-KmfKL3b6G+RXvP8N1vr3Tq1kL/oCFgn2NYXEtqP8/L3pKapUA4G8cFVaoF3SU323CD4XypR/ffioHmkti6/Tag==", "dev": true, - "license": "BSD-2-Clause", "dependencies": { "estraverse": "^5.2.0" }, @@ -1621,6 +1809,29 @@ "reusify": "^1.0.4" } }, + "node_modules/fetch-blob": { + "version": "3.2.0", + "resolved": "https://registry.npmjs.org/fetch-blob/-/fetch-blob-3.2.0.tgz", + "integrity": "sha512-7yAQpD2UMJzLi1Dqv7qFYnPbaPx7ZfFK6PiIxQ4PfkGPyNyl2Ugx+a/umUonmKqjhM4DnfbMvdX6otXq83soQQ==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/jimmywarting" + }, + { + "type": "paypal", + "url": "https://paypal.me/jimmywarting" + } + ], + "license": "MIT", + "dependencies": { + "node-domexception": "^1.0.0", + "web-streams-polyfill": "^3.0.3" + }, + "engines": { + "node": "^12.20 || >= 14.13" + } + }, "node_modules/figures": { "version": "1.7.0", "dev": true, @@ -1646,6 +1857,19 @@ "node": ">=16.0.0" } }, + "node_modules/fill-range": { + "version": "7.1.1", + "resolved": "https://registry.npmjs.org/fill-range/-/fill-range-7.1.1.tgz", + "integrity": "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg==", + "dev": true, + "license": "MIT", + "dependencies": { + "to-regex-range": "^5.0.1" + }, + "engines": { + "node": ">=8" + } + }, "node_modules/find-replace": { "version": "3.0.0", "license": "MIT", @@ -1671,6 +1895,16 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/flat": { + "version": "5.0.2", + "resolved": "https://registry.npmjs.org/flat/-/flat-5.0.2.tgz", + "integrity": "sha512-b6suED+5/3rTpUBdG1gupIl8MPFCAMA0QXwmljLhvCUKcUvdE4gWky9zpuGCcXHOsz4J9wPGNWq6OKpmIzz3hQ==", + "dev": true, + "license": "BSD-3-Clause", + "bin": { + "flat": "cli.js" + } + }, "node_modules/flat-cache": { "version": "4.0.1", "resolved": "https://registry.npmjs.org/flat-cache/-/flat-cache-4.0.1.tgz", @@ -1686,10 +1920,10 @@ } }, "node_modules/flatbuffers": { - "version": "23.5.26", - "resolved": "https://registry.npmjs.org/flatbuffers/-/flatbuffers-23.5.26.tgz", - "integrity": "sha512-vE+SI9vrJDwi1oETtTIFldC/o9GsVKRM+s6EL0nQgxXlYV1Vc4Tk30hj4xGICftInKQKj1F3up2n8UbIVobISQ==", - "license": "SEE LICENSE IN LICENSE" + "version": "24.3.25", + "resolved": "https://registry.npmjs.org/flatbuffers/-/flatbuffers-24.3.25.tgz", + "integrity": "sha512-3HDgPbgiwWMI9zVB7VYBHaMrbOO7Gm0v+yD2FV/sCKj+9NDeVL7BOBYUuhWAQGKWOzBo8S9WdMvV0eixO233XQ==", + "license": "Apache-2.0" }, "node_modules/flatted": { "version": "3.3.1", @@ -1700,6 +1934,8 @@ }, "node_modules/for-each": { "version": "0.3.3", + "resolved": "https://registry.npmjs.org/for-each/-/for-each-0.3.3.tgz", + "integrity": "sha512-jqYfLp7mo9vIyQf8ykW2v7A+2N4QjeCeI5+Dz9XraiO1ign81wjiH7Fb9vSOWvQfNtmSa4H2RoQTrrXivdUZmw==", "dev": true, "license": "MIT", "dependencies": { @@ -1707,7 +1943,9 @@ } }, "node_modules/foreground-child": { - "version": "3.1.1", + "version": "3.2.1", + "resolved": "https://registry.npmjs.org/foreground-child/-/foreground-child-3.2.1.tgz", + "integrity": "sha512-PXUUyLqrR2XCWICfv6ukppP96sdFwWbNEnfEMt7jNsISjMsvaLNinAHNDYyvkyU+SZG2BTSbT5NjG+vZslfGTA==", "dev": true, "license": "ISC", "dependencies": { @@ -1721,6 +1959,18 @@ "url": "https://github.com/sponsors/isaacs" } }, + "node_modules/formdata-polyfill": { + "version": "4.0.10", + "resolved": "https://registry.npmjs.org/formdata-polyfill/-/formdata-polyfill-4.0.10.tgz", + "integrity": "sha512-buewHzMvYL29jdeQTVILecSaZKnt/RJWjoZCF5OW60Z67/GmSLBkOFM7qh1PI3zFNtJbaZL5eQu1vLfazOwj4g==", + "license": "MIT", + "dependencies": { + "fetch-blob": "^3.1.2" + }, + "engines": { + "node": ">=12.20.0" + } + }, "node_modules/fs.realpath": { "version": "1.0.0", "dev": true, @@ -1767,17 +2017,30 @@ }, "node_modules/functions-have-names": { "version": "1.2.3", + "resolved": "https://registry.npmjs.org/functions-have-names/-/functions-have-names-1.2.3.tgz", + "integrity": "sha512-xckBUXyTIqT97tq2x2AMb+g163b5JFysYk0x4qxNFwbfQkmNZoiRHb6sPzI9/QV33WeuvVYBUIiD4NzNIyqaRQ==", "dev": true, "license": "MIT", "funding": { "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/get-caller-file": { + "version": "2.0.5", + "resolved": "https://registry.npmjs.org/get-caller-file/-/get-caller-file-2.0.5.tgz", + "integrity": "sha512-DyFP3BM/3YHTQOCUL/w0OZHR0lpKeGrxotcHWcqNEdnltqFwXVfhEBQ94eIo34AfQpo0rGki4cyIiftY06h2Fg==", + "dev": true, + "license": "ISC", + "engines": { + "node": "6.* || 8.* || >= 10.*" + } + }, "node_modules/get-intrinsic": { "version": "1.2.4", "resolved": "https://registry.npmjs.org/get-intrinsic/-/get-intrinsic-1.2.4.tgz", "integrity": "sha512-5uYhsJH8VJBTv7oslg4BznJYhDoRI6waYCxMmCdnTrcCrHA/fCFKoTFz2JKKE0HdDFUF7/oQuhzumXJK7paBRQ==", "dev": true, + "license": "MIT", "dependencies": { "es-errors": "^1.3.0", "function-bind": "^1.1.2", @@ -1794,6 +2057,8 @@ }, "node_modules/get-package-type": { "version": "0.1.0", + "resolved": "https://registry.npmjs.org/get-package-type/-/get-package-type-0.1.0.tgz", + "integrity": "sha512-pjzuKtY64GYfWizNAJ0fr9VqttZkNiK2iS430LtIHzjBEr6bX8Am2zm4sW4Ro5wjWW5cAlRL1qAMTcXbjNAO2Q==", "dev": true, "license": "MIT", "engines": { @@ -1820,6 +2085,9 @@ }, "node_modules/glob": { "version": "7.2.3", + "resolved": "https://registry.npmjs.org/glob/-/glob-7.2.3.tgz", + "integrity": "sha512-nFR0zLpU2YCaRxwoCJvL6UvCH2JFyFVIvwTLsIf21AuHlMskA1hhTdk+LlYJtOlYt9v6dvszD2BGRqBL+iQK9Q==", + "deprecated": "Glob versions prior to v9 are no longer supported", "dev": true, "license": "ISC", "dependencies": { @@ -1880,6 +2148,8 @@ }, "node_modules/gopd": { "version": "1.0.1", + "resolved": "https://registry.npmjs.org/gopd/-/gopd-1.0.1.tgz", + "integrity": "sha512-d65bNlIadxvpb/A2abVdlqKqV563juRnZ1Wtk6s1sIR8uNsXR70xqIzVqxVf1eTqDunwT2MkczEeaezCKTZhwA==", "dev": true, "license": "MIT", "dependencies": { @@ -1913,6 +2183,8 @@ }, "node_modules/has-bigints": { "version": "1.0.2", + "resolved": "https://registry.npmjs.org/has-bigints/-/has-bigints-1.0.2.tgz", + "integrity": "sha512-tSvCKtBr9lkF0Ex0aQiP9N+OpV4zi2r/Nee5VkRDbaqv35RLYMzbwQfFSZZH0kR+Rd6302UJZ2p/bJCEoR3VoQ==", "dev": true, "license": "MIT", "funding": { @@ -1924,6 +2196,7 @@ "resolved": "https://registry.npmjs.org/has-dynamic-import/-/has-dynamic-import-2.1.0.tgz", "integrity": "sha512-su0anMkNEnJKZ/rB99jn3y6lV/J8Ro96hBJ28YAeVzj5rWxH+YL/AdCyiYYA1HDLV9YhmvqpWSJJj2KLo1MX6g==", "dev": true, + "license": "MIT", "dependencies": { "call-bind": "^1.0.5", "get-intrinsic": "^1.2.2" @@ -1947,6 +2220,7 @@ "resolved": "https://registry.npmjs.org/has-property-descriptors/-/has-property-descriptors-1.0.2.tgz", "integrity": "sha512-55JNKuIW+vq4Ke1BjOTjM2YctQIvCT7GFzHwmfZPGo5wnrgkid0YQtnAleFSqumZm4az3n2BS+erby5ipJdgrg==", "dev": true, + "license": "MIT", "dependencies": { "es-define-property": "^1.0.0" }, @@ -1969,6 +2243,8 @@ }, "node_modules/has-symbols": { "version": "1.0.3", + "resolved": "https://registry.npmjs.org/has-symbols/-/has-symbols-1.0.3.tgz", + "integrity": "sha512-l3LCuF6MgDNwTDKkdYGEihYjt5pRPbEg46rtlmnSPlUbgmB8LOIrKJbYYFBSbnPaJexMKtiPO8hmeRjRz2Td+A==", "dev": true, "license": "MIT", "engines": { @@ -2006,6 +2282,16 @@ "node": ">= 0.4" } }, + "node_modules/he": { + "version": "1.2.0", + "resolved": "https://registry.npmjs.org/he/-/he-1.2.0.tgz", + "integrity": "sha512-F/1DnUGPopORZi0ni+CvrCgHQ5FyEAHRLSApuYWMmrbSwoN2Mn/7k+Gl38gJnR7yyDZk6WLXwiGod1JOWNDKGw==", + "dev": true, + "license": "MIT", + "bin": { + "he": "bin/he" + } + }, "node_modules/ignore": { "version": "5.3.1", "resolved": "https://registry.npmjs.org/ignore/-/ignore-5.3.1.tgz", @@ -2072,6 +2358,8 @@ }, "node_modules/is-arguments": { "version": "1.1.1", + "resolved": "https://registry.npmjs.org/is-arguments/-/is-arguments-1.1.1.tgz", + "integrity": "sha512-8Q7EARjzEnKpt/PCD7e1cgUS0a6X8u5tdSiMqXhojOdoV9TsMsiO+9VLC5vAmO8N7/GmXn7yjR8qnA6bVAEzfA==", "dev": true, "license": "MIT", "dependencies": { @@ -2104,6 +2392,8 @@ }, "node_modules/is-bigint": { "version": "1.0.4", + "resolved": "https://registry.npmjs.org/is-bigint/-/is-bigint-1.0.4.tgz", + "integrity": "sha512-zB9CruMamjym81i2JZ3UMn54PKGsQzsJeo6xvN3HJJ4CAsQNB6iRutp2To77OfCNuoxspsIhzaPoO1zyCEhFOg==", "dev": true, "license": "MIT", "dependencies": { @@ -2113,8 +2403,23 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/is-binary-path": { + "version": "2.1.0", + "resolved": "https://registry.npmjs.org/is-binary-path/-/is-binary-path-2.1.0.tgz", + "integrity": "sha512-ZMERYes6pDydyuGidse7OsHxtbI7WVeUEozgR/g7rd0xUimYNlvZRE/K2MgZTjWy725IfelLeVcEM97mmtRGXw==", + "dev": true, + "license": "MIT", + "dependencies": { + "binary-extensions": "^2.0.0" + }, + "engines": { + "node": ">=8" + } + }, "node_modules/is-boolean-object": { "version": "1.1.2", + "resolved": "https://registry.npmjs.org/is-boolean-object/-/is-boolean-object-1.1.2.tgz", + "integrity": "sha512-gDYaKHJmnj4aWxyj6YHyXVpdQawtVLHU5cb+eztPGczf6cjuTdwve5ZIEfgXqH4e57An1D1AKf8CZ3kYrQRqYA==", "dev": true, "license": "MIT", "dependencies": { @@ -2144,6 +2449,8 @@ }, "node_modules/is-callable": { "version": "1.2.7", + "resolved": "https://registry.npmjs.org/is-callable/-/is-callable-1.2.7.tgz", + "integrity": "sha512-1BC0BVFhS/p0qtw6enp8e+8OD0UrK0oFLztSjNzhcKA3WDuJxxAPXzPuPtKkjEY9UUoEWlX/8fgKeu2S8i9JTA==", "dev": true, "license": "MIT", "engines": { @@ -2182,6 +2489,8 @@ }, "node_modules/is-date-object": { "version": "1.0.5", + "resolved": "https://registry.npmjs.org/is-date-object/-/is-date-object-1.0.5.tgz", + "integrity": "sha512-9YQaSxsAiSwcvS33MBk3wTCVnWK+HhF8VZR2jRxehM16QcVOdHqPn4VPHmRK4lSr38n9JriurInLcP90xsYNfQ==", "dev": true, "license": "MIT", "dependencies": { @@ -2222,9 +2531,14 @@ } }, "node_modules/is-map": { - "version": "2.0.2", + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/is-map/-/is-map-2.0.3.tgz", + "integrity": "sha512-1Qed0/Hr2m+YqxnM09CjA2d/i6YZNfF6R2oRAOj36eUdS6qIV/huPJNSEpKbupewFs+ZsJlxsjjPbc0/afW6Lw==", "dev": true, "license": "MIT", + "engines": { + "node": ">= 0.4" + }, "funding": { "url": "https://github.com/sponsors/ljharb" } @@ -2247,8 +2561,20 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/is-number": { + "version": "7.0.0", + "resolved": "https://registry.npmjs.org/is-number/-/is-number-7.0.0.tgz", + "integrity": "sha512-41Cifkg6e8TylSpdtTpeLVMqvSBEVzTttHvERD741+pnZ8ANv0004MRL43QKPDlK9cGvNp6NZWZUBlbGXYxxng==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.12.0" + } + }, "node_modules/is-number-object": { "version": "1.0.7", + "resolved": "https://registry.npmjs.org/is-number-object/-/is-number-object-1.0.7.tgz", + "integrity": "sha512-k1U0IRzLMo7ZlYIfzRu23Oh6MiIFasgpb9X76eqfFZAqwH44UI4KTBvBYIZ1dSL9ZzChTB9ShHfLkR4pdW5krQ==", "dev": true, "license": "MIT", "dependencies": { @@ -2269,8 +2595,20 @@ "node": ">=8" } }, + "node_modules/is-plain-obj": { + "version": "2.1.0", + "resolved": "https://registry.npmjs.org/is-plain-obj/-/is-plain-obj-2.1.0.tgz", + "integrity": "sha512-YWnfyRwxL/+SsrWYfOpUtz5b3YD+nyfkHvjbcanzk8zgyO4ASD67uVMRt8k5bM4lLMDnXfriRhOpemw+NfT1eA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, "node_modules/is-regex": { "version": "1.1.4", + "resolved": "https://registry.npmjs.org/is-regex/-/is-regex-1.1.4.tgz", + "integrity": "sha512-kvRdxDsxZjhzUX07ZnLydzS1TU/TJlTUHHY4YLL87e37oUA49DfkLqgy+VjFocowy29cKvcSiu+kIv728jTTVg==", "dev": true, "license": "MIT", "dependencies": { @@ -2285,9 +2623,14 @@ } }, "node_modules/is-set": { - "version": "2.0.2", + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/is-set/-/is-set-2.0.3.tgz", + "integrity": "sha512-iPAjerrse27/ygGLxw+EBR9agv9Y6uLeYVJMu+QNCoouJ1/1ri0mGrcWpfCqFZuzzx3WjtwxG098X+n4OuRkPg==", "dev": true, "license": "MIT", + "engines": { + "node": ">= 0.4" + }, "funding": { "url": "https://github.com/sponsors/ljharb" } @@ -2310,6 +2653,8 @@ }, "node_modules/is-string": { "version": "1.0.7", + "resolved": "https://registry.npmjs.org/is-string/-/is-string-1.0.7.tgz", + "integrity": "sha512-tE2UXzivje6ofPW7l23cjDOMa09gb7xlAqG6jG5ej6uPV32TlWP3NKPigtaGeHNu9fohccRYvIiZMfOOnOYUtg==", "dev": true, "license": "MIT", "dependencies": { @@ -2324,6 +2669,8 @@ }, "node_modules/is-symbol": { "version": "1.0.4", + "resolved": "https://registry.npmjs.org/is-symbol/-/is-symbol-1.0.4.tgz", + "integrity": "sha512-C/CPBqKWnvdcxqIARxyOh4v1UUEOCHpgDa0WYgpKDFMszcrPcffg5uhwSgPCLD2WWxmq6isisz87tzT01tuGhg==", "dev": true, "license": "MIT", "dependencies": { @@ -2352,15 +2699,33 @@ "url": "https://github.com/sponsors/ljharb" } }, - "node_modules/is-weakmap": { - "version": "2.0.1", + "node_modules/is-unicode-supported": { + "version": "0.1.0", + "resolved": "https://registry.npmjs.org/is-unicode-supported/-/is-unicode-supported-0.1.0.tgz", + "integrity": "sha512-knxG2q4UC3u8stRGyAVJCOdxFmv5DZiRcdlIaAQXAbSfJya+OhopNotLQrstBhququ4ZpuKbDc/8S6mgXgPFPw==", "dev": true, "license": "MIT", + "engines": { + "node": ">=10" + }, "funding": { - "url": "https://github.com/sponsors/ljharb" + "url": "https://github.com/sponsors/sindresorhus" } }, - "node_modules/is-weakref": { + "node_modules/is-weakmap": { + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/is-weakmap/-/is-weakmap-2.0.2.tgz", + "integrity": "sha512-K5pXYOm9wqY1RgjpL3YTkF39tni1XajUIkawTLUo9EZEVUFga5gSQJF8nNS7ZwJQ02y+1YCNYcMh+HIf1ZqE+w==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 0.4" + }, + "funding": { + "url": "https://github.com/sponsors/ljharb" + } + }, + "node_modules/is-weakref": { "version": "1.0.2", "resolved": "https://registry.npmjs.org/is-weakref/-/is-weakref-1.0.2.tgz", "integrity": "sha512-qctsuLZmIQ0+vSSMfoVvyFe2+GSEvnmZ2ezTup1SBse9+twCCeial6EEi3Nc2KFcf6+qz2FBPnjXsk8xhKSaPQ==", @@ -2374,12 +2739,17 @@ } }, "node_modules/is-weakset": { - "version": "2.0.2", + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/is-weakset/-/is-weakset-2.0.3.tgz", + "integrity": "sha512-LvIm3/KWzS9oRFHugab7d+M/GcBXuXX5xZkzPmN+NxihdQlZUQ4dWuSV1xR/sq6upL1TJEDrfBgRepHFdBtSNQ==", "dev": true, "license": "MIT", "dependencies": { - "call-bind": "^1.0.2", - "get-intrinsic": "^1.1.1" + "call-bind": "^1.0.7", + "get-intrinsic": "^1.2.4" + }, + "engines": { + "node": ">= 0.4" }, "funding": { "url": "https://github.com/sponsors/ljharb" @@ -2387,6 +2757,8 @@ }, "node_modules/isarray": { "version": "2.0.5", + "resolved": "https://registry.npmjs.org/isarray/-/isarray-2.0.5.tgz", + "integrity": "sha512-xHjhDr3cNBK0BzdUJSPXZntQUx/mwMS5Rw4A7lPJ90XGAO6ISP/ePDNuo0vhqOZU+UD5JoodwCAAoZQd3FeAKw==", "dev": true, "license": "MIT" }, @@ -2396,14 +2768,16 @@ "license": "ISC" }, "node_modules/jackspeak": { - "version": "2.3.6", + "version": "4.0.1", + "resolved": "https://registry.npmjs.org/jackspeak/-/jackspeak-4.0.1.tgz", + "integrity": "sha512-cub8rahkh0Q/bw1+GxP7aeSe29hHHn2V4m29nnDlvCdlgU+3UGxkZp7Z53jLUdpX3jdTO0nJZUDl3xvbWc2Xog==", "dev": true, "license": "BlueOak-1.0.0", "dependencies": { "@isaacs/cliui": "^8.0.2" }, "engines": { - "node": ">=14" + "node": "20 || >=22" }, "funding": { "url": "https://github.com/sponsors/isaacs" @@ -2499,15 +2873,61 @@ "dev": true, "license": "MIT" }, - "node_modules/lru-cache": { - "version": "10.0.2", + "node_modules/log-symbols": { + "version": "4.1.0", + "resolved": "https://registry.npmjs.org/log-symbols/-/log-symbols-4.1.0.tgz", + "integrity": "sha512-8XPvpAA8uyhfteu8pIvQxpJZ7SYYdpUivZpGy6sFsBuKRY/7rQGavedeB8aK+Zkyq6upMFVL/9AW6vOYzfRyLg==", "dev": true, - "license": "ISC", + "license": "MIT", "dependencies": { - "semver": "^7.3.5" + "chalk": "^4.1.0", + "is-unicode-supported": "^0.1.0" }, "engines": { - "node": "14 || >=16.14" + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/log-symbols/node_modules/chalk": { + "version": "4.1.2", + "resolved": "https://registry.npmjs.org/chalk/-/chalk-4.1.2.tgz", + "integrity": "sha512-oKnbhFyRIXpUuez8iBMmyEa4nbj4IOQyuhc/wy9kY7/WVPcwIO9VA668Pu8RkO7+0G76SLROeyw9CpQ061i4mA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-styles": "^4.1.0", + "supports-color": "^7.1.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/chalk?sponsor=1" + } + }, + "node_modules/log-symbols/node_modules/supports-color": { + "version": "7.2.0", + "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-7.2.0.tgz", + "integrity": "sha512-qpCAvRl9stuOHveKsn7HncJRvv501qIacKzQlO/+Lwxc9+0q2wLyv4Dfvt80/DPn2pqOBsJdDiogXGR9+OvwRw==", + "dev": true, + "license": "MIT", + "dependencies": { + "has-flag": "^4.0.0" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/lru-cache": { + "version": "11.0.0", + "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-11.0.0.tgz", + "integrity": "sha512-Qv32eSV1RSCfhY3fpPE2GNZ8jgM9X7rdAfemLWqTUxwiyIC4jJ6Sy0fZ8H+oLWevO6i4/bizg7c8d8i6bxrzbA==", + "dev": true, + "license": "ISC", + "engines": { + "node": "20 || >=22" } }, "node_modules/maxmin": { @@ -2537,6 +2957,8 @@ }, "node_modules/minimist": { "version": "1.2.8", + "resolved": "https://registry.npmjs.org/minimist/-/minimist-1.2.8.tgz", + "integrity": "sha512-2yyAR8qBkN3YuheJanUpWC5U3bb5osDywNB8RzDVlDwDHbocAJveqqj1u8+SVD7jkWT4yvsHCpWqqWqAxb0zCA==", "dev": true, "license": "MIT", "funding": { @@ -2544,15 +2966,135 @@ } }, "node_modules/minipass": { - "version": "7.0.4", + "version": "7.1.2", + "resolved": "https://registry.npmjs.org/minipass/-/minipass-7.1.2.tgz", + "integrity": "sha512-qOOzS1cBTWYF4BH8fVePDBOO9iptMnGUEZwNc/cMWnTV2nVLZ7VoNWEPHkYczZA0pdoA7dl6e7FL659nX9S2aw==", "dev": true, "license": "ISC", "engines": { "node": ">=16 || 14 >=14.17" } }, + "node_modules/mocha": { + "version": "10.6.0", + "resolved": "https://registry.npmjs.org/mocha/-/mocha-10.6.0.tgz", + "integrity": "sha512-hxjt4+EEB0SA0ZDygSS015t65lJw/I2yRCS3Ae+SJ5FrbzrXgfYwJr96f0OvIXdj7h4lv/vLCrH3rkiuizFSvw==", + "dev": true, + "license": "MIT", + "dependencies": { + "ansi-colors": "^4.1.3", + "browser-stdout": "^1.3.1", + "chokidar": "^3.5.3", + "debug": "^4.3.5", + "diff": "^5.2.0", + "escape-string-regexp": "^4.0.0", + "find-up": "^5.0.0", + "glob": "^8.1.0", + "he": "^1.2.0", + "js-yaml": "^4.1.0", + "log-symbols": "^4.1.0", + "minimatch": "^5.1.6", + "ms": "^2.1.3", + "serialize-javascript": "^6.0.2", + "strip-json-comments": "^3.1.1", + "supports-color": "^8.1.1", + "workerpool": "^6.5.1", + "yargs": "^16.2.0", + "yargs-parser": "^20.2.9", + "yargs-unparser": "^2.0.0" + }, + "bin": { + "_mocha": "bin/_mocha", + "mocha": "bin/mocha.js" + }, + "engines": { + "node": ">= 14.0.0" + } + }, + "node_modules/mocha/node_modules/brace-expansion": { + "version": "2.0.1", + "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.1.tgz", + "integrity": "sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==", + "dev": true, + "license": "MIT", + "dependencies": { + "balanced-match": "^1.0.0" + } + }, + "node_modules/mocha/node_modules/escape-string-regexp": { + "version": "4.0.0", + "resolved": "https://registry.npmjs.org/escape-string-regexp/-/escape-string-regexp-4.0.0.tgz", + "integrity": "sha512-TtpcNJ3XAzx3Gq8sWRzJaVajRs0uVxA2YAkdb1jm2YkPz4G6egUFAyA3n5vtEIZefPk5Wa4UXbKuS5fKkJWdgA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/mocha/node_modules/glob": { + "version": "8.1.0", + "resolved": "https://registry.npmjs.org/glob/-/glob-8.1.0.tgz", + "integrity": "sha512-r8hpEjiQEYlF2QU0df3dS+nxxSIreXQS1qRhMJM0Q5NDdR386C7jb7Hwwod8Fgiuex+k0GFjgft18yvxm5XoCQ==", + "deprecated": "Glob versions prior to v9 are no longer supported", + "dev": true, + "license": "ISC", + "dependencies": { + "fs.realpath": "^1.0.0", + "inflight": "^1.0.4", + "inherits": "2", + "minimatch": "^5.0.1", + "once": "^1.3.0" + }, + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/isaacs" + } + }, + "node_modules/mocha/node_modules/minimatch": { + "version": "5.1.6", + "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-5.1.6.tgz", + "integrity": "sha512-lKwV/1brpG6mBUFHtb7NUmtABCb2WZZmm2wNiOA5hAb8VdCS4B3dtMWyvcoViccwAW/COERjXLt0zP1zXUN26g==", + "dev": true, + "license": "ISC", + "dependencies": { + "brace-expansion": "^2.0.1" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/mocha/node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/mocha/node_modules/supports-color": { + "version": "8.1.1", + "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-8.1.1.tgz", + "integrity": "sha512-MpUEN2OodtUzxvKQl72cUF7RQ5EiHsGvSsVG0ia9c5RbWGL2CI4C7EpPS8UTBIplnlzZiNuV56w+FuNxy3ty2Q==", + "dev": true, + "license": "MIT", + "dependencies": { + "has-flag": "^4.0.0" + }, + "engines": { + "node": ">=10" + }, + "funding": { + "url": "https://github.com/chalk/supports-color?sponsor=1" + } + }, "node_modules/mock-property": { "version": "1.0.3", + "resolved": "https://registry.npmjs.org/mock-property/-/mock-property-1.0.3.tgz", + "integrity": "sha512-2emPTb1reeLLYwHxyVx993iYyCHEiRRO+y8NFXFPL5kl5q14sgTK76cXyEKkeKCHeRw35SfdkUJ10Q1KfHuiIQ==", "dev": true, "license": "MIT", "dependencies": { @@ -2582,24 +3124,51 @@ "dev": true, "license": "MIT" }, + "node_modules/node-domexception": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/node-domexception/-/node-domexception-1.0.0.tgz", + "integrity": "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==", + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/jimmywarting" + }, + { + "type": "github", + "url": "https://paypal.me/jimmywarting" + } + ], + "license": "MIT", + "engines": { + "node": ">=10.5.0" + } + }, "node_modules/node-fetch": { - "version": "2.7.0", - "resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz", - "integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==", + "version": "3.3.2", + "resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-3.3.2.tgz", + "integrity": "sha512-dRB78srN/l6gqWulah9SrxeYnxeddIG30+GOqK/9OlLVyLg3HPnr6SqOWTWOXKRwC2eGYCkZ59NNuSgvSrpgOA==", "license": "MIT", "dependencies": { - "whatwg-url": "^5.0.0" + "data-uri-to-buffer": "^4.0.0", + "fetch-blob": "^3.1.4", + "formdata-polyfill": "^4.0.10" }, "engines": { - "node": "4.x || >=6.0.0" + "node": "^12.20.0 || ^14.13.1 || >=16.0.0" }, - "peerDependencies": { - "encoding": "^0.1.0" - }, - "peerDependenciesMeta": { - "encoding": { - "optional": true - } + "funding": { + "type": "opencollective", + "url": "https://opencollective.com/node-fetch" + } + }, + "node_modules/normalize-path": { + "version": "3.0.0", + "resolved": "https://registry.npmjs.org/normalize-path/-/normalize-path-3.0.0.tgz", + "integrity": "sha512-6eZs5Ls3WtCisHWp9S2GUy8dqkpGi4BVSz3GaqiE6ezub0512ESztXUwUB6C6IKbQkY2Pnb/mD4WYojCRwcwLA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" } }, "node_modules/number-is-nan": { @@ -2619,9 +3188,14 @@ } }, "node_modules/object-inspect": { - "version": "1.13.1", + "version": "1.13.2", + "resolved": "https://registry.npmjs.org/object-inspect/-/object-inspect-1.13.2.tgz", + "integrity": "sha512-IRZSRuzJiynemAXPYtPe5BoI/RESNYR7TYm50MC5Mqbd3Jmw5y790sErYw3V6SryFJD64b74qQQs9wn5Bg/k3g==", "dev": true, "license": "MIT", + "engines": { + "node": ">= 0.4" + }, "funding": { "url": "https://github.com/sponsors/ljharb" } @@ -2645,6 +3219,8 @@ }, "node_modules/object-keys": { "version": "1.1.1", + "resolved": "https://registry.npmjs.org/object-keys/-/object-keys-1.1.1.tgz", + "integrity": "sha512-NuAESUOUMrlIXOfHKzD6bpPu3tYt3xvjNdRIQ+FeT0lNb4K8WR70CaDxhuNguS2XG+GjkyMwOzsN5ZktImfhLA==", "dev": true, "license": "MIT", "engines": { @@ -2656,6 +3232,7 @@ "resolved": "https://registry.npmjs.org/object.assign/-/object.assign-4.1.5.tgz", "integrity": "sha512-byy+U7gp+FVwmyzKPYhW2h5l3crpmGsxl7X2s8y43IgxvG4g3QZ6CffDtsNQy1WsmZpQbO+ybo0AlW7TY6DcBQ==", "dev": true, + "license": "MIT", "dependencies": { "call-bind": "^1.0.5", "define-properties": "^1.2.1", @@ -2721,6 +3298,13 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/package-json-from-dist": { + "version": "1.0.0", + "resolved": "https://registry.npmjs.org/package-json-from-dist/-/package-json-from-dist-1.0.0.tgz", + "integrity": "sha512-dATvCeZN/8wQsGywez1mzHtTlP22H8OEfPrVMLNr4/eGa+ijtLn/6M5f0dY8UKNrC2O9UCU6SSoG3qRKnt7STw==", + "dev": true, + "license": "BlueOak-1.0.0" + }, "node_modules/parent-module": { "version": "1.0.1", "resolved": "https://registry.npmjs.org/parent-module/-/parent-module-1.0.1.tgz", @@ -2744,6 +3328,8 @@ }, "node_modules/path-is-absolute": { "version": "1.0.1", + "resolved": "https://registry.npmjs.org/path-is-absolute/-/path-is-absolute-1.0.1.tgz", + "integrity": "sha512-AVbw3UJ2e9bq64vSaS9Am0fje1Pa8pbGqTTsmXfaIiMpnr5DlDhfJOuLj9Sf95ZPVDAUerDfEk88MPmPe7UCQg==", "dev": true, "license": "MIT", "engines": { @@ -2764,15 +3350,17 @@ "license": "MIT" }, "node_modules/path-scurry": { - "version": "1.10.1", + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/path-scurry/-/path-scurry-2.0.0.tgz", + "integrity": "sha512-ypGJsmGtdXUOeM5u93TyeIEfEhM6s+ljAhrk5vAvSx8uyY/02OvrZnA0YNGUrPXfpJMgI1ODd3nwz8Npx4O4cg==", "dev": true, "license": "BlueOak-1.0.0", "dependencies": { - "lru-cache": "^9.1.1 || ^10.0.0", - "minipass": "^5.0.0 || ^6.0.2 || ^7.0.0" + "lru-cache": "^11.0.0", + "minipass": "^7.1.2" }, "engines": { - "node": ">=16 || 14 >=14.17" + "node": "20 || >=22" }, "funding": { "url": "https://github.com/sponsors/isaacs" @@ -2856,6 +3444,19 @@ "safe-buffer": "^5.1.0" } }, + "node_modules/readdirp": { + "version": "3.6.0", + "resolved": "https://registry.npmjs.org/readdirp/-/readdirp-3.6.0.tgz", + "integrity": "sha512-hOS089on8RduqdbhvQ5Z37A0ESjsqz6qnRcffsMU3495FuTdqSm+7bhJ29JvIOsBDEEnan5DPu9t3To9VRlMzA==", + "dev": true, + "license": "MIT", + "dependencies": { + "picomatch": "^2.2.1" + }, + "engines": { + "node": ">=8.10.0" + } + }, "node_modules/regexp.prototype.flags": { "version": "1.5.2", "resolved": "https://registry.npmjs.org/regexp.prototype.flags/-/regexp.prototype.flags-1.5.2.tgz", @@ -2875,6 +3476,16 @@ "url": "https://github.com/sponsors/ljharb" } }, + "node_modules/require-directory": { + "version": "2.1.1", + "resolved": "https://registry.npmjs.org/require-directory/-/require-directory-2.1.1.tgz", + "integrity": "sha512-fGxEI7+wsG9xrvdjsrlmL22OMTTiHRwAMroiEeMgq8gzoLC/PQr7RsRDSTLUg/bZAZtF+TVIkHc6/4RIKrui+Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=0.10.0" + } + }, "node_modules/resolve": { "version": "1.22.2", "dev": true, @@ -2911,19 +3522,20 @@ } }, "node_modules/rimraf": { - "version": "5.0.7", - "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-5.0.7.tgz", - "integrity": "sha512-nV6YcJo5wbLW77m+8KjH8aB/7/rxQy9SZ0HY5shnwULfS+9nmTtVXAJET5NdZmCzA4fPI/Hm1wo/Po/4mopOdg==", + "version": "6.0.1", + "resolved": "https://registry.npmjs.org/rimraf/-/rimraf-6.0.1.tgz", + "integrity": "sha512-9dkvaxAsk/xNXSJzMgFqqMCuFgt2+KsOFek3TMLfo8NCPfWpBmqwyNn5Y+NX56QUYfCtsyhF3ayiboEoUmJk/A==", "dev": true, "license": "ISC", "dependencies": { - "glob": "^10.3.7" + "glob": "^11.0.0", + "package-json-from-dist": "^1.0.0" }, "bin": { "rimraf": "dist/esm/bin.mjs" }, "engines": { - "node": ">=14.18" + "node": "20 || >=22" }, "funding": { "url": "https://github.com/sponsors/isaacs" @@ -2931,6 +3543,8 @@ }, "node_modules/rimraf/node_modules/brace-expansion": { "version": "2.0.1", + "resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-2.0.1.tgz", + "integrity": "sha512-XnAIvQ8eM+kC6aULx6wuQiwVsnzsi9d3WxzV3FpWTGA19F621kwdbsAcFKXgKUHZWsy+mY6iL1sHTxWEFCytDA==", "dev": true, "license": "MIT", "dependencies": { @@ -2938,44 +3552,49 @@ } }, "node_modules/rimraf/node_modules/glob": { - "version": "10.3.10", + "version": "11.0.0", + "resolved": "https://registry.npmjs.org/glob/-/glob-11.0.0.tgz", + "integrity": "sha512-9UiX/Bl6J2yaBbxKoEBRm4Cipxgok8kQYcOPEhScPwebu2I0HoQOuYdIO6S3hLuWoZgpDpwQZMzTFxgpkyT76g==", "dev": true, "license": "ISC", "dependencies": { "foreground-child": "^3.1.0", - "jackspeak": "^2.3.5", - "minimatch": "^9.0.1", - "minipass": "^5.0.0 || ^6.0.2 || ^7.0.0", - "path-scurry": "^1.10.1" + "jackspeak": "^4.0.1", + "minimatch": "^10.0.0", + "minipass": "^7.1.2", + "package-json-from-dist": "^1.0.0", + "path-scurry": "^2.0.0" }, "bin": { "glob": "dist/esm/bin.mjs" }, "engines": { - "node": ">=16 || 14 >=14.17" + "node": "20 || >=22" }, "funding": { "url": "https://github.com/sponsors/isaacs" } }, "node_modules/rimraf/node_modules/minimatch": { - "version": "9.0.3", + "version": "10.0.1", + "resolved": "https://registry.npmjs.org/minimatch/-/minimatch-10.0.1.tgz", + "integrity": "sha512-ethXTt3SGGR+95gudmqJ1eNhRO7eGEGIgYA9vnPatK4/etz2MEVDno5GMCibdMTuBMyElzIlgxMna3K94XDIDQ==", "dev": true, "license": "ISC", "dependencies": { "brace-expansion": "^2.0.1" }, "engines": { - "node": ">=16 || 14 >=14.17" + "node": "20 || >=22" }, "funding": { "url": "https://github.com/sponsors/isaacs" } }, "node_modules/rollup": { - "version": "4.18.0", - "resolved": "https://registry.npmjs.org/rollup/-/rollup-4.18.0.tgz", - "integrity": "sha512-QmJz14PX3rzbJCN1SG4Xe/bAAX2a6NpCP8ab2vfu2GiUr8AQcr2nCV/oEO3yneFarB67zk8ShlIyWb2LGTb3Sg==", + "version": "4.18.1", + "resolved": "https://registry.npmjs.org/rollup/-/rollup-4.18.1.tgz", + "integrity": "sha512-Elx2UT8lzxxOXMpy5HWQGZqkrQOtrVDDa/bm9l10+U4rQnVzbL/LgZ4NOM1MPIDyHk69W4InuYDF5dzRh4Kw1A==", "dev": true, "license": "MIT", "dependencies": { @@ -2989,22 +3608,22 @@ "npm": ">=8.0.0" }, "optionalDependencies": { - "@rollup/rollup-android-arm-eabi": "4.18.0", - "@rollup/rollup-android-arm64": "4.18.0", - "@rollup/rollup-darwin-arm64": "4.18.0", - "@rollup/rollup-darwin-x64": "4.18.0", - "@rollup/rollup-linux-arm-gnueabihf": "4.18.0", - "@rollup/rollup-linux-arm-musleabihf": "4.18.0", - "@rollup/rollup-linux-arm64-gnu": "4.18.0", - "@rollup/rollup-linux-arm64-musl": "4.18.0", - "@rollup/rollup-linux-powerpc64le-gnu": "4.18.0", - "@rollup/rollup-linux-riscv64-gnu": "4.18.0", - "@rollup/rollup-linux-s390x-gnu": "4.18.0", - "@rollup/rollup-linux-x64-gnu": "4.18.0", - "@rollup/rollup-linux-x64-musl": "4.18.0", - "@rollup/rollup-win32-arm64-msvc": "4.18.0", - "@rollup/rollup-win32-ia32-msvc": "4.18.0", - "@rollup/rollup-win32-x64-msvc": "4.18.0", + "@rollup/rollup-android-arm-eabi": "4.18.1", + "@rollup/rollup-android-arm64": "4.18.1", + "@rollup/rollup-darwin-arm64": "4.18.1", + "@rollup/rollup-darwin-x64": "4.18.1", + "@rollup/rollup-linux-arm-gnueabihf": "4.18.1", + "@rollup/rollup-linux-arm-musleabihf": "4.18.1", + "@rollup/rollup-linux-arm64-gnu": "4.18.1", + "@rollup/rollup-linux-arm64-musl": "4.18.1", + "@rollup/rollup-linux-powerpc64le-gnu": "4.18.1", + "@rollup/rollup-linux-riscv64-gnu": "4.18.1", + "@rollup/rollup-linux-s390x-gnu": "4.18.1", + "@rollup/rollup-linux-x64-gnu": "4.18.1", + "@rollup/rollup-linux-x64-musl": "4.18.1", + "@rollup/rollup-win32-arm64-msvc": "4.18.1", + "@rollup/rollup-win32-ia32-msvc": "4.18.1", + "@rollup/rollup-win32-x64-msvc": "4.18.1", "fsevents": "~2.3.2" } }, @@ -3095,33 +3714,10 @@ "url": "https://github.com/sponsors/ljharb" } }, - "node_modules/semver": { - "version": "7.5.4", - "dev": true, - "license": "ISC", - "dependencies": { - "lru-cache": "^6.0.0" - }, - "bin": { - "semver": "bin/semver.js" - }, - "engines": { - "node": ">=10" - } - }, - "node_modules/semver/node_modules/lru-cache": { - "version": "6.0.0", - "dev": true, - "license": "ISC", - "dependencies": { - "yallist": "^4.0.0" - }, - "engines": { - "node": ">=10" - } - }, "node_modules/serialize-javascript": { - "version": "6.0.1", + "version": "6.0.2", + "resolved": "https://registry.npmjs.org/serialize-javascript/-/serialize-javascript-6.0.2.tgz", + "integrity": "sha512-Saa1xPByTTq2gdeFZYLLo+RFE35NHZkAbqZeWNd3BpzppeVisAqpDjcp8dyf6uIvEqJRd46jemmyA4iFIeVk8g==", "dev": true, "license": "BSD-3-Clause", "dependencies": { @@ -3133,6 +3729,7 @@ "resolved": "https://registry.npmjs.org/set-function-length/-/set-function-length-1.2.2.tgz", "integrity": "sha512-pgRc4hJ4/sNjWCSS9AmnS40x3bNMDTknHgL5UaMBTMyJnU90EgWh1Rz+MC9eFu4BuN/UwZjKQuY/1v3rM7HMfg==", "dev": true, + "license": "MIT", "dependencies": { "define-data-property": "^1.1.4", "es-errors": "^1.3.0", @@ -3146,13 +3743,16 @@ } }, "node_modules/set-function-name": { - "version": "2.0.1", + "version": "2.0.2", + "resolved": "https://registry.npmjs.org/set-function-name/-/set-function-name-2.0.2.tgz", + "integrity": "sha512-7PGFlmtwsEADb0WYyvCMa1t+yke6daIG4Wirafur5kcf+MhUnPms1UeR0CKQdTZD81yESwMHbtn+TR+dMviakQ==", "dev": true, "license": "MIT", "dependencies": { - "define-data-property": "^1.0.1", + "define-data-property": "^1.1.4", + "es-errors": "^1.3.0", "functions-have-names": "^1.2.3", - "has-property-descriptors": "^1.0.0" + "has-property-descriptors": "^1.0.2" }, "engines": { "node": ">= 0.4" @@ -3178,20 +3778,28 @@ } }, "node_modules/side-channel": { - "version": "1.0.4", + "version": "1.0.6", + "resolved": "https://registry.npmjs.org/side-channel/-/side-channel-1.0.6.tgz", + "integrity": "sha512-fDW/EZ6Q9RiO8eFG8Hj+7u/oW+XrPTIChwCOM2+th2A6OblDtYYIpve9m+KvI9Z4C9qSEXlaGR6bTEYHReuglA==", "dev": true, "license": "MIT", "dependencies": { - "call-bind": "^1.0.0", - "get-intrinsic": "^1.0.2", - "object-inspect": "^1.9.0" + "call-bind": "^1.0.7", + "es-errors": "^1.3.0", + "get-intrinsic": "^1.2.4", + "object-inspect": "^1.13.1" + }, + "engines": { + "node": ">= 0.4" }, "funding": { "url": "https://github.com/sponsors/ljharb" } }, "node_modules/signal-exit": { - "version": "4.0.1", + "version": "4.1.0", + "resolved": "https://registry.npmjs.org/signal-exit/-/signal-exit-4.1.0.tgz", + "integrity": "sha512-bzyZ1e88w9O1iNJbKnOlvYTrWPDl46O1bG0D3XInv+9tkPrxrN8jUUTiFlDkkmKWgn1M6CfIA13SuGqOa9Korw==", "dev": true, "license": "ISC", "engines": { @@ -3225,6 +3833,8 @@ }, "node_modules/stop-iteration-iterator": { "version": "1.0.0", + "resolved": "https://registry.npmjs.org/stop-iteration-iterator/-/stop-iteration-iterator-1.0.0.tgz", + "integrity": "sha512-iCGQj+0l0HOdZ2AEeBADlsRC+vsnDsZsbdSiH1yNSjcfKM7fdpCMfqAL/dwF5BLiw/XhRft/Wax6zQbhq2BcjQ==", "dev": true, "license": "MIT", "dependencies": { @@ -3243,6 +3853,8 @@ }, "node_modules/string-width": { "version": "5.1.2", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-5.1.2.tgz", + "integrity": "sha512-HnLOCR3vjcY8beoNLtcjZ5/nxn2afmME6lhrDrebokqMap+XbeW8n9TXpPDOqdGK5qcI3oT0GKTW6wC7EMiVqA==", "dev": true, "license": "MIT", "dependencies": { @@ -3260,6 +3872,8 @@ "node_modules/string-width-cjs": { "name": "string-width", "version": "4.2.3", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz", + "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==", "dev": true, "license": "MIT", "dependencies": { @@ -3273,6 +3887,8 @@ }, "node_modules/string-width/node_modules/ansi-regex": { "version": "6.0.1", + "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.0.1.tgz", + "integrity": "sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA==", "dev": true, "license": "MIT", "engines": { @@ -3284,11 +3900,15 @@ }, "node_modules/string-width/node_modules/emoji-regex": { "version": "9.2.2", + "resolved": "https://registry.npmjs.org/emoji-regex/-/emoji-regex-9.2.2.tgz", + "integrity": "sha512-L18DaJsXSUk2+42pv8mLs5jJT2hqFkFE4j21wOmgbUqsZ2hL72NsUU785g9RXgo3s0ZNgVl42TiHp3ZtOv/Vyg==", "dev": true, "license": "MIT" }, "node_modules/string-width/node_modules/strip-ansi": { "version": "7.1.0", + "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.0.tgz", + "integrity": "sha512-iq6eVVI64nQQTRYq2KtEg2d2uU7LElhTJwsH4YzIHZshxlgZms/wIc4VoDQTlG/IvVIrBKG06CrZnp0qv7hkcQ==", "dev": true, "license": "MIT", "dependencies": { @@ -3367,6 +3987,8 @@ "node_modules/strip-ansi-cjs": { "name": "strip-ansi", "version": "6.0.1", + "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-6.0.1.tgz", + "integrity": "sha512-Y38VPSHcqkFrCpFnQ9vuSXmquuv5oXOKpGeT6aGrr3o3Gc9AlVa6JBfUSOCnbxGGZF+/0ooI7KrPuUSztUdU5A==", "dev": true, "license": "MIT", "dependencies": { @@ -3378,6 +4000,8 @@ }, "node_modules/strip-ansi-cjs/node_modules/ansi-regex": { "version": "5.0.1", + "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-5.0.1.tgz", + "integrity": "sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ==", "dev": true, "license": "MIT", "engines": { @@ -3489,6 +4113,8 @@ }, "node_modules/tape/node_modules/resolve": { "version": "2.0.0-next.5", + "resolved": "https://registry.npmjs.org/resolve/-/resolve-2.0.0-next.5.tgz", + "integrity": "sha512-U7WjGVG9sH8tvjW5SmGbQuui75FiyjAX72HX15DwBBwF9dNiQZRQAg9nnPhYy+TUnE0+VcrttuvNI8oSxZcocA==", "dev": true, "license": "MIT", "dependencies": { @@ -3525,9 +4151,18 @@ "dev": true, "license": "MIT" }, - "node_modules/tr46": { - "version": "0.0.3", - "license": "MIT" + "node_modules/to-regex-range": { + "version": "5.0.1", + "resolved": "https://registry.npmjs.org/to-regex-range/-/to-regex-range-5.0.1.tgz", + "integrity": "sha512-65P7iz6X5yEr1cwcgvQxbbIw7Uk3gOy5dIdtZ4rDveLqhrdJP+Li/Hx6tyK0NEb+2GCyneCMJiGqrADCSNk8sQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "is-number": "^7.0.0" + }, + "engines": { + "node": ">=8.0" + } }, "node_modules/tslib": { "version": "2.6.3", @@ -3624,9 +4259,9 @@ } }, "node_modules/typescript": { - "version": "5.5.2", - "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.5.2.tgz", - "integrity": "sha512-NcRtPEOsPFFWjobJEtfihkLCZCXZt/os3zf8nTxjVH3RvTSxjrCamJpbExGvYOF+tFHc3pA65qpdwPbzjohhew==", + "version": "5.5.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.5.3.tgz", + "integrity": "sha512-/hreyEujaB0w76zKo6717l3L0o/qEUtRgdvUBvlkhoWeOVMjMuHNHk0BRBzikzuGDqNmPQbg5ifMEqsHLiIUcQ==", "dev": true, "license": "Apache-2.0", "bin": { @@ -3676,16 +4311,13 @@ "punycode": "^2.1.0" } }, - "node_modules/webidl-conversions": { - "version": "3.0.1", - "license": "BSD-2-Clause" - }, - "node_modules/whatwg-url": { - "version": "5.0.0", + "node_modules/web-streams-polyfill": { + "version": "3.3.3", + "resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-3.3.3.tgz", + "integrity": "sha512-d2JWLCivmZYTSIoge9MsgFCZrt571BikcWGYkjC1khllbTeDlGqZ2D8vD8E/lJa8WGWbb7Plm8/XJYV7IJHZZw==", "license": "MIT", - "dependencies": { - "tr46": "~0.0.3", - "webidl-conversions": "^3.0.0" + "engines": { + "node": ">= 8" } }, "node_modules/which": { @@ -3704,6 +4336,8 @@ }, "node_modules/which-boxed-primitive": { "version": "1.0.2", + "resolved": "https://registry.npmjs.org/which-boxed-primitive/-/which-boxed-primitive-1.0.2.tgz", + "integrity": "sha512-bwZdv0AKLpplFY2KZRX6TvyuN7ojjr7lwkg6ml0roIy9YeuSr7JS372qlNW18UQYzgYK9ziGcerWqZOmEn9VNg==", "dev": true, "license": "MIT", "dependencies": { @@ -3718,14 +4352,19 @@ } }, "node_modules/which-collection": { - "version": "1.0.1", + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/which-collection/-/which-collection-1.0.2.tgz", + "integrity": "sha512-K4jVyjnBdgvc86Y6BkaLZEN933SwYOuBFkdmBu9ZfkcAbdVbpITnDmjvZ/aQjRXQrv5EPkTnD1s39GiiqbngCw==", "dev": true, "license": "MIT", "dependencies": { - "is-map": "^2.0.1", - "is-set": "^2.0.1", - "is-weakmap": "^2.0.1", - "is-weakset": "^2.0.1" + "is-map": "^2.0.3", + "is-set": "^2.0.3", + "is-weakmap": "^2.0.2", + "is-weakset": "^2.0.3" + }, + "engines": { + "node": ">= 0.4" }, "funding": { "url": "https://github.com/sponsors/ljharb" @@ -3758,8 +4397,17 @@ "node": ">=12.17" } }, + "node_modules/workerpool": { + "version": "6.5.1", + "resolved": "https://registry.npmjs.org/workerpool/-/workerpool-6.5.1.tgz", + "integrity": "sha512-Fs4dNYcsdpYSAfVxhnl1L5zTksjvOJxtC5hzMNl+1t9B8hTJTdKDyZ5ju7ztgPy+ft9tBFXoOlDNiOT9WUXZlA==", + "dev": true, + "license": "Apache-2.0" + }, "node_modules/wrap-ansi": { "version": "8.1.0", + "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-8.1.0.tgz", + "integrity": "sha512-si7QWI6zUMq56bESFvagtmzMdGOtoxfR+Sez11Mobfc7tm+VkUckk9bW2UeffTGVUbOksxmSw0AA2gs8g71NCQ==", "dev": true, "license": "MIT", "dependencies": { @@ -3777,6 +4425,8 @@ "node_modules/wrap-ansi-cjs": { "name": "wrap-ansi", "version": "7.0.0", + "resolved": "https://registry.npmjs.org/wrap-ansi/-/wrap-ansi-7.0.0.tgz", + "integrity": "sha512-YVGIj2kamLSTxw6NsZjoBxfSwsn0ycdesmc4p+Q21c5zPuZ1pl+NfxVdxPtdHvmNVOQ6XSYG4AUtyt/Fi7D16Q==", "dev": true, "license": "MIT", "dependencies": { @@ -3793,6 +4443,8 @@ }, "node_modules/wrap-ansi-cjs/node_modules/string-width": { "version": "4.2.3", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz", + "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==", "dev": true, "license": "MIT", "dependencies": { @@ -3806,6 +4458,8 @@ }, "node_modules/wrap-ansi/node_modules/ansi-regex": { "version": "6.0.1", + "resolved": "https://registry.npmjs.org/ansi-regex/-/ansi-regex-6.0.1.tgz", + "integrity": "sha512-n5M855fKb2SsfMIiFFoVrABHJC8QtHwVx+mHWP3QcEqBHYienj5dHSgjbxtC0WEZXYt4wcD6zrQElDPhFuZgfA==", "dev": true, "license": "MIT", "engines": { @@ -3817,6 +4471,8 @@ }, "node_modules/wrap-ansi/node_modules/ansi-styles": { "version": "6.2.1", + "resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-6.2.1.tgz", + "integrity": "sha512-bN798gFfQX+viw3R7yrGWRqnrN2oRkEkUjjl4JNn4E8GxxbjtG3FbrEIIY3l8/hrwUwIeCZvi4QuOTP4MErVug==", "dev": true, "license": "MIT", "engines": { @@ -3828,6 +4484,8 @@ }, "node_modules/wrap-ansi/node_modules/strip-ansi": { "version": "7.1.0", + "resolved": "https://registry.npmjs.org/strip-ansi/-/strip-ansi-7.1.0.tgz", + "integrity": "sha512-iq6eVVI64nQQTRYq2KtEg2d2uU7LElhTJwsH4YzIHZshxlgZms/wIc4VoDQTlG/IvVIrBKG06CrZnp0qv7hkcQ==", "dev": true, "license": "MIT", "dependencies": { @@ -3845,10 +4503,75 @@ "dev": true, "license": "ISC" }, - "node_modules/yallist": { - "version": "4.0.0", + "node_modules/y18n": { + "version": "5.0.8", + "resolved": "https://registry.npmjs.org/y18n/-/y18n-5.0.8.tgz", + "integrity": "sha512-0pfFzegeDWJHJIAmTLRP2DwHjdF5s7jo9tuztdQxAhINCdvS+3nGINqPd00AphqJR/0LhANUS6/+7SCb98YOfA==", "dev": true, - "license": "ISC" + "license": "ISC", + "engines": { + "node": ">=10" + } + }, + "node_modules/yargs": { + "version": "16.2.0", + "resolved": "https://registry.npmjs.org/yargs/-/yargs-16.2.0.tgz", + "integrity": "sha512-D1mvvtDG0L5ft/jGWkLpG1+m0eQxOfaBvTNELraWj22wSVUMWxZUvYgJYcKh6jGGIkJFhH4IZPQhR4TKpc8mBw==", + "dev": true, + "license": "MIT", + "dependencies": { + "cliui": "^7.0.2", + "escalade": "^3.1.1", + "get-caller-file": "^2.0.5", + "require-directory": "^2.1.1", + "string-width": "^4.2.0", + "y18n": "^5.0.5", + "yargs-parser": "^20.2.2" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/yargs-parser": { + "version": "20.2.9", + "resolved": "https://registry.npmjs.org/yargs-parser/-/yargs-parser-20.2.9.tgz", + "integrity": "sha512-y11nGElTIV+CT3Zv9t7VKl+Q3hTQoT9a1Qzezhhl6Rp21gJ/IVTW7Z3y9EWXhuUBC2Shnf+DX0antecpAwSP8w==", + "dev": true, + "license": "ISC", + "engines": { + "node": ">=10" + } + }, + "node_modules/yargs-unparser": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/yargs-unparser/-/yargs-unparser-2.0.0.tgz", + "integrity": "sha512-7pRTIA9Qc1caZ0bZ6RYRGbHJthJWuakf+WmHK0rVeLkNrrGhfoabBNdue6kdINI6r4if7ocq9aD/n7xwKOdzOA==", + "dev": true, + "license": "MIT", + "dependencies": { + "camelcase": "^6.0.0", + "decamelize": "^4.0.0", + "flat": "^5.0.2", + "is-plain-obj": "^2.1.0" + }, + "engines": { + "node": ">=10" + } + }, + "node_modules/yargs/node_modules/string-width": { + "version": "4.2.3", + "resolved": "https://registry.npmjs.org/string-width/-/string-width-4.2.3.tgz", + "integrity": "sha512-wKyQRQpjJ0sIp62ErSZdGsjMJWsap5oRNihHhu6G7JVO/9jIB6UyevL+tXuOqrng8j/cxKTWyWUwvSTriiZz/g==", + "dev": true, + "license": "MIT", + "dependencies": { + "emoji-regex": "^8.0.0", + "is-fullwidth-code-point": "^3.0.0", + "strip-ansi": "^6.0.1" + }, + "engines": { + "node": ">=8" + } }, "node_modules/yocto-queue": { "version": "0.1.0", diff --git a/package.json b/package.json index 1592f053..7b259e89 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,7 @@ { "name": "arquero", - "version": "5.4.1", + "type": "module", + "version": "6.0.0", "description": "Query processing and transformation of array-backed data tables.", "keywords": [ "data", @@ -13,14 +14,12 @@ ], "license": "BSD-3-Clause", "author": "Jeffrey Heer (http://idl.cs.washington.edu)", - "main": "dist/arquero.node.js", - "module": "src/index-node.js", + "exports": "./src/index.js", "unpkg": "dist/arquero.min.js", "jsdelivr": "dist/arquero.min.js", "types": "dist/types/index.d.ts", "browser": { - "./dist/arquero.node.js": "./dist/arquero.min.js", - "./src/index-node.js": "./src/index.js" + "./src/index.js": "./src/index-browser.js" }, "repository": { "type": "git", @@ -28,33 +27,28 @@ }, "scripts": { "prebuild": "rimraf dist && mkdir dist", - "build": "rollup -c rollup.config.mjs", + "build": "rollup -c rollup.config.js", "postbuild": "tsc", - "preperf": "npm run build", "perf": "TZ=America/Los_Angeles tape 'perf/**/*-perf.js'", "lint": "eslint src test", - "test": "TZ=America/Los_Angeles tape 'test/**/*-test.js' --require esm", - "prepublishOnly": "npm test && npm run lint && npm run build" + "test": "TZ=America/Los_Angeles mocha 'test/**/*-test.js' --timeout 5000", + "posttest": "npm run lint && tsc --project jsconfig.json", + "prepublishOnly": "npm test && npm run build" }, "dependencies": { - "acorn": "^8.12.0", - "apache-arrow": "^15.0.2", - "node-fetch": "^2.7.0" + "acorn": "^8.12.1", + "apache-arrow": "^17.0.0", + "node-fetch": "^3.3.2" }, "devDependencies": { - "@rollup/plugin-json": "^6.1.0", "@rollup/plugin-node-resolve": "^15.2.3", "@rollup/plugin-terser": "^0.4.4", - "eslint": "^9.5.0", - "esm": "^3.2.25", - "rimraf": "^5.0.7", - "rollup": "^4.18.0", + "eslint": "^9.7.0", + "mocha": "^10.6.0", + "rimraf": "^6.0.1", + "rollup": "^4.18.1", "rollup-plugin-bundle-size": "^1.0.3", "tape": "^5.8.1", - "typescript": "^5.5.2" - }, - "esm": { - "force": true, - "mainFields": ["module", "main"] + "typescript": "^5.5.3" } } diff --git a/perf/arrow-perf.js b/perf/arrow-perf.js index 929c0981..9f9c2f53 100644 --- a/perf/arrow-perf.js +++ b/perf/arrow-perf.js @@ -1,11 +1,11 @@ -const tape = require('tape'); -const time = require('./time'); -const { bools, floats, ints, sample, strings } = require('./data-gen'); -const { fromArrow, table } = require('..'); -const { +import tape from 'tape'; +import { time } from './time.js'; +import { bools, floats, ints, sample, strings } from './data-gen.js'; +import { fromArrow, table, toArrow } from '../src/index.js'; +import { Bool, Dictionary, Float64, Int32, Table, Uint32, Utf8, - vectorFromArray, tableToIPC -} = require('apache-arrow'); + tableToIPC, vectorFromArray +} from 'apache-arrow'; function process(N, nulls, msg) { const vectors = { @@ -76,7 +76,7 @@ function encode(name, type, values) { const dt = table({ values }); // measure encoding times - const qt = time(() => tableToIPC(dt.toArrow({ types: { values: type } }))); + const qt = time(() => tableToIPC(toArrow(dt, { types: { values: type } }))); const at = time( () => tableToIPC(new Table({ values: vectorFromArray(values, type) })) ); @@ -86,7 +86,7 @@ function encode(name, type, values) { const ab = tableToIPC(new Table({ values: vectorFromArray(values, type) })).length; - const qb = tableToIPC(dt.toArrow({ types: { values: type }})).length; + const qb = tableToIPC(toArrow(dt, { types: { values: type }})).length; const jb = (new TextEncoder().encode(JSON.stringify(values))).length; // check that arrow and arquero produce the same result @@ -111,4 +111,4 @@ process(5e6, 0.05, '5M values, 5% nulls'); // run arrow serialization benchmarks serialize(1e6, 0, '1M values'); -serialize(1e6, 0.05, '1M values, 5% nulls'); \ No newline at end of file +serialize(1e6, 0.05, '1M values, 5% nulls'); diff --git a/perf/csv-perf.js b/perf/csv-perf.js index 65f59a07..fc024d97 100644 --- a/perf/csv-perf.js +++ b/perf/csv-perf.js @@ -1,11 +1,11 @@ -const tape = require('tape'); -const time = require('./time'); -const { bools, dates, floats, ints, sample, strings } = require('./data-gen'); -const { fromCSV, table } = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { bools, dates, floats, ints, sample, strings } from './data-gen.js'; +import { toCSV as _toCSV, fromCSV, table } from '../src/index.js'; function toCSV(...values) { const cols = values.map((v, i) => [`col${i}`, v]); - return table(cols).toCSV(); + return _toCSV(table(cols)); } function parse(csv, opt) { @@ -37,4 +37,4 @@ function run(N, nulls, msg) { } run(1e5, 0, '100k values'); -run(1e5, 0.05, '100k values, 5% nulls'); \ No newline at end of file +run(1e5, 0.05, '100k values, 5% nulls'); diff --git a/perf/data-gen.js b/perf/data-gen.js index 26b0570c..7f353285 100644 --- a/perf/data-gen.js +++ b/perf/data-gen.js @@ -1,4 +1,4 @@ -function rint(min, max) { +export function rint(min, max) { let delta = min; if (max === undefined) { min = 0; @@ -8,7 +8,7 @@ function rint(min, max) { return (min + delta * Math.random()) | 0; } -function ints(n, min, max, nullf) { +export function ints(n, min, max, nullf) { const data = []; for (let i = 0; i < n; ++i) { const v = nullf && Math.random() < nullf ? null : rint(min, max); @@ -17,7 +17,7 @@ function ints(n, min, max, nullf) { return data; } -function floats(n, min, max, nullf) { +export function floats(n, min, max, nullf) { const data = []; const delta = max - min; for (let i = 0; i < n; ++i) { @@ -29,7 +29,7 @@ function floats(n, min, max, nullf) { return data; } -function dates(n, nullf) { +export function dates(n, nullf) { const data = []; for (let i = 0; i < n; ++i) { const v = nullf && Math.random() < nullf @@ -40,7 +40,7 @@ function dates(n, nullf) { return data; } -function strings(n) { +export function strings(n) { const c = 'bcdfghjlmpqrstvwxyz'; const v = 'aeiou'; const cn = c.length; @@ -57,7 +57,7 @@ function strings(n) { return data; } -function bools(n, nullf) { +export function bools(n, nullf) { const data = []; for (let i = 0; i < n; ++i) { const v = nullf && Math.random() < nullf ? null : (Math.random() < 0.5); @@ -66,7 +66,7 @@ function bools(n, nullf) { return data; } -function sample(n, values, nullf) { +export function sample(n, values, nullf) { const data = []; for (let i = 0; i < n; ++i) { const v = nullf && Math.random() < nullf @@ -76,13 +76,3 @@ function sample(n, values, nullf) { } return data; } - -module.exports = { - rint, - ints, - floats, - dates, - strings, - bools, - sample -}; \ No newline at end of file diff --git a/perf/derive-perf.js b/perf/derive-perf.js index 1a0fb185..f8a853a3 100644 --- a/perf/derive-perf.js +++ b/perf/derive-perf.js @@ -1,7 +1,7 @@ -const tape = require('tape'); -const time = require('./time'); -const { floats, sample, strings } = require('./data-gen'); -const { table } = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { floats, sample, strings } from './data-gen.js'; +import { table } from '../src/index.js'; function run(N, nulls, msg) { const dt = table({ @@ -39,4 +39,4 @@ function run(N, nulls, msg) { } run(1e6, 0, '1M values'); -run(1e6, 0.05, '1M values, 5% nulls'); \ No newline at end of file +run(1e6, 0.05, '1M values, 5% nulls'); diff --git a/perf/escape-perf.js b/perf/escape-perf.js index 7f19142e..ed4e323e 100644 --- a/perf/escape-perf.js +++ b/perf/escape-perf.js @@ -1,7 +1,7 @@ -const tape = require('tape'); -const time = require('./time'); -const { floats, sample, strings } = require('./data-gen'); -const aq = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { floats, sample, strings } from './data-gen.js'; +import * as aq from '../src/index.js'; function run(N, nulls, msg) { const off = 1; @@ -25,4 +25,4 @@ function run(N, nulls, msg) { } run(1e6, 0, '1M values'); -run(1e6, 0.05, '1M values, 5% nulls'); \ No newline at end of file +run(1e6, 0.05, '1M values, 5% nulls'); diff --git a/perf/filter-perf.js b/perf/filter-perf.js index 66151a7a..5584ccd6 100644 --- a/perf/filter-perf.js +++ b/perf/filter-perf.js @@ -1,7 +1,7 @@ -const tape = require('tape'); -const time = require('./time'); -const { floats, ints, sample, strings } = require('./data-gen'); -const { table } = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { floats, ints, sample, strings } from './data-gen.js'; +import { table } from '../src/index.js'; function run(N, nulls, msg) { const dt = table({ @@ -19,21 +19,21 @@ function run(N, nulls, msg) { table: time(() => dt.filter('d.a > 0')), reify: time(() => dt.filter('d.a > 0').reify()), object: time(a => a.filter(d => d.a > 0), dt.objects()), - array: time(a => a.filter(v => v > 0), dt.column('a').data) + array: time(a => a.filter(v => v > 0), dt.column('a')) }, { type: 'float', table: time(() => dt.filter('d.b > 0')), reify: time(() => dt.filter('d.b > 0').reify()), object: time(a => a.filter(d => d.b > 0), dt.objects()), - array: time(a => a.filter(v => v > 0), dt.column('b').data) + array: time(a => a.filter(v => v > 0), dt.column('b')) }, { type: 'string', table: time(() => dt.filter(`d.c === '${str}'`)), reify: time(() => dt.filter(`d.c === '${str}'`).reify()), object: time(a => a.filter(d => d.c === str), dt.objects()), - array: time(a => a.filter(v => v === str), dt.column('c').data) + array: time(a => a.filter(v => v === str), dt.column('c')) } ]); t.end(); @@ -41,4 +41,4 @@ function run(N, nulls, msg) { } run(1e6, 0, '1M values'); -run(1e6, 0.05, '1M values, 5% nulls'); \ No newline at end of file +run(1e6, 0.05, '1M values, 5% nulls'); diff --git a/perf/rollup-perf.js b/perf/rollup-perf.js index 3a2aa96b..8d5e7732 100644 --- a/perf/rollup-perf.js +++ b/perf/rollup-perf.js @@ -1,7 +1,7 @@ -const tape = require('tape'); -const time = require('./time'); -const { floats, sample, strings } = require('./data-gen'); -const { table, op } = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { floats, sample, strings } from './data-gen.js'; +import { op, table } from '../src/index.js'; function run(N, nulls, msg) { const dt = table({ @@ -67,4 +67,4 @@ function run(N, nulls, msg) { } run(1e6, 0, '1M values'); -run(1e6, 0.05, '1M values, 5% nulls'); \ No newline at end of file +run(1e6, 0.05, '1M values, 5% nulls'); diff --git a/perf/table-perf.js b/perf/table-perf.js index cc47a0d6..cb1b63db 100644 --- a/perf/table-perf.js +++ b/perf/table-perf.js @@ -1,7 +1,7 @@ -const tape = require('tape'); -const time = require('./time'); -const { floats, ints, sample, strings } = require('./data-gen'); -const { from, table } = require('..'); +import tape from 'tape'; +import { time } from './time.js'; +import { floats, ints, sample, strings } from './data-gen.js'; +import { from, table } from '../src/index.js'; function run(N, nulls, msg) { const dt = table({ @@ -31,4 +31,4 @@ function run(N, nulls, msg) { } run(1e5, 0, '100k values'); -run(1e5, 0.05, '100k values, 5% nulls'); \ No newline at end of file +run(1e5, 0.05, '100k values, 5% nulls'); diff --git a/perf/time.js b/perf/time.js index eada5eb8..11bc39f6 100644 --- a/perf/time.js +++ b/perf/time.js @@ -1,7 +1,7 @@ -const { performance } = require('perf_hooks'); +import { performance } from 'perf_hooks'; -module.exports = function time(fn, ...args) { +export function time(fn, ...args) { const t0 = performance.now(); fn(...args); return Math.round(performance.now() - t0); -}; \ No newline at end of file +}; diff --git a/rollup.config.mjs b/rollup.config.js similarity index 57% rename from rollup.config.mjs rename to rollup.config.js index 4a1199e2..45d59c88 100644 --- a/rollup.config.mjs +++ b/rollup.config.js @@ -1,42 +1,20 @@ -import json from '@rollup/plugin-json'; import bundleSize from 'rollup-plugin-bundle-size'; import { nodeResolve } from '@rollup/plugin-node-resolve'; import terser from '@rollup/plugin-terser'; -function onwarn(warning, defaultHandler) { - if (warning.code !== 'CIRCULAR_DEPENDENCY') { - defaultHandler(warning); - } -} - const name = 'aq'; -const external = [ 'apache-arrow', 'node-fetch' ]; +const external = [ 'apache-arrow' ]; const globals = { 'apache-arrow': 'Arrow' }; const plugins = [ - json(), bundleSize(), nodeResolve({ modulesOnly: true }) ]; export default [ { - input: 'src/index-node.js', - external: ['acorn'].concat(external), - plugins, - onwarn, - output: [ - { - file: 'dist/arquero.node.js', - format: 'cjs', - name - } - ] - }, - { - input: 'src/index.js', + input: 'src/index-browser.js', external, plugins, - onwarn, output: [ { file: 'dist/arquero.js', @@ -54,4 +32,4 @@ export default [ } ] } -]; \ No newline at end of file +]; diff --git a/src/api.js b/src/api.js new file mode 100644 index 00000000..9960a0b8 --- /dev/null +++ b/src/api.js @@ -0,0 +1,32 @@ +// export internal class and method definitions +export { BitSet } from './table/BitSet.js'; +export { Table } from './table/Table.js'; +export { ColumnTable } from './table/ColumnTable.js'; +export { default as Reducer } from './verbs/reduce/reducer.js'; +export { default as parse } from './expression/parse.js'; +export { default as walk_ast } from './expression/ast/walk.js'; + +// public API +export { seed } from './util/random.js'; +export { default as fromArrow } from './arrow/from-arrow.js'; +export { default as fromCSV } from './format/from-csv.js'; +export { default as fromFixed } from './format/from-fixed.js'; +export { default as fromJSON } from './format/from-json.js'; +export { default as toArrow } from './arrow/to-arrow.js'; +export { default as toArrowIPC } from './arrow/to-arrow-ipc.js'; +export { default as toCSV } from './format/to-csv.js'; +export { default as toHTML } from './format/to-html.js'; +export { default as toJSON } from './format/to-json.js'; +export { default as toMarkdown } from './format/to-markdown.js'; +export { default as bin } from './helpers/bin.js'; +export { default as escape } from './helpers/escape.js'; +export { default as desc } from './helpers/desc.js'; +export { default as field } from './helpers/field.js'; +export { default as frac } from './helpers/frac.js'; +export { default as names } from './helpers/names.js'; +export { default as rolling } from './helpers/rolling.js'; +export { all, endswith, matches, not, range, startswith } from './helpers/selection.js'; +export { default as agg } from './verbs/helpers/agg.js'; +export { default as op } from './op/op-api.js'; +export { addAggregateFunction, addFunction, addWindowFunction } from './op/register.js'; +export { table, from } from './table/index.js'; diff --git a/src/arrow/arrow-column.js b/src/arrow/arrow-column.js index 2bb6e484..a8b43f10 100644 --- a/src/arrow/arrow-column.js +++ b/src/arrow/arrow-column.js @@ -1,68 +1,277 @@ -import arrowDictionary from './arrow-dictionary'; -import error from '../util/error'; -import repeat from '../util/repeat'; -import toString from '../util/to-string'; -import unroll from '../util/unroll'; -import { isDict, isFixedSizeList, isList, isStruct, isUtf8 } from './arrow-types'; +import sequence from '../op/functions/sequence.js'; +import error from '../util/error.js'; +import isFunction from '../util/is-function.js'; +import repeat from '../util/repeat.js'; +import toString from '../util/to-string.js'; +import unroll from '../util/unroll.js'; -const isListType = type => isList(type) || isFixedSizeList(type); +// Hardwire Arrow type ids to sidestep hard dependency +// https://github.com/apache/arrow/blob/master/js/src/enum.ts +const isDict = ({ typeId }) => typeId === -1; +const isInt = ({ typeId }) => typeId === 2; +const isUtf8 = ({ typeId }) => typeId === 5; +const isDecimal = ({ typeId }) => typeId === 7; +const isDate = ({ typeId }) => typeId === 8; +const isTimestamp = ({ typeId }) => typeId === 10; +const isStruct = ({ typeId }) => typeId === 13; +const isLargeUtf8 = ({ typeId }) => typeId === 20; +const isListType = ({ typeId }) => typeId === 12 || typeId === 16; /** * Create an Arquero column that proxies access to an Arrow column. - * @param {object} arrow An Apache Arrow column. - * @return {import('../table/column').ColumnType} An Arquero-compatible column. + * @param {import('apache-arrow').Vector} vector An Apache Arrow column. + * @param {import('./types.js').ArrowColumnOptions} [options] + * Arrow conversion options. + * @return {import('../table/types.js').ColumnType} + * An Arquero-compatible column. */ -export default function arrowColumn(vector, nested) { +export default function arrowColumn(vector, options) { + return isDict(vector.type) + ? dictionaryColumn(vector) + : proxyColumn(vector, options); +} + +/** + * Internal method for Arquero column generation for Apache Arrow data + * @param {import('apache-arrow').Vector} vector An Apache Arrow column. + * @param {import('./types.js').ArrowColumnOptions} [options] + * Arrow conversion options. + * @return {import('../table/types.js').ColumnType} + * An Arquero-compatible column. + */ +function proxyColumn(vector, options = {}) { const { type, length, numChildren } = vector; - if (isDict(type)) return arrowDictionary(vector); + const { + convertDate = true, + convertDecimal = true, + convertTimestamp = true, + convertBigInt = false, + memoize = true + } = options; - const get = numChildren && nested ? getNested(vector) - : numChildren ? memoize(getNested(vector)) - : isUtf8(type) ? memoize(row => vector.get(row)) - : null; + // create a getter method for retrieving values + let get; + if (numChildren) { + // extract lists/structs to JS objects, possibly memoized + get = getNested(vector, options); + if (memoize) get = memoized(length, get); + } else if (memoize && (isUtf8(type) || isLargeUtf8(type))) { + // memoize string extraction + get = memoized(length, row => vector.get(row)); + } else if ((convertDate && isDate(type)) + || (convertTimestamp && isTimestamp(type))) { + // convert to Date type, memoized for object equality + get = memoized(length, row => { + const v = vector.get(row); + return v == null ? null : new Date(vector.get(row)); + }); + } else if (convertDecimal && isDecimal(type)) { + // map decimal to number + const scale = 1 / Math.pow(10, type.scale); + get = row => { + const v = vector.get(row); + return v == null ? null : decimalToNumber(v, scale); + }; + } else if (convertBigInt && isInt(type) && type.bitWidth >= 64) { + // map bigint to number + get = row => { + const v = vector.get(row); + return v == null ? null : Number(v); + }; + } else if (!isFunction(vector.at)) { + // backwards compatibility with older arrow versions + // the vector `at` method was added in Arrow v16 + get = row => vector.get(row); + } else { + // use the arrow column directly + return vector; + } - return get - ? { vector, length, get, [Symbol.iterator]: () => iterator(length, get) } - : vector; + // return a column proxy object using custom getter + return { + length, + at: get, + [Symbol.iterator]: () => (function* () { + for (let i = 0; i < length; ++i) { + yield get(i); + } + })() + }; } -function memoize(get) { - const values = []; +/** + * Memoize expensive getter calls by caching retrieved values. + */ +function memoized(length, get) { + const values = Array(length); return row => { const v = values[row]; return v !== undefined ? v : (values[row] = get(row)); }; } -function* iterator(n, get) { - for (let i = 0; i < n; ++i) { - yield get(i); +// generate base values for big integers represented as a Uint32Array +const BASE32 = Array.from( + { length: 8 }, + (_, i) => Math.pow(2, i * 32) +); + +/** + * Convert a fixed point decimal value to a double precision number. + * Note: if the value is sufficiently large the conversion may be lossy! + * @param {Uint32Array & { signed: boolean }} v a fixed point decimal value + * @param {number} scale a scale factor, corresponding to the + * number of fractional decimal digits in the fixed point value + * @return {number} the resulting number + */ +function decimalToNumber(v, scale) { + const n = v.length; + let x = 0; + if (v.signed && (v[n - 1] | 0) < 0) { + for (let i = 0; i < n; ++i) { + x += ~v[i] * BASE32[i]; + } + x = -(x + 1); + } else { + for (let i = 0; i < n; ++i) { + x += v[i] * BASE32[i]; + } } + return x * scale; } -const arrayFrom = vector => vector.numChildren - ? repeat(vector.length, getNested(vector)) - : vector.nullCount ? [...vector] - : vector.toArray(); +// get an array for a given vector +function arrayFrom(vector, options) { + return vector.numChildren ? repeat(vector.length, getNested(vector, options)) + : vector.nullCount ? [...vector] + : vector.toArray(); +} -const getNested = vector => isListType(vector.type) ? getList(vector) - : isStruct(vector.type) ? getStruct(vector) - : error(`Unsupported Arrow type: ${toString(vector.VectorName)}`); +// generate a getter for a nested data type +function getNested(vector, options) { + return isListType(vector.type) ? getList(vector, options) + : isStruct(vector.type) ? getStruct(vector, options) + : error(`Unsupported Arrow type: ${toString(vector.VectorName)}`); +} -const getList = vector => vector.nullCount - ? row => vector.isValid(row) ? arrayFrom(vector.get(row)) : null - : row => arrayFrom(vector.get(row)); +// generate a getter for a list data type +function getList(vector, options) { + return vector.nullCount + ? row => vector.isValid(row) + ? arrayFrom(vector.get(row), options) + : null + : row => arrayFrom(vector.get(row), options); +} -function getStruct(vector) { +// generate a getter for a struct (object) data type +function getStruct(vector, options) { + // disable memoization for nested columns as we extract JS objects + const opt = { ...options, memoize: false }; const props = []; const code = []; vector.type.children.forEach((field, i) => { - props.push(arrowColumn(vector.getChildAt(i), true)); - code.push(`${toString(field.name)}:_${i}.get(row)`); + props.push(arrowColumn(vector.getChildAt(i), opt)); + code.push(`${toString(field.name)}:_${i}.at(row)`); }); const get = unroll('row', '({' + code + '})', props); return vector.nullCount ? row => vector.isValid(row) ? get(row) : null : get; -} \ No newline at end of file +} + +/** + * Create a new Arquero column that proxies access to an + * Apache Arrow dictionary column. + * @param {import('apache-arrow').Vector} vector + * An Apache Arrow dictionary column. + */ +function dictionaryColumn(vector) { + const { data, length, nullCount } = vector; + const dictionary = data[data.length - 1].dictionary; + const size = dictionary.length; + const keys = dictKeys(data || [vector], length, nullCount, size); + const get = memoized(size, + k => k == null || k < 0 || k >= size ? null : dictionary.get(k) + ); + + return { + vector, + length, + at: row => get(keys[row]), + key: row => keys[row], + keyFor(value) { + if (value === null) return nullCount ? size : -1; + for (let i = 0; i < size; ++i) { + if (get(i) === value) return i; + } + return -1; + }, + groups(names) { + const s = size + (nullCount ? 1 : 0); + return { keys, get: [get], names, rows: sequence(0, s), size: s }; + }, + [Symbol.iterator]() { + return vector[Symbol.iterator](); + } + }; +} + +/** + * Generate a dictionary key array. + * @param {readonly any[]} chunks Arrow column chunks + * @param {number} length The length of the Arrow column + * @param {number} nulls The count of column null values + * @param {number} size The backing dictionary size + */ +function dictKeys(chunks, length, nulls, size) { + const v = chunks.length > 1 || nulls + ? flatten(chunks, length, chunks[0].type.indices) + : chunks[0].values; + return nulls ? nullKeys(chunks, v, size) : v; +} + +/** + * Flatten Arrow column chunks into a single array. + */ +function flatten(chunks, length, type) { + const array = new type.ArrayType(length); + const n = chunks.length; + for (let i = 0, idx = 0, len; i < n; ++i) { + len = chunks[i].length; + array.set(chunks[i].values.subarray(0, len), idx); + idx += len; + } + return array; +} + +/** + * Encode null values as an additional dictionary key. + * Returns a new key array with null values added. + * TODO: safeguard against integer overflow? + */ +function nullKeys(chunks, keys, key) { + // iterate over null bitmaps, encode null values as key + const n = chunks.length; + for (let i = 0, idx = 0, m, base, bits, byte; i < n; ++i) { + bits = chunks[i].nullBitmap; + m = chunks[i].length >> 3; + if (bits && bits.length) { + for (let j = 0; j <= m; ++j) { + if ((byte = bits[j]) !== 255) { + base = idx + (j << 3); + if ((byte & (1 << 0)) === 0) keys[base + 0] = key; + if ((byte & (1 << 1)) === 0) keys[base + 1] = key; + if ((byte & (1 << 2)) === 0) keys[base + 2] = key; + if ((byte & (1 << 3)) === 0) keys[base + 3] = key; + if ((byte & (1 << 4)) === 0) keys[base + 4] = key; + if ((byte & (1 << 5)) === 0) keys[base + 5] = key; + if ((byte & (1 << 6)) === 0) keys[base + 6] = key; + if ((byte & (1 << 7)) === 0) keys[base + 7] = key; + } + } + } + idx += chunks[i].length; + } + return keys; +} diff --git a/src/arrow/arrow-dictionary.js b/src/arrow/arrow-dictionary.js deleted file mode 100644 index 95d98ff2..00000000 --- a/src/arrow/arrow-dictionary.js +++ /dev/null @@ -1,104 +0,0 @@ -import sequence from '../op/functions/sequence'; - -/** - * Create a new Arquero column that proxies access to an - * Apache Arrow dictionary column. - * @param {object} vector An Apache Arrow dictionary column. - */ -export default function(vector) { - const { data, length, nullCount } = vector; - const dictionary = data[data.length - 1].dictionary; - const size = dictionary.length; - const keys = dictKeys(data || [vector], length, nullCount, size); - const values = Array(size); - - const value = k => k == null || k < 0 || k >= size ? null - : values[k] !== undefined ? values[k] - : (values[k] = dictionary.get(k)); - - return { - vector, - length, - - get: row => value(keys[row]), - - key: row => keys[row], - - keyFor(value) { - if (value === null) return nullCount ? size : -1; - for (let i = 0; i < size; ++i) { - if (values[i] === undefined) values[i] = dictionary.get(i); - if (values[i] === value) return i; - } - return -1; - }, - - groups(names) { - const s = size + (nullCount ? 1 : 0); - return { keys, get: [value], names, rows: sequence(0, s), size: s }; - }, - - [Symbol.iterator]() { - return vector[Symbol.iterator](); - } - }; -} - -/** - * Generate a dictionary key array - * @param {object[]} chunks Arrow column chunks - * @param {number} length The length of the Arrow column - * @param {number} nulls The count of column null values - * @param {number} size The backing dictionary size - */ -function dictKeys(chunks, length, nulls, size) { - const v = chunks.length > 1 || nulls - ? flatten(chunks, length, chunks[0].type.indices) - : chunks[0].values; - return nulls ? nullKeys(chunks, v, size) : v; -} - -/** - * Flatten Arrow column chunks into a single array. - */ -function flatten(chunks, length, type) { - const array = new type.ArrayType(length); - const n = chunks.length; - for (let i = 0, idx = 0, len; i < n; ++i) { - len = chunks[i].length; - array.set(chunks[i].values.subarray(0, len), idx); - idx += len; - } - return array; -} - -/** - * Encode null values as an additional dictionary key. - * Returns a new key array with null values added. - * TODO: safeguard against integer overflow? - */ -function nullKeys(chunks, keys, key) { - // iterate over null bitmaps, encode null values as key - const n = chunks.length; - for (let i = 0, idx = 0, m, base, bits, byte; i < n; ++i) { - bits = chunks[i].nullBitmap; - m = chunks[i].length >> 3; - if (bits && bits.length) { - for (let j = 0; j <= m; ++j) { - if ((byte = bits[j]) !== 255) { - base = idx + (j << 3); - if ((byte & (1 << 0)) === 0) keys[base + 0] = key; - if ((byte & (1 << 1)) === 0) keys[base + 1] = key; - if ((byte & (1 << 2)) === 0) keys[base + 2] = key; - if ((byte & (1 << 3)) === 0) keys[base + 3] = key; - if ((byte & (1 << 4)) === 0) keys[base + 4] = key; - if ((byte & (1 << 5)) === 0) keys[base + 5] = key; - if ((byte & (1 << 6)) === 0) keys[base + 6] = key; - if ((byte & (1 << 7)) === 0) keys[base + 7] = key; - } - } - } - idx += chunks[i].length; - } - return keys; -} \ No newline at end of file diff --git a/src/arrow/arrow-table.js b/src/arrow/arrow-table.js index fec65fbf..de93d647 100644 --- a/src/arrow/arrow-table.js +++ b/src/arrow/arrow-table.js @@ -1,27 +1,38 @@ -import { Table, tableFromIPC } from 'apache-arrow'; -import error from '../util/error'; +import { Table, tableFromIPC, tableToIPC } from 'apache-arrow'; +import error from '../util/error.js'; -const fail = () => error( +const fail = (cause) => error( 'Apache Arrow not imported, ' + - 'see https://github.com/uwdata/arquero#usage' + 'see https://github.com/uwdata/arquero#usage', + cause ); -export function table() { +export function arrowTable(...args) { // trap access to provide a helpful message // when Apache Arrow has not been imported try { - return Table; - } catch (err) { // eslint-disable-line no-unused-vars - fail(); + return new Table(...args); + } catch (err) { + fail(err); } } -export function fromIPC() { +export function arrowTableFromIPC(bytes) { // trap access to provide a helpful message // when Apache Arrow has not been imported try { - return tableFromIPC; - } catch (err) { // eslint-disable-line no-unused-vars - fail(); + return tableFromIPC(bytes); + } catch (err) { + fail(err); } -} \ No newline at end of file +} + +export function arrowTableToIPC(table, format) { + // trap access to provide a helpful message + // when Apache Arrow has not been imported + try { + return tableToIPC(table, format); + } catch (err) { + fail(err); + } +} diff --git a/src/arrow/arrow-types.js b/src/arrow/arrow-types.js deleted file mode 100644 index 6cba8a81..00000000 --- a/src/arrow/arrow-types.js +++ /dev/null @@ -1,8 +0,0 @@ -// Hardwire Arrow type ids to sidestep dependency -// https://github.com/apache/arrow/blob/master/js/src/enum.ts - -export const isDict = ({ typeId }) => typeId === -1; -export const isUtf8 = ({ typeId }) => typeId === 5; -export const isList = ({ typeId }) => typeId === 12; -export const isStruct = ({ typeId }) => typeId === 13; -export const isFixedSizeList = ({ typeId }) => typeId === 16; \ No newline at end of file diff --git a/src/arrow/builder/array-builder.js b/src/arrow/builder/array-builder.js index c4e99910..c9de3a49 100644 --- a/src/arrow/builder/array-builder.js +++ b/src/arrow/builder/array-builder.js @@ -1,4 +1,4 @@ -import { array } from './util'; +import { array } from './util.js'; export default function(type, length) { const data = array(type.ArrayType, length); @@ -6,4 +6,4 @@ export default function(type, length) { set(value, index) { data[index] = value; }, data: () => ({ type, length, buffers: [null, data] }) }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/bool-builder.js b/src/arrow/builder/bool-builder.js index c12a159a..a327ff32 100644 --- a/src/arrow/builder/bool-builder.js +++ b/src/arrow/builder/bool-builder.js @@ -1,4 +1,4 @@ -import { array } from './util'; +import { array } from './util.js'; export default function(type, length) { const data = array(type.ArrayType, length / 8); @@ -8,4 +8,4 @@ export default function(type, length) { }, data: () => ({ type, length, buffers: [null, data] }) }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/date-day-builder.js b/src/arrow/builder/date-day-builder.js index 6a7cd6f6..d7bb0129 100644 --- a/src/arrow/builder/date-day-builder.js +++ b/src/arrow/builder/date-day-builder.js @@ -1,4 +1,4 @@ -import { array } from './util'; +import { array } from './util.js'; export default function(type, length) { const data = array(type.ArrayType, length); @@ -6,4 +6,4 @@ export default function(type, length) { set(value, index) { data[index] = (value / 86400000) | 0; }, data: () => ({ type, length, buffers: [null, data] }) }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/date-millis-builder.js b/src/arrow/builder/date-millis-builder.js index a7f031b5..1bcdac1a 100644 --- a/src/arrow/builder/date-millis-builder.js +++ b/src/arrow/builder/date-millis-builder.js @@ -1,13 +1,9 @@ -import { array } from './util'; +import { array } from './util.js'; export default function(type, length) { - const data = array(type.ArrayType, length << 1); + const data = array(type.ArrayType, length); return { - set(value, index) { - const i = index << 1; - data[ i] = (value % 4294967296) | 0; - data[i+1] = (value / 4294967296) | 0; - }, + set(value, index) { data[index] = BigInt(value); }, data: () => ({ type, length, buffers: [null, data] }) }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/default-builder.js b/src/arrow/builder/default-builder.js index d70f5a64..d0301ec9 100644 --- a/src/arrow/builder/default-builder.js +++ b/src/arrow/builder/default-builder.js @@ -9,4 +9,4 @@ export default function(type) { set(value, index) { b.set(index, value); }, data: () => b.finish().flush() }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/dictionary-builder.js b/src/arrow/builder/dictionary-builder.js index 70058995..c301a7ae 100644 --- a/src/arrow/builder/dictionary-builder.js +++ b/src/arrow/builder/dictionary-builder.js @@ -1,5 +1,5 @@ -import utf8Builder from './utf8-builder'; -import { array, arrowVector } from './util'; +import utf8Builder from './utf8-builder.js'; +import { array, arrowVector } from './util.js'; export default function(type, length) { const values = []; @@ -33,4 +33,4 @@ function dictionary(type, values, strlen) { const b = utf8Builder(type, values.length, strlen); values.forEach(b.set); return arrowVector(b.data()); -} \ No newline at end of file +} diff --git a/src/arrow/builder/index.js b/src/arrow/builder/index.js index 65a5423a..2e9dde71 100644 --- a/src/arrow/builder/index.js +++ b/src/arrow/builder/index.js @@ -1,11 +1,11 @@ import { Type } from 'apache-arrow'; -import arrayBuilder from './array-builder'; -import boolBuilder from './bool-builder'; -import dateDayBuilder from './date-day-builder'; -import dateMillisBuilder from './date-millis-builder'; -import defaultBuilder from './default-builder'; -import dictionaryBuilder from './dictionary-builder'; -import validBuilder from './valid-builder'; +import arrayBuilder from './array-builder.js'; +import boolBuilder from './bool-builder.js'; +import dateDayBuilder from './date-day-builder.js'; +import dateMillisBuilder from './date-millis-builder.js'; +import defaultBuilder from './default-builder.js'; +import dictionaryBuilder from './dictionary-builder.js'; +import validBuilder from './valid-builder.js'; export default function(type, nrows, nullable = true) { let method; @@ -37,4 +37,4 @@ export default function(type, nrows, nullable = true) { return method == null ? defaultBuilder(type) : nullable ? validBuilder(method(type, nrows), nrows) : method(type, nrows); -} \ No newline at end of file +} diff --git a/src/arrow/builder/resolve-type.js b/src/arrow/builder/resolve-type.js index c300dc6b..14eb3415 100644 --- a/src/arrow/builder/resolve-type.js +++ b/src/arrow/builder/resolve-type.js @@ -26,8 +26,8 @@ import { Uint8, Utf8 } from 'apache-arrow'; -import error from '../../util/error'; -import toString from '../../util/to-string'; +import error from '../../util/error.js'; +import toString from '../../util/to-string.js'; export default function(type) { if (type instanceof DataType || type == null) { @@ -94,4 +94,4 @@ export default function(type) { 'Use a data type constructor instead?' ); } -} \ No newline at end of file +} diff --git a/src/arrow/builder/utf8-builder.js b/src/arrow/builder/utf8-builder.js index 8a43dd5b..b8fb35d9 100644 --- a/src/arrow/builder/utf8-builder.js +++ b/src/arrow/builder/utf8-builder.js @@ -1,4 +1,4 @@ -import { array, ceil64Bytes, writeUtf8 } from './util'; +import { array, ceil64Bytes, writeUtf8 } from './util.js'; export default function(type, length, strlen) { const offset = array(Int32Array, length + 1); @@ -18,4 +18,4 @@ export default function(type, length, strlen) { return { type, length, buffers: [offset, data] }; } }; -} \ No newline at end of file +} diff --git a/src/arrow/builder/util.js b/src/arrow/builder/util.js index 2f9f0ca0..c3cc25f8 100644 --- a/src/arrow/builder/util.js +++ b/src/arrow/builder/util.js @@ -30,4 +30,4 @@ export function encodeInto(data, idx, str) { return encoder.encodeInto(str, data.subarray(idx)).written; } -export const writeUtf8 = encoder.encodeInto ? encodeInto : encode; \ No newline at end of file +export const writeUtf8 = encoder.encodeInto ? encodeInto : encode; diff --git a/src/arrow/builder/valid-builder.js b/src/arrow/builder/valid-builder.js index d2f04188..ee21daec 100644 --- a/src/arrow/builder/valid-builder.js +++ b/src/arrow/builder/valid-builder.js @@ -1,4 +1,4 @@ -import { array } from './util'; +import { array } from './util.js'; export default function(builder, length) { const valid = array(Uint8Array, length / 8); @@ -22,4 +22,4 @@ export default function(builder, length) { return d; } }; -} \ No newline at end of file +} diff --git a/src/arrow/encode/data-from-objects.js b/src/arrow/encode/data-from-objects.js index 334de681..af913308 100644 --- a/src/arrow/encode/data-from-objects.js +++ b/src/arrow/encode/data-from-objects.js @@ -1,6 +1,6 @@ -import { dataFromScan } from './data-from'; -import { profile } from './profiler'; -import resolveType from '../builder/resolve-type'; +import { dataFromScan } from './data-from.js'; +import { profile } from './profiler.js'; +import resolveType from '../builder/resolve-type.js'; export default function(data, name, nrows, scan, type, nullable = true) { type = resolveType(type); @@ -13,4 +13,4 @@ export default function(data, name, nrows, scan, type, nullable = true) { } return dataFromScan(nrows, scan, name, type, nullable); -} \ No newline at end of file +} diff --git a/src/arrow/encode/data-from-table.js b/src/arrow/encode/data-from-table.js index 1e3324d5..85fe155e 100644 --- a/src/arrow/encode/data-from-table.js +++ b/src/arrow/encode/data-from-table.js @@ -3,10 +3,10 @@ import { Int16, Int32, Int64, Int8, Uint16, Uint32, Uint64, Uint8, Vector } from 'apache-arrow'; -import { dataFromArray, dataFromScan } from './data-from'; -import { profile } from './profiler'; -import resolveType from '../builder/resolve-type'; -import isTypedArray from '../../util/is-typed-array'; +import { dataFromArray, dataFromScan } from './data-from.js'; +import { profile } from './profiler.js'; +import resolveType from '../builder/resolve-type.js'; +import isTypedArray from '../../util/is-typed-array.js'; export default function(table, name, nrows, scan, type, nullable = true) { type = resolveType(type); @@ -66,4 +66,4 @@ function typeFromArray(data) { function typeCompatible(a, b) { return !a || !b ? true : a.compareTo(b); -} \ No newline at end of file +} diff --git a/src/arrow/encode/data-from.js b/src/arrow/encode/data-from.js index 06564da5..8b0a0ecd 100644 --- a/src/arrow/encode/data-from.js +++ b/src/arrow/encode/data-from.js @@ -1,5 +1,5 @@ -import builder from '../builder'; -import { arrowData, ceil64Bytes } from '../builder/util'; +import builder from '../builder/index.js'; +import { arrowData, ceil64Bytes } from '../builder/util.js'; export function dataFromArray(array, type) { const length = array.length; @@ -18,4 +18,4 @@ export function dataFromScan(nrows, scan, column, type, nullable = true) { const b = builder(type, nrows, nullable); scan(column, b.set); return arrowData(b.data()); -} \ No newline at end of file +} diff --git a/src/arrow/encode/index.js b/src/arrow/encode/index.js deleted file mode 100644 index 8bdd2255..00000000 --- a/src/arrow/encode/index.js +++ /dev/null @@ -1,77 +0,0 @@ -import { Table } from 'apache-arrow'; // eslint-disable-line no-unused-vars - -import dataFromObjects from './data-from-objects'; -import dataFromTable from './data-from-table'; -import { scanArray, scanTable } from './scan'; -import { table } from '../arrow-table'; -import error from '../../util/error'; -import isArray from '../../util/is-array'; -import isFunction from '../../util/is-function'; - -/** - * Options for Arrow encoding. - * @typedef {object} ArrowFormatOptions - * @property {number} [limit=Infinity] The maximum number of rows to include. - * @property {number} [offset=0] The row offset indicating how many initial - * rows to skip. - * @property {string[]|(data: object) => string[]} [columns] Ordered list of - * column names to include. If function-valued, the function should accept - * a dataset as input and return an array of column name strings. - * @property {object} [types] The Arrow data types to use. If specified, - * the input should be an object with column names for keys and Arrow data - * types for values. If a column type is not explicitly provided, type - * inference will be performed to guess an appropriate type. - */ - -/** - * Create an Apache Arrow table for an input dataset. - * @param {Array|object} data An input dataset to convert to Arrow format. - * If array-valued, the data should consist of an array of objects where - * each entry represents a row and named properties represent columns. - * Otherwise, the input data should be an Arquero table. - * @param {ArrowFormatOptions} [options] Encoding options, including - * column data types. - * @return {Table} An Apache Arrow Table instance. - */ -export default function(data, options = {}) { - const { types = {} } = options; - const { dataFrom, names, nrows, scan } = init(data, options); - const cols = {}; - names.forEach(name => { - const col = dataFrom(data, name, nrows, scan, types[name]); - if (col.length !== nrows) { - error('Column length mismatch'); - } - cols[name] = col; - }); - const T = table(); - return new T(cols); -} - -function init(data, options) { - const { columns, limit = Infinity, offset = 0 } = options; - const names = isFunction(columns) ? columns(data) - : isArray(columns) ? columns - : null; - if (isArray(data)) { - return { - dataFrom: dataFromObjects, - names: names || Object.keys(data[0]), - nrows: Math.min(limit, data.length - offset), - scan: scanArray(data, limit, offset) - }; - } else if (isTable(data)) { - return { - dataFrom: dataFromTable, - names: names || data.columnNames(), - nrows: Math.min(limit, data.numRows() - offset), - scan: scanTable(data, limit, offset) - }; - } else { - error('Unsupported input data type'); - } -} - -function isTable(data) { - return data && isFunction(data.reify); -} \ No newline at end of file diff --git a/src/arrow/encode/profiler.js b/src/arrow/encode/profiler.js index 29c3883f..16d1b7ff 100644 --- a/src/arrow/encode/profiler.js +++ b/src/arrow/encode/profiler.js @@ -1,9 +1,9 @@ import { Field, FixedSizeList, List, Struct, Type } from 'apache-arrow'; -import resolveType from '../builder/resolve-type'; -import error from '../../util/error'; -import isArrayType from '../../util/is-array-type'; -import isDate from '../../util/is-date'; -import isExactUTCDate from '../../util/is-exact-utc-date'; +import resolveType from '../builder/resolve-type.js'; +import error from '../../util/error.js'; +import isArrayType from '../../util/is-array-type.js'; +import isDate from '../../util/is-date.js'; +import isExactUTCDate from '../../util/is-exact-utc-date.js'; export function profile(scan, column) { const p = profiler(); @@ -100,6 +100,7 @@ function infer(p) { return Type.Float64; } else if (p.bigints === valid) { + // @ts-ignore const v = -p.min > p.max ? -p.min - 1n : p.max; return p.min < 0 ? v < 2 ** 63 ? Type.Int64 @@ -134,4 +135,4 @@ function infer(p) { else { error('Type inference failure'); } -} \ No newline at end of file +} diff --git a/src/arrow/encode/scan.js b/src/arrow/encode/scan.js index 6d1991f6..b12a5886 100644 --- a/src/arrow/encode/scan.js +++ b/src/arrow/encode/scan.js @@ -1,4 +1,4 @@ -import isArrayType from '../../util/is-array-type'; +import isArrayType from '../../util/is-array-type.js'; export function scanArray(data, limit, offset) { const n = Math.min(data.length, offset + limit); @@ -14,12 +14,16 @@ export function scanTable(table, limit, offset) { && !table.isFiltered() && !table.isOrdered(); return (column, visit) => { + const isArray = isArrayType(column); let i = -1; - scanAll && isArrayType(column.data) - ? column.data.forEach(visit) + scanAll && isArray + ? column.forEach(visit) : table.scan( - row => visit(column.get(row), ++i), + // optimize column value access + isArray + ? row => visit(column[row], ++i) + : row => visit(column.at(row), ++i), true, limit, offset ); }; -} \ No newline at end of file +} diff --git a/src/arrow/from-arrow.js b/src/arrow/from-arrow.js new file mode 100644 index 00000000..cd9a0602 --- /dev/null +++ b/src/arrow/from-arrow.js @@ -0,0 +1,39 @@ +import { arrowTableFromIPC } from './arrow-table.js'; +import arrowColumn from './arrow-column.js'; +import resolve, { all } from '../helpers/selection.js'; +import { columnSet } from '../table/ColumnSet.js'; +import { ColumnTable } from '../table/ColumnTable.js'; + +/** + * Create a new table backed by an Apache Arrow table instance. + * @param {import('./types.js').ArrowInput} arrow + * An Apache Arrow data table or Arrow IPC byte buffer. + * @param {import('./types.js').ArrowOptions} [options] + * Options for Arrow import. + * @return {ColumnTable} A new table containing the imported values. + */ +export default function(arrow, options) { + if (arrow instanceof ArrayBuffer || ArrayBuffer.isView(arrow)) { + arrow = arrowTableFromIPC(arrow); + } + + const { + columns = all(), + ...columnOptions + } = options || {}; + + // resolve column selection + const fields = arrow.schema.fields.map(f => f.name); + const sel = resolve({ + columnNames: test => test ? fields.filter(test) : fields.slice(), + columnIndex: name => fields.indexOf(name) + }, columns); + + // build Arquero columns for backing Arrow columns + const cols = columnSet(); + sel.forEach((name, key) => { + cols.add(name, arrowColumn(arrow.getChild(key), columnOptions)); + }); + + return new ColumnTable(cols.data, cols.names); +} diff --git a/src/arrow/to-arrow-ipc.js b/src/arrow/to-arrow-ipc.js new file mode 100644 index 00000000..9c079ec6 --- /dev/null +++ b/src/arrow/to-arrow-ipc.js @@ -0,0 +1,18 @@ +import { arrowTableToIPC } from './arrow-table.js'; +import toArrow from './to-arrow.js'; + +/** + * Format a table as binary data in the Apache Arrow IPC format. + * @param {object[]|import('../table/Table.js').Table} data The table data + * @param {import('./types.js').ArrowIPCFormatOptions} [options] + * The Arrow IPC formatting options. Set the *format* option to `'stream'` + * or `'file'` to specify the IPC format. + * @return {Uint8Array} A new Uint8Array of Arrow-encoded binary data. + */ +export default function(data, options = {}) { + const { format = 'stream', ...toArrowOptions } = options; + if (!['stream', 'file'].includes(format)) { + throw Error('Unrecognised Arrow IPC output format'); + } + return arrowTableToIPC(toArrow(data, toArrowOptions), format); +} diff --git a/src/arrow/to-arrow.js b/src/arrow/to-arrow.js new file mode 100644 index 00000000..e4adb132 --- /dev/null +++ b/src/arrow/to-arrow.js @@ -0,0 +1,59 @@ +import { arrowTable } from './arrow-table.js'; +import dataFromObjects from './encode/data-from-objects.js'; +import dataFromTable from './encode/data-from-table.js'; +import { scanArray, scanTable } from './encode/scan.js'; +import error from '../util/error.js'; +import isArray from '../util/is-array.js'; +import isFunction from '../util/is-function.js'; + +/** + * Create an Apache Arrow table for an input dataset. + * @param {object[]|import('../table/Table.js').Table} data An input dataset + * to convert to Arrow format. If array-valued, the data should consist of an + * array of objects where each entry represents a row and named properties + * represent columns. Otherwise, the input data should be an Arquero table. + * @param {import('./types.js').ArrowFormatOptions} [options] + * Encoding options, including column data types. + * @return {import('apache-arrow').Table} An Apache Arrow Table instance. + */ +export default function(data, options = {}) { + const { types = {} } = options; + const { dataFrom, names, nrows, scan } = init(data, options); + const cols = {}; + names.forEach(name => { + const col = dataFrom(data, name, nrows, scan, types[name]); + if (col.length !== nrows) { + error('Column length mismatch'); + } + cols[name] = col; + }); + return arrowTable(cols); +} + +function init(data, options) { + const { columns, limit = Infinity, offset = 0 } = options; + const names = isFunction(columns) ? columns(data) + : isArray(columns) ? columns + : null; + if (isArray(data)) { + return { + dataFrom: dataFromObjects, + names: names || Object.keys(data[0]), + nrows: Math.min(limit, data.length - offset), + scan: scanArray(data, limit, offset) + }; + } else if (isTable(data)) { + return { + dataFrom: dataFromTable, + names: names || data.columnNames(), + nrows: Math.min(limit, data.numRows() - offset), + scan: scanTable(data, limit, offset) + }; + } else { + error('Unsupported input data type'); + } +} + +function isTable(data) { + return data && isFunction(data.reify); +} diff --git a/src/arrow/types.ts b/src/arrow/types.ts new file mode 100644 index 00000000..f3cf8f10 --- /dev/null +++ b/src/arrow/types.ts @@ -0,0 +1,91 @@ +import { DataType, Table } from 'apache-arrow'; +import type { Select, TypedArray } from '../table/types.js'; + +/** Arrow input data as bytes or loaded table. */ +export type ArrowInput = + | ArrayBuffer + | TypedArray + | Table; + +/** Options for Apache Arrow column conversion. */ +export interface ArrowColumnOptions { + /** + * Flag (default `true`) to convert Arrow date values to JavaScript Date + * objects. If false, defaults to what the Arrow implementation provides, + * typically timestamps as number values. + */ + convertDate?: boolean; + /** + * Flag (default `true`) to convert Arrow fixed point decimal values to + * JavaScript numbers. If false, defaults to what the Arrow implementation + * provides, typically byte arrays. The conversion will be lossy if the + * decimal can not be exactly represented as a double-precision floating + * point number. + */ + convertDecimal?: boolean; + /** + * Flag (default `true`) to convert Arrow timestamp values to JavaScript + * Date objects. If false, defaults to what the Arrow implementation + * provides, typically timestamps as number values. + */ + convertTimestamp?: boolean; + /** + * Flag (default `false`) to convert Arrow integers with bit widths of 64 + * bits or higher to JavaScript numbers. If false, defaults to what the + * Arrow implementation provides, typically `BigInt` values. The conversion + * will be lossy if the integer is so large it can not be exactly + * represented as a double-precision floating point number. + */ + convertBigInt?: boolean; + /** + * A hint (default `true`) to enable memoization of expensive conversions. + * If true, memoization is applied for string and nested (list, struct) + * types, caching extracted values to enable faster access. Memoization + * is also applied to converted Date values, in part to ensure exact object + * equality. This hint is ignored for dictionary columns, whose values are + * always memoized. + */ + memoize?: boolean; +} + +/** Options for Apache Arrow import. */ +export interface ArrowOptions extends ArrowColumnOptions { + /** + * An ordered set of columns to import. The input may consist of column name + * strings, column integer indices, objects with current column names as + * keys and new column names as values (for renaming), or selection helper + * functions such as *all*, *not*, or *range*. + */ + columns?: Select; +} + +/** Options for Arrow encoding. */ +export interface ArrowFormatOptions { + /** The maximum number of rows to include (default `Infinity`). */ + limit?: number; + /** + * The row offset (default `0`) indicating how many initial rows to skip. + */ + offset?: number; + /** + * Ordered list of column names to include. If function-valued, the + * function should accept a dataset as input and return an array of + * column name strings. If unspecified all columns are included. + */ + columns?: string[] | ((data: any) => string[]); + /** + * The Arrow data types to use. If specified, the input should be an + * object with column names for keys and Arrow data types for values. + * If a column type is not explicitly provided, type inference will be + * performed to guess an appropriate type. + */ + types?: Record; +} + +/** Options for Arrow IPC encoding. */ +export interface ArrowIPCFormatOptions extends ArrowFormatOptions { + /** + * The Arrow IPC byte format to use. One of `'stream'` (default) or `'file'`. + */ + format?: 'stream' | 'file'; +} diff --git a/src/engine/derive.js b/src/engine/derive.js deleted file mode 100644 index dc05121a..00000000 --- a/src/engine/derive.js +++ /dev/null @@ -1,78 +0,0 @@ -import { window } from './window/window'; -import { aggregate } from './reduce/util'; -import { hasWindow } from '../op'; -import columnSet from '../table/column-set'; -import repeat from '../util/repeat'; - -function isWindowed(op) { - return hasWindow(op.name) || - op.frame && ( - Number.isFinite(op.frame[0]) || - Number.isFinite(op.frame[1]) - ); -} - -export default function(table, { names, exprs, ops }, options = {}) { - // instantiate output data - const total = table.totalRows(); - const cols = columnSet(options.drop ? null : table); - const data = names.map(name => cols.add(name, Array(total))); - - // analyze operations, compute non-windowed aggregates - const [ aggOps, winOps ] = segmentOps(ops); - - const size = table.isGrouped() ? table.groups().size : 1; - const result = aggregate( - table, aggOps, - repeat(ops.length, () => Array(size)) - ); - - // perform table scans to generate output values - winOps.length - ? window(table, data, exprs, result, winOps) - : output(table, data, exprs, result); - - return table.create(cols); -} - -function segmentOps(ops) { - const aggOps = []; - const winOps = []; - const n = ops.length; - - for (let i = 0; i < n; ++i) { - const op = ops[i]; - op.id = i; - (isWindowed(op) ? winOps : aggOps).push(op); - } - - return [aggOps, winOps]; -} - -function output(table, cols, exprs, result) { - const bits = table.mask(); - const data = table.data(); - const { keys } = table.groups() || {}; - const op = keys - ? (id, row) => result[id][keys[row]] - : id => result[id][0]; - - const m = cols.length; - for (let j = 0; j < m; ++j) { - const get = exprs[j]; - const col = cols[j]; - - // inline the following for performance: - // table.scan((i, data) => col[i] = get(i, data, op)); - if (bits) { - for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { - col[i] = get(i, data, op); - } - } else { - const n = table.totalRows(); - for (let i = 0; i < n; ++i) { - col[i] = get(i, data, op); - } - } - } -} \ No newline at end of file diff --git a/src/engine/filter.js b/src/engine/filter.js deleted file mode 100644 index 26777821..00000000 --- a/src/engine/filter.js +++ /dev/null @@ -1,22 +0,0 @@ -import BitSet from '../table/bit-set'; - -export default function(table, predicate) { - const n = table.totalRows(); - const bits = table.mask(); - const data = table.data(); - const filter = new BitSet(n); - - // inline the following for performance: - // table.scan((row, data) => { if (predicate(row, data)) filter.set(row); }); - if (bits) { - for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { - if (predicate(i, data)) filter.set(i); - } - } else { - for (let i = 0; i < n; ++i) { - if (predicate(i, data)) filter.set(i); - } - } - - return table.create({ filter }); -} \ No newline at end of file diff --git a/src/engine/fold.js b/src/engine/fold.js deleted file mode 100644 index 865f396c..00000000 --- a/src/engine/fold.js +++ /dev/null @@ -1,18 +0,0 @@ -import unroll from './unroll'; -import { aggregateGet } from './reduce/util'; - -export default function(table, { names = [], exprs = [], ops = [] }, options = {}) { - if (names.length === 0) return table; - - const [k = 'key', v = 'value'] = options.as || []; - const vals = aggregateGet(table, ops, exprs); - - return unroll( - table, - { - names: [k, v], - exprs: [() => names, (row, data) => vals.map(fn => fn(row, data))] - }, - { ...options, drop: names } - ); -} \ No newline at end of file diff --git a/src/engine/groupby.js b/src/engine/groupby.js deleted file mode 100644 index 3d5914c7..00000000 --- a/src/engine/groupby.js +++ /dev/null @@ -1,51 +0,0 @@ -import { aggregateGet } from './reduce/util'; -import keyFunction from '../util/key-function'; - -export default function(table, exprs) { - return table.create({ - groups: createGroups(table, exprs) - }); -} - -function createGroups(table, { names = [], exprs = [], ops = [] }) { - const n = names.length; - if (n === 0) return null; - - // check for optimized path when grouping by a single field - // use pre-calculated groups if available - if (n === 1 && !table.isFiltered() && exprs[0].field) { - const col = table.column(exprs[0].field); - if (col.groups) return col.groups(names); - } - - let get = aggregateGet(table, ops, exprs); - const getKey = keyFunction(get); - const nrows = table.totalRows(); - const keys = new Uint32Array(nrows); - const index = {}; - const rows = []; - - // inline table scan for performance - const data = table.data(); - const bits = table.mask(); - if (bits) { - for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { - const key = getKey(i, data) + ''; - const val = index[key]; - keys[i] = val != null ? val : (index[key] = rows.push(i) - 1); - } - } else { - for (let i = 0; i < nrows; ++i) { - const key = getKey(i, data) + ''; - const val = index[key]; - keys[i] = val != null ? val : (index[key] = rows.push(i) - 1); - } - } - - if (!ops.length) { - // capture data in closure, so no interaction with select - get = get.map(f => row => f(row, data)); - } - - return { keys, get, names, rows, size: rows.length }; -} \ No newline at end of file diff --git a/src/engine/impute.js b/src/engine/impute.js deleted file mode 100644 index d4d40026..00000000 --- a/src/engine/impute.js +++ /dev/null @@ -1,134 +0,0 @@ -import { aggregateGet } from './reduce/util'; -import columnSet from '../table/column-set'; -import isValid from '../util/is-valid'; -import keyFunction from '../util/key-function'; -import unroll from '../util/unroll'; - -export default function(table, values, keys, arrays) { - const write = keys && keys.length; - return impute( - write ? expand(table, keys, arrays) : table, - values, - write - ); -} - -function impute(table, { names, exprs, ops }, write) { - const gets = aggregateGet(table, ops, exprs); - const cols = write ? null : columnSet(table); - const rows = table.totalRows(); - - names.forEach((name, i) => { - const col = table.column(name); - const out = write ? col.data : cols.add(name, Array(rows)); - const get = gets[i]; - - table.scan(idx => { - const v = col.get(idx); - out[idx] = !isValid(v) ? get(idx) : v; - }); - }); - - return write ? table : table.create(cols); -} - -function expand(table, keys, values) { - const groups = table.groups(); - const data = table.data(); - - // expansion keys and accessors - const keyNames = (groups ? groups.names : []).concat(keys); - const keyGet = (groups ? groups.get : []) - .concat(keys.map(key => table.getter(key))); - - // build hash of existing rows - const hash = new Set(); - const keyTable = keyFunction(keyGet); - table.scan((idx, data) => hash.add(keyTable(idx, data))); - - // initialize output table data - const names = table.columnNames(); - const cols = columnSet(); - const out = names.map(name => cols.add(name, [])); - names.forEach((name, i) => { - const old = data[name]; - const col = out[i]; - table.scan(row => col.push(old.get(row))); - }); - - // enumerate expanded value sets and augment output table - const keyEnum = keyFunction(keyGet.map((k, i) => a => a[i])); - const set = unroll( - 'v', - '{' + out.map((_, i) => `_${i}.push(v[$${i}]);`).join('') + '}', - out, names.map(name => keyNames.indexOf(name)) - ); - - if (groups) { - let row = groups.keys.length; - const prod = values.reduce((p, a) => p * a.length, groups.size); - const keys = new Uint32Array(prod + (row - hash.size)); - keys.set(groups.keys); - enumerate(groups, values, (vec, idx) => { - if (!hash.has(keyEnum(vec))) { - set(vec); - keys[row++] = idx[0]; - } - }); - cols.groupby({ ...groups, keys }); - } else { - enumerate(groups, values, vec => { - if (!hash.has(keyEnum(vec))) set(vec); - }); - } - - return table.create(cols.new()); -} - -function enumerate(groups, values, callback) { - const offset = groups ? groups.get.length : 0; - const pad = groups ? 1 : 0; - const len = pad + values.length; - const lens = new Int32Array(len); - const idxs = new Int32Array(len); - const set = []; - - if (groups) { - const { get, rows, size } = groups; - lens[0] = size; - set.push((vec, idx) => { - const row = rows[idx]; - for (let i = 0; i < offset; ++i) { - vec[i] = get[i](row); - } - }); - } - - values.forEach((a, i) => { - const j = i + offset; - lens[i + pad] = a.length; - set.push((vec, idx) => vec[j] = a[idx]); - }); - - const vec = Array(offset + values.length); - - // initialize value vector - for (let i = 0; i < len; ++i) { - set[i](vec, 0); - } - callback(vec, idxs); - - // enumerate all combinations of values - for (let i = len - 1; i >= 0;) { - const idx = ++idxs[i]; - if (idx < lens[i]) { - set[i](vec, idx); - callback(vec, idxs); - i = len - 1; - } else { - idxs[i] = 0; - set[i](vec, 0); - --i; - } - } -} \ No newline at end of file diff --git a/src/engine/join-filter.js b/src/engine/join-filter.js deleted file mode 100644 index e71ca35e..00000000 --- a/src/engine/join-filter.js +++ /dev/null @@ -1,100 +0,0 @@ -import { rowLookup } from './join/lookup'; -import BitSet from '../table/bit-set'; -import isArray from '../util/is-array'; - -export default function(tableL, tableR, predicate, options = {}) { - // calculate semi-join filter mask - const filter = new BitSet(tableL.totalRows()); - const join = isArray(predicate) ? hashSemiJoin : loopSemiJoin; - join(filter, tableL, tableR, predicate); - - // if anti-join, negate the filter - if (options.anti) { - filter.not().and(tableL.mask()); - } - - return tableL.create({ filter }); -} - -function hashSemiJoin(filter, tableL, tableR, [keyL, keyR]) { - // build lookup table - const lut = rowLookup(tableR, keyR); - - // scan table, update filter with matches - tableL.scan((rowL, data) => { - const rowR = lut.get(keyL(rowL, data)); - if (rowR >= 0) filter.set(rowL); - }); -} - -function loopSemiJoin(filter, tableL, tableR, predicate) { - const nL = tableL.numRows(); - const nR = tableR.numRows(); - const dataL = tableL.data(); - const dataR = tableR.data(); - - if (tableL.isFiltered() || tableR.isFiltered()) { - // use indices as at least one table is filtered - const idxL = tableL.indices(false); - const idxR = tableR.indices(false); - for (let i = 0; i < nL; ++i) { - const rowL = idxL[i]; - for (let j = 0; j < nR; ++j) { - if (predicate(rowL, dataL, idxR[j], dataR)) { - filter.set(rowL); - break; - } - } - } - } else { - // no filters, enumerate row indices directly - for (let i = 0; i < nL; ++i) { - for (let j = 0; j < nR; ++j) { - if (predicate(i, dataL, j, dataR)) { - filter.set(i); - break; - } - } - } - } -} - -// export default function(tableL, tableR, predicate, options = {}) { -// const filter = new BitSet(tableL.totalRows()); -// const nL = tableL.numRows(); -// const nR = tableR.numRows(); -// const dataL = tableL.data(); -// const dataR = tableR.data(); - -// if (tableL.isFiltered() || tableR.isFiltered()) { -// // use indices as at least one table is filtered -// const idxL = tableL.indices(false); -// const idxR = tableR.indices(false); -// for (let i = 0; i < nL; ++i) { -// const rowL = idxL[i]; -// for (let j = 0; j < nR; ++j) { -// if (predicate(rowL, dataL, idxR[j], dataR)) { -// filter.set(rowL); -// break; -// } -// } -// } -// } else { -// // no filters, enumerate row indices directly -// for (let i = 0; i < nL; ++i) { -// for (let j = 0; j < nR; ++j) { -// if (predicate(i, dataL, j, dataR)) { -// filter.set(i); -// break; -// } -// } -// } -// } - -// // if anti-join, negate the filter -// if (options.anti) { -// filter.not().and(tableL.mask()); -// } - -// return tableL.create({ filter }); -// } \ No newline at end of file diff --git a/src/engine/join.js b/src/engine/join.js deleted file mode 100644 index e242402f..00000000 --- a/src/engine/join.js +++ /dev/null @@ -1,110 +0,0 @@ -import { indexLookup } from './join/lookup'; -import columnSet from '../table/column-set'; -import concat from '../util/concat'; -import isArray from '../util/is-array'; -import unroll from '../util/unroll'; - -function emitter(columns, getters) { - const args = ['i', 'a', 'j', 'b']; - return unroll( - args, - '{' + concat(columns, (_, i) => `_${i}.push($${i}(${args}));`) + '}', - columns, getters - ); -} - -export default function(tableL, tableR, predicate, { names, exprs }, options = {}) { - // initialize data for left table - const dataL = tableL.data(); - const idxL = tableL.indices(false); - const nL = idxL.length; - const hitL = new Int32Array(nL); - - // initialize data for right table - const dataR = tableR.data(); - const idxR = tableR.indices(false); - const nR = idxR.length; - const hitR = new Int32Array(nR); - - // initialize output data - const ncols = names.length; - const cols = columnSet(); - const columns = Array(ncols); - const getters = Array(ncols); - for (let i = 0; i < names.length; ++i) { - columns[i] = cols.add(names[i], []); - getters[i] = exprs[i]; - } - const emit = emitter(columns, getters); - - // perform join - const join = isArray(predicate) ? hashJoin : loopJoin; - join(emit, predicate, dataL, dataR, idxL, idxR, hitL, hitR, nL, nR); - - if (options.left) { - for (let i = 0; i < nL; ++i) { - if (!hitL[i]) { - emit(idxL[i], dataL, -1, dataR); - } - } - } - - if (options.right) { - for (let j = 0; j < nR; ++j) { - if (!hitR[j]) { - emit(-1, dataL, idxR[j], dataR); - } - } - } - - return tableL.create(cols.new()); -} - -function loopJoin(emit, predicate, dataL, dataR, idxL, idxR, hitL, hitR, nL, nR) { - // perform nested-loops join - for (let i = 0; i < nL; ++i) { - const rowL = idxL[i]; - for (let j = 0; j < nR; ++j) { - const rowR = idxR[j]; - if (predicate(rowL, dataL, rowR, dataR)) { - emit(rowL, dataL, rowR, dataR); - hitL[i] = 1; - hitR[j] = 1; - } - } - } -} - -function hashJoin(emit, [keyL, keyR], dataL, dataR, idxL, idxR, hitL, hitR, nL, nR) { - // determine which table to hash - let dataScan, keyScan, hitScan, idxScan; - let dataHash, keyHash, hitHash, idxHash; - let emitScan = emit; - if (nL >= nR) { - dataScan = dataL; keyScan = keyL; hitScan = hitL; idxScan = idxL; - dataHash = dataR; keyHash = keyR; hitHash = hitR; idxHash = idxR; - } else { - dataScan = dataR; keyScan = keyR; hitScan = hitR; idxScan = idxR; - dataHash = dataL; keyHash = keyL; hitHash = hitL; idxHash = idxL; - emitScan = (i, a, j, b) => emit(j, b, i, a); - } - - // build lookup table - const lut = indexLookup(idxHash, dataHash, keyHash); - - // scan other table - const m = idxScan.length; - for (let j = 0; j < m; ++j) { - const rowScan = idxScan[j]; - const list = lut.get(keyScan(rowScan, dataScan)); - if (list) { - const n = list.length; - for (let k = 0; k < n; ++k) { - const i = list[k]; - emitScan(rowScan, dataScan, idxHash[i], dataHash); - hitHash[i] = 1; - } - hitScan[j] = 1; - } - } -} \ No newline at end of file diff --git a/src/engine/lookup.js b/src/engine/lookup.js deleted file mode 100644 index ee435580..00000000 --- a/src/engine/lookup.js +++ /dev/null @@ -1,33 +0,0 @@ -import { rowLookup } from './join/lookup'; -import { aggregateGet } from './reduce/util'; -import columnSet from '../table/column-set'; -import NULL from '../util/null'; -import concat from '../util/concat'; -import unroll from '../util/unroll'; - -export default function(tableL, tableR, [keyL, keyR], { names, exprs, ops }) { - // instantiate output data - const cols = columnSet(tableL); - const total = tableL.totalRows(); - names.forEach(name => cols.add(name, Array(total).fill(NULL))); - - // build lookup table - const lut = rowLookup(tableR, keyR); - - // generate setter function for lookup match - const set = unroll( - ['lr', 'rr', 'data'], - '{' + concat(names, (_, i) => `_[${i}][lr] = $[${i}](rr, data);`) + '}', - names.map(name => cols.data[name]), - aggregateGet(tableR, ops, exprs) - ); - - // find matching rows, set values on match - const dataR = tableR.data(); - tableL.scan((lrow, data) => { - const rrow = lut.get(keyL(lrow, data)); - if (rrow >= 0) set(lrow, rrow, dataR); - }); - - return tableL.create(cols); -} \ No newline at end of file diff --git a/src/engine/orderby.js b/src/engine/orderby.js deleted file mode 100644 index 23adbdda..00000000 --- a/src/engine/orderby.js +++ /dev/null @@ -1,3 +0,0 @@ -export default function(table, comparator) { - return table.create({ order: comparator }); -} diff --git a/src/engine/pivot.js b/src/engine/pivot.js deleted file mode 100644 index bb2e1562..00000000 --- a/src/engine/pivot.js +++ /dev/null @@ -1,109 +0,0 @@ -import { aggregate, aggregateGet, groupOutput } from './reduce/util'; -import columnSet from '../table/column-set'; - -const opt = (value, defaultValue) => value != null ? value : defaultValue; - -export default function(table, on, values, options = {}) { - const { keys, keyColumn } = pivotKeys(table, on, options); - const vsep = opt(options.valueSeparator, '_'); - const namefn = values.names.length > 1 - ? (i, name) => name + vsep + keys[i] - : i => keys[i]; - - // perform separate aggregate operations for each key - // if keys do not match, emit NaN so aggregate skips it - // use custom toString method for proper field resolution - const results = keys.map( - k => aggregate(table, values.ops.map(op => { - if (op.name === 'count') { // fix #273 - const fn = r => k === keyColumn[r] ? 1 : NaN; - fn.toString = () => k + ':1'; - return { ...op, name: 'sum', fields: [fn] }; - } - const fields = op.fields.map(f => { - const fn = (r, d) => k === keyColumn[r] ? f(r, d) : NaN; - fn.toString = () => k + ':' + f; - return fn; - }); - return { ...op, fields }; - })) - ); - - return table.create(output(values, namefn, table.groups(), results)); -} - -function pivotKeys(table, on, options) { - const limit = options.limit > 0 ? +options.limit : Infinity; - const sort = opt(options.sort, true); - const ksep = opt(options.keySeparator, '_'); - - // construct key accessor function - const get = aggregateGet(table, on.ops, on.exprs); - const key = get.length === 1 - ? get[0] - : (row, data) => get.map(fn => fn(row, data)).join(ksep); - - // generate vector of per-row key values - const kcol = Array(table.totalRows()); - table.scan((row, data) => kcol[row] = key(row, data)); - - // collect unique key values - const uniq = aggregate( - table.ungroup(), - [ { - id: 0, - name: 'array_agg_distinct', - fields: [(row => kcol[row])], params: [] - } ] - )[0][0]; - - // get ordered set of unique key values - const keys = sort ? uniq.sort() : uniq; - - // return key values - return { - keys: Number.isFinite(limit) ? keys.slice(0, limit) : keys, - keyColumn: kcol - }; -} - -function output({ names, exprs }, namefn, groups, results) { - const size = groups ? groups.size : 1; - const cols = columnSet(); - const m = results.length; - const n = names.length; - - let result; - const op = (id, row) => result[id][row]; - - // write groupby fields to output - if (groups) groupOutput(cols, groups); - - // write pivot values to output - for (let i = 0; i < n; ++i) { - const get = exprs[i]; - if (get.field != null) { - // if expression is op only, use aggregates directly - for (let j = 0; j < m; ++j) { - cols.add(namefn(j, names[i]), results[j][get.field]); - } - } else if (size > 1) { - // if multiple groups, evaluate expression for each - for (let j = 0; j < m; ++j) { - result = results[j]; - const col = cols.add(namefn(j, names[i]), Array(size)); - for (let k = 0; k < size; ++k) { - col[k] = get(k, null, op); - } - } - } else { - // if only one group, no need to loop - for (let j = 0; j < m; ++j) { - result = results[j]; - cols.add(namefn(j, names[i]), [ get(0, null, op) ]); - } - } - } - - return cols.new(); -} \ No newline at end of file diff --git a/src/engine/rollup.js b/src/engine/rollup.js deleted file mode 100644 index f439673b..00000000 --- a/src/engine/rollup.js +++ /dev/null @@ -1,41 +0,0 @@ -import { aggregate, groupOutput } from './reduce/util'; -import columnSet from '../table/column-set'; - -export default function(table, { names, exprs, ops }) { - // output data - const cols = columnSet(); - const groups = table.groups(); - - // write groupby fields to output - if (groups) groupOutput(cols, groups); - - // compute and write aggregate output - output(names, exprs, groups, aggregate(table, ops), cols); - - // return output table - return table.create(cols.new()); -} - -function output(names, exprs, groups, result = [], cols) { - if (!exprs.length) return; - const size = groups ? groups.size : 1; - const op = (id, row) => result[id][row]; - const n = names.length; - - for (let i = 0; i < n; ++i) { - const get = exprs[i]; - if (get.field != null) { - // if expression is op only, use aggregates directly - cols.add(names[i], result[get.field]); - } else if (size > 1) { - // if multiple groups, evaluate expression for each - const col = cols.add(names[i], Array(size)); - for (let j = 0; j < size; ++j) { - col[j] = get(j, null, op); - } - } else { - // if only one group, no need to loop - cols.add(names[i], [ get(0, null, op) ]); - } - } -} \ No newline at end of file diff --git a/src/engine/sample.js b/src/engine/sample.js deleted file mode 100644 index 64f1f590..00000000 --- a/src/engine/sample.js +++ /dev/null @@ -1,38 +0,0 @@ -import sample from '../util/sample'; -import _shuffle from '../util/shuffle'; - -export default function(table, size, weight, options = {}) { - const { replace, shuffle } = options; - const parts = table.partitions(false); - - let total = 0; - size = parts.map((idx, group) => { - let s = size(group); - total += (s = (replace ? s : Math.min(idx.length, s))); - return s; - }); - - const samples = new Uint32Array(total); - let curr = 0; - - parts.forEach((idx, group) => { - const sz = size[group]; - const buf = samples.subarray(curr, curr += sz); - - if (!replace && sz === idx.length) { - // sample size === data size, no replacement - // no need to sample, just copy indices - buf.set(idx); - } else { - sample(buf, replace, idx, weight); - } - }); - - if (shuffle !== false && (parts.length > 1 || !replace)) { - // sampling with replacement methods shuffle, so in - // that case a single partition is already good to go - _shuffle(samples); - } - - return table.reify(samples); -} \ No newline at end of file diff --git a/src/engine/select.js b/src/engine/select.js deleted file mode 100644 index aa0c5c11..00000000 --- a/src/engine/select.js +++ /dev/null @@ -1,17 +0,0 @@ -import columnSet from '../table/column-set'; -import error from '../util/error'; -import isString from '../util/is-string'; - -export default function(table, columns) { - const cols = columnSet(); - - columns.forEach((value, curr) => { - const next = isString(value) ? value : curr; - if (next) { - const col = table.column(curr) || error(`Unrecognized column: ${curr}`); - cols.add(next, col); - } - }); - - return table.create(cols); -} \ No newline at end of file diff --git a/src/engine/spread.js b/src/engine/spread.js deleted file mode 100644 index f12d1e78..00000000 --- a/src/engine/spread.js +++ /dev/null @@ -1,59 +0,0 @@ -import { aggregateGet } from './reduce/util'; -import columnSet from '../table/column-set'; -import NULL from '../util/null'; -import toArray from '../util/to-array'; - -export default function(table, { names, exprs, ops = [] }, options = {}) { - if (names.length === 0) return table; - - // ignore 'as' if there are multiple field names - const as = (names.length === 1 && options.as) || []; - const drop = options.drop == null ? true : !!options.drop; - const limit = options.limit == null - ? as.length || Infinity - : Math.max(1, +options.limit || 1); - - const get = aggregateGet(table, ops, exprs); - const cols = columnSet(); - const map = names.reduce((map, name, i) => map.set(name, i), new Map()); - - const add = (index, name) => { - const columns = spread(table, get[index], limit); - const n = columns.length; - for (let i = 0; i < n; ++i) { - cols.add(as[i] || `${name}_${i + 1}`, columns[i]); - } - }; - - table.columnNames().forEach(name => { - if (map.has(name)) { - if (!drop) cols.add(name, table.column(name)); - add(map.get(name), name); - map.delete(name); - } else { - cols.add(name, table.column(name)); - } - }); - - map.forEach(add); - - return table.create(cols); -} - -function spread(table, get, limit) { - const nrows = table.totalRows(); - const columns = []; - - table.scan((row, data) => { - const values = toArray(get(row, data)); - const n = Math.min(values.length, limit); - while (columns.length < n) { - columns.push(Array(nrows).fill(NULL)); - } - for (let i = 0; i < n; ++i) { - columns[i][row] = values[i]; - } - }); - - return columns; -} \ No newline at end of file diff --git a/src/engine/unroll.js b/src/engine/unroll.js deleted file mode 100644 index c8c40845..00000000 --- a/src/engine/unroll.js +++ /dev/null @@ -1,117 +0,0 @@ -import { aggregateGet } from './reduce/util'; -import columnSet from '../table/column-set'; -import toArray from '../util/to-array'; - -export default function(table, { names = [], exprs = [], ops = [] }, options = {}) { - if (!names.length) return table; - - const limit = options.limit > 0 ? +options.limit : Infinity; - const index = options.index - ? options.index === true ? 'index' : options.index + '' - : null; - const drop = new Set(options.drop); - const get = aggregateGet(table, ops, exprs); - - // initialize output columns - const cols = columnSet(); - const nset = new Set(names); - const priors = []; - const copies = []; - const unroll = []; - - // original and copied columns - table.columnNames().forEach(name => { - if (!drop.has(name)) { - const col = cols.add(name, []); - if (!nset.has(name)) { - priors.push(table.column(name)); - copies.push(col); - } - } - }); - - // unrolled output columns - names.forEach(name => { - if (!drop.has(name)) { - if (!cols.has(name)) cols.add(name, []); - unroll.push(cols.data[name]); - } - }); - - // index column, if requested - const icol = index ? cols.add(index, []) : null; - - let start = 0; - const m = priors.length; - const n = unroll.length; - - const copy = (row, maxlen) => { - for (let i = 0; i < m; ++i) { - copies[i].length = start + maxlen; - copies[i].fill(priors[i].get(row), start, start + maxlen); - } - }; - - const indices = icol - ? (row, maxlen) => { - for (let i = 0; i < maxlen; ++i) { - icol[row + i] = i; - } - } - : () => {}; - - if (n === 1) { - // optimize common case of one array-valued column - const fn = get[0]; - const col = unroll[0]; - - table.scan((row, data) => { - // extract array data - const array = toArray(fn(row, data)); - const maxlen = Math.min(array.length, limit); - - // copy original table data - copy(row, maxlen); - - // copy unrolled array data - for (let j = 0; j < maxlen; ++j) { - col[start + j] = array[j]; - } - - // fill in array indices - indices(start, maxlen); - - start += maxlen; - }); - } else { - table.scan((row, data) => { - let maxlen = 0; - - // extract parallel array data - const arrays = get.map(fn => { - const value = toArray(fn(row, data)); - maxlen = Math.min(Math.max(maxlen, value.length), limit); - return value; - }); - - // copy original table data - copy(row, maxlen); - - // copy unrolled array data - for (let i = 0; i < n; ++i) { - const col = unroll[i]; - const arr = arrays[i]; - for (let j = 0; j < maxlen; ++j) { - col[start + j] = arr[j]; - } - } - - // fill in array indices - indices(start, maxlen); - - start += maxlen; - }); - } - - return table.create(cols.new()); -} \ No newline at end of file diff --git a/src/expression/ast/clean.js b/src/expression/ast/clean.js index 0b41c31c..f28f1065 100644 --- a/src/expression/ast/clean.js +++ b/src/expression/ast/clean.js @@ -1,4 +1,4 @@ -import walk from './walk'; +import walk from './walk.js'; function strip(node) { delete node.start; @@ -21,4 +21,4 @@ export default function(ast) { Default: strip }); return ast; -} \ No newline at end of file +} diff --git a/src/expression/ast/constants.js b/src/expression/ast/constants.js index b2c8eadf..35105b7c 100644 --- a/src/expression/ast/constants.js +++ b/src/expression/ast/constants.js @@ -13,4 +13,4 @@ export const Constant = 'Constant'; export const Dictionary = 'Dictionary'; export const Function = 'Function'; export const Parameter = 'Parameter'; -export const Op = 'Op'; \ No newline at end of file +export const Op = 'Op'; diff --git a/src/expression/ast/util.js b/src/expression/ast/util.js index fc6b4f72..a94aad6f 100644 --- a/src/expression/ast/util.js +++ b/src/expression/ast/util.js @@ -1,4 +1,4 @@ -import { ArrowFunctionExpression, FunctionExpression } from './constants'; +import { ArrowFunctionExpression, FunctionExpression } from './constants.js'; export function is(type, node) { return node && node.type === type; @@ -7,4 +7,4 @@ export function is(type, node) { export function isFunctionExpression(node) { return is(FunctionExpression, node) || is(ArrowFunctionExpression, node); -} \ No newline at end of file +} diff --git a/src/expression/ast/walk.js b/src/expression/ast/walk.js index 198768ee..8a3428d2 100644 --- a/src/expression/ast/walk.js +++ b/src/expression/ast/walk.js @@ -113,4 +113,4 @@ const walkers = { Program: (node, ctx, visitors) => { walk(node.body[0], ctx, visitors, node); } -}; \ No newline at end of file +}; diff --git a/src/expression/codegen.js b/src/expression/codegen.js index 9121ebbc..0c16ee45 100644 --- a/src/expression/codegen.js +++ b/src/expression/codegen.js @@ -1,5 +1,5 @@ -import error from '../util/error'; -import toString from '../util/to-string'; +import error from '../util/error.js'; +import toString from '../util/to-string.js'; const visit = (node, opt) => { const f = visitors[node.type]; @@ -33,9 +33,14 @@ const ref = (node, opt, method) => { return `data${table}${name(node)}.${method}(${opt.index}${table})`; }; +const get = (node, opt) => { + const table = node.table || ''; + return `data${table}${name(node)}[${opt.index}${table}]`; +}; + const visitors = { Constant: node => node.raw, - Column: (node, opt) => ref(node, opt, 'get'), + Column: (node, opt) => node.array ? get(node, opt) : ref(node, opt, 'at'), Dictionary: (node, opt) => ref(node, opt, 'key'), Function: node => `fn.${node.name}`, Parameter: node => `$${name(node)}`, @@ -135,4 +140,4 @@ const visitors = { export default function(node, opt = { index: 'row' }) { return visit(node, opt); -} \ No newline at end of file +} diff --git a/src/expression/compare.js b/src/expression/compare.js index 30c349c9..399bd6a7 100644 --- a/src/expression/compare.js +++ b/src/expression/compare.js @@ -1,6 +1,6 @@ -import codegen from './codegen'; -import parse from './parse'; -import { aggregate } from '../engine/reduce/util'; +import codegen from './codegen.js'; +import parse from './parse.js'; +import { aggregate } from '../verbs/reduce/util.js'; // generate code to compare a single field const _compare = (u, v, lt, gt) => @@ -58,4 +58,4 @@ export default function(table, fields) { // instantiate and return comparator function return Function('op', 'keys', 'fn', 'data', code)(op, keys, fn, table.data()); -} \ No newline at end of file +} diff --git a/src/expression/compile.js b/src/expression/compile.js index 2bb767fb..4c36a5c5 100644 --- a/src/expression/compile.js +++ b/src/expression/compile.js @@ -1,4 +1,4 @@ -import { functions as fn } from '../op'; +import { functions as fn } from '../op/index.js'; function compile(code, fn, params) { code = `"use strict"; return ${code};`; @@ -11,4 +11,4 @@ export default { expr2: (code, params) => compile(`(row0,data0,row,data)=>${code}`, fn, params), join: (code, params) => compile(`(row1,data1,row2,data2)=>${code}`, fn, params), param: (code, params) => compile(code, fn, params) -}; \ No newline at end of file +}; diff --git a/src/expression/constants.js b/src/expression/constants.js index 0aff57cb..d56dda9c 100644 --- a/src/expression/constants.js +++ b/src/expression/constants.js @@ -10,4 +10,4 @@ export default { PI: 'Math.PI', SQRT1_2: 'Math.SQRT1_2', SQRT2: 'Math.SQRT2' -}; \ No newline at end of file +}; diff --git a/src/expression/parse-escape.js b/src/expression/parse-escape.js index e483b4ec..22233c93 100644 --- a/src/expression/parse-escape.js +++ b/src/expression/parse-escape.js @@ -1,7 +1,7 @@ -import compile from './compile'; -import { rowObjectCode } from './row-object'; -import error from '../util/error'; -import toFunction from '../util/to-function'; +import compile from './compile.js'; +import { rowObjectCode } from './row-object.js'; +import error from '../util/error.js'; +import toFunction from '../util/to-function.js'; const ERROR_ESC_AGGRONLY = 'Escaped functions are not valid as rollup or pivot values.'; @@ -9,9 +9,7 @@ export default function(ctx, spec, params) { if (ctx.aggronly) error(ERROR_ESC_AGGRONLY); // generate escaped function invocation code - const code = '(row,data)=>fn(' - + rowObjectCode(ctx.table.columnNames()) - + ',$)'; + const code = `(row,data)=>fn(${rowObjectCode(ctx.table)},$)`; return { escape: compile.escape(code, toFunction(spec.expr), params) }; -} \ No newline at end of file +} diff --git a/src/expression/parse-expression.js b/src/expression/parse-expression.js index 1805df38..71d196f3 100644 --- a/src/expression/parse-expression.js +++ b/src/expression/parse-expression.js @@ -10,22 +10,22 @@ import { Op, Parameter, Property -} from './ast/constants'; -import { is, isFunctionExpression } from './ast/util'; -import walk from './ast/walk'; -import constants from './constants'; -import rewrite from './rewrite'; -import { ROW_OBJECT, rowObjectExpression } from './row-object'; +} from './ast/constants.js'; +import { is, isFunctionExpression } from './ast/util.js'; +import walk from './ast/walk.js'; +import constants from './constants.js'; +import rewrite from './rewrite.js'; +import { ROW_OBJECT, rowObjectExpression } from './row-object.js'; import { getAggregate, getWindow, hasAggregate, hasFunction, hasWindow -} from '../op'; +} from '../op/index.js'; -import error from '../util/error'; -import has from '../util/has'; -import isArray from '../util/is-array'; -import isNumber from '../util/is-number'; -import toString from '../util/to-string'; +import error from '../util/error.js'; +import has from '../util/has.js'; +import isArray from '../util/is-array.js'; +import isNumber from '../util/is-number.js'; +import toString from '../util/to-string.js'; const PARSER_OPT = { ecmaVersion: 11 }; const DEFAULT_PARAM_ID = '$'; @@ -93,9 +93,10 @@ function parseAST(expr) { const code = expr.field ? fieldRef(expr) : isArray(expr) ? toString(expr) : expr; + // @ts-ignore return parse(`expr=(${code})`, PARSER_OPT).body[0].expression.right; - } catch (err) { - error(`Expression parse error: ${expr+''}`, err); + } catch (err) { // eslint-disable-line no-unused-vars + error(`Expression parse error: ${expr+''}`); } } @@ -378,7 +379,7 @@ function updateFunctionNode(node, name, ctx) { if (name === ROW_OBJECT) { const t = ctx.table; if (!t) ctx.error(node, ERROR_ROW_OBJECT); - rowObjectExpression(node, + rowObjectExpression(node, t, node.arguments.length ? node.arguments.map(node => { const col = ctx.param(node); @@ -403,4 +404,4 @@ function handleDeclaration(node, ctx) { } else { ctx.error(node.id, ERROR_DECLARATION); } -} \ No newline at end of file +} diff --git a/src/expression/parse.js b/src/expression/parse.js index fecf5afa..9fd73462 100644 --- a/src/expression/parse.js +++ b/src/expression/parse.js @@ -1,14 +1,14 @@ -import { Column, Literal, Op } from './ast/constants'; -import clean from './ast/clean'; -import { is } from './ast/util'; -import codegen from './codegen'; -import compile from './compile'; -import entries from '../util/entries'; -import error from '../util/error'; -import isFunction from '../util/is-function'; -import isObject from '../util/is-object'; -import parseEscape from './parse-escape'; -import parseExpression from './parse-expression'; +import { Column, Literal, Op } from './ast/constants.js'; +import clean from './ast/clean.js'; +import { is } from './ast/util.js'; +import codegen from './codegen.js'; +import compile from './compile.js'; +import entries from '../util/entries.js'; +import error from '../util/error.js'; +import isFunction from '../util/is-function.js'; +import isObject from '../util/is-object.js'; +import parseEscape from './parse-escape.js'; +import parseExpression from './parse-expression.js'; const ANNOTATE = { [Column]: 1, [Op]: 1 }; @@ -115,4 +115,4 @@ function getParams(opt) { function getTableParams(table) { return table && isFunction(table.params) ? table.params() : {}; -} \ No newline at end of file +} diff --git a/src/expression/rewrite.js b/src/expression/rewrite.js index f9046cbb..87d63c33 100644 --- a/src/expression/rewrite.js +++ b/src/expression/rewrite.js @@ -1,5 +1,6 @@ -import { Column, Dictionary, Literal } from './ast/constants'; -import isFunction from '../util/is-function'; +import { Column, Dictionary, Literal } from './ast/constants.js'; +import isArrayType from '../util/is-array-type.js'; +import isFunction from '../util/is-function.js'; const dictOps = { '==': 1, @@ -13,15 +14,20 @@ const dictOps = { * Additionally optimizes dictionary column operations. * @param {object} ref AST node to rewrite to a column reference. * @param {string} name The name of the column. - * @param {number} index The table index of the column. - * @param {object} col The actual table column instance. - * @param {object} op Parent AST node operating on the column reference. + * @param {number} [index] The table index of the column. + * @param {object} [col] The actual table column instance. + * @param {object} [op] Parent AST node operating on the column reference. */ -export default function(ref, name, index = 0, col, op) { +export default function(ref, name, index = 0, col = undefined, op = undefined) { ref.type = Column; ref.name = name; ref.table = index; + // annotate arrays as such for optimized access + if (isArrayType(col)) { + ref.array = true; + } + // proceed only if has parent op and is a dictionary column if (op && col && isFunction(col.keyFor)) { // get other arg if op is an optimizeable operation @@ -56,4 +62,4 @@ function rewriteDictionary(op, ref, lit, key) { } return true; -} \ No newline at end of file +} diff --git a/src/expression/row-object.js b/src/expression/row-object.js index 6071328e..b12d0ff7 100644 --- a/src/expression/row-object.js +++ b/src/expression/row-object.js @@ -1,14 +1,18 @@ -import { Literal, ObjectExpression, Property } from './ast/constants'; -import codegen from './codegen'; -import compile from './compile'; -import rewrite from './rewrite'; -import entries from '../util/entries'; -import isArray from '../util/is-array'; -import toString from '../util/to-string'; +import { Literal, ObjectExpression, Property } from './ast/constants.js'; +import codegen from './codegen.js'; +import compile from './compile.js'; +import rewrite from './rewrite.js'; +import entries from '../util/entries.js'; +import isArray from '../util/is-array.js'; +import toString from '../util/to-string.js'; export const ROW_OBJECT = 'row_object'; -export function rowObjectExpression(node, props) { +export function rowObjectExpression( + node, + table, + props = table.columnNames()) +{ node.type = ObjectExpression; const p = node.properties = []; @@ -17,17 +21,17 @@ export function rowObjectExpression(node, props) { p.push({ type: Property, key: { type: Literal, raw: toString(key) }, - value: rewrite({ computed: true }, name) + value: rewrite({ computed: true }, name, 0, table.column(name)) }); } return node; } -export function rowObjectCode(props) { - return codegen(rowObjectExpression({}, props)); +export function rowObjectCode(table, props) { + return codegen(rowObjectExpression({}, table, props)); } -export function rowObjectBuilder(props) { - return compile.expr(rowObjectCode(props)); -} \ No newline at end of file +export function rowObjectBuilder(table, props) { + return compile.expr(rowObjectCode(table, props)); +} diff --git a/src/format/from-arrow.js b/src/format/from-arrow.js deleted file mode 100644 index 0a35d53b..00000000 --- a/src/format/from-arrow.js +++ /dev/null @@ -1,42 +0,0 @@ -import { fromIPC } from '../arrow/arrow-table'; -import arrowColumn from '../arrow/arrow-column'; -import resolve, { all } from '../helpers/selection'; -import columnSet from '../table/column-set'; -import ColumnTable from '../table/column-table'; - -/** - * Options for Apache Arrow import. - * @typedef {object} ArrowOptions - * @property {import('../table/transformable').Select} columns - * An ordered set of columns to import. The input may consist of column name - * strings, column integer indices, objects with current column names as keys - * and new column names as values (for renaming), or selection helper - * functions such as {@link all}, {@link not}, or {@link range}. - */ - -/** - * Create a new table backed by an Apache Arrow table instance. - * @param {object} arrow An Apache Arrow data table or byte buffer. - * @param {ArrowOptions} options Options for Arrow import. - * @return {ColumnTable} A new table containing the imported values. - */ -export default function(arrow, options = {}) { - if (arrow && !arrow.batches) { - arrow = fromIPC()(arrow); - } - - // resolve column selection - const fields = arrow.schema.fields.map(f => f.name); - const sel = resolve({ - columnNames: test => test ? fields.filter(test) : fields.slice(), - columnIndex: name => fields.indexOf(name) - }, options.columns || all()); - - // build Arquero columns for backing Arrow columns - const cols = columnSet(); - sel.forEach((name, key) => { - cols.add(name, arrowColumn(arrow.getChild(key))); - }); - - return new ColumnTable(cols.data, cols.names); -} \ No newline at end of file diff --git a/src/format/from-csv.js b/src/format/from-csv.js index 490eae96..044fd1eb 100644 --- a/src/format/from-csv.js +++ b/src/format/from-csv.js @@ -1,7 +1,5 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import fromTextRows from './from-text-rows'; -import parseDelimited from './parse/parse-delimited'; +import fromTextRows from './from-text-rows.js'; +import parseDelimited from './parse/parse-delimited.js'; /** * Options for CSV parsing. @@ -34,8 +32,9 @@ import parseDelimited from './parse/parse-delimited'; * behavior, set the autoType option to false. To perform custom parsing * of input column values, use the parse option. * @param {string} text A string in a delimited-value format. - * @param {CSVParseOptions} options The formatting options. - * @return {ColumnTable} A new table containing the parsed values. + * @param {CSVParseOptions} [options] The formatting options. + * @return {import('../table/ColumnTable.js').ColumnTable} A new table + * containing the parsed values. */ export default function(text, options = {}) { const next = parseDelimited(text, options); @@ -44,4 +43,4 @@ export default function(text, options = {}) { options.header !== false ? next() : options.names, options ); -} \ No newline at end of file +} diff --git a/src/format/from-fixed.js b/src/format/from-fixed.js index 958c8029..ccfecacb 100644 --- a/src/format/from-fixed.js +++ b/src/format/from-fixed.js @@ -1,8 +1,6 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import fromTextRows from './from-text-rows'; -import parseLines from './parse/parse-lines'; -import error from '../util/error'; +import fromTextRows from './from-text-rows.js'; +import parseLines from './parse/parse-lines.js'; +import error from '../util/error.js'; /** * Options for fixed width file parsing. @@ -34,7 +32,8 @@ import error from '../util/error'; * parsing of input column values, use the parse option. * @param {string} text A string in a fixed-width file format. * @param {FixedParseOptions} options The formatting options. - * @return {ColumnTable} A new table containing the parsed values. + * @return {import('../table/ColumnTable.js').ColumnTable} A new table + * containing the parsed values. */ export default function(text, options = {}) { const read = parseLines(text, options); @@ -51,10 +50,10 @@ export default function(text, options = {}) { ); } -function positions({ positions, widths }) { +function positions({ positions = undefined, widths = undefined }) { if (!positions && !widths) { error('Fixed width files require a "positions" or "widths" option'); } let i = 0; return positions || widths.map(w => [i, i += w]); -} \ No newline at end of file +} diff --git a/src/format/from-json.js b/src/format/from-json.js index a3441ba3..75279588 100644 --- a/src/format/from-json.js +++ b/src/format/from-json.js @@ -1,10 +1,10 @@ -import ColumnTable from '../table/column-table'; -import defaultTrue from '../util/default-true'; -import isArrayType from '../util/is-array-type'; -import isDigitString from '../util/is-digit-string'; -import isISODateString from '../util/is-iso-date-string'; -import isObject from '../util/is-object'; -import isString from '../util/is-string'; +import { ColumnTable } from '../table/ColumnTable.js'; +import defaultTrue from '../util/default-true.js'; +import isArrayType from '../util/is-array-type.js'; +import isDigitString from '../util/is-digit-string.js'; +import isISODateString from '../util/is-iso-date-string.js'; +import isObject from '../util/is-object.js'; +import isString from '../util/is-string.js'; /** * Options for JSON parsing. @@ -27,7 +27,7 @@ import isString from '../util/is-string'; * The data payload can also be provided as the "data" property of an * enclosing object, with an optional "schema" property containing table * metadata such as a "fields" array of ordered column information. - * @param {string|object} data A string in JSON format, or pre-parsed object. + * @param {string|object} json A string in JSON format, or pre-parsed object. * @param {JSONParseOptions} options The formatting options. * @return {ColumnTable} A new table containing the parsed values. */ @@ -73,4 +73,4 @@ export default function(json, options = {}) { } return new ColumnTable(data, names); -} \ No newline at end of file +} diff --git a/src/format/from-text-rows.js b/src/format/from-text-rows.js index 6ea4bae5..bdcda246 100644 --- a/src/format/from-text-rows.js +++ b/src/format/from-text-rows.js @@ -1,8 +1,8 @@ -import ColumnTable from '../table/column-table'; -import identity from '../util/identity'; -import isFunction from '../util/is-function'; -import repeat from '../util/repeat'; -import valueParser from '../util/parse-values'; +import { ColumnTable } from '../table/ColumnTable.js'; +import identity from '../util/identity.js'; +import isFunction from '../util/is-function.js'; +import repeat from '../util/repeat.js'; +import valueParser from '../util/parse-values.js'; function defaultNames(n, off = 0) { return repeat(n - off, i => `col${i + off + 1}`); @@ -44,6 +44,7 @@ export default function(next, names, options) { } } + /** @type {import('../table/types.js').ColumnData} */ const columns = {}; names.forEach((name, i) => columns[name] = values[i]); return new ColumnTable(columns, names); @@ -58,4 +59,4 @@ function getParsers(names, values, options) { : noParse ? identity : valueParser(values[i], options) ); -} \ No newline at end of file +} diff --git a/src/format/infer.js b/src/format/infer.js index 2492e45c..be203e98 100644 --- a/src/format/infer.js +++ b/src/format/infer.js @@ -1,4 +1,4 @@ -import isDate from '../util/is-date'; +import isDate from '../util/is-date.js'; function isExactDateUTC(d) { return d.getUTCHours() === 0 @@ -47,4 +47,4 @@ export default function(scan, options = {}) { digits: Math.min(digits, options.maxdigits || 6) } }; -} \ No newline at end of file +} diff --git a/src/format/load-file.js b/src/format/load-file.js index 8341c211..db5975a0 100644 --- a/src/format/load-file.js +++ b/src/format/load-file.js @@ -1,14 +1,15 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import fromArrow from './from-arrow'; -import fromCSV from './from-csv'; -import fromFixed from './from-fixed'; -import fromJSON from './from-json'; -import { from } from '../table'; -import isArray from '../util/is-array'; - import fetch from 'node-fetch'; import { readFile } from 'fs'; +import fromCSV from './from-csv.js'; +import fromFixed from './from-fixed.js'; +import fromJSON from './from-json.js'; +import fromArrow from '../arrow/from-arrow.js'; +import { from } from '../table/index.js'; +import isArray from '../util/is-array.js'; + +/** + * @typedef {import('../table/ColumnTable.js').ColumnTable} ColumnTable + */ /** * Options for file loading. @@ -28,7 +29,7 @@ import { readFile } from 'fs'; * otherwise CSV format is assumed. The options to this method are * passed as the second argument to the format parser. * @param {string} path The URL or file path to load. - * @param {LoadOptions & object} options The loading and formatting options. + * @param {LoadOptions & object} [options] The loading and formatting options. * @return {Promise} A Promise for an Arquero table. * @example aq.load('data/table.csv') * @example aq.load('data/table.json', { using: aq.fromJSON }) @@ -66,7 +67,8 @@ function loadFile(file, options, parse) { /** * Load an Arrow file from a URL and return a Promise for an Arquero table. * @param {string} path The URL or file path to load. - * @param {LoadOptions & import('./from-arrow').ArrowOptions} options Arrow format options. + * @param {LoadOptions & import('../arrow/types.js').ArrowOptions} [options] + * Arrow format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadArrow('data/table.arrow') */ @@ -77,7 +79,8 @@ export function loadArrow(path, options) { /** * Load a CSV file from a URL and return a Promise for an Arquero table. * @param {string} path The URL or file path to load. - * @param {LoadOptions & import('./from-csv').CSVParseOptions} options CSV format options. + * @param {LoadOptions & import('./from-csv.js').CSVParseOptions} [options] + * CSV format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadCSV('data/table.csv') * @example aq.loadTSV('data/table.tsv', { delimiter: '\t' }) @@ -89,7 +92,8 @@ export function loadCSV(path, options) { /** * Load a fixed width file from a URL and return a Promise for an Arquero table. * @param {string} path The URL or file path to load. - * @param {LoadOptions & import('./from-fixed').FixedParseOptions} options Fixed width format options. + * @param {LoadOptions & import('./from-fixed.js').FixedParseOptions} [options] + * Fixed width format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadFixedWidth('data/table.txt', { names: ['name', 'city', state'], widths: [10, 20, 2] }) */ @@ -103,7 +107,8 @@ export function loadCSV(path, options) { * and the aq.from method is used to construct the table. Otherwise, a * column object format is assumed and aq.fromJSON is applied. * @param {string} path The URL or file path to load. - * @param {LoadOptions & import('./from-json').JSONParseOptions} options JSON format options. + * @param {LoadOptions & import('./from-json.js').JSONParseOptions} [options] + * JSON format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadJSON('data/table.json') */ @@ -113,4 +118,4 @@ export function loadJSON(path, options) { function parseJSON(data, options) { return isArray(data) ? from(data) : fromJSON(data, options); -} \ No newline at end of file +} diff --git a/src/format/load-url.js b/src/format/load-url.js index e2054e5b..aded0fd4 100644 --- a/src/format/load-url.js +++ b/src/format/load-url.js @@ -1,11 +1,13 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars +import fromArrow from '../arrow/from-arrow.js'; +import fromCSV from './from-csv.js'; +import fromFixed from './from-fixed.js'; +import fromJSON from './from-json.js'; +import { from } from '../table/index.js'; +import isArray from '../util/is-array.js'; -import fromArrow from './from-arrow'; -import fromCSV from './from-csv'; -import fromFixed from './from-fixed'; -import fromJSON from './from-json'; -import { from } from '../table'; -import isArray from '../util/is-array'; +/** + * @typedef {import('../table/ColumnTable.js').ColumnTable} ColumnTable + */ /** * Options for file loading. @@ -41,7 +43,8 @@ export function load(url, options = {}) { /** * Load an Arrow file from a URL and return a Promise for an Arquero table. * @param {string} url The URL to load. - * @param {LoadOptions & import('./from-arrow').ArrowOptions} options Arrow format options. + * @param {LoadOptions & import('../arrow/types.js').ArrowOptions} [options] + * Arrow format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadArrow('data/table.arrow') */ @@ -52,7 +55,8 @@ export function loadArrow(url, options) { /** * Load a CSV file from a URL and return a Promise for an Arquero table. * @param {string} url The URL to load. - * @param {LoadOptions & import('./from-csv').CSVParseOptions} options CSV format options. + * @param {LoadOptions & import('./from-csv.js').CSVParseOptions} [options] + * CSV format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadCSV('data/table.csv') * @example aq.loadTSV('data/table.tsv', { delimiter: '\t' }) @@ -64,7 +68,8 @@ export function loadCSV(url, options) { /** * Load a fixed width file from a URL and return a Promise for an Arquero table. * @param {string} url The URL to load. - * @param {LoadOptions & import('./from-fixed').FixedParseOptions} options Fixed width format options. + * @param {LoadOptions & import('./from-fixed.js').FixedParseOptions} [options] + * Fixed width format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadFixedWidth('data/table.txt', { names: ['name', 'city', state'], widths: [10, 20, 2] }) */ @@ -78,7 +83,8 @@ export function loadCSV(url, options) { * and the aq.from method is used to construct the table. Otherwise, a * column object format is assumed and aq.fromJSON is applied. * @param {string} url The URL to load. - * @param {LoadOptions & import('./from-json').JSONParseOptions} options JSON format options. + * @param {LoadOptions & import('./from-json.js').JSONParseOptions} [options] + * JSON format options. * @return {Promise} A Promise for an Arquero table. * @example aq.loadJSON('data/table.json') */ @@ -88,4 +94,4 @@ export function loadJSON(url, options) { function parseJSON(data, options) { return isArray(data) ? from(data) : fromJSON(data, options); -} \ No newline at end of file +} diff --git a/src/format/parse/constants.js b/src/format/parse/constants.js index 303d7aaa..9c5bfc4d 100644 --- a/src/format/parse/constants.js +++ b/src/format/parse/constants.js @@ -2,4 +2,4 @@ export const EOL = {}; export const EOF = {}; export const QUOTE = 34; export const NEWLINE = 10; -export const RETURN = 13; \ No newline at end of file +export const RETURN = 13; diff --git a/src/format/parse/parse-delimited.js b/src/format/parse/parse-delimited.js index 340b59e2..ae622841 100644 --- a/src/format/parse/parse-delimited.js +++ b/src/format/parse/parse-delimited.js @@ -1,6 +1,6 @@ -import { EOF, EOL, NEWLINE, QUOTE, RETURN } from './constants'; -import filter from './text-filter'; -import error from '../../util/error'; +import { EOF, EOL, NEWLINE, QUOTE, RETURN } from './constants.js'; +import filter from './text-filter.js'; +import error from '../../util/error.js'; // Adapted from d3-dsv: https://github.com/d3/d3-dsv/blob/master/src/dsv.js // Copyright 2013-2016 Mike Bostock @@ -26,7 +26,11 @@ import error from '../../util/error'; // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -export default function(text, { delimiter = ',', skip, comment }) { +export default function(text, { + delimiter = ',', + skip = 0, + comment = undefined +}) { if (delimiter.length !== 1) { error(`Text "delimiter" should be a single character, found "${delimiter}"`); } @@ -81,4 +85,4 @@ export default function(text, { delimiter = ',', skip, comment }) { next, skip, comment && (x => (x && x[0] || '').startsWith(comment)) ); -} \ No newline at end of file +} diff --git a/src/format/parse/parse-lines.js b/src/format/parse/parse-lines.js index 7764c608..4a1d2eca 100644 --- a/src/format/parse/parse-lines.js +++ b/src/format/parse/parse-lines.js @@ -1,7 +1,7 @@ -import { NEWLINE, RETURN } from './constants'; -import filter from './text-filter'; +import { NEWLINE, RETURN } from './constants.js'; +import filter from './text-filter.js'; -export default function(text, { skip, comment }) { +export default function(text, { skip = 0, comment = undefined }) { let N = text.length; let I = 0; // current character index @@ -31,4 +31,4 @@ export default function(text, { skip, comment }) { read, skip, comment && (x => (x || '').startsWith(comment)) ); -} \ No newline at end of file +} diff --git a/src/format/to-arrow.js b/src/format/to-arrow.js deleted file mode 100644 index 875a3827..00000000 --- a/src/format/to-arrow.js +++ /dev/null @@ -1,13 +0,0 @@ -import toArrow from '../arrow/encode'; -import { tableToIPC } from 'apache-arrow'; - -export default toArrow; - -export function toArrowIPC(table, options = {}) { - const { format: format, ...toArrowOptions } = options; - const outputFormat = format ? format : 'stream'; - if (!['stream', 'file'].includes(outputFormat)) { - throw Error('Unrecognised output format'); - } - return tableToIPC(toArrow(table, toArrowOptions), format); -} diff --git a/src/format/to-csv.js b/src/format/to-csv.js index 3948be74..db7ff675 100644 --- a/src/format/to-csv.js +++ b/src/format/to-csv.js @@ -1,8 +1,6 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import { columns, scan } from './util'; -import { formatUTCDate } from '../util/format-date'; -import isDate from '../util/is-date'; +import { columns, scan } from './util.js'; +import { formatUTCDate } from '../util/format-date.js'; +import isDate from '../util/is-date.js'; /** * Options for CSV formatting. @@ -10,7 +8,7 @@ import isDate from '../util/is-date'; * @property {string} [delimiter=','] The delimiter between values. * @property {number} [limit=Infinity] The maximum number of rows to print. * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - * @property {import('./util').ColumnSelectOptions} [columns] Ordered list + * @property {import('./util.js').ColumnSelectOptions} [columns] Ordered list * of column names to include. If function-valued, the function should * accept a table as input and return an array of column name strings. * @property {Object. any>} [format] Object of column @@ -23,7 +21,7 @@ import isDate from '../util/is-date'; * Format a table as a comma-separated values (CSV) string. Other * delimiters, such as tabs or pipes ('|'), can be specified using * the options argument. - * @param {ColumnTable} table The table to format. + * @param {import('../table/Table.js').Table} table The table to format. * @param {CSVFormatOptions} options The formatting options. * @return {string} A delimited-value format string. */ @@ -51,4 +49,4 @@ export default function(table, options = {}) { }); return text + vals.join(delim); -} \ No newline at end of file +} diff --git a/src/format/to-html.js b/src/format/to-html.js index 82fd1c8f..54b7b264 100644 --- a/src/format/to-html.js +++ b/src/format/to-html.js @@ -1,9 +1,7 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import formatValue from './value'; -import { columns, formats, scan } from './util'; -import isFunction from '../util/is-function'; -import mapObject from '../util/map-object'; +import formatValue from './value.js'; +import { columns, formats, scan } from './util.js'; +import isFunction from '../util/is-function.js'; +import mapObject from '../util/map-object.js'; /** * Null format function. @@ -30,14 +28,14 @@ import mapObject from '../util/map-object'; * @typedef {object} HTMLFormatOptions * @property {number} [limit=Infinity] The maximum number of rows to print. * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - * @property {import('./util').ColumnSelectOptions} [columns] Ordered list + * @property {import('./util.js').ColumnSelectOptions} [columns] Ordered list * of column names to include. If function-valued, the function should * accept a table as input and return an array of column name strings. - * @property {import('./util').ColumnAlignOptions} [align] Object of column + * @property {import('./util.js').ColumnAlignOptions} [align] Object of column * alignment options. The object keys should be column names. The object * values should be aligment strings, one of 'l' (left), 'c' (center), or * 'r' (right). If specified, these override automatically inferred options. - * @property {import('./util').ColumnFormatOptions} [format] Object of column + * @property {import('./util.js').ColumnFormatOptions} [format] Object of column * format options. The object keys should be column names. The object values * should be formatting functions or specification objects. If specified, * these override automatically inferred options. @@ -57,7 +55,7 @@ import mapObject from '../util/map-object'; /** * Format a table as an HTML table string. - * @param {ColumnTable} table The table to format. + * @param {import('../table/Table.js').Table} table The table to format. * @param {HTMLFormatOptions} options The formatting options. * @return {string} An HTML table string. */ @@ -113,4 +111,4 @@ function styles(options) { options.style, value => isFunction(value) ? value : () => value ); -} \ No newline at end of file +} diff --git a/src/format/to-json.js b/src/format/to-json.js index 09283012..e9e51075 100644 --- a/src/format/to-json.js +++ b/src/format/to-json.js @@ -1,9 +1,7 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import { columns } from './util'; -import { formatUTCDate } from '../util/format-date'; -import defaultTrue from '../util/default-true'; -import isDate from '../util/is-date'; +import { columns } from './util.js'; +import { formatUTCDate } from '../util/format-date.js'; +import defaultTrue from '../util/default-true.js'; +import isDate from '../util/is-date.js'; /** * Options for JSON formatting. @@ -14,7 +12,7 @@ import isDate from '../util/is-date'; * @property {boolean} [schema=true] Flag indicating if table schema metadata * should be included in the JSON output. If false, only the data payload * is included. - * @property {import('./util').ColumnSelectOptions} [columns] Ordered list + * @property {import('./util.js').ColumnSelectOptions} [columns] Ordered list * of column names to include. If function-valued, the function should * accept a table as input and return an array of column name strings. * @property {Object. any>} [format] Object of column @@ -29,7 +27,7 @@ const defaultFormatter = value => isDate(value) /** * Format a table as a JavaScript Object Notation (JSON) string. - * @param {ColumnTable} table The table to format. + * @param {import('../table/Table.js').Table} table The table to format. * @param {JSONFormatOptions} options The formatting options. * @return {string} A JSON string. */ @@ -52,7 +50,7 @@ export default function(table, options = {}) { const formatter = format[name] || defaultFormatter; let r = -1; table.scan(row => { - const value = column.get(row); + const value = column.at(row); text += (++r ? ',' : '') + JSON.stringify(formatter(value)); }, true, options.limit, options.offset); @@ -60,4 +58,4 @@ export default function(table, options = {}) { }); return text + '}' + (schema ? '}' : ''); -} \ No newline at end of file +} diff --git a/src/format/to-markdown.js b/src/format/to-markdown.js index d7cb5fd8..70d5c251 100644 --- a/src/format/to-markdown.js +++ b/src/format/to-markdown.js @@ -1,21 +1,20 @@ -import ColumnTable from '../table/column-table'; // eslint-disable-line no-unused-vars - -import formatValue from './value'; -import { columns, formats, scan } from './util'; +import formatValue from './value.js'; +import { columns, formats, scan } from './util.js'; /** * Options for Markdown formatting. * @typedef {object} MarkdownFormatOptions * @property {number} [limit=Infinity] The maximum number of rows to print. - * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - * @property {import('./util').ColumnSelectOptions} [columns] Ordered list + * @property {number} [offset=0] The row offset indicating how many initial + * rows to skip. + * @property {import('./util.js').ColumnSelectOptions} [columns] Ordered list * of column names to include. If function-valued, the function should * accept a table as input and return an array of column name strings. - * @property {import('./util').ColumnAlignOptions} [align] Object of column + * @property {import('./util.js').ColumnAlignOptions} [align] Object of column * alignment options. The object keys should be column names. The object * values should be aligment strings, one of 'l' (left), 'c' (center), or * 'r' (right). If specified, these override automatically inferred options. - * @property {import('./util').ColumnFormatOptions} [format] Object of column + * @property {import('./util.js').ColumnFormatOptions} [format] Object of column * format options. The object keys should be column names. The object values * should be formatting functions or specification objects. If specified, * these override automatically inferred options. @@ -26,7 +25,7 @@ import { columns, formats, scan } from './util'; /** * Format a table as a GitHub-Flavored Markdown table string. - * @param {ColumnTable} table The table to format. + * @param {import('../table/Table.js').Table} table The table to format. * @param {MarkdownFormatOptions} options The formatting options. * @return {string} A GitHub-Flavored Markdown table string. */ @@ -53,4 +52,4 @@ export default function(table, options = {}) { }); return text + '\n'; -} \ No newline at end of file +} diff --git a/src/format/util.js b/src/format/util.js index 1d855c37..082fa599 100644 --- a/src/format/util.js +++ b/src/format/util.js @@ -1,11 +1,9 @@ -import Table from '../table/table'; // eslint-disable-line no-unused-vars - -import inferFormat from './infer'; -import isFunction from '../util/is-function'; +import inferFormat from './infer.js'; +import isFunction from '../util/is-function.js'; /** * Column selection function. - * @typedef {(table: Table) => string[]} ColumnSelectFunction + * @typedef {(table: import('../table/Table.js').Table) => string[]} ColumnSelectFunction */ /** @@ -17,7 +15,7 @@ import isFunction from '../util/is-function'; * Column format options. The object keys should be column names. * The object values should be formatting functions or objects. * If specified, these override any automatically inferred options. - * @typedef {Object.} ColumnFormatOptions */ /** @@ -51,7 +49,7 @@ export function formats(table, names, options) { function values(table, columnName) { const column = table.column(columnName); - return fn => table.scan(row => fn(column.get(row))); + return fn => table.scan(row => fn(column.at(row))); } export function scan(table, names, limit = 100, offset, ctx) { @@ -61,7 +59,7 @@ export function scan(table, names, limit = 100, offset, ctx) { ctx.row(row); for (let i = 0; i < n; ++i) { const name = names[i]; - ctx.cell(data[names[i]].get(row), name, i); + ctx.cell(data[names[i]].at(row), name, i); } }, true, limit, offset); -} \ No newline at end of file +} diff --git a/src/format/value.js b/src/format/value.js index 10adea64..a3a180c9 100644 --- a/src/format/value.js +++ b/src/format/value.js @@ -1,7 +1,7 @@ -import { formatDate, formatUTCDate } from '../util/format-date'; -import isDate from '../util/is-date'; -import isFunction from '../util/is-function'; -import isTypedArray from '../util/is-typed-array'; +import { formatDate, formatUTCDate } from '../util/format-date.js'; +import isDate from '../util/is-date.js'; +import isFunction from '../util/is-function.js'; +import isTypedArray from '../util/is-typed-array.js'; /** * Column format object. @@ -32,6 +32,7 @@ import isTypedArray from '../util/is-typed-array'; */ export default function(v, options = {}) { if (isFunction(options)) { + // @ts-ignore return options(v) + ''; } @@ -39,18 +40,22 @@ export default function(v, options = {}) { if (type === 'object') { if (isDate(v)) { + // @ts-ignore return options.utc ? formatUTCDate(v) : formatDate(v); } else { const s = JSON.stringify( v, + // @ts-ignore (k, v) => isTypedArray(v) ? Array.from(v) : v ); + // @ts-ignore const maxlen = options.maxlen || 30; return s.length > maxlen ? s.slice(0, 28) + '\u2026' + (s[0] === '[' ? ']' : '}') : s; } } else if (type === 'number') { + // @ts-ignore const digits = options.digits || 0; let a; return v !== 0 && ((a = Math.abs(v)) >= 1e18 || a < Math.pow(10, -digits)) @@ -59,4 +64,4 @@ export default function(v, options = {}) { } else { return v + ''; } -} \ No newline at end of file +} diff --git a/src/helpers/bin.js b/src/helpers/bin.js index 5b62f567..1f476719 100644 --- a/src/helpers/bin.js +++ b/src/helpers/bin.js @@ -31,4 +31,4 @@ export default function(name, options = {}) { const a = args.length ? ', ' + args.map(a => a + '').join(', ') : ''; return `d => op.bin(${field}, ...op.bins(${field}${a}), ${offset || 0})`; -} \ No newline at end of file +} diff --git a/src/helpers/desc.js b/src/helpers/desc.js index 08a57de2..956057b9 100644 --- a/src/helpers/desc.js +++ b/src/helpers/desc.js @@ -1,4 +1,4 @@ -import wrap from './wrap'; +import wrap from './wrap.js'; /** * Annotate a table expression to indicate descending sort order. @@ -9,4 +9,4 @@ import wrap from './wrap'; */ export default function(expr) { return wrap(expr, { desc: true }); -} \ No newline at end of file +} diff --git a/src/helpers/escape.js b/src/helpers/escape.js index 030f45e7..690f8ecc 100644 --- a/src/helpers/escape.js +++ b/src/helpers/escape.js @@ -1,5 +1,5 @@ -import wrap from './wrap'; -import error from '../util/error'; +import wrap from './wrap.js'; +import error from '../util/error.js'; /** * Escape a function or value to prevent it from being parsed and recompiled. @@ -17,4 +17,4 @@ export default function(value) { escape: true, toString() { error('Escaped values can not be serialized.'); } }); -} \ No newline at end of file +} diff --git a/src/helpers/field.js b/src/helpers/field.js index 7a06d40b..101dd6ba 100644 --- a/src/helpers/field.js +++ b/src/helpers/field.js @@ -1,4 +1,4 @@ -import wrap from './wrap'; +import wrap from './wrap.js'; /** * Annotate an expression to indicate it is a string field reference. @@ -17,4 +17,4 @@ export default function(expr, name, table = 0) { expr, name ? { expr: name, ...props } : props ); -} \ No newline at end of file +} diff --git a/src/helpers/frac.js b/src/helpers/frac.js index 808c333c..507aa5a5 100644 --- a/src/helpers/frac.js +++ b/src/helpers/frac.js @@ -8,4 +8,4 @@ */ export default function(fraction) { return `() => op.round(${+fraction} * op.count())`; -} \ No newline at end of file +} diff --git a/src/helpers/names.js b/src/helpers/names.js index 1f1eae32..4167d321 100644 --- a/src/helpers/names.js +++ b/src/helpers/names.js @@ -21,4 +21,4 @@ export default function(...names) { } return m; }; -} \ No newline at end of file +} diff --git a/src/helpers/rolling.js b/src/helpers/rolling.js index 627fb3db..213f055e 100644 --- a/src/helpers/rolling.js +++ b/src/helpers/rolling.js @@ -1,4 +1,4 @@ -import wrap from './wrap'; +import wrap from './wrap.js'; /** * Annotate a table expression to compute rolling aggregate or window @@ -28,4 +28,4 @@ export default function(expr, frame, includePeers) { peers: !!includePeers } }); -} \ No newline at end of file +} diff --git a/src/helpers/selection.js b/src/helpers/selection.js index 9732af5a..7032cf25 100644 --- a/src/helpers/selection.js +++ b/src/helpers/selection.js @@ -1,12 +1,12 @@ -import assign from '../util/assign'; -import error from '../util/error'; -import escapeRegExp from '../util/escape-regexp'; -import isArray from '../util/is-array'; -import isFunction from '../util/is-function'; -import isNumber from '../util/is-number'; -import isObject from '../util/is-object'; -import isString from '../util/is-string'; -import toString from '../util/to-string'; +import assign from '../util/assign.js'; +import error from '../util/error.js'; +import escapeRegExp from '../util/escape-regexp.js'; +import isArray from '../util/is-array.js'; +import isFunction from '../util/is-function.js'; +import isNumber from '../util/is-number.js'; +import isObject from '../util/is-object.js'; +import isString from '../util/is-string.js'; +import toString from '../util/to-string.js'; export default function resolve(table, sel, map = new Map()) { sel = isNumber(sel) ? table.columnName(sel) : sel; @@ -39,7 +39,7 @@ function toObject(value) { /** * Proxy type for SelectHelper function. - * @typedef {import('../table/transformable').SelectHelper} SelectHelper + * @typedef {import('../table/types.js').SelectHelper} SelectHelper */ /** @@ -98,7 +98,9 @@ export function range(start, end) { export function matches(pattern) { if (isString(pattern)) pattern = RegExp(escapeRegExp(pattern)); return decorate( + // @ts-ignore table => table.columnNames(name => pattern.test(name)), + // @ts-ignore () => ({ matches: [pattern.source, pattern.flags] }) ); } @@ -119,4 +121,4 @@ export function startswith(string) { */ export function endswith(string) { return matches(RegExp(escapeRegExp(string) + '$')); -} \ No newline at end of file +} diff --git a/src/helpers/slice.js b/src/helpers/slice.js index 23fd7807..0285940b 100644 --- a/src/helpers/slice.js +++ b/src/helpers/slice.js @@ -18,4 +18,4 @@ export default function(start = 0, end = Infinity) { function prep(index) { return index < 0 ? `count() + ${index}` : index; -} \ No newline at end of file +} diff --git a/src/helpers/wrap.js b/src/helpers/wrap.js index eb52a5a2..24ff1276 100644 --- a/src/helpers/wrap.js +++ b/src/helpers/wrap.js @@ -1,4 +1,4 @@ -import isFunction from '../util/is-function'; +import isFunction from '../util/is-function.js'; /** * Annotate an expression in an object wrapper. @@ -27,4 +27,4 @@ class Wrapper { ...(isFunction(this.expr) ? { func: true } : {}) }; } -} \ No newline at end of file +} diff --git a/src/index-node.js b/src/index-browser.js similarity index 57% rename from src/index-node.js rename to src/index-browser.js index f9409583..7dcd8f9c 100644 --- a/src/index-node.js +++ b/src/index-browser.js @@ -1,2 +1,2 @@ -export * from './index'; -export { load, loadArrow, loadCSV, loadFixed, loadJSON } from './format/load-file'; \ No newline at end of file +export * from './api.js'; +export { load, loadArrow, loadCSV, loadFixed, loadJSON } from './format/load-url.js'; diff --git a/src/index.js b/src/index.js index f89313b3..275a5a8c 100644 --- a/src/index.js +++ b/src/index.js @@ -1,47 +1,2 @@ -// export internal class definitions -import Table from './table/table'; -import { columnFactory } from './table/column'; -import ColumnTable from './table/column-table'; -import Transformable from './table/transformable'; -import Reducer from './engine/reduce/reducer'; -import parse from './expression/parse'; -import walk_ast from './expression/ast/walk'; -import Query from './query/query'; -import { Verb, Verbs } from './query/verb'; - -export const internal = { - Table, - ColumnTable, - Transformable, - Query, - Reducer, - Verb, - Verbs, - columnFactory, - parse, - walk_ast -}; - -// export public API -import pkg from '../package.json'; -export const version = pkg.version; -export { seed } from './util/random'; -export { default as fromArrow } from './format/from-arrow'; -export { default as fromCSV } from './format/from-csv'; -export { default as fromFixed } from './format/from-fixed'; -export { default as fromJSON } from './format/from-json'; -export { load, loadArrow, loadCSV, loadFixed, loadJSON } from './format/load-url'; -export { default as toArrow } from './arrow/encode'; -export { default as bin } from './helpers/bin'; -export { default as escape } from './helpers/escape'; -export { default as desc } from './helpers/desc'; -export { default as field } from './helpers/field'; -export { default as frac } from './helpers/frac'; -export { default as names } from './helpers/names'; -export { default as rolling } from './helpers/rolling'; -export { all, endswith, matches, not, range, startswith } from './helpers/selection'; -export { default as agg } from './verbs/helpers/agg'; -export { default as op } from './op/op-api'; -export { query, queryFrom } from './query/query'; -export * from './register'; -export * from './table'; +export * from './api.js'; +export { load, loadArrow, loadCSV, loadFixed, loadJSON } from './format/load-file.js'; diff --git a/src/op/aggregate-functions.js b/src/op/aggregate-functions.js index e9341486..e122fe46 100644 --- a/src/op/aggregate-functions.js +++ b/src/op/aggregate-functions.js @@ -1,9 +1,9 @@ -import bins from '../util/bins'; -import distinctMap from '../util/distinct-map'; -import isBigInt from '../util/is-bigint'; -import noop from '../util/no-op'; -import NULL from '../util/null'; -import product from '../util/product'; +import bins from '../util/bins.js'; +import distinctMap from '../util/distinct-map.js'; +import isBigInt from '../util/is-bigint.js'; +import noop from '../util/no-op.js'; +import NULL from '../util/null.js'; +import product from '../util/product.js'; /** * Initialize an aggregate operator. @@ -54,8 +54,8 @@ function initProduct(s, value) { * An operator instance for an aggregate function. * @typedef {object} AggregateOperator * @property {AggregateInit} init Initialize the operator. - * @property {AggregateAdd} add Add a value to the operator state. - * @property {AggregateRem} rem Remove a value from the operator state. + * @property {AggregateAdd} [add] Add a value to the operator state. + * @property {AggregateRem} [rem] Remove a value from the operator state. * @property {AggregateValue} value Retrieve an output value. */ @@ -390,4 +390,4 @@ export default { param: [1, 4], req: ['min', 'max'] } -}; \ No newline at end of file +}; diff --git a/src/op/functions/array.js b/src/op/functions/array.js index 3b5335ac..14e3167e 100644 --- a/src/op/functions/array.js +++ b/src/op/functions/array.js @@ -1,25 +1,128 @@ -import NULL from '../../util/null'; -import isArrayType from '../../util/is-array-type'; -import isString from '../../util/is-string'; -import isValid from '../../util/is-valid'; +import NULL from '../../util/null.js'; +import isArrayType from '../../util/is-array-type.js'; +import isString from '../../util/is-string.js'; +import isValid from '../../util/is-valid.js'; const isSeq = (seq) => isArrayType(seq) || isString(seq); export default { - compact: (arr) => isArrayType(arr) ? arr.filter(v => isValid(v)) : arr, - concat: (...values) => [].concat(...values), - includes: (seq, value, index) => isSeq(seq) - ? seq.includes(value, index) - : false, - indexof: (seq, value) => isSeq(seq) ? seq.indexOf(value) : -1, - join: (arr, delim) => isArrayType(arr) ? arr.join(delim) : NULL, - lastindexof: (seq, value) => isSeq(seq) ? seq.lastIndexOf(value) : -1, - length: (seq) => isSeq(seq) ? seq.length : 0, - pluck: (arr, prop) => isArrayType(arr) - ? arr.map(v => isValid(v) ? v[prop] : NULL) - : NULL, - reverse: (seq) => isArrayType(seq) ? seq.slice().reverse() - : isString(seq) ? seq.split('').reverse().join('') - : NULL, - slice: (seq, start, end) => isSeq(seq) ? seq.slice(start, end) : NULL + /** + * Returns a new compacted array with invalid values + * (`null`, `undefined`, `NaN`) removed. + * @template T + * @param {T[]} array The input array. + * @return {T[]} A compacted array. + */ + compact: (array) => isArrayType(array) + ? array.filter(v => isValid(v)) + : array, + + /** + * Merges two or more arrays in sequence, returning a new array. + * @template T + * @param {...(T|T[])} values The arrays to merge. + * @return {T[]} The merged array. + */ + concat: (...values) => [].concat(...values), + + /** + * Determines whether an *array* includes a certain *value* among its + * entries, returning `true` or `false` as appropriate. + * @template T + * @param {T[]} sequence The input array value. + * @param {T} value The value to search for. + * @param {number} [index=0] The integer index to start searching + * from (default `0`). + * @return {boolean} True if the value is included, false otherwise. + */ + includes: (sequence, value, index) => isSeq(sequence) + ? sequence.includes(value, index) + : false, + + /** + * Returns the first index at which a given *value* can be found in the + * *sequence* (array or string), or -1 if it is not present. + * @template T + * @param {T[]|string} sequence The input array or string value. + * @param {T} value The value to search for. + * @return {number} The index of the value, or -1 if not present. + */ + indexof: (sequence, value) => isSeq(sequence) + // @ts-ignore + ? sequence.indexOf(value) + : -1, + + /** + * Creates and returns a new string by concatenating all of the elements + * in an *array* (or an array-like object), separated by commas or a + * specified *delimiter* string. If the *array* has only one item, then + * that item will be returned without using the delimiter. + * @template T + * @param {T[]} array The input array value. + * @param {string} delim The delimiter string (default `','`). + * @return {string} The joined string. + */ + join: (array, delim) => isArrayType(array) ? array.join(delim) : NULL, + + /** + * Returns the last index at which a given *value* can be found in the + * *sequence* (array or string), or -1 if it is not present. + * @template T + * @param {T[]|string} sequence The input array or string value. + * @param {T} value The value to search for. + * @return {number} The last index of the value, or -1 if not present. + */ + lastindexof: (sequence, value) => isSeq(sequence) + // @ts-ignore + ? sequence.lastIndexOf(value) + : -1, + + /** + * Returns the length of the input *sequence* (array or string). + * @param {Array|string} sequence The input array or string value. + * @return {number} The length of the sequence. + */ + length: (sequence) => isSeq(sequence) ? sequence.length : 0, + + /** + * Returns a new array in which the given *property* has been extracted + * for each element in the input *array*. + * @param {Array} array The input array value. + * @param {string} property The property name string to extract. Nested + * properties are not supported: the input `"a.b"` will indicates a + * property with that exact name, *not* a nested property `"b"` of + * the object `"a"`. + * @return {Array} An array of plucked properties. + */ + pluck: (array, property) => isArrayType(array) + ? array.map(v => isValid(v) ? v[property] : NULL) + : NULL, + + /** + * Returns a new array or string with the element order reversed: the first + * *sequence* element becomes the last, and the last *sequence* element + * becomes the first. The input *sequence* is unchanged. + * @template T + * @param {T[]|string} sequence The input array or string value. + * @return {T[]|string} The reversed sequence. + */ + reverse: (sequence) => isArrayType(sequence) ? sequence.slice().reverse() + : isString(sequence) ? sequence.split('').reverse().join('') + : NULL, + + /** + * Returns a copy of a portion of the input *sequence* (array or string) + * selected from *start* to *end* (*end* not included) where *start* and + * *end* represent the index of items in the sequence. + * @template T + * @param {T[]|string} sequence The input array or string value. + * @param {number} [start=0] The starting integer index to copy from + * (inclusive, default `0`). + * @param {number} [end] The ending integer index to copy from (exclusive, + * default `sequence.length`). + * @return {T[]|string} The sliced sequence. + */ + slice: (sequence, start, end) => isSeq(sequence) + ? sequence.slice(start, end) + : NULL }; diff --git a/src/op/functions/bin.js b/src/op/functions/bin.js index 1c243d71..7206343c 100644 --- a/src/op/functions/bin.js +++ b/src/op/functions/bin.js @@ -3,11 +3,11 @@ * Useful for creating equal-width histograms. * Values outside the [min, max] range will be mapped to * -Infinity (< min) or +Infinity (> max). - * @param {number} value - The value to bin. - * @param {number} min - The minimum bin boundary. - * @param {number} max - The maximum bin boundary. - * @param {number} step - The step size between bin boundaries. - * @param {number} [offset=0] - Offset in steps by which to adjust + * @param {number} value The value to bin. + * @param {number} min The minimum bin boundary. + * @param {number} max The maximum bin boundary. + * @param {number} step The step size between bin boundaries. + * @param {number} [offset=0] Offset in steps by which to adjust * the bin value. An offset of 1 will return the next boundary. */ export default function(value, min, max, step, offset) { @@ -18,4 +18,4 @@ export default function(value, min, max, step, offset) { value = Math.max(min, Math.min(value, max)), min + step * Math.floor(1e-14 + (value - min) / step + (offset || 0)) ); -} \ No newline at end of file +} diff --git a/src/op/functions/date.js b/src/op/functions/date.js index 51ac4d7b..1b7971b2 100644 --- a/src/op/functions/date.js +++ b/src/op/functions/date.js @@ -1,5 +1,5 @@ -import { formatDate, formatUTCDate } from '../../util/format-date'; -import parseIsoDate from '../../util/parse-iso-date'; +import { formatDate, formatUTCDate } from '../../util/format-date.js'; +import parseIsoDate from '../../util/parse-iso-date.js'; const msMinute = 6e4; const msDay = 864e5; @@ -22,7 +22,7 @@ const t = d => ( * @param {number} [minutes=0] The minute within the hour. * @param {number} [seconds=0] The second within the minute. * @param {number} [milliseconds=0] The milliseconds within the second. - * @return {date} The resuting Date value. + * @return {Date} The resuting Date value. */ function datetime(year, month, date, hours, minutes, seconds, milliseconds) { return !arguments.length @@ -48,7 +48,7 @@ function datetime(year, month, date, hours, minutes, seconds, milliseconds) { * @param {number} [minutes=0] The minute within the hour. * @param {number} [seconds=0] The second within the minute. * @param {number} [milliseconds=0] The milliseconds within the second. - * @return {date} The resuting Date value. + * @return {Date} The resuting Date value. */ function utcdatetime(year, month, date, hours, minutes, seconds, milliseconds) { return !arguments.length @@ -64,6 +64,12 @@ function utcdatetime(year, month, date, hours, minutes, seconds, milliseconds) { )); } +/** + * Return the current day of the year in local time as a number + * between 1 and 366. + * @param {Date|number} date A date or timestamp. + * @return {number} The day of the year in local time. + */ function dayofyear(date) { t1.setTime(+date); t1.setHours(0, 0, 0, 0); @@ -71,16 +77,28 @@ function dayofyear(date) { t0.setMonth(0); t0.setDate(1); const tz = (t1.getTimezoneOffset() - t0.getTimezoneOffset()) * msMinute; - return Math.floor(1 + ((t1 - t0) - tz) / msDay); + return Math.floor(1 + ((+t1 - +t0) - tz) / msDay); } +/** + * Return the current day of the year in UTC time as a number + * between 1 and 366. + * @param {Date|number} date A date or timestamp. + * @return {number} The day of the year in UTC time. + */ function utcdayofyear(date) { t1.setTime(+date); t1.setUTCHours(0, 0, 0, 0); const t0 = Date.UTC(t1.getUTCFullYear(), 0, 1); - return Math.floor(1 + (t1 - t0) / msDay); + return Math.floor(1 + (+t1 - t0) / msDay); } +/** + * Return the current week of the year in local time as a number + * between 1 and 52. + * @param {Date|number} date A date or timestamp. + * @return {number} The week of the year in local time. + */ function week(date, firstday) { const i = firstday || 0; t1.setTime(+date); @@ -92,9 +110,15 @@ function week(date, firstday) { t0.setDate(1 - (t0.getDay() + 7 - i) % 7); t0.setHours(0, 0, 0, 0); const tz = (t1.getTimezoneOffset() - t0.getTimezoneOffset()) * msMinute; - return Math.floor((1 + (t1 - t0) - tz) / msWeek); + return Math.floor((1 + (+t1 - +t0) - tz) / msWeek); } +/** + * Return the current week of the year in UTC time as a number + * between 1 and 52. + * @param {Date|number} date A date or timestamp. + * @return {number} The week of the year in UTC time. + */ function utcweek(date, firstday) { const i = firstday || 0; t1.setTime(+date); @@ -105,36 +129,263 @@ function utcweek(date, firstday) { t0.setUTCDate(1); t0.setUTCDate(1 - (t0.getUTCDay() + 7 - i) % 7); t0.setUTCHours(0, 0, 0, 0); - return Math.floor((1 + (t1 - t0)) / msWeek); + return Math.floor((1 + (+t1 - +t0)) / msWeek); } export default { - format_date: (date, shorten) => formatDate(t(date), !shorten), - format_utcdate: (date, shorten) => formatUTCDate(t(date), !shorten), - timestamp: (date) => +t(date), - year: (date) => t(date).getFullYear(), - quarter: (date) => Math.floor(t(date).getMonth() / 3), - month: (date) => t(date).getMonth(), - date: (date) => t(date).getDate(), - dayofweek: (date) => t(date).getDay(), - hours: (date) => t(date).getHours(), - minutes: (date) => t(date).getMinutes(), - seconds: (date) => t(date).getSeconds(), - milliseconds: (date) => t(date).getMilliseconds(), - utcyear: (date) => t(date).getUTCFullYear(), - utcquarter: (date) => Math.floor(t(date).getUTCMonth() / 3), - utcmonth: (date) => t(date).getUTCMonth(), - utcdate: (date) => t(date).getUTCDate(), - utcdayofweek: (date) => t(date).getUTCDay(), - utchours: (date) => t(date).getUTCHours(), - utcminutes: (date) => t(date).getUTCMinutes(), - utcseconds: (date) => t(date).getUTCSeconds(), - utcmilliseconds: (date) => t(date).getUTCMilliseconds(), + /** + * Returns an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) formatted + * string for the given *date* in local timezone. The resulting string is + * compatible with *parse_date* and JavaScript's built-in *Date.parse*. + * @param {Date | number} date The input Date or timestamp value. + * @param {boolean} [shorten=false] A boolean flag (default `false`) + * indicating if the formatted string should be shortened if possible. + * For example, the local date `2001-01-01` will shorten from + * `"2001-01-01T00:00:00.000"` to `"2001-01-01T00:00"`. + * @return {string} The formatted date string in local time. + */ + format_date: (date, shorten) => formatDate(t(date), !shorten), + + /** + * Returns an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) formatted + * string for the given *date* in Coordinated Universal Time (UTC). The + * resulting string is compatible with *parse_date* and JavaScript's + * built-in *Date.parse*. + * @param {Date | number} date The input Date or timestamp value. + * @param {boolean} [shorten=false] A boolean flag (default `false`) + * indicating if the formatted string should be shortened if possible. + * For example, the the UTC date `2001-01-01` will shorten from + * `"2001-01-01T00:00:00.000Z"` to `"2001-01-01"` + * @return {string} The formatted date string in UTC time. + */ + format_utcdate: (date, shorten) => formatUTCDate(t(date), !shorten), + + /** + * Returns the number of milliseconds elapsed since midnight, January 1, + * 1970 Universal Coordinated Time (UTC). + * @return {number} The timestamp for now. + */ + now: Date.now, + + /** + * Returns the timestamp for a *date* as the number of milliseconds elapsed + * since January 1, 1970 00:00:00 UTC. + * @param {Date | number} date The input Date value. + * @return {number} The timestamp value. + */ + timestamp: (date) => +t(date), + + /** + * Creates and returns a new Date value. If no arguments are provided, + * the current date and time are used. + * @param {number} [year] The year. + * @param {number} [month=0] The (zero-based) month. + * @param {number} [date=1] The date within the month. + * @param {number} [hours=0] The hour within the day. + * @param {number} [minutes=0] The minute within the hour. + * @param {number} [seconds=0] The second within the minute. + * @param {number} [milliseconds=0] The milliseconds within the second. + * @return {Date} The Date value. + */ datetime, - dayofyear, + + /** + * Returns the year of the specified *date* according to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The year value in local time. + */ + year: (date) => t(date).getFullYear(), + + /** + * Returns the zero-based quarter of the specified *date* according to + * local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The quarter value in local time. + */ + quarter: (date) => Math.floor(t(date).getMonth() / 3), + + /** + * Returns the zero-based month of the specified *date* according to local + * time. A value of `0` indicates January, `1` indicates February, and so on. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The month value in local time. + */ + month: (date) => t(date).getMonth(), + + /** + * Returns the week number of the year (0-53) for the specified *date* + * according to local time. By default, Sunday is used as the first day + * of the week. All days in a new year preceding the first Sunday are + * considered to be in week 0. + * @param {Date | number} date The input Date or timestamp value. + * @param {number} firstday The number of first day of the week (default + * `0` for Sunday, `1` for Monday and so on). + * @return {number} The week of the year in local time. + */ week, + + /** + * Returns the date (day of month) of the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The date (day of month) value. + */ + date: (date) => t(date).getDate(), + + /** + * Returns the day of the year (1-366) of the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The day of the year in local time. + */ + dayofyear, + + /** + * Returns the Sunday-based day of the week (0-6) of the specified *date* + * according to local time. A value of `0` indicates Sunday, `1` indicates + * Monday, and so on. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The day of the week value in local time. + */ + dayofweek: (date) => t(date).getDay(), + + /** + * Returns the hour of the day for the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The hour value in local time. + */ + hours: (date) => t(date).getHours(), + + /** + * Returns the minute of the hour for the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The minutes value in local time. + */ + minutes: (date) => t(date).getMinutes(), + + /** + * Returns the seconds of the minute for the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The seconds value in local time. + */ + seconds: (date) => t(date).getSeconds(), + + /** + * Returns the milliseconds of the second for the specified *date* according + * to local time. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The milliseconds value in local time. + */ + milliseconds: (date) => t(date).getMilliseconds(), + + /** + * Creates and returns a new Date value using Coordinated Universal Time + * (UTC). If no arguments are provided, the current date and time are used. + * @param {number} [year] The year. + * @param {number} [month=0] The (zero-based) month. + * @param {number} [date=1] The date within the month. + * @param {number} [hours=0] The hour within the day. + * @param {number} [minutes=0] The minute within the hour. + * @param {number} [seconds=0] The second within the minute. + * @param {number} [milliseconds=0] The milliseconds within the second. + * @return {Date} The Date value. + */ utcdatetime, - utcdayofyear, + + /** + * Returns the year of the specified *date* according to Coordinated + * Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The year value in UTC time. + */ + utcyear: (date) => t(date).getUTCFullYear(), + + /** + * Returns the zero-based quarter of the specified *date* according to + * Coordinated Universal Time (UTC) + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The quarter value in UTC time. + */ + utcquarter: (date) => Math.floor(t(date).getUTCMonth() / 3), + + /** + * Returns the zero-based month of the specified *date* according to + * Coordinated Universal Time (UTC). A value of `0` indicates January, + * `1` indicates February, and so on. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The month value in UTC time. + */ + utcmonth: (date) => t(date).getUTCMonth(), + + /** + * Returns the week number of the year (0-53) for the specified *date* + * according to Coordinated Universal Time (UTC). By default, Sunday is + * used as the first day of the week. All days in a new year preceding the + * first Sunday are considered to be in week 0. + * @param {Date | number} date The input Date or timestamp value. + * @param {number} firstday The number of first day of the week (default + * `0` for Sunday, `1` for Monday and so on). + * @return {number} The week of the year in UTC time. + */ utcweek, - now: Date.now -}; \ No newline at end of file + + /** + * Returns the date (day of month) of the specified *date* according to + * Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The date (day of month) value in UTC time. + */ + utcdate: (date) => t(date).getUTCDate(), + + /** + * Returns the day of the year (1-366) of the specified *date* according + * to Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The day of the year in UTC time. + */ + utcdayofyear, + + /** + * Returns the Sunday-based day of the week (0-6) of the specified *date* + * according to Coordinated Universal Time (UTC). A value of `0` indicates + * Sunday, `1` indicates Monday, and so on. + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The day of the week in UTC time. + */ + utcdayofweek: (date) => t(date).getUTCDay(), + + /** + * Returns the hour of the day for the specified *date* according to + * Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The hours value in UTC time. + */ + utchours: (date) => t(date).getUTCHours(), + + /** + * Returns the minute of the hour for the specified *date* according to + * Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The minutes value in UTC time. + */ + utcminutes: (date) => t(date).getUTCMinutes(), + + /** + * Returns the seconds of the minute for the specified *date* according to + * Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The seconds value in UTC time. + */ + utcseconds: (date) => t(date).getUTCSeconds(), + + /** + * Returns the milliseconds of the second for the specified *date* according to + * Coordinated Universal Time (UTC). + * @param {Date | number} date The input Date or timestamp value. + * @return {number} The milliseconds value in UTC time. + */ + utcmilliseconds: (date) => t(date).getUTCMilliseconds() +}; diff --git a/src/op/functions/equal.js b/src/op/functions/equal.js index 94422087..7aae8c21 100644 --- a/src/op/functions/equal.js +++ b/src/op/functions/equal.js @@ -1,6 +1,6 @@ -import isDate from '../../util/is-date'; -import isRegExp from '../../util/is-regexp'; -import isObject from '../../util/is-object'; +import isDate from '../../util/is-date.js'; +import isRegExp from '../../util/is-regexp.js'; +import isObject from '../../util/is-object.js'; /** * Compare two values for equality, using join semantics in which null @@ -62,4 +62,4 @@ function arrayEqual(a, b, test = equal) { } return true; -} \ No newline at end of file +} diff --git a/src/op/functions/index.js b/src/op/functions/index.js index e59ad3c2..7eefe153 100644 --- a/src/op/functions/index.js +++ b/src/op/functions/index.js @@ -1,13 +1,13 @@ -import array from './array'; -import bin from './bin'; -import date from './date'; -import equal from './equal'; -import json from './json'; -import math from './math'; -import object from './object'; -import recode from './recode'; -import sequence from './sequence'; -import string from './string'; +import array from './array.js'; +import bin from './bin.js'; +import date from './date.js'; +import equal from './equal.js'; +import json from './json.js'; +import math from './math.js'; +import object from './object.js'; +import recode from './recode.js'; +import sequence from './sequence.js'; +import string from './string.js'; export default { bin, @@ -20,4 +20,4 @@ export default { ...math, ...object, ...string -}; \ No newline at end of file +}; diff --git a/src/op/functions/json.js b/src/op/functions/json.js index 99c5ba81..d0e4e6ef 100644 --- a/src/op/functions/json.js +++ b/src/op/functions/json.js @@ -1,4 +1,16 @@ export default { - parse_json: (str) => JSON.parse(str), - to_json: (val) => JSON.stringify(val) -}; \ No newline at end of file + /** + * Parses a string *value* in JSON format, constructing the JavaScript + * value or object described by the string. + * @param {string} value The input string value. + * @return {any} The parsed JSON. + */ + parse_json: (value) => JSON.parse(value), + + /** + * Converts a JavaScript object or value to a JSON string. + * @param {*} value The value to convert to a JSON string. + * @return {string} The JSON string. + */ + to_json: (value) => JSON.stringify(value) +}; diff --git a/src/op/functions/math.js b/src/op/functions/math.js index 4faddf51..b7064fff 100644 --- a/src/op/functions/math.js +++ b/src/op/functions/math.js @@ -1,43 +1,300 @@ -import { random } from '../../util/random'; +import { random } from '../../util/random.js'; export default { + /** + * Return a random floating point number between 0 (inclusive) and 1 + * (exclusive). By default uses *Math.random*. Use the *seed* method + * to instead use a seeded random number generator. + * @return {number} A pseudorandom number between 0 and 1. + */ random, - is_nan: Number.isNaN, + + /** + * Tests if the input *value* is not a number (`NaN`); equivalent + * to *Number.isNaN*. + * @param {*} value The value to test. + * @return {boolean} True if the value is not a number, false otherwise. + */ + is_nan: Number.isNaN, + + /** + * Tests if the input *value* is finite; equivalent to *Number.isFinite*. + * @param {*} value The value to test. + * @return {boolean} True if the value is finite, false otherwise. + */ is_finite: Number.isFinite, - abs: Math.abs, - cbrt: Math.cbrt, - ceil: Math.ceil, - clz32: Math.clz32, - exp: Math.exp, - expm1: Math.expm1, - floor: Math.floor, - fround: Math.fround, + /** + * Returns the absolute value of the input *value*; equivalent to *Math.abs*. + * @param {number} value The input number value. + * @return {number} The absolute value. + */ + abs: Math.abs, + + /** + * Returns the cube root value of the input *value*; equivalent to + * *Math.cbrt*. + * @param {number} value The input number value. + * @return {number} The cube root value. + */ + cbrt: Math.cbrt, + + /** + * Returns the ceiling of the input *value*, the nearest integer equal to + * or greater than the input; equivalent to *Math.ceil*. + * @param {number} value The input number value. + * @return {number} The ceiling value. + */ + ceil: Math.ceil, + + /** + * Returns the number of leading zero bits in the 32-bit binary + * representation of a number *value*; equivalent to *Math.clz32*. + * @param {number} value The input number value. + * @return {number} The leading zero bits value. + */ + clz32: Math.clz32, + + /** + * Returns *evalue*, where *e* is Euler's number, the base of the + * natural logarithm; equivalent to *Math.exp*. + * @param {number} value The input number value. + * @return {number} The base-e exponentiated value. + */ + exp: Math.exp, + + /** + * Returns *evalue - 1*, where *e* is Euler's number, the base of + * the natural logarithm; equivalent to *Math.expm1*. + * @param {number} value The input number value. + * @return {number} The base-e exponentiated value minus 1. + */ + expm1: Math.expm1, + + /** + * Returns the floor of the input *value*, the nearest integer equal to or + * less than the input; equivalent to *Math.floor*. + * @param {number} value The input number value. + * @return {number} The floor value. + */ + floor: Math.floor, + + /** + * Returns the nearest 32-bit single precision float representation of the + * input number *value*; equivalent to *Math.fround*. Useful for translating + * between 64-bit `Number` values and values from a `Float32Array`. + * @param {number} value The input number value. + * @return {number} The rounded value. + */ + fround: Math.fround, + + /** + * Returns the greatest (maximum) value among the input *values*; equivalent + * to *Math.max*. This is _not_ an aggregate function, see *op.max* to + * compute a maximum value across multiple rows. + * @param {...number} values The input number values. + * @return {number} The greatest (maximum) value among the inputs. + */ greatest: Math.max, - least: Math.min, - log: Math.log, - log10: Math.log10, - log1p: Math.log1p, - log2: Math.log2, - pow: Math.pow, - round: Math.round, - sign: Math.sign, - sqrt: Math.sqrt, - trunc: Math.trunc, - - degrees: (rad) => 180 * rad / Math.PI, - radians: (deg) => Math.PI * deg / 180, - acos: Math.acos, - acosh: Math.acosh, - asin: Math.asin, - asinh: Math.asinh, - atan: Math.atan, - atan2: Math.atan2, - atanh: Math.atanh, - cos: Math.cos, - cosh: Math.cosh, - sin: Math.sin, - sinh: Math.sinh, - tan: Math.tan, - tanh: Math.tanh -}; \ No newline at end of file + + /** + * Returns the least (minimum) value among the input *values*; equivalent + * to *Math.min*. This is _not_ an aggregate function, see *op.min* to + * compute a minimum value across multiple rows. + * @param {...number} values The input number values. + * @return {number} The least (minimum) value among the inputs. + */ + least: Math.min, + + /** + * Returns the natural logarithm (base *e*) of a number *value*; equivalent + * to *Math.log*. + * @param {number} value The input number value. + * @return {number} The base-e log value. + */ + log: Math.log, + + /** + * Returns the base 10 logarithm of a number *value*; equivalent + * to *Math.log10*. + * @param {number} value The input number value. + * @return {number} The base-10 log value. + */ + log10: Math.log10, + + /** + * Returns the natural logarithm (base *e*) of 1 + a number *value*; + * equivalent to *Math.log1p*. + * @param {number} value The input number value. + * @return {number} The base-e log of value + 1. + */ + log1p: Math.log1p, + + /** + * Returns the base 2 logarithm of a number *value*; equivalent + * to *Math.log2*. + * @param {number} value The input number value. + * @return {number} The base-2 log value. + */ + log2: Math.log2, + + /** + * Returns the *base* raised to the *exponent* power, that is, + * *base**exponent*; equivalent to *Math.pow*. + * @param {number} base The base number value. + * @param {number} exponent The exponent number value. + * @return {number} The exponentiated value. + */ + pow: Math.pow, + + /** + * Returns the value of a number rounded to the nearest integer; + * equivalent to *Math.round*. + * @param {number} value The input number value. + * @return {number} The rounded value. + */ + round: Math.round, + + /** + * Returns either a positive or negative +/- 1, indicating the sign of the + * input *value*; equivalent to *Math.sign*. + * @param {number} value The input number value. + * @return {number} The sign of the value. + */ + sign: Math.sign, + + /** + * Returns the square root of the input *value*; equivalent to *Math.sqrt*. + * @param {number} value The input number value. + * @return {number} The square root value. + */ + sqrt: Math.sqrt, + + /** + * Returns the integer part of a number by removing any fractional digits; + * equivalent to *Math.trunc*. + * @param {number} value The input number value. + * @return {number} The truncated value. + */ + trunc: Math.trunc, + + /** + * Converts the input *radians* value to degrees. + * @param {number} radians The input radians value. + * @return {number} The value in degrees + */ + degrees: (radians) => 180 * radians / Math.PI, + + /** + * Converts the input *degrees* value to radians. + * @param {number} degrees The input degrees value. + * @return {number} The value in radians. + */ + radians: (degrees) => Math.PI * degrees / 180, + + /** + * Returns the arc-cosine (in radians) of a number *value*; + * equivalent to *Math.acos*. + * @param {number} value The input number value. + * @return {number} The arc-cosine value. + */ + acos: Math.acos, + + /** + * Returns the hyperbolic arc-cosine of a number *value*; + * equivalent to *Math.acosh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic arc-cosine value. + */ + acosh: Math.acosh, + + /** + * Returns the arc-sine (in radians) of a number *value*; + * equivalent to *Math.asin*. + * @param {number} value The input number value. + * @return {number} The arc-sine value. + */ + asin: Math.asin, + + /** + * Returns the hyperbolic arc-sine of a number *value*; + * equivalent to *Math.asinh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic arc-sine value. + */ + asinh: Math.asinh, + + /** + * Returns the arc-tangent (in radians) of a number *value*; + * equivalent to *Math.atan*. + * @param {number} value The input number value. + * @return {number} The arc-tangent value. + */ + atan: Math.atan, + + /** + * Returns the angle in the plane (in radians) between the positive x-axis + * and the ray from (0, 0) to the point (*x*, *y*); + * equivalent to *Math.atan2*. + * @param {number} y The y coordinate of the point. + * @param {number} x The x coordinate of the point. + * @return {number} The arc-tangent angle. + */ + atan2: Math.atan2, + + /** + * Returns the hyperbolic arc-tangent of a number *value*; + * equivalent to *Math.atanh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic arc-tangent value. + */ + atanh: Math.atanh, + + /** + * Returns the cosine (in radians) of a number *value*; + * equivalent to *Math.cos*. + * @param {number} value The input number value. + * @return {number} The cosine value. + */ + cos: Math.cos, + + /** + * Returns the hyperbolic cosine (in radians) of a number *value*; + * equivalent to *Math.cosh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic cosine value. + */ + cosh: Math.cosh, + + /** + * Returns the sine (in radians) of a number *value*; + * equivalent to *Math.sin*. + * @param {number} value The input number value. + * @return {number} The sine value. + */ + sin: Math.sin, + + /** + * Returns the hyperbolic sine (in radians) of a number *value*; + * equivalent to *Math.sinh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic sine value. + */ + sinh: Math.sinh, + + /** + * Returns the tangent (in radians) of a number *value*; + * equivalent to *Math.tan*. + * @param {number} value The input number value. + * @return {number} The tangent value. + */ + tan: Math.tan, + + /** + * Returns the hyperbolic tangent (in radians) of a number *value*; + * equivalent to *Math.tanh*. + * @param {number} value The input number value. + * @return {number} The hyperbolic tangent value. + */ + tanh: Math.tanh +}; diff --git a/src/op/functions/object.js b/src/op/functions/object.js index 7c1d693b..a00f7d7b 100644 --- a/src/op/functions/object.js +++ b/src/op/functions/object.js @@ -1,24 +1,74 @@ -import NULL from '../../util/null'; -import has from '../../util/has'; -import isMap from '../../util/is-map'; -import isMapOrSet from '../../util/is-map-or-set'; +import NULL from '../../util/null.js'; +import has from '../../util/has.js'; +import isMap from '../../util/is-map.js'; +import isMapOrSet from '../../util/is-map-or-set.js'; function array(iter) { return Array.from(iter); } export default { - has: (obj, key) => isMapOrSet(obj) ? obj.has(key) - : obj != null ? has(obj, key) - : false, - keys: (obj) => isMap(obj) ? array(obj.keys()) - : obj != null ? Object.keys(obj) - : [], - values: (obj) => isMapOrSet(obj) ? array(obj.values()) - : obj != null ? Object.values(obj) - : [], - entries: (obj) => isMapOrSet(obj) ? array(obj.entries()) - : obj != null ? Object.entries(obj) - : [], - object: (entries) => entries ? Object.fromEntries(entries) : NULL -}; \ No newline at end of file + /** + * Returns a boolean indicating whether the *object* has the specified *key* + * as its own property (as opposed to inheriting it). If the *object* is a + * *Map* or *Set* instance, the *has* method will be invoked directly on the + * object, otherwise *Object.hasOwnProperty* is used. + * @template K, V + * @param {Map|Set|Record} object The object, Map, or Set to + * test for property membership. + * @param {K} key The property key to test for. + * @return {boolean} True if the object has the given key, false otherwise. + */ + has: (object, key) => isMapOrSet(object) ? object.has(key) + : object != null ? has(object, `${key}`) + : false, + + /** + * Returns an array of a given *object*'s own enumerable property names. If + * the *object* is a *Map* instance, the *keys* method will be invoked + * directly on the object, otherwise *Object.keys* is used. + * @template K, V + * @param {Map|Record} object The input object or Map value. + * @return {K[]} An array of property key name strings. + */ + keys: (object) => isMap(object) ? array(object.keys()) + : object != null ? Object.keys(object) + : [], + + /** + * Returns an array of a given *object*'s own enumerable property values. If + * the *object* is a *Map* or *Set* instance, the *values* method will be + * invoked directly on the object, otherwise *Object.values* is used. + * @template K, V + * @param {Map|Set|Record} object The input + * object, Map, or Set value. + * @return {V[]} An array of property values. + */ + values: (object) => isMapOrSet(object) ? array(object.values()) + : object != null ? Object.values(object) + : [], + + /** + * Returns an array of a given *object*'s own enumerable keyed property + * `[key, value]` pairs. If the *object* is a *Map* or *Set* instance, the + * *entries* method will be invoked directly on the object, otherwise + * *Object.entries* is used. + * @template K, V + * @param {Map|Set|Record} object The input + * object, Map, or Set value. + * @return {[K, V][]} An array of property values. + */ + entries: (object) => isMapOrSet(object) ? array(object.entries()) + : object != null ? Object.entries(object) + : [], + + /** + * Returns a new object given iterable *entries* of `[key, value]` pairs. + * This method is Arquero's version of the *Object.fromEntries* method. + * @template K, V + * @param {Iterable<[K, V]>} entries An iterable collection of `[key, value]` + * pairs, such as an array of two-element arrays or a *Map*. + * @return {Record} An object of consolidated key-value pairs. + */ + object: (entries) => entries ? Object.fromEntries(entries) : NULL +}; diff --git a/src/op/functions/recode.js b/src/op/functions/recode.js index 80bcb48d..34b9b5eb 100644 --- a/src/op/functions/recode.js +++ b/src/op/functions/recode.js @@ -1,24 +1,26 @@ -import has from '../../util/has'; +import has from '../../util/has.js'; /** * Recodes an input value to an alternative value, based on a provided * value map. If a fallback value is specified, it will be returned when * a matching value is not found in the map; otherwise, the input value * is returned unchanged. - * @param {*} value The value to recode. The value must be safely + * @template T + * @param {T} value The value to recode. The value must be safely * coercible to a string for lookup against the value map. - * @param {object|Map} map An object or Map with input values for keys and - * output recoded values as values. If a non-Map object, only the object's - * own properties will be considered. - * @param {*} [fallback] A default fallback value to use if the input + * @param {Map|Record} map An object or Map with input values + * for keys and output recoded values as values. If a non-Map object, only + * the object's own properties will be considered. + * @param {T} [fallback] A default fallback value to use if the input * value is not found in the value map. - * @return {*} The recoded value. + * @return {T} The recoded value. */ export default function(value, map, fallback) { if (map instanceof Map) { if (map.has(value)) return map.get(value); - } else if (has(map, value)) { - return map[value]; + } else { + const key = `${value}`; + if (has(map, key)) return map[key]; } return fallback !== undefined ? fallback : value; -} \ No newline at end of file +} diff --git a/src/op/functions/sequence.js b/src/op/functions/sequence.js index 2b36f973..e9a5b8f9 100644 --- a/src/op/functions/sequence.js +++ b/src/op/functions/sequence.js @@ -27,4 +27,4 @@ export default function(start, stop, step) { } return seq; -} \ No newline at end of file +} diff --git a/src/op/functions/string.js b/src/op/functions/string.js index f1faf379..581f795f 100644 --- a/src/op/functions/string.js +++ b/src/op/functions/string.js @@ -1,36 +1,222 @@ export default { - parse_date: (str) => str == null ? str : new Date(str), - parse_float: (str) => str == null ? str : Number.parseFloat(str), - parse_int: (str, radix) => str == null ? str : Number.parseInt(str, radix), - endswith: (str, search, length) => str == null ? false - : String(str).endsWith(search, length), - match: (str, regexp, index) => { - const m = str == null ? str : String(str).match(regexp); - return index == null || m == null ? m - : typeof index === 'number' ? m[index] - : m.groups ? m.groups[index] - : null; - }, - normalize: (str, form) => str == null ? str - : String(str).normalize(form), - padend: (str, len, fill) => str == null ? str - : String(str).padEnd(len, fill), - padstart: (str, len, fill) => str == null ? str - : String(str).padStart(len, fill), - upper: (str) => str == null ? str - : String(str).toUpperCase(), - lower: (str) => str == null ? str - : String(str).toLowerCase(), - repeat: (str, num) => str == null ? str - : String(str).repeat(num), - replace: (str, pattern, replacement) => str == null ? str - : String(str).replace(pattern, String(replacement)), - substring: (str, start, end) => str == null ? str - : String(str).substring(start, end), - split: (str, separator, limit) => str == null ? [] - : String(str).split(separator, limit), - startswith: (str, search, length) => str == null ? false - : String(str).startsWith(search, length), - trim: (str) => str == null ? str - : String(str).trim() -}; \ No newline at end of file + /** + * Parses a string *value* and returns a Date instance. Beware: this method + * uses JavaScript's *Date.parse()* functionality, which is inconsistently + * implemented across browsers. That said, + * [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) formatted strings such + * as those produced by *op.format_date* and *op.format_utcdate* should be + * supported across platforms. Note that "bare" ISO date strings such as + * `"2001-01-01"` are interpreted by JavaScript as indicating midnight of + * that day in Coordinated Universal Time (UTC), *not* local time. To + * indicate the local timezone, an ISO string can include additional time + * components and no `Z` suffix: `"2001-01-01T00:00"`. + * @param {*} value The input value. + * @return {Date} The parsed date value. + */ + parse_date: (value) => value == null ? value : new Date(value), + + /** + * Parses a string *value* and returns a floating point number. + * @param {*} value The input value. + * @return {number} The parsed number value. + */ + parse_float: (value) => value == null ? value : Number.parseFloat(value), + + /** + * Parses a string *value* and returns an integer of the specified radix + * (the base in mathematical numeral systems). + * @param {*} value The input value. + * @param {number} [radix] An integer between 2 and 36 that represents the + * radix (the base in mathematical numeral systems) of the string. Be + * careful: this does not default to 10! If *radix* is `undefined`, `0`, + * or unspecified, JavaScript assumes the following: If the input string + * begins with `"0x"` or `"0X"` (a zero, followed by lowercase or + * uppercase X), the radix is assumed to be 16 and the rest of the string + * is parsed as a hexidecimal number. If the input string begins with `"0"` + * (a zero), the radix is assumed to be 8 (octal) or 10 (decimal). Exactly + * which radix is chosen is implementation-dependent. If the input string + * begins with any other value, the radix is 10 (decimal). + * @return {number} The parsed integer value. + */ + parse_int: (value, radix) => value == null ? value + : Number.parseInt(value, radix), + + /** + * Determines whether a string *value* ends with the characters of a + * specified *search* string, returning `true` or `false` as appropriate. + * @param {any} value The input string value. + * @param {string} search The search string to test for. + * @param {number} [length] If provided, used as the length of *value* + * (default `value.length`). + * @return {boolean} True if the value ends with the search string, + * false otherwise. + */ + endswith: (value, search, length) => value == null ? false + : String(value).endsWith(search, length), + + /** + * Retrieves the result of matching a string *value* against a regular + * expression *regexp*. If no *index* is specified, returns an array + * whose contents depend on the presence or absence of the regular + * expression global (`g`) flag, or `null` if no matches are found. If the + * `g` flag is used, all results matching the complete regular expression + * will be returned, but capturing groups will not. If the `g` flag is not + * used, only the first complete match and its related capturing groups are + * returned. + * + * If specified, the *index* looks up a value of the resulting match. If + * *index* is a number, the corresponding index of the result array is + * returned. If *index* is a string, the value of the corresponding + * named capture group is returned, or `null` if there is no such group. + * @param {*} value The input string value. + * @param {*} regexp The regular expression to match against. + * @param {number|string} index The index into the match result array + * or capture group. + * @return {string|string[]} The match result. + */ + match: (value, regexp, index) => { + const m = value == null ? value : String(value).match(regexp); + return index == null || m == null ? m + : typeof index === 'number' ? m[index] + : m.groups ? m.groups[index] + : null; + }, + + /** + * Returns the Unicode normalization form of the string *value*. + * @param {*} value The input value to normalize. + * @param {string} form The Unicode normalization form, one of + * `'NFC'` (default, canonical decomposition, followed by canonical + * composition), `'NFD'` (canonical decomposition), `'NFKC'` (compatibility + * decomposition, followed by canonical composition), + * or `'NFKD'` (compatibility decomposition). + * @return {string} The normalized string value. + */ + normalize: (value, form) => value == null ? value + : String(value).normalize(form), + + /** + * Pad a string *value* with a given *fill* string (applied from the end of + * *value* and repeated, if needed) so that the resulting string reaches a + * given *length*. + * @param {*} value The input value to pad. + * @param {number} length The length of the resulting string once the + * *value* string has been padded. If the length is lower than + * `value.length`, the *value* string will be returned as-is. + * @param {string} [fill] The string to pad the *value* string with + * (default `''`). If *fill* is too long to stay within the target + * *length*, it will be truncated: for left-to-right languages the + * left-most part and for right-to-left languages the right-most will + * be applied. + * @return {string} The padded string. + */ + padend: (value, length, fill) => value == null ? value + : String(value).padEnd(length, fill), + + /** + * Pad a string *value* with a given *fill* string (applied from the start + * of *value* and repeated, if needed) so that the resulting string reaches + * a given *length*. + * @param {*} value The input value to pad. + * @param {number} length The length of the resulting string once the + * *value* string has been padded. If the length is lower than + * `value.length`, the *value* string will be returned as-is. + * @param {string} [fill] The string to pad the *value* string with + * (default `''`). If *fill* is too long to stay within the target + * *length*, it will be truncated: for left-to-right languages the + * left-most part and for right-to-left languages the right-most will + * be applied. + * @return {string} The padded string. + */ + padstart: (value, length, fill) => value == null ? value + : String(value).padStart(length, fill), + + /** + * Returns the string *value* converted to upper case. + * @param {*} value The input string value. + * @return {string} The upper case string. + */ + upper: (value) => value == null ? value : String(value).toUpperCase(), + + /** + * Returns the string *value* converted to lower case. + * @param {*} value The input string value. + * @return {string} The lower case string. + */ + lower: (value) => value == null ? value : String(value).toLowerCase(), + + /** + * Returns a new string which contains the specified *number* of copies of + * the *value* string concatenated together. + * @param {*} value The input string to repeat. + * @param {*} number An integer between `0` and `+Infinity`, indicating the + * number of times to repeat the string. + * @return {string} The repeated string. + */ + repeat: (value, number) => value == null ? value + : String(value).repeat(number), + + /** + * Returns a new string with some or all matches of a *pattern* replaced by + * a *replacement*. The *pattern* can be a string or a regular expression, + * and the *replacement* must be a string. If *pattern* is a string, only + * the first occurrence will be replaced; to make multiple replacements, use + * a regular expression *pattern* with a `g` (global) flag. + * @param {*} value The input string value. + * @param {*} pattern The pattern string or regular expression to replace. + * @param {*} replacement The replacement string to use. + * @return {string} The string with patterns replaced. + */ + replace: (value, pattern, replacement) => value == null ? value + : String(value).replace(pattern, String(replacement)), + + /** + * Divides a string *value* into an ordered list of substrings based on a + * *separator* pattern, puts these substrings into an array, and returns the + * array. + * @param {*} value The input string value. + * @param {*} separator A string or regular expression pattern describing + * where each split should occur. + * @param {number} [limit] An integer specifying a limit on the number of + * substrings to be included in the array. + * @return {string[]} + */ + split: (value, separator, limit) => value == null ? [] + : String(value).split(separator, limit), + + /** + * Determines whether a string *value* starts with the characters of a + * specified *search* string, returning `true` or `false` as appropriate. + * @param {*} value The input string value. + * @param {string} search The search string to test for. + * @param {number} [position=0] The position in the *value* string at which + * to begin searching (default `0`). + * @return {boolean} True if the string starts with the search pattern, + * false otherwise. + */ + startswith: (value, search, position) => value == null ? false + : String(value).startsWith(search, position), + + /** + * Returns the part of the string *value* between the *start* and *end* + * indexes, or to the end of the string. + * @param {*} value The input string value. + * @param {number} [start=0] The index of the first character to include in + * the returned substring (default `0`). + * @param {number} [end] The index of the first character to exclude from + * the returned substring (default `value.length`). + * @return {string} The substring. + */ + substring: (value, start, end) => value == null ? value + : String(value).substring(start, end), + + /** + * Returns a new string with whitespace removed from both ends of the input + * *value* string. Whitespace in this context is all the whitespace + * characters (space, tab, no-break space, etc.) and all the line terminator + * characters (LF, CR, etc.). + * @param {*} value The input string value to trim. + * @return {string} The trimmed string. + */ + trim: (value) => value == null ? value : String(value).trim() +}; diff --git a/src/op/index.js b/src/op/index.js index a5a0f83e..b4c032b6 100644 --- a/src/op/index.js +++ b/src/op/index.js @@ -1,13 +1,11 @@ -import aggregateFunctions from './aggregate-functions'; -import windowFunctions from './window-functions'; -import functions from './functions'; -import has from '../util/has'; +import aggregateFunctions from './aggregate-functions.js'; +import windowFunctions from './window-functions.js'; +import functions from './functions/index.js'; +import has from '../util/has.js'; -export { - functions, - aggregateFunctions, - windowFunctions -}; +export { default as aggregateFunctions } from './aggregate-functions.js'; +export { default as windowFunctions } from './window-functions.js'; +export { default as functions } from './functions/index.js'; /** * Check if an aggregate function with the given name exists. @@ -39,8 +37,8 @@ export function hasWindow(name) { /** * Get an aggregate function definition. * @param {string} name The name of the aggregate function. - * @return {AggregateDef} The aggregate function definition, - * or undefined if not found. + * @return {import('./aggregate-functions.js').AggregateDef} + * The aggregate function definition, or undefined if not found. */ export function getAggregate(name) { return hasAggregate(name) && aggregateFunctions[name]; @@ -49,8 +47,8 @@ export function getAggregate(name) { /** * Get a window function definition. * @param {string} name The name of the window function. - * @return {WindowDef} The window function definition, - * or undefined if not found. + * @return {import('./window-functions.js').WindowDef} + * The window function definition, or undefined if not found. */ export function getWindow(name) { return hasWindow(name) && windowFunctions[name]; @@ -63,4 +61,4 @@ export function getWindow(name) { */ export function getFunction(name) { return hasFunction(name) && functions[name]; -} \ No newline at end of file +} diff --git a/src/op/op-api.js b/src/op/op-api.js index e5c7ca6b..534846b6 100644 --- a/src/op/op-api.js +++ b/src/op/op-api.js @@ -1,5 +1,33 @@ -import functions from './functions'; -import op from './op'; +import functions from './functions/index.js'; +import toArray from '../util/to-array.js'; +import toString from '../util/to-string.js'; + +export class Op { + constructor(name, fields, params) { + this.name = name; + this.fields = fields; + this.params = params; + } + toString() { + const args = [ + ...this.fields.map(f => `d[${toString(f)}]`), + ...this.params.map(toString) + ]; + return `d => op.${this.name}(${args})`; + } + toObject() { + return { expr: this.toString(), func: true }; + } +} + +/** + * @param {string} name + * @param {any | any[]} [fields] + * @param {any | any[]} [params] + */ +export function op(name, fields = [], params = []) { + return new Op(name, toArray(fields), toArray(params)); +} export const any = (field) => op('any', field); export const count = () => op('count'); @@ -10,7 +38,7 @@ export const object_agg = (key, value) => op('object_agg', [key, value]); export const entries_agg = (key, value) => op('entries_agg', [key, value]); /** - * @typedef {import('../table/transformable').Struct} Struct + * @typedef {import('../table/types.js').Struct} Struct */ /** @@ -36,47 +64,53 @@ export default { /** * Aggregate function returning an arbitrary observed value. - * @param {*} field The data field. - * @return {*} An arbitrary observed value. + * @template T + * @param {T} field The data field. + * @return {T} An arbitrary observed value. */ any, /** * Aggregate function to collect an array of values. - * @param {*} field The data field. - * @return {Array} A list of values. + * @template T + * @param {T} field The data field. + * @return {Array} A list of values. */ array_agg, /** * Aggregate function to collect an array of distinct (unique) values. - * @param {*} field The data field. - * @return {Array} An array of unique values. + * @template T + * @param {T} field The data field. + * @return {Array} An array of unique values. */ array_agg_distinct, /** * Aggregate function to create an object given input key and value fields. - * @param {*} key The object key field. - * @param {*} value The object value field. - * @return {Struct} An object of key-value pairs. + * @template K, V + * @param {K} key The object key field. + * @param {V} value The object value field. + * @return {Record} An object of key-value pairs. */ object_agg, /** * Aggregate function to create a Map given input key and value fields. - * @param {*} key The object key field. - * @param {*} value The object value field. - * @return {Map} A Map of key-value pairs. + * @template K, V + * @param {K} key The object key field. + * @param {V} value The object value field. + * @return {Map} A Map of key-value pairs. */ map_agg, /** * Aggregate function to create an array in the style of Object.entries() * given input key and value fields. - * @param {*} key The object key field. - * @param {*} value The object value field. - * @return {[[any, any]]} An array of [key, value] arrays. + * @template K, V + * @param {K} key The object key field. + * @param {V} value The object value field. + * @return {[K, V][]} An array of [key, value] arrays. */ entries_agg, @@ -86,6 +120,7 @@ export default { * @param {*} field The data field. * @return {number} The count of valid values. */ + // @ts-ignore valid: (field) => op('valid', field), /** @@ -94,6 +129,7 @@ export default { * @param {*} field The data field. * @return {number} The count of invalid values. */ + // @ts-ignore invalid: (field) => op('invalid', field), /** @@ -101,20 +137,24 @@ export default { * @param {*} field The data field. * @return {number} The count of distinct values. */ + // @ts-ignore distinct: (field) => op('distinct', field), /** * Aggregate function to determine the mode (most frequent) value. - * @param {*} field The data field. - * @return {number} The mode value. + * @template T + * @param {T} field The data field. + * @return {T} The mode value. */ + // @ts-ignore mode: (field) => op('mode', field), /** * Aggregate function to sum values. - * @param {string} field The data field. + * @param {*} field The data field. * @return {number} The sum of the values. */ + // @ts-ignore sum: (field) => op('sum', field), /** @@ -122,6 +162,7 @@ export default { * @param {*} field The data field. * @return {number} The product of the values. */ + // @ts-ignore product: (field) => op('product', field), /** @@ -129,6 +170,7 @@ export default { * @param {*} field The data field. * @return {number} The mean (average) of the values. */ + // @ts-ignore mean: (field) => op('mean', field), /** @@ -136,6 +178,7 @@ export default { * @param {*} field The data field. * @return {number} The average (mean) of the values. */ + // @ts-ignore average: (field) => op('average', field), /** @@ -143,6 +186,7 @@ export default { * @param {*} field The data field. * @return {number} The sample variance of the values. */ + // @ts-ignore variance: (field) => op('variance', field), /** @@ -150,6 +194,7 @@ export default { * @param {*} field The data field. * @return {number} The population variance of the values. */ + // @ts-ignore variancep: (field) => op('variancep', field), /** @@ -157,6 +202,7 @@ export default { * @param {*} field The data field. * @return {number} The sample standard deviation of the values. */ + // @ts-ignore stdev: (field) => op('stdev', field), /** @@ -164,20 +210,25 @@ export default { * @param {*} field The data field. * @return {number} The population standard deviation of the values. */ + // @ts-ignore stdevp: (field) => op('stdevp', field), /** * Aggregate function for the minimum value. - * @param {*} field The data field. - * @return {number} The minimum value. + * @template T + * @param {T} field The data field. + * @return {T} The minimum value. */ + // @ts-ignore min: (field) => op('min', field), /** * Aggregate function for the maximum value. - * @param {*} field The data field. - * @return {number} The maximum value. + * @template T + * @param {T} field The data field. + * @return {T} The maximum value. */ + // @ts-ignore max: (field) => op('max', field), /** @@ -187,6 +238,7 @@ export default { * @param {number} p The probability threshold. * @return {number} The quantile value. */ + // @ts-ignore quantile: (field, p) => op('quantile', field, p), /** @@ -195,6 +247,7 @@ export default { * @param {*} field The data field. * @return {number} The median value. */ + // @ts-ignore median: (field) => op('median', field), /** @@ -203,6 +256,7 @@ export default { * @param {*} field2 The second data field. * @return {number} The sample covariance of the values. */ + // @ts-ignore covariance: (field1, field2) => op('covariance', [field1, field2]), /** @@ -211,6 +265,7 @@ export default { * @param {*} field2 The second data field. * @return {number} The population covariance of the values. */ + // @ts-ignore covariancep: (field1, field2) => op('covariancep', [field1, field2]), /** @@ -221,6 +276,7 @@ export default { * @param {*} field2 The second data field. * @return {number} The correlation between the field values. */ + // @ts-ignore corr: (field1, field2) => op('corr', [field1, field2]), /** @@ -235,13 +291,18 @@ export default { * If specified, the maxbins and minstep arguments are ignored. * @return {[number, number, number]} The bin [min, max, and step] values. */ - bins: (field, maxbins, nice, minstep) => - op('bins', field, [maxbins, nice, minstep]), + // @ts-ignore + bins: (field, maxbins, nice, minstep, step) => op( + 'bins', + field, + [maxbins, nice, minstep, step] + ), /** * Window function to assign consecutive row numbers, starting from 1. * @return {number} The row number value. */ + // @ts-ignore row_number: () => op('row_number'), /** @@ -251,6 +312,7 @@ export default { * rank 1, the third value is assigned rank 3. * @return {number} The rank value. */ + // @ts-ignore rank: () => op('rank'), /** @@ -259,6 +321,7 @@ export default { * indices: if the first two values tie, both will be assigned rank 1.5. * @return {number} The peer-averaged rank value. */ + // @ts-ignore avg_rank: () => op('avg_rank'), /** @@ -268,6 +331,7 @@ export default { * values tie for rank 1, the third value is assigned rank 2. * @return {number} The dense rank value. */ + // @ts-ignore dense_rank: () => op('dense_rank'), /** @@ -275,6 +339,7 @@ export default { * The percent is calculated as (rank - 1) / (group_size - 1). * @return {number} The percentage rank value. */ + // @ts-ignore percent_rank: () => op('percent_rank'), /** @@ -282,6 +347,7 @@ export default { * to each value in a group. * @return {number} The cumulative distribution value. */ + // @ts-ignore cume_dist: () => op('cume_dist'), /** @@ -291,68 +357,83 @@ export default { * @param {number} num The number of buckets for ntile calculation. * @return {number} The quantile value. */ + // @ts-ignore ntile: (num) => op('ntile', null, num), /** * Window function to assign a value that precedes the current value by * a specified number of positions. If no such value exists, returns a * default value instead. - * @param {*} field The data field. + * @template T + * @param {T} field The data field. * @param {number} [offset=1] The lag offset from the current value. - * @param {*} [defaultValue=undefined] The default value. - * @return {*} The lagging value. + * @param {T} [defaultValue=undefined] The default value. + * @return {T} The lagging value. */ + // @ts-ignore lag: (field, offset, defaultValue) => op('lag', field, [offset, defaultValue]), /** * Window function to assign a value that follows the current value by * a specified number of positions. If no such value exists, returns a * default value instead. - * @param {*} field The data field. + * @template T + * @param {T} field The data field. * @param {number} [offset=1] The lead offset from the current value. - * @param {*} [defaultValue=undefined] The default value. - * @return {*} The leading value. + * @param {T} [defaultValue=undefined] The default value. + * @return {T} The leading value. */ + // @ts-ignore lead: (field, offset, defaultValue) => op('lead', field, [offset, defaultValue]), /** * Window function to assign the first value in a sliding window frame. - * @param {*} field The data field. - * @return {*} The first value in the current frame. + * @template T + * @param {T} field The data field. + * @return {T} The first value in the current frame. */ + // @ts-ignore first_value: (field) => op('first_value', field), /** * Window function to assign the last value in a sliding window frame. - * @param {*} field The data field. - * @return {*} The last value in the current frame. + * @template T + * @param {T} field The data field. + * @return {T} The last value in the current frame. */ + // @ts-ignore last_value: (field) => op('last_value', field), /** * Window function to assign the nth value in a sliding window frame * (counting from 1), or undefined if no such value exists. - * @param {*} field The data field. + * @template T + * @param {T} field The data field. * @param {number} nth The nth position, starting from 1. - * @return {*} The nth value in the current frame. + * @return {T} The nth value in the current frame. */ + // @ts-ignore nth_value: (field, nth) => op('nth_value', field, nth), /** * Window function to fill in missing values with preceding values. - * @param {*} field The data field. - * @param {*} [defaultValue=undefined] The default value. - * @return {*} The current value if valid, otherwise the first preceding + * @template T + * @param {T} field The data field. + * @param {T} [defaultValue=undefined] The default value. + * @return {T} The current value if valid, otherwise the first preceding * valid value. If no such value exists, returns the default value. */ + // @ts-ignore fill_down: (field, defaultValue) => op('fill_down', field, defaultValue), /** * Window function to fill in missing values with subsequent values. - * @param {*} field The data field. - * @param {*} [defaultValue=undefined] The default value. - * @return {*} The current value if valid, otherwise the first subsequent + * @template T + * @param {T} field The data field. + * @param {T} [defaultValue=undefined] The default value. + * @return {T} The current value if valid, otherwise the first subsequent * valid value. If no such value exists, returns the default value. */ + // @ts-ignore fill_up: (field, defaultValue) => op('fill_up', field, defaultValue) -}; \ No newline at end of file +}; diff --git a/src/op/op.js b/src/op/op.js deleted file mode 100644 index d31986e8..00000000 --- a/src/op/op.js +++ /dev/null @@ -1,24 +0,0 @@ -import toArray from '../util/to-array'; -import toString from '../util/to-string'; - -export default function(name, fields = [], params = []) { - return new Op(name, toArray(fields), toArray(params)); -} - -export class Op { - constructor(name, fields, params) { - this.name = name; - this.fields = fields; - this.params = params; - } - toString() { - const args = [ - ...this.fields.map(f => `d[${toString(f)}]`), - ...this.params.map(toString) - ]; - return `d => op.${this.name}(${args})`; - } - toObject() { - return { expr: this.toString(), func: true }; - } -} \ No newline at end of file diff --git a/src/op/register.js b/src/op/register.js new file mode 100644 index 00000000..560ed116 --- /dev/null +++ b/src/op/register.js @@ -0,0 +1,112 @@ +import aggregateFunctions from './aggregate-functions.js'; +import windowFunctions from './window-functions.js'; +import functions from './functions/index.js'; +import ops, { op } from './op-api.js'; +import { ROW_OBJECT } from '../expression/row-object.js'; +import error from '../util/error.js'; +import has from '../util/has.js'; +import toString from '../util/to-string.js'; + +const onIllegal = (name, type) => + error(`Illegal ${type} name: ${toString(name)}`); + +const onDefined = (name, type) => + error(`The ${type} ${toString(name)} is already defined. Use override option?`); + +const onReserve = (name, type) => + error(`The ${type} name ${toString(name)} is reserved and can not be overridden.`); + +function check(name, options, obj = ops, type = 'function') { + if (!name) onIllegal(name, type); + if (!options.override && has(obj, name)) onDefined(name, type); +} + +function verifyFunction(name, def, object, options) { + return object[name] === def || check(name, options); +} + +/** + * Register an aggregate or window operation. + * @param {string} name The name of the operation + * @param {AggregateDef|WindowDef} def The operation definition. + * @param {object} object The registry object to add the definition to. + * @param {RegisterOptions} [options] Registration options. + */ +function addOp(name, def, object, options = {}) { + if (verifyFunction(name, def, object, options)) return; + const [nf = 0, np = 0] = def.param; // num fields, num params + object[name] = def; + ops[name] = (...params) => op( + name, + params.slice(0, nf), + params.slice(nf, nf + np) + ); +} + +/** + * Register a custom aggregate function. + * @param {string} name The name to use for the aggregate function. + * @param {AggregateDef} def The aggregate operator definition. + * @param {RegisterOptions} [options] Function registration options. + * @throws If a function with the same name is already registered and + * the override option is not specified. + */ +export function addAggregateFunction(name, def, options) { + addOp(name, def, aggregateFunctions, options); +} + +/** + * Register a custom window function. + * @param {string} name The name to use for the window function. + * @param {WindowDef} def The window operator definition. + * @param {RegisterOptions} [options] Function registration options. + * @throws If a function with the same name is already registered and + * the override option is not specified. + */ +export function addWindowFunction(name, def, options) { + addOp(name, def, windowFunctions, options); +} + +/** + * Register a function for use within table expressions. + * If only a single argument is provided, it will be assumed to be a + * function and the system will try to extract its name. + * @param {string} name The name to use for the function. + * @param {Function} fn A standard JavaScript function. + * @param {RegisterOptions} [options] Function registration options. + * @throws If a function with the same name is already registered and + * the override option is not specified, or if no name is provided + * and the input function is anonymous. + */ +export function addFunction(name, fn, options = {}) { + if (arguments.length === 1) { + // @ts-ignore + fn = name; + name = fn.name; + if (name === '' || name === 'anonymous') { + error('Anonymous function provided, please include a name argument.'); + } else if (name === ROW_OBJECT) { + onReserve(ROW_OBJECT, 'function'); + } + } + if (verifyFunction(name, fn, functions, options)) return; + functions[name] = fn; + ops[name] = fn; +} + +/** + * Aggregate function definition. + * @typedef {import('./aggregate-functions.js').AggregateDef} AggregateDef + */ + +/** + * Window function definition. + * @typedef {import('./window-functions.js').WindowDef} WindowDef + */ + +/** + * Options for registering new functions. + * @typedef {object} RegisterOptions + * @property {boolean} [override=false] Flag indicating if the added + * function can override an existing function with the same name. + */ diff --git a/src/op/window-functions.js b/src/op/window-functions.js index b861538f..60710ba4 100644 --- a/src/op/window-functions.js +++ b/src/op/window-functions.js @@ -1,7 +1,7 @@ -import error from '../util/error'; -import isValid from '../util/is-valid'; -import noop from '../util/no-op'; -import NULL from '../util/null'; +import error from '../util/error.js'; +import isValid from '../util/is-valid.js'; +import noop from '../util/no-op.js'; +import NULL from '../util/null.js'; /** * Initialize a window operator. @@ -11,7 +11,7 @@ import NULL from '../util/null'; /** * A storage object for the state of the window. - * @typedef {import('../engine/window/window-state').default} WindowState + * @typedef {import('../verbs/window/window-state.js').default} WindowState */ /** @@ -23,12 +23,12 @@ import NULL from '../util/null'; /** * Initialize an aggregate operator. - * @typedef {import('./aggregate-functions').AggregateInit} AggregateInit + * @typedef {import('./aggregate-functions.js').AggregateInit} AggregateInit */ /** * Retrive an output value from an aggregate operator. - * @typedef {import('./aggregate-functions').AggregateValue} AggregateValue + * @typedef {import('./aggregate-functions.js').AggregateValue} AggregateValue */ /** @@ -47,7 +47,7 @@ import NULL from '../util/null'; /** * Create a new aggregate operator instance. - * @typedef {import('./aggregate-functions').AggregateCreate} AggregateCreate + * @typedef {import('./aggregate-functions.js').AggregateCreate} AggregateCreate */ /** diff --git a/src/query/constants.js b/src/query/constants.js deleted file mode 100644 index fd7591b4..00000000 --- a/src/query/constants.js +++ /dev/null @@ -1,17 +0,0 @@ -export const Expr = 'Expr'; -export const ExprList = 'ExprList'; -export const ExprNumber = 'ExprNumber'; -export const ExprObject = 'ExprObject'; -export const JoinKeys = 'JoinKeys'; -export const JoinValues = 'JoinValues'; -export const Options = 'Options'; -export const OrderbyKeys = 'OrderKeys'; -export const SelectionList = 'SelectionList'; -export const TableRef = 'TableRef'; -export const TableRefList = 'TableRefList'; - -export const Descending = 'Descending'; -export const Query = 'Query'; -export const Selection = 'Selection'; -export const Verb = 'Verb'; -export const Window = 'Window'; \ No newline at end of file diff --git a/src/query/query.js b/src/query/query.js deleted file mode 100644 index c1b08a96..00000000 --- a/src/query/query.js +++ /dev/null @@ -1,190 +0,0 @@ -import Transformable from '../table/transformable'; -import { Query as QueryType } from './constants'; -import { Verb, Verbs } from './verb'; - -/** - * Create a new query instance. The query interface provides - * a table-like verb API to construct a query that can be - * serialized or evaluated against Arquero tables. - * @param {string} [tableName] The name of the table to query. If - * provided, will be used as the default input table to pull from - * a provided catalog to run the query against. - * @return {Query} A new builder instance. - */ -export function query(tableName) { - return new Query(null, null, tableName); -} - -/** - * Create a new query instance from a serialized object. - * @param {object} object A serialized query representation, such as - * those generated by query(...).toObject(). - * @returns {Query} The instantiated query instance. - */ -export function queryFrom(object) { - return Query.from(object); -} - -/** - * Model a query as a collection of serializble verbs. - * Provides a table-like interface for constructing queries. - */ -export default class Query extends Transformable { - - /** - * Construct a new query instance. - * @param {Verb[]} verbs An array of verb instances. - * @param {object} [params] Optional query parameters, corresponding - * to parameter references in table expressions. - * @param {string} [table] Optional name of the table to query. - */ - constructor(verbs, params, table) { - super(params); - this._verbs = verbs || []; - this._table = table; - } - - /** - * Create a new query instance from the given serialized object. - * @param {QueryObject} object A serialized query representation, such as - * those generated by Query.toObject. - * @returns {Query} The instantiated query. - */ - static from({ verbs, table, params }) { - return new Query(verbs.map(Verb.from), params, table); - } - - /** - * Provide an informative object string tag. - */ - get [Symbol.toStringTag]() { - if (!this._verbs) return 'Object'; // bail if called on prototype - const ns = this._verbs.length; - return `Query: ${ns} verbs` + (this._table ? ` on '${this._table}'` : ''); - } - - /** - * Return the number of verbs in this query. - */ - get length() { - return this._verbs.length; - } - - /** - * Return the name of the table this query applies to. - * @return {string} The name of the source table, or undefined. - */ - get tableName() { - return this._table; - } - - /** - * Get or set table expression parameter values. - * If called with no arguments, returns the current parameter values - * as an object. Otherwise, adds the provided parameters to this - * query's parameter set and returns the table. Any prior parameters - * with names matching the input parameters are overridden. - * @param {object} values The parameter values. - * @return {Query|object} The current parameter values (if called - * with no arguments) or this query. - */ - params(values) { - if (arguments.length) { - this._params = { ...this._params, ...values }; - return this; - } else { - return this._params; - } - } - - /** - * Evaluate this query against a given table and catalog. - * @param {Table} table The Arquero table to process. - * @param {Function} catalog A table lookup function that accepts a table - * name string as input and returns a corresponding Arquero table. - * @returns {Table} The resulting Arquero table. - */ - evaluate(table, catalog) { - table = table || catalog(this._table); - for (const verb of this._verbs) { - table = verb.evaluate(table.params(this._params), catalog); - } - return table; - } - - /** - * Serialize this query as a JSON-compatible object. The resulting - * object can be passed to Query.from to re-instantiate this query. - * @returns {object} A JSON-compatible object representing this query. - */ - toObject() { - return serialize(this, 'toObject'); - } - - /** - * Serialize this query as a JSON-compatible object. The resulting - * object can be passed to Query.from to re-instantiate this query. - * This method simply returns the result of toObject, but is provided - * as a separate method to allow later customization of JSON export. - * @returns {object} A JSON-compatible object representing this query. - */ - toJSON() { - return this.toObject(); - } - - /** - * Serialize this query to a JSON-compatible abstract syntax tree. - * All table expressions will be parsed and represented as AST instances - * using a modified form of the Mozilla JavaScript AST format. - * This method can be used to output parsed and serialized representations - * to translate Arquero queries to alternative data processing platforms. - * @returns {object} A JSON-compatible abstract syntax tree object. - */ - toAST() { - return serialize(this, 'toAST', { type: QueryType }); - } -} - -/** - * Abstract class representing a data table. - * @typedef {import('../table/table').default} Table - */ - -/** - * Serialized object representation of a query. - * @typedef {object} QueryObject - * @property {object[]} verbs An array of verb definitions. - * @property {object} [params] An object of parameter values. - * @property {string} [table] The name of the table to query. - */ - -function serialize(query, method, props) { - return { - ...props, - verbs: query._verbs.map(verb => verb[method]()), - ...(query._params ? { params: query._params } : null), - ...(query._table ? { table: query._table } : null) - }; -} - -function append(qb, verb) { - return new Query( - qb._verbs.concat(verb), - qb._params, - qb._table - ); -} - -export function addQueryVerb(name, verb) { - Query.prototype[name] = function(...args) { - return append(this, verb(...args)); - }; -} - -// Internal verb handlers -for (const name in Verbs) { - const verb = Verbs[name]; - Query.prototype['__' + name] = function(qb, ...args) { - return append(qb, verb(...args)); - }; -} \ No newline at end of file diff --git a/src/query/to-ast.js b/src/query/to-ast.js deleted file mode 100644 index 6b143e93..00000000 --- a/src/query/to-ast.js +++ /dev/null @@ -1,160 +0,0 @@ -import error from '../util/error'; -import isArray from '../util/is-array'; -import isFunction from '../util/is-function'; -import isNumber from '../util/is-number'; -import isObject from '../util/is-object'; -import isString from '../util/is-string'; -import toArray from '../util/to-array'; -import parse from '../expression/parse'; -import { isSelection, toObject } from './util'; - -import { Column } from '../expression/ast/constants'; -import { - Descending, - Expr, - ExprList, - ExprNumber, - ExprObject, - JoinKeys, - JoinValues, - Options, - OrderbyKeys, - Selection, - SelectionList, - TableRef, - TableRefList, - Window -} from './constants'; - -const Methods = { - [Expr]: astExpr, - [ExprList]: astExprList, - [ExprNumber]: astExprNumber, - [ExprObject]: astExprObject, - [JoinKeys]: astJoinKeys, - [JoinValues]: astJoinValues, - [OrderbyKeys]: astExprList, - [SelectionList]: astSelectionList -}; - -export default function(value, type, propTypes) { - return type === TableRef ? astTableRef(value) - : type === TableRefList ? value.map(astTableRef) - : ast(toObject(value), type, propTypes); -} - -function ast(value, type, propTypes) { - return type === Options - ? (value ? astOptions(value, propTypes) : value) - : Methods[type](value); -} - -function astOptions(value, types = {}) { - const output = {}; - for (const key in value) { - const prop = value[key]; - output[key] = types[key] ? ast(prop, types[key]) : prop; - } - return output; -} - -function astParse(expr, opt) { - return parse({ expr }, { ...opt, ast: true }).exprs[0]; -} - -function astColumn(name) { - return { type: Column, name }; -} - -function astColumnIndex(index) { - return { type: Column, index }; -} - -function astExprObject(obj, opt) { - if (isString(obj)) { - return astParse(obj, opt); - } - - if (obj.expr) { - let ast; - if (obj.field === true) { - ast = astColumn(obj.expr); - } else if (obj.func === true) { - ast = astExprObject(obj.expr, opt); - } - if (ast) { - if (obj.desc) { - ast = { type: Descending, expr: ast }; - } - if (obj.window) { - ast = { type: Window, expr: ast, ...obj.window }; - } - return ast; - } - } - - return Object.keys(obj) - .map(key => ({ - ...astExprObject(obj[key], opt), - as: key - })); -} - -function astSelection(sel) { - const type = Selection; - return sel.all ? { type, operator: 'all' } - : sel.not ? { type, operator: 'not', arguments: astExprList(sel.not) } - : sel.range ? { type, operator: 'range', arguments: astExprList(sel.range) } - : sel.matches ? { type, operator: 'matches', arguments: sel.matches } - : error('Invalid input'); -} - -function astSelectionList(arr) { - return toArray(arr).map(astSelectionItem).flat(); -} - -function astSelectionItem(val) { - return isSelection(val) ? astSelection(val) - : isNumber(val) ? astColumnIndex(val) - : isString(val) ? astColumn(val) - : isObject(val) ? Object.keys(val) - .map(name => ({ type: Column, name, as: val[name] })) - : error('Invalid input'); -} - -function astExpr(val) { - return isSelection(val) ? astSelection(val) - : isNumber(val) ? astColumnIndex(val) - : isString(val) ? astColumn(val) - : isObject(val) ? astExprObject(val) - : error('Invalid input'); -} - -function astExprList(arr) { - return toArray(arr).map(astExpr).flat(); -} - -function astExprNumber(val) { - return isNumber(val) ? val : astExprObject(val); -} - -function astJoinKeys(val) { - return isArray(val) - ? val.map(astExprList) - : astExprObject(val, { join: true }); -} - -function astJoinValues(val) { - return isArray(val) - ? val.map((v, i) => i < 2 - ? astExprList(v) - : astExprObject(v, { join: true }) - ) - : astExprObject(val, { join: true }); -} - -function astTableRef(value) { - return value && isFunction(value.toAST) - ? value.toAST() - : value; -} \ No newline at end of file diff --git a/src/query/util.js b/src/query/util.js deleted file mode 100644 index 878c4967..00000000 --- a/src/query/util.js +++ /dev/null @@ -1,130 +0,0 @@ -import desc from '../helpers/desc'; -import field from '../helpers/field'; -import rolling from '../helpers/rolling'; -import { all, matches, not, range } from '../helpers/selection'; -import Query from './query'; -import error from '../util/error'; -import isArray from '../util/is-array'; -import isFunction from '../util/is-function'; -import isNumber from '../util/is-number'; -import isObject from '../util/is-object'; -import isString from '../util/is-string'; -import map from '../util/map-object'; -import toArray from '../util/to-array'; - -function func(expr) { - const f = d => d; - f.toString = () => expr; - return f; -} - -export function getTable(catalog, ref) { - ref = ref && isFunction(ref.query) ? ref.query() : ref; - return ref && isFunction(ref.evaluate) - ? ref.evaluate(null, catalog) - : catalog(ref); -} - -export function isSelection(value) { - return isObject(value) && ( - isArray(value.all) || - isArray(value.matches) || - isArray(value.not) || - isArray(value.range) - ); -} - -export function toObject(value) { - return value && isFunction(value.toObject) ? value.toObject() - : isFunction(value) ? { expr: String(value), func: true } - : isArray(value) ? value.map(toObject) - : isObject(value) ? map(value, _ => toObject(_)) - : value; -} - -export function fromObject(value) { - return isArray(value) ? value.map(fromObject) - : !isObject(value) ? value - : isArray(value.verbs) ? Query.from(value) - : isArray(value.all) ? all() - : isArray(value.range) ? range(...value.range) - : isArray(value.match) ? matches(RegExp(...value.match)) - : isArray(value.not) ? not(value.not.map(toObject)) - : fromExprObject(value); -} - -function fromExprObject(value) { - let output = value; - let expr = value.expr; - - if (expr != null) { - if (value.field === true) { - output = expr = field(expr); - } else if (value.func === true) { - output = expr = func(expr); - } - - if (isObject(value.window)) { - const { frame, peers } = value.window; - output = expr = rolling(expr, frame, peers); - } - - if (value.desc === true) { - output = desc(expr); - } - } - - return value === output - ? map(value, _ => fromObject(_)) - : output; -} - -export function joinKeys(keys) { - return isArray(keys) ? keys.map(parseJoinKeys) - : keys; -} - -function parseJoinKeys(keys) { - const list = []; - - toArray(keys).forEach(param => { - isNumber(param) ? list.push(param) - : isString(param) ? list.push(field(param, null)) - : isObject(param) && param.expr ? list.push(param) - : isFunction(param) ? list.push(param) - : error(`Invalid key value: ${param+''}`); - }); - - return list; -} - -export function joinValues(values) { - return isArray(values) - ? values.map(parseJoinValues) - : values; -} - -function parseJoinValues(values, index) { - return index < 2 ? toArray(values) : values; -} - -export function orderbyKeys(keys) { - const list = []; - - keys.forEach(param => { - const expr = param.expr != null ? param.expr : param; - if (isObject(expr) && !isFunction(expr)) { - for (const key in expr) { - list.push(expr[key]); - } - } else { - param = isNumber(expr) ? expr - : isString(expr) ? field(param) - : isFunction(expr) ? param - : error(`Invalid orderby field: ${param+''}`); - list.push(param); - } - }); - - return list; -} \ No newline at end of file diff --git a/src/query/verb.js b/src/query/verb.js deleted file mode 100644 index e2321401..00000000 --- a/src/query/verb.js +++ /dev/null @@ -1,248 +0,0 @@ -import { Verb as VerbType } from './constants'; - -import { - fromObject, - getTable, - joinKeys, - joinValues, - orderbyKeys, - toObject -} from './util'; - -import { - Expr, - ExprList, - ExprNumber, - ExprObject, - JoinKeys, - JoinValues, - Options, - OrderbyKeys, - SelectionList, - TableRef, - TableRefList -} from './constants'; - -import toAST from './to-ast'; - -/** - * Model an Arquero verb as a serializable object. - */ -export class Verb { - - /** - * Construct a new verb instance. - * @param {string} verb The verb name. - * @param {object[]} schema Schema describing verb parameters. - * @param {any[]} params Array of parameter values. - */ - constructor(verb, schema = [], params = []) { - this.verb = verb; - this.schema = schema; - schema.forEach((s, index) => { - const type = s.type; - const param = params[index]; - const value = type === JoinKeys ? joinKeys(param) - : type === JoinValues ? joinValues(param) - : type === OrderbyKeys ? orderbyKeys(param) - : param; - this[s.name] = value !== undefined ? value : s.default; - }); - } - - /** - * Create new verb instance from the given serialized object. - * @param {object} object A serialized verb representation, such as - * those generated by Verb.toObject. - * @returns {Verb} The instantiated verb. - */ - static from(object) { - const verb = Verbs[object.verb]; - const params = (verb.schema || []) - .map(({ name }) => fromObject(object[name])); - return verb(...params); - } - - /** - * Evaluate this verb against a given table and catalog. - * @param {Table} table The Arquero table to process. - * @param {Function} catalog A table lookup function that accepts a table - * name string as input and returns a corresponding Arquero table. - * @returns {Table} The resulting Arquero table. - */ - evaluate(table, catalog) { - const params = this.schema.map(({ name, type }) => { - const value = this[name]; - return type === TableRef ? getTable(catalog, value) - : type === TableRefList ? value.map(t => getTable(catalog, t)) - : value; - }); - return table[this.verb](...params); - } - - /** - * Serialize this verb as a JSON-compatible object. The resulting - * object can be passed to Verb.from to re-instantiate this verb. - * @returns {object} A JSON-compatible object representing this verb. - */ - toObject() { - const obj = { verb: this.verb }; - this.schema.forEach(({ name }) => { - obj[name] = toObject(this[name]); - }); - return obj; - } - - /** - * Serialize this verb to a JSON-compatible abstract syntax tree. - * All table expressions will be parsed and represented as AST instances - * using a modified form of the Mozilla JavaScript AST format. - * This method can be used to output parsed and serialized representations - * to translate Arquero verbs to alternative data processing platforms. - * @returns {object} A JSON-compatible abstract syntax tree object. - */ - toAST() { - const obj = { type: VerbType, verb: this.verb }; - this.schema.forEach(({ name, type, props }) => { - obj[name] = toAST(this[name], type, props); - }); - return obj; - } -} - -/** - * Verb parameter type. - * @typedef {Expr|ExprList|ExprNumber|ExprObject|JoinKeys|JoinValues|Options|OrderbyKeys|SelectionList|TableRef|TableRefList} ParamType - */ - -/** - * Verb parameter schema. - * @typedef {object} ParamDef - * @property {string} name The name of the parameter. - * @property {ParamType} type The type of the parameter. - * @property {{ [key: string]: ParamType }} [props] Types for non-literal properties. - */ - -/** - * Create a new constructors. - * @param {string} name The name of the verb. - * @param {ParamDef[]} schema The verb parameter schema. - * @return {Function} A verb constructor function. - */ -export function createVerb(name, schema) { - return Object.assign( - (...params) => new Verb(name, schema, params), - { schema } - ); -} - -/** - * A lookup table of verb classes. - */ -export const Verbs = { - count: createVerb('count', [ - { name: 'options', type: Options } - ]), - derive: createVerb('derive', [ - { name: 'values', type: ExprObject }, - { name: 'options', type: Options, - props: { before: SelectionList, after: SelectionList } - } - ]), - filter: createVerb('filter', [ - { name: 'criteria', type: ExprObject } - ]), - groupby: createVerb('groupby', [ - { name: 'keys', type: ExprList } - ]), - orderby: createVerb('orderby', [ - { name: 'keys', type: OrderbyKeys } - ]), - relocate: createVerb('relocate', [ - { name: 'columns', type: SelectionList }, - { name: 'options', type: Options, - props: { before: SelectionList, after: SelectionList } - } - ]), - rename: createVerb('rename', [ - { name: 'columns', type: SelectionList } - ]), - rollup: createVerb('rollup', [ - { name: 'values', type: ExprObject } - ]), - sample: createVerb('sample', [ - { name: 'size', type: ExprNumber }, - { name: 'options', type: Options, props: { weight: Expr } } - ]), - select: createVerb('select', [ - { name: 'columns', type: SelectionList } - ]), - ungroup: createVerb('ungroup'), - unorder: createVerb('unorder'), - reify: createVerb('reify'), - dedupe: createVerb('dedupe', [ - { name: 'keys', type: ExprList, default: [] } - ]), - impute: createVerb('impute', [ - { name: 'values', type: ExprObject }, - { name: 'options', type: Options, props: { expand: ExprList } } - ]), - fold: createVerb('fold', [ - { name: 'values', type: ExprList }, - { name: 'options', type: Options } - ]), - pivot: createVerb('pivot', [ - { name: 'keys', type: ExprList }, - { name: 'values', type: ExprList }, - { name: 'options', type: Options } - ]), - spread: createVerb('spread', [ - { name: 'values', type: ExprList }, - { name: 'options', type: Options } - ]), - unroll: createVerb('unroll', [ - { name: 'values', type: ExprList }, - { name: 'options', type: Options, props: { drop: ExprList } } - ]), - lookup: createVerb('lookup', [ - { name: 'table', type: TableRef }, - { name: 'on', type: JoinKeys }, - { name: 'values', type: ExprList } - ]), - join: createVerb('join', [ - { name: 'table', type: TableRef }, - { name: 'on', type: JoinKeys }, - { name: 'values', type: JoinValues }, - { name: 'options', type: Options } - ]), - cross: createVerb('cross', [ - { name: 'table', type: TableRef }, - { name: 'values', type: JoinValues }, - { name: 'options', type: Options } - ]), - semijoin: createVerb('semijoin', [ - { name: 'table', type: TableRef }, - { name: 'on', type: JoinKeys } - ]), - antijoin: createVerb('antijoin', [ - { name: 'table', type: TableRef }, - { name: 'on', type: JoinKeys } - ]), - concat: createVerb('concat', [ - { name: 'tables', type: TableRefList } - ]), - union: createVerb('union', [ - { name: 'tables', type: TableRefList } - ]), - intersect: createVerb('intersect', [ - { name: 'tables', type: TableRefList } - ]), - except: createVerb('except', [ - { name: 'tables', type: TableRefList } - ]) -}; - -/** - * Abstract class representing a data table. - * @typedef {import('../table/table').default} Table - */ diff --git a/src/register.js b/src/register.js deleted file mode 100644 index 1c6f725c..00000000 --- a/src/register.js +++ /dev/null @@ -1,260 +0,0 @@ -import ColumnTable from './table/column-table'; -import aggregateFunctions from './op/aggregate-functions'; -import windowFunctions from './op/window-functions'; -import functions from './op/functions'; -import op from './op/op'; -import ops from './op/op-api'; -import Query, { addQueryVerb } from './query/query'; -import { Verbs, createVerb } from './query/verb'; -import { ROW_OBJECT } from './expression/row-object'; -import error from './util/error'; -import has from './util/has'; -import toString from './util/to-string'; - -const onIllegal = (name, type) => - error(`Illegal ${type} name: ${toString(name)}`); - -const onDefined = (name, type) => - error(`The ${type} ${toString(name)} is already defined. Use override option?`); - -const onReserve = (name, type) => - error(`The ${type} name ${toString(name)} is reserved and can not be overridden.`); - -function check(name, options, obj = ops, type = 'function') { - if (!name) onIllegal(name, type); - if (!options.override && has(obj, name)) onDefined(name, type); -} - -// -- Op Functions -------------------------------------------------- - -function verifyFunction(name, def, object, options) { - return object[name] === def || check(name, options); -} - -/** - * Register an aggregate or window operation. - * @param {string} name The name of the operation - * @param {AggregateDef|WindowDef} def The operation definition. - * @param {object} object The registry object to add the definition to. - * @param {RegisterOptions} [options] Registration options. - */ -function addOp(name, def, object, options = {}) { - if (verifyFunction(name, def, object, options)) return; - const [nf = 0, np = 0] = def.param; - object[name] = def; - ops[name] = (...params) => op( - name, - params.slice(0, nf), - params.slice(nf, nf + np) - ); -} - -/** - * Register a custom aggregate function. - * @param {string} name The name to use for the aggregate function. - * @param {AggregateDef} def The aggregate operator definition. - * @param {RegisterOptions} [options] Function registration options. - * @throws If a function with the same name is already registered and - * the override option is not specified. - */ -export function addAggregateFunction(name, def, options) { - addOp(name, def, aggregateFunctions, options); -} - -/** - * Register a custom window function. - * @param {string} name The name to use for the window function. - * @param {WindowDef} def The window operator definition. - * @param {RegisterOptions} [options] Function registration options. - * @throws If a function with the same name is already registered and - * the override option is not specified. - */ -export function addWindowFunction(name, def, options) { - addOp(name, def, windowFunctions, options); -} - -/** - * Register a function for use within table expressions. - * If only a single argument is provided, it will be assumed to be a - * function and the system will try to extract its name. - * @param {string} name The name to use for the function. - * @param {Function} fn A standard JavaScript function. - * @param {RegisterOptions} [options] Function registration options. - * @throws If a function with the same name is already registered and - * the override option is not specified, or if no name is provided - * and the input function is anonymous. - */ -export function addFunction(name, fn, options = {}) { - if (arguments.length === 1) { - fn = name; - name = fn.name; - if (name === '' || name === 'anonymous') { - error('Anonymous function provided, please include a name argument.'); - } else if (name === ROW_OBJECT) { - onReserve(ROW_OBJECT, 'function'); - } - } - if (verifyFunction(name, fn, functions, options)) return; - functions[name] = fn; - ops[name] = fn; -} - -// -- Table Methods and Verbs --------------------------------------- - -const proto = ColumnTable.prototype; - -/** - * Reserved table/query methods that must not be overwritten. - */ -let RESERVED; - -function addReserved(obj) { - for (; obj; obj = Object.getPrototypeOf(obj)) { - Object.getOwnPropertyNames(obj).forEach(name => RESERVED[name] = 1); - } -} - -function verifyTableMethod(name, fn, options) { - const type = 'method'; - - // exit early if duplicate re-assignment - if (proto[name] && proto[name].fn === fn) return true; - - // initialize reserved properties to avoid overriding internals - if (!RESERVED) { - RESERVED = {}; - addReserved(proto); - addReserved(Query.prototype); - } - - // perform name checks - if (RESERVED[name]) onReserve(name, type); - if ((name + '')[0] === '_') onIllegal(name, type); - check(name, options, proto, type); -} - -/** - * Register a new table method. A new method will be added to the column - * table prototype. When invoked from a table, the registered method will - * be invoked with the table as the first argument, followed by all the - * provided arguments. - * @param {string} name The name of the table method. - * @param {Function} method The table method. - * @param {RegisterOptions} options - */ -export function addTableMethod(name, method, options = {}) { - if (verifyTableMethod(name, method, options)) return; - proto[name] = function(...args) { return method(this, ...args); }; - proto[name].fn = method; -} - -/** - * Register a new transformation verb. - * @param {string} name The name of the verb. - * @param {Function} method The verb implementation. - * @param {ParamDef[]} params The verb parameter schema. - * @param {RegisterOptions} options Function registration options. - */ -export function addVerb(name, method, params, options = {}) { - // register table method first - // if that doesn't throw, add serializable verb entry - addTableMethod(name, method, options); - addQueryVerb(name, Verbs[name] = createVerb(name, params)); -} - -// -- Package Bundles ----------------------------------------------- - -const PACKAGE = 'arquero_package'; - -/** - * Add an extension package of functions, table methods, and/or verbs. - * @param {Package|PackageBundle} bundle The package of extensions. - * @throws If package validation fails. - */ -export function addPackage(bundle, options = {}) { - const pkg = bundle && bundle[PACKAGE] || bundle; - const parts = { - functions: [ - (name, def, opt) => verifyFunction(name, def, functions, opt), - addFunction - ], - aggregateFunctions: [ - (name, def, opt) => verifyFunction(name, def, aggregateFunctions, opt), - addAggregateFunction - ], - windowFunctions: [ - (name, def, opt) => verifyFunction(name, def, windowFunctions, opt), - addWindowFunction - ], - tableMethods: [ - verifyTableMethod, - addTableMethod - ], - verbs: [ - (name, obj, opt) => verifyTableMethod(name, obj.method, opt), - (name, obj, opt) => addVerb(name, obj.method, obj.params, opt) - ] - }; - - function scan(index) { - for (const key in parts) { - const part = parts[key]; - const p = pkg[key]; - for (const name in p) part[index](name, p[name], options); - } - } - scan(0); // first validate package, throw if validation fails - scan(1); // then add package content -} - -/** - * Aggregate function definition. - * @typedef {import('./op/aggregate-functions').AggregateDef} AggregateDef - */ - -/** - * Window function definition. - * @typedef {import('./op/window-functions').WindowDef} WindowDef - */ - -/** - * Verb parameter definition. - * @typedef {import('./query/verb').ParamDef} ParamDef - */ - -/** - * Verb definition. - * @typedef {object} VerbDef - * @property {Function} method A function implementing the verb. - * @property {ParamDef[]} params The verb parameter schema. - */ - -/** - * Verb parameter definition. - * @typedef {object} ParamDef - * @property {string} name The verb parameter name. - * @property {ParamType} type The verb parameter type. - */ - -/** - * A package of op function and table method definitions. - * @typedef {object} Package - * @property {{[name: string]: Function}} [functions] Standard function entries. - * @property {{[name: string]: AggregateDef}} [aggregateFunctions] Aggregate function entries. - * @property {{[name: string]: WindowDef}} [windowFunctions] Window function entries. - * @property {{[name: string]: Function}} [tableMethods] Table method entries. - * @property {{[name: string]: VerbDef}} [verbs] Verb entries. - */ - -/** - * An object containing an extension package. - * @typedef {object} PackageBundle - * @property {Package} arquero.package The package bundle. - */ - -/** - * Options for registering new functions. - * @typedef {object} RegisterOptions - * @property {boolean} [override=false] Flag indicating if the added - * function can override an existing function with the same name. - */ \ No newline at end of file diff --git a/src/table/bit-set.js b/src/table/BitSet.js similarity index 99% rename from src/table/bit-set.js rename to src/table/BitSet.js index 71d6823e..7cf546f7 100644 --- a/src/table/bit-set.js +++ b/src/table/BitSet.js @@ -4,7 +4,7 @@ const ALL = 0xFFFFFFFF; /** * Represent an indexable set of bits. */ -export default class BitSet { +export class BitSet { /** * Instantiate a new BitSet instance. * @param {number} size The number of bits. @@ -162,4 +162,4 @@ export default class BitSet { } return this; } -} \ No newline at end of file +} diff --git a/src/table/ColumnSet.js b/src/table/ColumnSet.js new file mode 100644 index 00000000..1b1d49c2 --- /dev/null +++ b/src/table/ColumnSet.js @@ -0,0 +1,83 @@ +import has from '../util/has.js'; + +/** + * Return a new column set instance. + * @param {import('./Table.js').Table} [table] A base table whose columns + * should populate the returned set. If unspecified, create an empty set. + * @return {ColumnSet} The column set. + */ +export function columnSet(table) { + return table + ? new ColumnSet({ ...table.data() }, table.columnNames()) + : new ColumnSet(); +} + +/** An editable collection of named columns. */ +export class ColumnSet { + /** + * Create a new column set instance. + * @param {import('./types.js').ColumnData} [data] Initial column data. + * @param {string[]} [names] Initial column names. + */ + constructor(data, names) { + this.data = data || {}; + this.names = names || []; + } + + /** + * Add a new column to this set and return the column values. + * @template {import('./types.js').ColumnType} T + * @param {string} name The column name. + * @param {T} values The column values. + * @return {T} The provided column values. + */ + add(name, values) { + if (!this.has(name)) this.names.push(name + ''); + return this.data[name] = values; + } + + /** + * Test if this column set has a columns with the given name. + * @param {string} name A column name + * @return {boolean} True if this set contains a column with the given name, + * false otherwise. + */ + has(name) { + return has(this.data, name); + } + + /** + * Add a groupby specification to this column set. + * @param {import('./types.js').GroupBySpec} groups A groupby specification. + * @return {this} This column set. + */ + groupby(groups) { + this.groups = groups; + return this; + } + + /** + * Create a new table with the contents of this column set, using the same + * type as a given prototype table. The new table does not inherit the + * filter, groupby, or orderby state of the prototype. + * @template {import('./Table.js').Table} T + * @param {T} proto A prototype table + * @return {T} The new table. + */ + new(proto) { + const { data, names, groups = null } = this; + return proto.create({ data, names, groups, filter: null, order: null }); + } + + /** + * Create a derived table with the contents of this column set, using the same + * type as a given prototype table. The new table will inherit the filter, + * groupby, and orderby state of the prototype. + * @template {import('./Table.js').Table} T + * @param {T} proto A prototype table + * @return {T} The new table. + */ + derive(proto) { + return proto.create(this); + } +} diff --git a/src/table/ColumnTable.js b/src/table/ColumnTable.js new file mode 100644 index 00000000..43320650 --- /dev/null +++ b/src/table/ColumnTable.js @@ -0,0 +1,848 @@ +import { Table } from './Table.js'; +import { + antijoin, + assign, + concat, + cross, + dedupe, + derive, + except, + filter, + fold, + groupby, + impute, + intersect, + join, + lookup, + orderby, + pivot, + reduce, + relocate, + rename, + rollup, + sample, + select, + semijoin, + slice, + spread, + ungroup, + union, + unorder, + unroll +} from '../verbs/index.js'; +import { count } from '../op/op-api.js'; +import toArrow from '../arrow/to-arrow.js'; +import toArrowIPC from '../arrow/to-arrow-ipc.js'; +import toCSV from '../format/to-csv.js'; +import toHTML from '../format/to-html.js'; +import toJSON from '../format/to-json.js'; +import toMarkdown from '../format/to-markdown.js'; +import toArray from '../util/to-array.js'; + +/** + * A data table with transformation verbs. + */ +export class ColumnTable extends Table { + /** + * Create a new table with additional columns drawn from one or more input + * tables. All tables must have the same numer of rows and are reified + * prior to assignment. In the case of repeated column names, input table + * columns overwrite existing columns. + * @param {...(Table|import('./types.js').ColumnData)} tables + * The tables to merge with this table. + * @return {this} A new table with merged columns. + * @example table.assign(table1, table2) + */ + assign(...tables) { + return assign(this, ...tables); + } + + /** + * Count the number of values in a group. This method is a shorthand + * for *rollup* with a count aggregate function. + * @param {import('./types.js').CountOptions} [options] + * Options for the count. + * @return {this} A new table with groupby and count columns. + * @example table.groupby('colA').count() + * @example table.groupby('colA').count({ as: 'num' }) + */ + count(options = {}) { + const { as = 'count' } = options; + return rollup(this, { [as]: count() }); + } + + /** + * Derive new column values based on the provided expressions. By default, + * new columns are added after (higher indices than) existing columns. Use + * the before or after options to place new columns elsewhere. + * @param {import('./types.js').ExprObject} values + * Object of name-value pairs defining the columns to derive. The input + * object should have output column names for keys and table expressions + * for values. + * @param {import('./types.js').DeriveOptions} [options] + * Options for dropping or relocating derived columns. Use either a before + * or after property to indicate where to place derived columns. Specifying + * both before and after is an error. Unlike the *relocate* verb, this + * option affects only new columns; updated columns with existing names + * are excluded from relocation. + * @return {this} A new table with derived columns added. + * @example table.derive({ sumXY: d => d.x + d.y }) + * @example table.derive({ z: d => d.x * d.y }, { before: 'x' }) + */ + derive(values, options) { + return derive(this, values, options); + } + + /** + * Filter a table to a subset of rows based on the input criteria. + * The resulting table provides a filtered view over the original data; no + * data copy is made. To create a table that copies only filtered data to + * new data structures, call *reify* on the output table. + * @param {import('./types.js').TableExpr} criteria + * Filter criteria as a table expression. Both aggregate and window + * functions are permitted, taking into account *groupby* or *orderby* + * settings. + * @return {this} A new table with filtered rows. + * @example table.filter(d => abs(d.value) < 5) + */ + filter(criteria) { + return filter(this, criteria); + } + + /** + * Extract rows with indices from start to end (end not included), where + * start and end represent per-group ordered row numbers in the table. + * @param {number} [start] Zero-based index at which to start extraction. + * A negative index indicates an offset from the end of the group. + * If start is undefined, slice starts from the index 0. + * @param {number} [end] Zero-based index before which to end extraction. + * A negative index indicates an offset from the end of the group. + * If end is omitted, slice extracts through the end of the group. + * @return {this} A new table with sliced rows. + * @example table.slice(1, -1) + */ + slice(start, end) { + return slice(this, start, end); + } + + /** + * Group table rows based on a set of column values. + * Subsequent operations that are sensitive to grouping (such as + * aggregate functions) will operate over the grouped rows. + * To undo grouping, use *ungroup*. + * @param {...import('./types.js').ExprList} keys + * Key column values to group by. The keys may be specified using column + * name strings, column index numbers, value objects with output column + * names for keys and table expressions for values, or selection helper + * functions. + * @return {this} A new table with grouped rows. + * @example table.groupby('colA', 'colB') + * @example table.groupby({ key: d => d.colA + d.colB }) + */ + groupby(...keys) { + return groupby(this, ...keys); + } + + /** + * Order table rows based on a set of column values. Subsequent operations + * sensitive to ordering (such as window functions) will operate over sorted + * values. The resulting table provides an view over the original data, + * without any copying. To create a table with sorted data copied to new + * data strucures, call *reify* on the result of this method. To undo + * ordering, use *unorder*. + * @param {...import('./types.js').OrderKeys} keys + * Key values to sort by, in precedence order. + * By default, sorting is done in ascending order. + * To sort in descending order, wrap values using *desc*. + * If a string, order by the column with that name. + * If a number, order by the column with that index. + * If a function, must be a valid table expression; aggregate functions + * are permitted, but window functions are not. + * If an object, object values must be valid values parameters + * with output column names for keys and table expressions + * for values (the output names will be ignored). + * If an array, array values must be valid key parameters. + * @return {this} A new ordered table. + * @example table.orderby('a', desc('b')) + * @example table.orderby({ a: 'a', b: desc('b') )}) + * @example table.orderby(desc(d => d.a)) + */ + orderby(...keys) { + return orderby(this, ...keys); + } + + /** + * Relocate a subset of columns to change their positions, also + * potentially renaming them. + * @param {import('./types.js').Select} columns + * An ordered selection of columns to relocate. + * The input may consist of column name strings, column integer indices, + * rename objects with current column names as keys and new column names + * as values, or functions that take a table as input and returns a valid + * selection parameter (typically the output of selection helper functions + * such as *all*, *not*, or *range*). + * @param {import('./types.js').RelocateOptions} options + * Options for relocating. Must include either the before or after property + * to indicate where to place the relocated columns. Specifying both before + * and after is an error. + * @return {this} A new table with relocated columns. + * @example table.relocate(['colY', 'colZ'], { after: 'colX' }) + * @example table.relocate(not('colB', 'colC'), { before: 'colA' }) + * @example table.relocate({ colA: 'newA', colB: 'newB' }, { after: 'colC' }) + */ + relocate(columns, options) { + return relocate(this, toArray(columns), options); + } + + /** + * Rename one or more columns, preserving column order. + * @param {...import('./types.js').Select} columns + * One or more rename objects with current column names as keys and new + * column names as values. + * @return {this} A new table with renamed columns. + * @example table.rename({ oldName: 'newName' }) + * @example table.rename({ a: 'a2', b: 'b2' }) + */ + rename(...columns) { + return rename(this, ...columns); + } + + /** + * Reduce a table, processing all rows to produce a new table. + * To produce standard aggregate summaries, use the rollup verb. + * This method allows the use of custom reducer implementations, + * for example to produce multiple rows for an aggregate. + * @param {import('../verbs/reduce/reducer.js').default} reducer + * The reducer to apply. + * @return {this} A new table of reducer outputs. + */ + reduce(reducer) { + return reduce(this, reducer); + } + + /** + * Rollup a table to produce an aggregate summary. + * Often used in conjunction with *groupby*. + * To produce counts only, *count* is a shortcut. + * @param {import('./types.js').ExprObject} [values] + * Object of name-value pairs defining aggregate output columns. The input + * object should have output column names for keys and table expressions + * for values. The expressions must be valid aggregate expressions: window + * functions are not allowed and column references must be arguments to + * aggregate functions. + * @return {this} A new table of aggregate summary values. + * @example table.groupby('colA').rollup({ mean: d => mean(d.colB) }) + * @example table.groupby('colA').rollup({ mean: op.median('colB') }) + */ + rollup(values) { + return rollup(this, values); + } + + /** + * Generate a table from a random sample of rows. + * If the table is grouped, performs a stratified sample by + * sampling from each group separately. + * @param {number | import('./types.js').TableExpr} size + * The number of samples to draw per group. + * If number-valued, the same sample size is used for each group. + * If function-valued, the input should be an aggregate table + * expression compatible with *rollup*. + * @param {import('./types.js').SampleOptions} [options] + * Options for sampling. + * @return {this} A new table with sampled rows. + * @example table.sample(50) + * @example table.sample(100, { replace: true }) + * @example table.groupby('colA').sample(() => op.floor(0.5 * op.count())) + */ + sample(size, options) { + return sample(this, size, options); + } + + /** + * Select a subset of columns into a new table, potentially renaming them. + * @param {...import('./types.js').Select} columns + * An ordered selection of columns. + * The input may consist of column name strings, column integer indices, + * rename objects with current column names as keys and new column names + * as values, or functions that take a table as input and returns a valid + * selection parameter (typically the output of selection helper functions + * such as *all*, *not*, or *range*.). + * @return {this} A new table of selected columns. + * @example table.select('colA', 'colB') + * @example table.select(not('colB', 'colC')) + * @example table.select({ colA: 'newA', colB: 'newB' }) + */ + select(...columns) { + return select(this, ...columns); + } + + /** + * Ungroup a table, removing any grouping criteria. + * Undoes the effects of *groupby*. + * @return {this} A new ungrouped table, or this table if not grouped. + * @example table.ungroup() + */ + ungroup() { + return ungroup(this); + } + + /** + * Unorder a table, removing any sorting criteria. + * Undoes the effects of *orderby*. + * @return {this} A new unordered table, or this table if not ordered. + * @example table.unorder() + */ + unorder() { + return unorder(this); + } + + // -- Cleaning Verbs ------------------------------------------------------ + + /** + * De-duplicate table rows by removing repeated row values. + * @param {...import('./types.js').ExprList} keys + * Key columns to check for duplicates. + * Two rows are considered duplicates if they have matching values for + * all keys. If keys are unspecified, all columns are used. + * The keys may be specified using column name strings, column index + * numbers, value objects with output column names for keys and table + * expressions for values, or selection helper functions. + * @return {this} A new de-duplicated table. + * @example table.dedupe() + * @example table.dedupe('a', 'b') + * @example table.dedupe({ abs: d => op.abs(d.a) }) + */ + dedupe(...keys) { + return dedupe(this, ...keys); + } + + /** + * Impute missing values or rows. Accepts a set of column-expression pairs + * and evaluates the expressions to replace any missing (null, undefined, + * or NaN) values in the original column. + * If the expand option is specified, imputes new rows for missing + * combinations of values. All combinations of key values (a full cross + * product) are considered for each level of grouping (specified by + * *groupby*). New rows will be added for any combination + * of key and groupby values not already contained in the table. For all + * non-key and non-group columns the new rows are populated with imputation + * values (first argument) if specified, otherwise undefined. + * If the expand option is specified, any filter or orderby settings are + * removed from the output table, but groupby settings persist. + * @param {import('./types.js').ExprObject} values + * Object of name-value pairs for the column values to impute. The input + * object should have existing column names for keys and table expressions + * for values. The expressions will be evaluated to determine replacements + * for any missing values. + * @param {import('./types.js').ImputeOptions} [options] Imputation options. + * The expand property specifies a set of column values to consider for + * imputing missing rows. All combinations of expanded values are + * considered, and new rows are added for each combination that does not + * appear in the input table. + * @return {this} A new table with imputed values and/or rows. + * @example table.impute({ v: () => 0 }) + * @example table.impute({ v: d => op.mean(d.v) }) + * @example table.impute({ v: () => 0 }, { expand: ['x', 'y'] }) + */ + impute(values, options) { + return impute(this, values, options); + } + + // -- Reshaping Verbs ----------------------------------------------------- + + /** + * Fold one or more columns into two key-value pair columns. + * The fold transform is an inverse of the *pivot* transform. + * The resulting table has two new columns, one containing the column + * names (named "key") and the other the column values (named "value"). + * The number of output rows equals the original row count multiplied + * by the number of folded columns. + * @param {import('./types.js').ExprList} values The columns to fold. + * The columns may be specified using column name strings, column index + * numbers, value objects with output column names for keys and table + * expressions for values, or selection helper functions. + * @param {import('./types.js').FoldOptions} [options] Options for folding. + * @return {this} A new folded table. + * @example table.fold('colA') + * @example table.fold(['colA', 'colB']) + * @example table.fold(range(5, 8)) + */ + fold(values, options) { + return fold(this, values, options); + } + + /** + * Pivot columns into a cross-tabulation. + * The pivot transform is an inverse of the *fold* transform. + * The resulting table has new columns for each unique combination + * of the provided *keys*, populated with the provided *values*. + * The provided *values* must be aggregates, as a single set of keys may + * include more than one row. If string-valued, the *any* aggregate is used. + * If only one *values* column is defined, the new pivoted columns will + * be named using key values directly. Otherwise, input value column names + * will be included as a component of the output column names. + * @param {import('./types.js').ExprList} keys + * Key values to map to new column names. The keys may be specified using + * column name strings, column index numbers, value objects with output + * column names for keys and table expressions for values, or selection + * helper functions. + * @param {import('./types.js').ExprList} values Output values for pivoted + * columns. Column references will be wrapped in an *any* aggregate. If + * object-valued, the input object should have output value names for keys + * and aggregate table expressions for values. + * @param {import('./types.js').PivotOptions} [options] + * Options for pivoting. + * @return {this} A new pivoted table. + * @example table.pivot('key', 'value') + * @example table.pivot(['keyA', 'keyB'], ['valueA', 'valueB']) + * @example table.pivot({ key: d => d.key }, { value: d => op.sum(d.value) }) + */ + pivot(keys, values, options) { + return pivot(this, keys, values, options); + } + + /** + * Spread array elements into a set of new columns. + * Output columns are named based on the value key and array index. + * @param {import('./types.js').ExprList} values + * The column values to spread. The values may be specified using column + * name strings, column index numbers, value objects with output column + * names for keys and table expressions for values, or selection helper + * functions. + * @param {import('./types.js').SpreadOptions } [options] + * Options for spreading. + * @return {this} A new table with the spread columns added. + * @example table.spread({ a: d => op.split(d.text, '') }) + * @example table.spread('arrayCol', { limit: 100 }) + */ + spread(values, options) { + return spread(this, values, options); + } + + /** + * Unroll one or more array-valued columns into new rows. + * If more than one array value is used, the number of new rows + * is the smaller of the limit and the largest length. + * Values for all other columns are copied over. + * @param {import('./types.js').ExprList} values + * The column values to unroll. The values may be specified using column + * name strings, column index numbers, value objects with output column + * names for keys and table expressions for values, or selection helper + * functions. + * @param {import('./types.js').UnrollOptions} [options] + * Options for unrolling. + * @return {this} A new unrolled table. + * @example table.unroll('colA', { limit: 1000 }) + */ + unroll(values, options) { + return unroll(this, values, options); + } + + // -- Joins --------------------------------------------------------------- + + /** + * Lookup values from a secondary table and add them as new columns. + * A lookup occurs upon matching key values for rows in both tables. + * If the secondary table has multiple rows with the same key, only + * the last observed instance will be considered in the lookup. + * Lookup is similar to *join_left*, but with a simpler + * syntax and the added constraint of allowing at most one match only. + * @param {import('./types.js').TableRef} other + * The secondary table to look up values from. + * @param {import('./types.js').JoinKeys} [on] + * Lookup keys (column name strings or table expressions) for this table + * and the secondary table, respectively. + * @param {...import('./types.js').ExprList} values + * The column values to add from the secondary table. Can be column name + * strings or objects with column names as keys and table expressions as + * values. + * @return {this} A new table with lookup values added. + * @example table.lookup(other, ['key1', 'key2'], 'value1', 'value2') + */ + lookup(other, on, ...values) { + return lookup(this, other, on, ...values); + } + + /** + * Join two tables, extending the columns of one table with + * values from the other table. The current table is considered + * the "left" table in the join, and the new table input is + * considered the "right" table in the join. By default an inner + * join is performed, removing all rows that do not match the + * join criteria. To perform left, right, or full outer joins, use + * the *join_left*, *join_right*, or *join_full* methods, or provide + * an options argument. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. If unspecified, the values of + * all columns with matching names are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @param {import('./types.js').JoinValues} [values] + * The columns to include in the join output. + * If unspecified, all columns from both tables are included; paired + * join keys sharing the same column name are included only once. + * If array-valued, a two element array should be provided, containing + * the columns to include for the left and right tables, respectively. + * Array input may consist of column name strings, objects with output + * names as keys and single-table table expressions as values, or the + * selection helper functions *all*, *not*, or *range*. + * If object-valued, specifies the key-value pairs for each output, + * defined using two-table table expressions. + * @param {import('./types.js').JoinOptions} [options] + * Options for the join. + * @return {this} A new joined table. + * @example table.join(other, ['keyL', 'keyR']) + * @example table.join(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + join(other, on, values, options) { + return join(this, other, on, values, options); + } + + /** + * Perform a left outer join on two tables. Rows in the left table + * that do not match a row in the right table will be preserved. + * This is a convenience method with fixed options for *join*. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. + * If unspecified, the values of all columns with matching names + * are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @param {import('./types.js').JoinValues} [values] + * he columns to include in the join output. + * If unspecified, all columns from both tables are included; paired + * join keys sharing the same column name are included only once. + * If array-valued, a two element array should be provided, containing + * the columns to include for the left and right tables, respectively. + * Array input may consist of column name strings, objects with output + * names as keys and single-table table expressions as values, or the + * selection helper functions *all*, *not*, or *range*. + * If object-valued, specifies the key-value pairs for each output, + * defined using two-table table expressions. + * @param {import('./types.js').JoinOptions} [options] + * Options for the join. With this method, any options will be + * overridden with `{left: true, right: false}`. + * @return {this} A new joined table. + * @example table.join_left(other, ['keyL', 'keyR']) + * @example table.join_left(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + join_left(other, on, values, options) { + const opt = { ...options, left: true, right: false }; + return join(this, other, on, values, opt); + } + + /** + * Perform a right outer join on two tables. Rows in the right table + * that do not match a row in the left table will be preserved. + * This is a convenience method with fixed options for *join*. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. + * If unspecified, the values of all columns with matching names + * are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @param {import('./types.js').JoinValues} [values] + * The columns to include in the join output. + * If unspecified, all columns from both tables are included; paired + * join keys sharing the same column name are included only once. + * If array-valued, a two element array should be provided, containing + * the columns to include for the left and right tables, respectively. + * Array input may consist of column name strings, objects with output + * names as keys and single-table table expressions as values, or the + * selection helper functions *all*, *not*, or *range*. + * If object-valued, specifies the key-value pairs for each output, + * defined using two-table table expressions. + * @param {import('./types.js').JoinOptions} [options] + * Options for the join. With this method, any options will be overridden + * with `{left: false, right: true}`. + * @return {this} A new joined table. + * @example table.join_right(other, ['keyL', 'keyR']) + * @example table.join_right(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + join_right(other, on, values, options) { + const opt = { ...options, left: false, right: true }; + return join(this, other, on, values, opt); + } + + /** + * Perform a full outer join on two tables. Rows in either the left or + * right table that do not match a row in the other will be preserved. + * This is a convenience method with fixed options for *join*. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. + * If unspecified, the values of all columns with matching names + * are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @param {import('./types.js').JoinValues} [values] + * The columns to include in the join output. + * If unspecified, all columns from both tables are included; paired + * join keys sharing the same column name are included only once. + * If array-valued, a two element array should be provided, containing + * the columns to include for the left and right tables, respectively. + * Array input may consist of column name strings, objects with output + * names as keys and single-table table expressions as values, or the + * selection helper functions *all*, *not*, or *range*. + * If object-valued, specifies the key-value pairs for each output, + * defined using two-table table expressions. + * @param {import('./types.js').JoinOptions} [options] + * Options for the join. With this method, any options will be overridden + * with `{left: true, right: true}`. + * @return {this} A new joined table. + * @example table.join_full(other, ['keyL', 'keyR']) + * @example table.join_full(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + join_full(other, on, values, options) { + const opt = { ...options, left: true, right: true }; + return join(this, other, on, values, opt); + } + + /** + * Produce the Cartesian cross product of two tables. The output table + * has one row for every pair of input table rows. Beware that outputs + * may be quite large, as the number of output rows is the product of + * the input row counts. + * This is a convenience method for *join* in which the + * join criteria is always true. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinValues} [values] + * The columns to include in the output. + * If unspecified, all columns from both tables are included. + * If array-valued, a two element array should be provided, containing + * the columns to include for the left and right tables, respectively. + * Array input may consist of column name strings, objects with output + * names as keys and single-table table expressions as values, or the + * selection helper functions *all*, *not*, or *range*. + * If object-valued, specifies the key-value pairs for each output, + * defined using two-table table expressions. + * @param {import('./types.js').JoinOptions} [options] + * Options for the join. + * @return {this} A new joined table. + * @example table.cross(other) + * @example table.cross(other, [['leftKey', 'leftVal'], ['rightVal']]) + */ + cross(other, values, options) { + return cross(this, other, values, options); + } + + /** + * Perform a semi-join, filtering the left table to only rows that + * match a row in the right table. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. + * If unspecified, the values of all columns with matching names + * are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @return {this} A new filtered table. + * @example table.semijoin(other) + * @example table.semijoin(other, ['keyL', 'keyR']) + * @example table.semijoin(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + semijoin(other, on) { + return semijoin(this, other, on); + } + + /** + * Perform an anti-join, filtering the left table to only rows that + * do *not* match a row in the right table. + * @param {import('./types.js').TableRef} other + * The other (right) table to join with. + * @param {import('./types.js').JoinPredicate} [on] + * The join criteria for matching table rows. + * If unspecified, the values of all columns with matching names + * are compared. + * If array-valued, a two-element array should be provided, containing + * the columns to compare for the left and right tables, respectively. + * If a one-element array or a string value is provided, the same + * column names will be drawn from both tables. + * If function-valued, should be a two-table table expression that + * returns a boolean value. When providing a custom predicate, note that + * join key values can be arrays or objects, and that normal join + * semantics do not consider null or undefined values to be equal (that is, + * null !== null). Use the op.equal function to handle these cases. + * @return {this} A new filtered table. + * @example table.antijoin(other) + * @example table.antijoin(other, ['keyL', 'keyR']) + * @example table.antijoin(other, (a, b) => op.equal(a.keyL, b.keyR)) + */ + antijoin(other, on) { + return antijoin(this, other, on); + } + + // -- Set Operations ------------------------------------------------------ + + /** + * Concatenate multiple tables into a single table, preserving all rows. + * This transformation mirrors the UNION_ALL operation in SQL. + * Only named columns in this table are included in the output. + * @param {...import('./types.js').TableRefList} tables + * A list of tables to concatenate. + * @return {this} A new concatenated table. + * @example table.concat(other) + * @example table.concat(other1, other2) + * @example table.concat([other1, other2]) + */ + concat(...tables) { + return concat(this, ...tables); + } + + /** + * Union multiple tables into a single table, deduplicating all rows. + * This transformation mirrors the UNION operation in SQL. It is + * similar to *concat* but suppresses duplicate rows with + * values identical to another row. + * Only named columns in this table are included in the output. + * @param {...import('./types.js').TableRefList} tables + * A list of tables to union. + * @return {this} A new unioned table. + * @example table.union(other) + * @example table.union(other1, other2) + * @example table.union([other1, other2]) + */ + union(...tables) { + return union(this, ...tables); + } + + /** + * Intersect multiple tables, keeping only rows whose with identical + * values for all columns in all tables, and deduplicates the rows. + * This transformation is similar to a series of *semijoin*. + * calls, but additionally suppresses duplicate rows. + * @param {...import('./types.js').TableRefList} tables + * A list of tables to intersect. + * @return {this} A new filtered table. + * @example table.intersect(other) + * @example table.intersect(other1, other2) + * @example table.intersect([other1, other2]) + */ + intersect(...tables) { + return intersect(this, ...tables); + } + + /** + * Compute the set difference with multiple tables, keeping only rows in + * this table that whose values do not occur in the other tables. + * This transformation is similar to a series of *anitjoin* + * calls, but additionally suppresses duplicate rows. + * @param {...import('./types.js').TableRefList} tables + * A list of tables to difference. + * @return {this} A new filtered table. + * @example table.except(other) + * @example table.except(other1, other2) + * @example table.except([other1, other2]) + */ + except(...tables) { + return except(this, ...tables); + } + + // -- Table Output Formats ------------------------------------------------ + + /** + * Format this table as an Apache Arrow table. + * @param {import('../arrow/types.js').ArrowFormatOptions} [options] + * The Arrow formatting options. + * @return {import('apache-arrow').Table} An Apache Arrow table. + */ + toArrow(options) { + return toArrow(this, options); + } + + /** + * Format this table as binary data in the Apache Arrow IPC format. + * @param {import('../arrow/types.js').ArrowIPCFormatOptions} [options] + * The Arrow IPC formatting options. + * @return {Uint8Array} A new Uint8Array of Arrow-encoded binary data. + */ + toArrowIPC(options) { + return toArrowIPC(this, options); + } + + /** + * Format this table as a comma-separated values (CSV) string. Other + * delimiters, such as tabs or pipes ('|'), can be specified using + * the options argument. + * @param {import('../format/to-csv.js').CSVFormatOptions} [options] + * The CSV formatting options. + * @return {string} A delimited value string. + */ + toCSV(options) { + return toCSV(this, options); + } + + /** + * Format this table as an HTML table string. + * @param {import('../format/to-html.js').HTMLFormatOptions} [options] + * The HTML formatting options. + * @return {string} An HTML table string. + */ + toHTML(options) { + return toHTML(this, options); + } + + /** + * Format this table as a JavaScript Object Notation (JSON) string. + * @param {import('../format/to-json.js').JSONFormatOptions} [options] + * The JSON formatting options. + * @return {string} A JSON string. + */ + toJSON(options) { + return toJSON(this, options); + } + + /** + * Format this table as a GitHub-Flavored Markdown table string. + * @param {import('../format/to-markdown.js').MarkdownFormatOptions} [options] + * The Markdown formatting options. + * @return {string} A GitHub-Flavored Markdown table string. + */ + toMarkdown(options) { + return toMarkdown(this, options); + } +} diff --git a/src/table/Table.js b/src/table/Table.js new file mode 100644 index 00000000..734f8e15 --- /dev/null +++ b/src/table/Table.js @@ -0,0 +1,656 @@ +import { nest, regroup, reindex } from './regroup.js'; +import { rowObjectBuilder } from '../expression/row-object.js'; +import resolve, { all } from '../helpers/selection.js'; +import arrayType from '../util/array-type.js'; +import error from '../util/error.js'; +import isArrayType from '../util/is-array-type.js'; +import isNumber from '../util/is-number.js'; +import repeat from '../util/repeat.js'; + +/** + * Base class representing a column-oriented data table. + */ +export class Table { + /** + * Instantiate a Table instance. + * @param {import('./types.js').ColumnData} columns + * An object mapping column names to values. + * @param {string[]} [names] + * An ordered list of column names. + * @param {import('./BitSet.js').BitSet} [filter] + * A filtering BitSet. + * @param {import('./types.js').GroupBySpec} [group] + * A groupby specification. + * @param {import('./types.js').RowComparator} [order] + * A row comparator function. + * @param {import('./types.js').Params} [params] + * An object mapping parameter names to values. + */ + constructor(columns, names, filter, group, order, params) { + const data = Object.freeze({ ...columns }); + names = names?.slice() ?? Object.keys(data); + const nrows = names.length ? data[names[0]].length : 0; + /** + * @private + * @type {readonly string[]} + */ + this._names = Object.freeze(names); + /** + * @private + * @type {import('./types.js').ColumnData} + */ + this._data = data; + /** + * @private + * @type {number} + */ + this._total = nrows; + /** + * @private + * @type {number} + */ + this._nrows = filter?.count() ?? nrows; + /** + * @private + * @type {import('./BitSet.js').BitSet} + */ + this._mask = filter ?? null; + /** + * @private + * @type {import('./types.js').GroupBySpec} + */ + this._group = group ?? null; + /** + * @private + * @type {import('./types.js').RowComparator} + */ + this._order = order ?? null; + /** + * @private + * @type {import('./types.js').Params} + */ + this._params = params; + /** + * @private + * @type {Uint32Array} + */ + this._index = null; + /** + * @private + * @type {number[][] | Uint32Array[]} + */ + this._partitions = null; + } + + /** + * Create a new table with the same type as this table. + * The new table may have different data, filter, grouping, or ordering + * based on the values of the optional configuration argument. If a + * setting is not specified, it is inherited from the current table. + * @param {import('./types.js').CreateOptions} [options] + * Creation options for the new table. + * @return {this} A newly created table. + */ + create({ + data = undefined, + names = undefined, + filter = undefined, + groups = undefined, + order = undefined + } = {}) { + const f = filter !== undefined ? filter : this.mask(); + // @ts-ignore + return new this.constructor( + data || this._data, + names || (!data ? this._names : null), + f, + groups !== undefined ? groups : regroup(this._group, filter && f), + order !== undefined ? order : this._order, + this._params + ); + } + + /** + * Get or set table expression parameter values. + * If called with no arguments, returns the current parameter values + * as an object. Otherwise, adds the provided parameters to this + * table's parameter set and returns the table. Any prior parameters + * with names matching the input parameters are overridden. + * @param {import('./types.js').Params} [values] + * The parameter values. + * @return {this|import('./types.js').Params} + * The current parameter values (if called with no arguments) or this table. + */ + params(values) { + if (arguments.length) { + if (values) { + this._params = { ...this._params, ...values }; + } + return this; + } else { + return this._params; + } + } + + /** + * Provide an informative object string tag. + */ + get [Symbol.toStringTag]() { + if (!this._names) return 'Object'; // bail if called on prototype + const nr = this.numRows(); + const nc = this.numCols(); + const plural = v => v !== 1 ? 's' : ''; + return `Table: ${nc} col${plural(nc)} x ${nr} row${plural(nr)}` + + (this.isFiltered() ? ` (${this.totalRows()} backing)` : '') + + (this.isGrouped() ? `, ${this._group.size} groups` : '') + + (this.isOrdered() ? ', ordered' : ''); + } + + /** + * Indicates if the table has a filter applied. + * @return {boolean} True if filtered, false otherwise. + */ + isFiltered() { + return !!this._mask; + } + + /** + * Indicates if the table has a groupby specification. + * @return {boolean} True if grouped, false otherwise. + */ + isGrouped() { + return !!this._group; + } + + /** + * Indicates if the table has a row order comparator. + * @return {boolean} True if ordered, false otherwise. + */ + isOrdered() { + return !!this._order; + } + + /** + * Get the backing column data for this table. + * @return {import('./types.js').ColumnData} + * Object of named column instances. + */ + data() { + return this._data; + } + + /** + * Returns the filter bitset mask, if defined. + * @return {import('./BitSet.js').BitSet} The filter bitset mask. + */ + mask() { + return this._mask; + } + + /** + * Returns the groupby specification, if defined. + * @return {import('./types.js').GroupBySpec} The groupby specification. + */ + groups() { + return this._group; + } + + /** + * Returns the row order comparator function, if specified. + * @return {import('./types.js').RowComparator} + * The row order comparator function. + */ + comparator() { + return this._order; + } + + /** + * The total number of rows in this table, counting both + * filtered and unfiltered rows. + * @return {number} The number of total rows. + */ + totalRows() { + return this._total; + } + + /** + * The number of active rows in this table. This number may be + * less than the *totalRows* if the table has been filtered. + * @return {number} The number of rows. + */ + numRows() { + return this._nrows; + } + + /** + * The number of active rows in this table. This number may be + * less than the *totalRows* if the table has been filtered. + * @return {number} The number of rows. + */ + get size() { + return this._nrows; + } + + /** + * The number of columns in this table. + * @return {number} The number of columns. + */ + numCols() { + return this._names.length; + } + + /** + * Filter function invoked for each column name. + * @callback NameFilter + * @param {string} name The column name. + * @param {number} index The column index. + * @param {string[]} array The array of names. + * @return {boolean} Returns true to retain the column name. + */ + + /** + * The table column names, optionally filtered. + * @param {NameFilter} [filter] An optional filter function. + * If unspecified, all column names are returned. + * @return {string[]} An array of matching column names. + */ + columnNames(filter) { + return filter ? this._names.filter(filter) : this._names.slice(); + } + + /** + * The column name at the given index. + * @param {number} index The column index. + * @return {string} The column name, + * or undefined if the index is out of range. + */ + columnName(index) { + return this._names[index]; + } + + /** + * The column index for the given name. + * @param {string} name The column name. + * @return {number} The column index, or -1 if the name is not found. + */ + columnIndex(name) { + return this._names.indexOf(name); + } + + /** + * Get the column instance with the given name. + * @param {string} name The column name. + * @return {import('./types.js').ColumnType | undefined} + * The named column, or undefined if it does not exist. + */ + column(name) { + return this._data[name]; + } + + /** + * Get the column instance at the given index position. + * @param {number} index The zero-based column index. + * @return {import('./types.js').ColumnType | undefined} + * The column, or undefined if it does not exist. + */ + columnAt(index) { + return this._data[this._names[index]]; + } + + /** + * Get an array of values contained in a column. The resulting array + * respects any table filter or orderby criteria. + * @param {string} name The column name. + * @param {ArrayConstructor | import('./types.js').TypedArrayConstructor} [constructor=Array] + * The array constructor for instantiating the output array. + * @return {import('./types.js').DataValue[] | import('./types.js').TypedArray} + * The array of column values. + */ + array(name, constructor = Array) { + const column = this.column(name); + const array = new constructor(this.numRows()); + let idx = -1; + this.scan(row => array[++idx] = column.at(row), true); + return array; + } + + /** + * Get the value for the given column and row. + * @param {string} name The column name. + * @param {number} [row=0] The row index, defaults to zero if not specified. + * @return {import('./types.js').DataValue} The table value at (column, row). + */ + get(name, row = 0) { + const column = this.column(name); + return this.isFiltered() || this.isOrdered() + ? column.at(this.indices()[row]) + : column.at(row); + } + + /** + * Returns an accessor ("getter") function for a column. The returned + * function takes a row index as its single argument and returns the + * corresponding column value. + * @param {string} name The column name. + * @return {import('./types.js').ColumnGetter} The column getter function. + */ + getter(name) { + const column = this.column(name); + const indices = this.isFiltered() || this.isOrdered() ? this.indices() : null; + if (indices) { + return row => column.at(indices[row]); + } else if (column) { + return row => column.at(row); + } else { + error(`Unrecognized column: ${name}`); + } + } + + /** + * Returns an object representing a table row. + * @param {number} [row=0] The row index, defaults to zero if not specified. + * @return {object} A row object with named properties for each column. + */ + object(row = 0) { + return objectBuilder(this)(row); + } + + /** + * Returns an array of objects representing table rows. + * @param {import('./types.js').ObjectsOptions} [options] + * The options for row object generation. + * @return {object[]} An array of row objects. + */ + objects(options = {}) { + const { grouped, limit, offset } = options; + + // generate array of row objects + const names = resolve(this, options.columns || all()); + const createRow = rowObjectBuilder(this, names); + const obj = []; + this.scan( + (row, data) => obj.push(createRow(row, data)), + true, limit, offset + ); + + // produce nested output as requested + if (grouped && this.isGrouped()) { + const idx = []; + this.scan(row => idx.push(row), true, limit, offset); + return nest(this, idx, obj, grouped); + } + + return obj; + } + + /** + * Returns an iterator over objects representing table rows. + * @return {Iterator} An iterator over row objects. + */ + *[Symbol.iterator]() { + const createRow = objectBuilder(this); + const n = this.numRows(); + for (let i = 0; i < n; ++i) { + yield createRow(i); + } + } + + /** + * Returns an iterator over column values. + * @return {Iterator} An iterator over row objects. + */ + *values(name) { + const get = this.getter(name); + const n = this.numRows(); + for (let i = 0; i < n; ++i) { + yield get(i); + } + } + + /** + * Print the contents of this table using the console.table() method. + * @param {import('./types.js').PrintOptions|number} options + * The options for row object generation, determining which rows and + * columns are printed. If number-valued, specifies the row limit. + * @return {this} The table instance. + */ + print(options = {}) { + const opt = isNumber(options) + ? { limit: +options } + // @ts-ignore + : { ...options, limit: 10 }; + + const obj = this.objects({ ...opt, grouped: false }); + const msg = `${this[Symbol.toStringTag]}. Showing ${obj.length} rows.`; + + console.log(msg); // eslint-disable-line no-console + console.table(obj); // eslint-disable-line no-console + return this; + } + + /** + * Returns an array of indices for all rows passing the table filter. + * @param {boolean} [order=true] A flag indicating if the returned + * indices should be sorted if this table is ordered. If false, the + * returned indices may or may not be sorted. + * @return {Uint32Array} An array of row indices. + */ + indices(order = true) { + if (this._index) return this._index; + + const n = this.numRows(); + const index = new Uint32Array(n); + const ordered = this.isOrdered(); + const bits = this.mask(); + let row = -1; + + // inline the following for performance: + // this.scan(row => index[++i] = row); + if (bits) { + for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { + index[++row] = i; + } + } else { + for (let i = 0; i < n; ++i) { + index[++row] = i; + } + } + + // sort index vector + if (order && ordered) { + const { _order, _data } = this; + index.sort((a, b) => _order(a, b, _data)); + } + + // save indices if they reflect table metadata + if (order || !ordered) { + this._index = index; + } + + return index; + } + + /** + * Returns an array of indices for each group in the table. + * If the table is not grouped, the result is the same as + * the *indices* method, but wrapped within an array. + * @param {boolean} [order=true] A flag indicating if the returned + * indices should be sorted if this table is ordered. If false, the + * returned indices may or may not be sorted. + * @return {number[][] | Uint32Array[]} An array of row index arrays, one + * per group. The indices will be filtered if the table is filtered. + */ + partitions(order = true) { + // return partitions if already generated + if (this._partitions) { + return this._partitions; + } + + // if not grouped, return a single partition + if (!this.isGrouped()) { + return [ this.indices(order) ]; + } + + // generate partitions + const { keys, size } = this._group; + const part = repeat(size, () => []); + + // populate partitions, don't sort if indices don't exist + // inline the following for performance: + // this.scan(row => part[keys[row]].push(row), sort); + const sort = this._index; + const bits = this.mask(); + const n = this.numRows(); + if (sort && this.isOrdered()) { + for (let i = 0, r; i < n; ++i) { + r = sort[i]; + part[keys[r]].push(r); + } + } else if (bits) { + for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { + part[keys[i]].push(i); + } + } else { + for (let i = 0; i < n; ++i) { + part[keys[i]].push(i); + } + } + + // if ordered but not yet sorted, sort partitions directly + if (order && !sort && this.isOrdered()) { + const compare = this._order; + const data = this._data; + for (let i = 0; i < size; ++i) { + part[i].sort((a, b) => compare(a, b, data)); + } + } + + // save partitions if they reflect table metadata + if (order || !this.isOrdered()) { + this._partitions = part; + } + + return part; + } + + /** + * Create a new fully-materialized instance of this table. + * All filter and orderby settings are removed from the new table. + * Instead, the backing data itself is filtered and ordered as needed. + * @param {number[]} [indices] Ordered row indices to materialize. + * If unspecified, all rows passing the table filter are used. + * @return {this} A reified table. + */ + reify(indices) { + const nrows = indices ? indices.length : this.numRows(); + const names = this._names; + let data, groups; + + if (!indices && !this.isOrdered()) { + if (!this.isFiltered()) { + return this; // data already reified + } else if (nrows === this.totalRows()) { + data = this.data(); // all rows pass filter, skip copy + } + } + + if (!data) { + const scan = indices ? f => indices.forEach(f) : f => this.scan(f, true); + const ncols = names.length; + data = {}; + + for (let i = 0; i < ncols; ++i) { + const name = names[i]; + const prev = this.column(name); + const curr = data[name] = new (arrayType(prev))(nrows); + let r = -1; + // optimize array access + isArrayType(prev) + ? scan(row => curr[++r] = prev[row]) + : scan(row => curr[++r] = prev.at(row)); + } + + if (this.isGrouped()) { + groups = reindex(this.groups(), scan, !!indices, nrows); + } + } + + return this.create({ data, names, groups, filter: null, order: null }); + } + + /** + * Callback function to cancel a table scan. + * @callback ScanStop + * @return {void} + */ + + /** + * Callback function invoked for each row of a table scan. + * @callback ScanVisitor + * @param {number} [row] The table row index. + * @param {import('./types.js').ColumnData} [data] + * The backing table data store. + * @param {ScanStop} [stop] Function to stop the scan early. + * Callees can invoke this function to prevent future calls. + * @return {void} + */ + + /** + * Perform a table scan, visiting each row of the table. + * If this table is filtered, only rows passing the filter are visited. + * @param {ScanVisitor} fn Callback invoked for each row of the table. + * @param {boolean} [order=false] Indicates if the table should be + * scanned in the order determined by *orderby*. This + * argument has no effect if the table is unordered. + * @property {number} [limit=Infinity] The maximum number of + * objects to create. + * @property {number} [offset=0] The row offset indicating how many + * initial rows to skip. + */ + scan(fn, order, limit = Infinity, offset = 0) { + const filter = this._mask; + const nrows = this._nrows; + const data = this._data; + + let i = offset || 0; + if (i > nrows) return; + + const n = Math.min(nrows, i + limit); + const stop = () => i = this._total; + + if (order && this.isOrdered() || filter && this._index) { + const index = this.indices(); + const data = this._data; + for (; i < n; ++i) { + fn(index[i], data, stop); + } + } else if (filter) { + let c = n - i + 1; + for (i = filter.nth(i); --c && i > -1; i = filter.next(i + 1)) { + fn(i, data, stop); + } + } else { + for (; i < n; ++i) { + fn(i, data, stop); + } + } + } +} + +function objectBuilder(table) { + let b = table._builder; + + if (!b) { + const createRow = rowObjectBuilder(table); + const data = table.data(); + if (table.isOrdered() || table.isFiltered()) { + const indices = table.indices(); + b = row => createRow(indices[row], data); + } else { + b = row => createRow(row, data); + } + table._builder = b; + } + + return b; +} diff --git a/src/table/column-set.js b/src/table/column-set.js deleted file mode 100644 index 3f49b5c0..00000000 --- a/src/table/column-set.js +++ /dev/null @@ -1,35 +0,0 @@ -import has from '../util/has'; - -export default function(table) { - return table - ? new ColumnSet({ ...table.data() }, table.columnNames()) - : new ColumnSet(); -} - -class ColumnSet { - constructor(data, names) { - this.data = data || {}; - this.names = names || []; - } - - add(name, values) { - if (!this.has(name)) this.names.push(name + ''); - return this.data[name] = values; - } - - has(name) { - return has(this.data, name); - } - - new() { - this.filter = null; - this.groups = this.groups || null; - this.order = null; - return this; - } - - groupby(groups) { - this.groups = groups; - return this; - } -} \ No newline at end of file diff --git a/src/table/column-table.js b/src/table/column-table.js deleted file mode 100644 index afe1d4b5..00000000 --- a/src/table/column-table.js +++ /dev/null @@ -1,456 +0,0 @@ -import { defaultColumnFactory } from './column'; -import columnsFrom from './columns-from'; -import columnSet from './column-set'; -import Table from './table'; -import { nest, regroup, reindex } from './regroup'; -import { rowObjectBuilder } from '../expression/row-object'; -import { default as toArrow, toArrowIPC } from '../format/to-arrow'; -import toCSV from '../format/to-csv'; -import toHTML from '../format/to-html'; -import toJSON from '../format/to-json'; -import toMarkdown from '../format/to-markdown'; -import resolve, { all } from '../helpers/selection'; -import arrayType from '../util/array-type'; -import entries from '../util/entries'; -import error from '../util/error'; -import mapObject from '../util/map-object'; - -/** - * Class representing a table backed by a named set of columns. - */ -export default class ColumnTable extends Table { - - /** - * Create a new ColumnTable from existing input data. - * @param {object[]|Iterable|object|Map} values The backing table data values. - * If array-valued, should be a list of JavaScript objects with - * key-value properties for each column value. - * If object- or Map-valued, a table with two columns (one for keys, - * one for values) will be created. - * @param {string[]} [names] The named columns to include. - * @return {ColumnTable} A new ColumnTable instance. - */ - static from(values, names) { - return new ColumnTable(columnsFrom(values, names), names); - } - - /** - * Create a new table for a set of named columns. - * @param {object|Map} columns - * The set of named column arrays. Keys are column names. - * The enumeration order of the keys determines the column indices, - * unless the names parameter is specified. - * Values must be arrays (or array-like values) of identical length. - * @param {string[]} [names] Ordered list of column names. If specified, - * this array determines the column indices. If not specified, the - * key enumeration order of the columns object is used. - * @return {ColumnTable} the instantiated ColumnTable instance. - */ - static new(columns, names) { - if (columns instanceof ColumnTable) return columns; - const data = {}; - const keys = []; - for (const [key, value] of entries(columns)) { - data[key] = value; - keys.push(key); - } - return new ColumnTable(data, names || keys); - } - - /** - * Instantiate a new ColumnTable instance. - * @param {object} columns An object mapping column names to values. - * @param {string[]} [names] An ordered list of column names. - * @param {BitSet} [filter] A filtering BitSet. - * @param {GroupBySpec} [group] A groupby specification. - * @param {RowComparator} [order] A row comparator function. - * @param {Params} [params] An object mapping parameter names to values. - */ - constructor(columns, names, filter, group, order, params) { - mapObject(columns, defaultColumnFactory, columns); - names = names || Object.keys(columns); - const nrows = names.length ? columns[names[0]].length : 0; - super(names, nrows, columns, filter, group, order, params); - } - - /** - * Create a new table with the same type as this table. - * The new table may have different data, filter, grouping, or ordering - * based on the values of the optional configuration argument. If a - * setting is not specified, it is inherited from the current table. - * @param {CreateOptions} [options] Creation options for the new table. - * @return {this} A newly created table. - */ - create({ data, names, filter, groups, order }) { - const f = filter !== undefined ? filter : this.mask(); - - return new ColumnTable( - data || this._data, - names || (!data ? this._names : null), - f, - groups !== undefined ? groups : regroup(this._group, filter && f), - order !== undefined ? order : this._order, - this._params - ); - } - - /** - * Create a new table with additional columns drawn from one or more input - * tables. All tables must have the same numer of rows and are reified - * prior to assignment. In the case of repeated column names, input table - * columns overwrite existing columns. - * @param {...ColumnTable} tables The tables to merge with this table. - * @return {ColumnTable} A new table with merged columns. - * @example table.assign(table1, table2) - */ - assign(...tables) { - const nrows = this.numRows(); - const base = this.reify(); - const cset = columnSet(base).groupby(base.groups()); - tables.forEach(input => { - input = ColumnTable.new(input); - if (input.numRows() !== nrows) error('Assign row counts do not match'); - input = input.reify(); - input.columnNames(name => cset.add(name, input.column(name))); - }); - return this.create(cset.new()); - } - - /** - * Get the backing set of columns for this table. - * @return {ColumnData} Object of named column instances. - */ - columns() { - return this._data; - } - - /** - * Get the column instance with the given name. - * @param {string} name The column name. - * @return {ColumnType | undefined} The named column, or undefined if it does not exist. - */ - column(name) { - return this._data[name]; - } - - /** - * Get the column instance at the given index position. - * @param {number} index The zero-based column index. - * @return {ColumnType | undefined} The column, or undefined if it does not exist. - */ - columnAt(index) { - return this._data[this._names[index]]; - } - - /** - * Get an array of values contained in a column. The resulting array - * respects any table filter or orderby criteria. - * @param {string} name The column name. - * @param {ArrayConstructor|TypedArrayConstructor} [constructor=Array] - * The array constructor for instantiating the output array. - * @return {DataValue[]|TypedArray} The array of column values. - */ - array(name, constructor = Array) { - const column = this.column(name); - const array = new constructor(this.numRows()); - let idx = -1; - this.scan(row => array[++idx] = column.get(row), true); - return array; - } - - /** - * Get the value for the given column and row. - * @param {string} name The column name. - * @param {number} [row=0] The row index, defaults to zero if not specified. - * @return {DataValue} The table value at (column, row). - */ - get(name, row = 0) { - const column = this.column(name); - return this.isFiltered() || this.isOrdered() - ? column.get(this.indices()[row]) - : column.get(row); - } - - /** - * Returns an accessor ("getter") function for a column. The returned - * function takes a row index as its single argument and returns the - * corresponding column value. - * @param {string} name The column name. - * @return {ColumnGetter} The column getter function. - */ - getter(name) { - const column = this.column(name); - const indices = this.isFiltered() || this.isOrdered() ? this.indices() : null; - return indices ? row => column.get(indices[row]) - : column ? row => column.get(row) - : error(`Unrecognized column: ${name}`); - } - - /** - * Returns an object representing a table row. - * @param {number} [row=0] The row index, defaults to zero if not specified. - * @return {object} A row object with named properties for each column. - */ - object(row = 0) { - return objectBuilder(this)(row); - } - - /** - * Returns an array of objects representing table rows. - * @param {ObjectsOptions} [options] The options for row object generation. - * @return {object[]} An array of row objects. - */ - objects(options = {}) { - const { grouped, limit, offset } = options; - - // generate array of row objects - const names = resolve(this, options.columns || all()); - const create = rowObjectBuilder(names); - const obj = []; - this.scan( - (row, data) => obj.push(create(row, data)), - true, limit, offset - ); - - // produce nested output as requested - if (grouped && this.isGrouped()) { - const idx = []; - this.scan(row => idx.push(row), true, limit, offset); - return nest(this, idx, obj, grouped); - } - - return obj; - } - - /** - * Returns an iterator over objects representing table rows. - * @return {Iterator} An iterator over row objects. - */ - *[Symbol.iterator]() { - const create = objectBuilder(this); - const n = this.numRows(); - for (let i = 0; i < n; ++i) { - yield create(i); - } - } - - /** - * Create a new fully-materialized instance of this table. - * All filter and orderby settings are removed from the new table. - * Instead, the backing data itself is filtered and ordered as needed. - * @param {number[]} [indices] Ordered row indices to materialize. - * If unspecified, all rows passing the table filter are used. - * @return {this} A reified table. - */ - reify(indices) { - const nrows = indices ? indices.length : this.numRows(); - const names = this._names; - let data, groups; - - if (!indices && !this.isOrdered()) { - if (!this.isFiltered()) { - return this; // data already reified - } else if (nrows === this.totalRows()) { - data = this.data(); // all rows pass filter, skip copy - } - } - - if (!data) { - const scan = indices ? f => indices.forEach(f) : f => this.scan(f, true); - const ncols = names.length; - data = {}; - - for (let i = 0; i < ncols; ++i) { - const name = names[i]; - const prev = this.column(name); - const curr = data[name] = new (arrayType(prev))(nrows); - let r = -1; - scan(row => curr[++r] = prev.get(row)); - } - - if (this.isGrouped()) { - groups = reindex(this.groups(), scan, !!indices, nrows); - } - } - - return this.create({ data, names, groups, filter: null, order: null }); - } - - /** - * Apply a sequence of transformations to this table. The output - * of each transform is passed as input to the next transform, and - * the output of the last transform is then returned. - * @param {...(Transform|Transform[])} transforms Transformation - * functions to apply to the table in sequence. Each function should - * take a single table as input and return a table as output. - * @return {ColumnTable} The output of the last transform. - */ - transform(...transforms) { - return transforms.flat().reduce((t, f) => f(t), this); - } - - /** - * Format this table as an Apache Arrow table. - * @param {ArrowFormatOptions} [options] The formatting options. - * @return {import('apache-arrow').Table} An Apache Arrow table. - */ - toArrow(options) { - return toArrow(this, options); - } - - /** - * Format this table as binary data in the Apache Arrow IPC format. - * @param {ArrowFormatOptions} [options] The formatting options. Set {format: 'stream'} - * or {format:"file"} for specific IPC format - * @return {Uint8Array} A new Uint8Array of Arrow-encoded binary data. - */ - toArrowBuffer(options) { - return toArrowIPC(this, options); - } - - /** - * Format this table as a comma-separated values (CSV) string. Other - * delimiters, such as tabs or pipes ('|'), can be specified using - * the options argument. - * @param {CSVFormatOptions} [options] The formatting options. - * @return {string} A delimited value string. - */ - toCSV(options) { - return toCSV(this, options); - } - - /** - * Format this table as an HTML table string. - * @param {HTMLFormatOptions} [options] The formatting options. - * @return {string} An HTML table string. - */ - toHTML(options) { - return toHTML(this, options); - } - - /** - * Format this table as a JavaScript Object Notation (JSON) string. - * @param {JSONFormatOptions} [options] The formatting options. - * @return {string} A JSON string. - */ - toJSON(options) { - return toJSON(this, options); - } - - /** - * Format this table as a GitHub-Flavored Markdown table string. - * @param {MarkdownFormatOptions} [options] The formatting options. - * @return {string} A GitHub-Flavored Markdown table string. - */ - toMarkdown(options) { - return toMarkdown(this, options); - } -} - -function objectBuilder(table) { - let b = table._builder; - - if (!b) { - const create = rowObjectBuilder(table.columnNames()); - const data = table.data(); - if (table.isOrdered() || table.isFiltered()) { - const indices = table.indices(); - b = row => create(indices[row], data); - } else { - b = row => create(row, data); - } - table._builder = b; - } - - return b; -} - -/** - * Options for derived table creation. - * @typedef {import('./table').CreateOptions} CreateOptions - */ - -/** - * A typed array constructor. - * @typedef {import('./table').TypedArrayConstructor} TypedArrayConstructor - */ - -/** - * A typed array instance. - * @typedef {import('./table').TypedArray} TypedArray - */ - -/** - * Table value. - * @typedef {import('./table').DataValue} DataValue - */ - -/** - * Column value accessor. - * @typedef {import('./table').ColumnGetter} ColumnGetter - */ - -/** - * Options for generating row objects. - * @typedef {import('./table').ObjectsOptions} ObjectsOptions - */ - -/** - * A table transformation. - * @typedef {(table: ColumnTable) => ColumnTable} Transform - */ - -/** - * Proxy type for BitSet class. - * @typedef {import('./table').BitSet} BitSet - */ - -/** - * Proxy type for ColumnType interface. - * @typedef {import('./column').ColumnType} ColumnType - */ - -/** - * A named collection of columns. - * @typedef {{[key: string]: ColumnType}} ColumnData - */ - -/** - * Proxy type for GroupBySpec. - * @typedef {import('./table').GroupBySpec} GroupBySpec - */ - -/** - * Proxy type for RowComparator. - * @typedef {import('./table').RowComparator} RowComparator - */ - -/** - * Proxy type for Params. - * @typedef {import('./table').Params} Params - */ - -/** - * Options for Arrow formatting. - * @typedef {import('../arrow/encode').ArrowFormatOptions} ArrowFormatOptions - */ - -/** - * Options for CSV formatting. - * @typedef {import('../format/to-csv').CSVFormatOptions} CSVFormatOptions - */ - -/** - * Options for HTML formatting. - * @typedef {import('../format/to-html').HTMLFormatOptions} HTMLFormatOptions - */ - -/** - * Options for JSON formatting. - * @typedef {import('../format/to-json').JSONFormatOptions} JSONFormatOptions - */ - -/** - * Options for Markdown formatting. - * @typedef {import('../format/to-markdown').MarkdownFormatOptions} MarkdownFormatOptions - */ diff --git a/src/table/column.js b/src/table/column.js deleted file mode 100644 index 2d63d487..00000000 --- a/src/table/column.js +++ /dev/null @@ -1,79 +0,0 @@ -import isFunction from '../util/is-function'; - -/** - * Class representing an array-backed data column. - */ -export default class Column { - /** - * Create a new column instance. - * @param {Array} data The backing array (or array-like object) - * containing the column data. - */ - constructor(data) { - this.data = data; - } - - /** - * Get the length (number of rows) of the column. - * @return {number} The length of the column array. - */ - get length() { - return this.data.length; - } - - /** - * Get the column value at the given row index. - * @param {number} row The row index of the value to retrieve. - * @return {import('./table').DataValue} The column value. - */ - get(row) { - return this.data[row]; - } - - /** - * Returns an iterator over the column values. - * @return {Iterator} An iterator over column values. - */ - [Symbol.iterator]() { - return this.data[Symbol.iterator](); - } -} - -/** - * Column interface. Any object that adheres to this interface - * can be used as a data column within a {@link ColumnTable}. - * @typedef {object} ColumnType - * @property {number} length - * The length (number of rows) of the column. - * @property {import('./table').ColumnGetter} get - * Column value getter. - */ - -/** - * Column factory function interface. - * @callback ColumnFactory - * @param {*} data The input column data. - * @return {ColumnType} A column instance. - */ - -/** - * Create a new column from the given input data. - * @param {any} data The backing column data. If the value conforms to - * the Column interface it is returned directly. If the value is an - * array, it will be wrapped in a new Column instance. - * @return {ColumnType} A compatible column instance. - */ -export let defaultColumnFactory = function(data) { - return data && isFunction(data.get) ? data : new Column(data); -}; - -/** - * Get or set the default factory function for instantiating table columns. - * @param {ColumnFactory} [factory] The new default factory. - * @return {ColumnFactory} The current default column factory. - */ -export function columnFactory(factory) { - return arguments.length - ? (defaultColumnFactory = factory) - : defaultColumnFactory; -} \ No newline at end of file diff --git a/src/table/columns-from.js b/src/table/columns-from.js index 696158c7..8fe317dd 100644 --- a/src/table/columns-from.js +++ b/src/table/columns-from.js @@ -1,13 +1,17 @@ -import error from '../util/error'; -import isArray from '../util/is-array'; -import isDate from '../util/is-date'; -import isFunction from '../util/is-function'; -import isObject from '../util/is-object'; -import isRegExp from '../util/is-regexp'; -import isString from '../util/is-string'; +import error from '../util/error.js'; +import isArray from '../util/is-array.js'; +import isDate from '../util/is-date.js'; +import isFunction from '../util/is-function.js'; +import isObject from '../util/is-object.js'; +import isRegExp from '../util/is-regexp.js'; +import isString from '../util/is-string.js'; -export default function(values, names) { +/** + * @return {import('./types.js').ColumnData} + */ +export function columnsFrom(values, names) { const raise = type => error(`Illegal argument type: ${type || typeof values}`); + // @ts-ignore return values instanceof Map ? fromKeyValuePairs(values.entries(), names) : isDate(values) ? raise('Date') : isRegExp(values) ? raise('RegExp') @@ -77,4 +81,4 @@ function fromIterable(values, names) { } return columns; -} \ No newline at end of file +} diff --git a/src/table/index.js b/src/table/index.js index ed334ee5..ff558c29 100644 --- a/src/table/index.js +++ b/src/table/index.js @@ -1,8 +1,6 @@ -import ColumnTable from './column-table'; -import verbs from '../verbs'; - -// Add verb implementations to ColumnTable prototype -Object.assign(ColumnTable.prototype, verbs); +import entries from '../util/entries.js'; +import { ColumnTable } from './ColumnTable.js'; +import { columnsFrom } from './columns-from.js'; /** * Create a new table for a set of named columns. @@ -18,7 +16,15 @@ Object.assign(ColumnTable.prototype, verbs); * @example table({ colA: ['a', 'b', 'c'], colB: [3, 4, 5] }) */ export function table(columns, names) { - return ColumnTable.new(columns, names); + if (columns instanceof ColumnTable) return columns; + /** @type {import('./types.js').ColumnData} */ + const data = {}; + const keys = []; + for (const [key, value] of entries(columns)) { + data[key] = value; + keys.push(key); + } + return new ColumnTable(data, names || keys); } /** @@ -36,5 +42,5 @@ export function table(columns, names) { * @example from([ { colA: 1, colB: 2 }, { colA: 3, colB: 4 } ]) */ export function from(values, names) { - return ColumnTable.from(values, names); -} \ No newline at end of file + return new ColumnTable(columnsFrom(values, names), names); +} diff --git a/src/table/regroup.js b/src/table/regroup.js index 063ce985..f540152e 100644 --- a/src/table/regroup.js +++ b/src/table/regroup.js @@ -1,18 +1,21 @@ -import { array_agg, entries_agg, map_agg, object_agg } from '../op/op-api'; -import error from '../util/error'; -import uniqueName from '../util/unique-name'; +import { array_agg, entries_agg, map_agg, object_agg } from '../op/op-api.js'; +import error from '../util/error.js'; +import uniqueName from '../util/unique-name.js'; +import { groupby } from '../verbs/groupby.js'; +import { rollup } from '../verbs/rollup.js'; +import { select } from '../verbs/select.js'; /** * Regroup table rows in response to a BitSet filter. - * @param {GroupBySpec} groups The current groupby specification. - * @param {BitSet} filter The filter to apply. + * @param {import('./types.js').GroupBySpec} groups The current groupby specification. + * @param {import('./BitSet.js').BitSet} filter The filter to apply. */ export function regroup(groups, filter) { if (!groups || !filter) return groups; // check for presence of rows for each group const { keys, rows, size } = groups; - const map = new Int32Array(size); + const map = new Uint32Array(size); filter.scan(row => map[keys[row]] = 1); // check sum, exit early if all groups occur @@ -36,7 +39,8 @@ export function regroup(groups, filter) { /** * Regroup table rows in response to a re-indexing. * This operation may or may not involve filtering of rows. - * @param {GroupBySpec} groups The current groupby specification. + * @param {import('./types.js').GroupBySpec} groups + * The current groupby specification. * @param {Function} scan Function to scan new row indices. * @param {boolean} filter Flag indicating if filtering may occur. * @param {number} nrows The number of rows in the new table. @@ -86,19 +90,18 @@ export function nest(table, idx, obj, type) { // create table with one column of row objects // then aggregate into per-group arrays - let t = table - .select() - .reify(idx) - .create({ data: { [col]: obj } }) - .rollup({ [col]: array_agg(col) }); + let t = select(table, {}).reify(idx).create({ data: { [col]: obj } }); + t = rollup(t, { [col]: array_agg(col) }); // create nested structures for each level of grouping for (let i = names.length; --i >= 0;) { - t = t - .groupby(names.slice(0, i)) - .rollup({ [col]: agg(names[i], col) }); + t = rollup( + groupby(t, names.slice(0, i)), + // @ts-ignore + { [col]: agg(names[i], col) } + ); } // return the final aggregated structure return t.get(col); -} \ No newline at end of file +} diff --git a/src/table/table.js b/src/table/table.js deleted file mode 100644 index 737ab8f8..00000000 --- a/src/table/table.js +++ /dev/null @@ -1,607 +0,0 @@ -import Transformable from './transformable'; -import error from '../util/error'; -import isNumber from '../util/is-number'; -import repeat from '../util/repeat'; - -/** - * Abstract class representing a data table. - */ -export default class Table extends Transformable { - - /** - * Instantiate a new Table instance. - * @param {string[]} names An ordered list of column names. - * @param {number} nrows The number of rows. - * @param {TableData} data The backing data, which can vary by implementation. - * @param {BitSet} [filter] A bit mask for which rows to include. - * @param {GroupBySpec} [groups] A groupby specification for grouping ows. - * @param {RowComparator} [order] A comparator function for sorting rows. - * @param {Params} [params] Parameter values for table expressions. - */ - constructor(names, nrows, data, filter, groups, order, params) { - super(params); - this._names = Object.freeze(names); - this._data = data; - this._total = nrows; - this._nrows = filter ? filter.count() : nrows; - this._mask = (nrows !== this._nrows && filter) || null; - this._group = groups || null; - this._order = order || null; - } - - /** - * Create a new table with the same type as this table. - * The new table may have different data, filter, grouping, or ordering - * based on the values of the optional configuration argument. If a - * setting is not specified, it is inherited from the current table. - * @param {CreateOptions} [options] Creation options for the new table. - * @return {this} A newly created table. - */ - create(options) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Provide an informative object string tag. - */ - get [Symbol.toStringTag]() { - if (!this._names) return 'Object'; // bail if called on prototype - const nr = this.numRows() + ' row' + (this.numRows() !== 1 ? 's' : ''); - const nc = this.numCols() + ' col' + (this.numCols() !== 1 ? 's' : ''); - return `Table: ${nc} x ${nr}` - + (this.isFiltered() ? ` (${this.totalRows()} backing)` : '') - + (this.isGrouped() ? `, ${this._group.size} groups` : '') - + (this.isOrdered() ? ', ordered' : ''); - } - - /** - * Indicates if the table has a filter applied. - * @return {boolean} True if filtered, false otherwise. - */ - isFiltered() { - return !!this._mask; - } - - /** - * Indicates if the table has a groupby specification. - * @return {boolean} True if grouped, false otherwise. - */ - isGrouped() { - return !!this._group; - } - - /** - * Indicates if the table has a row order comparator. - * @return {boolean} True if ordered, false otherwise. - */ - isOrdered() { - return !!this._order; - } - - /** - * Returns the internal table storage data structure. - * @return {TableData} The backing table storage data structure. - */ - data() { - return this._data; - } - - /** - * Returns the filter bitset mask, if defined. - * @return {BitSet} The filter bitset mask. - */ - mask() { - return this._mask; - } - - /** - * Returns the groupby specification, if defined. - * @return {GroupBySpec} The groupby specification. - */ - groups() { - return this._group; - } - - /** - * Returns the row order comparator function, if specified. - * @return {RowComparator} The row order comparator function. - */ - comparator() { - return this._order; - } - - /** - * The total number of rows in this table, counting both - * filtered and unfiltered rows. - * @return {number} The number of total rows. - */ - totalRows() { - return this._total; - } - - /** - * The number of active rows in this table. This number may be - * less than the total rows if the table has been filtered. - * @see Table.totalRows - * @return {number} The number of rows. - */ - numRows() { - return this._nrows; - } - - /** - * The number of active rows in this table. This number may be - * less than the total rows if the table has been filtered. - * @see Table.totalRows - * @return {number} The number of rows. - */ - get size() { - return this._nrows; - } - - /** - * The number of columns in this table. - * @return {number} The number of columns. - */ - numCols() { - return this._names.length; - } - - /** - * Filter function invoked for each column name. - * @callback NameFilter - * @param {string} name The column name. - * @param {number} index The column index. - * @param {string[]} array The array of names. - * @return {boolean} Returns true to retain the column name. - */ - - /** - * The table column names, optionally filtered. - * @param {NameFilter} [filter] An optional filter function. - * If unspecified, all column names are returned. - * @return {string[]} An array of matching column names. - */ - columnNames(filter) { - return filter ? this._names.filter(filter) : this._names.slice(); - } - - /** - * The column name at the given index. - * @param {number} index The column index. - * @return {string} The column name, - * or undefined if the index is out of range. - */ - columnName(index) { - return this._names[index]; - } - - /** - * The column index for the given name. - * @param {string} name The column name. - * @return {number} The column index, or -1 if the name is not found. - */ - columnIndex(name) { - return this._names.indexOf(name); - } - - /** - * Deprecated alias for the table array() method: use table.array() - * instead. Get an array of values contained in a column. The resulting - * array respects any table filter or orderby criteria. - * @param {string} name The column name. - * @param {ArrayConstructor|TypedArrayConstructor} [constructor=Array] - * The array constructor for instantiating the output array. - * @return {DataValue[]|TypedArray} The array of column values. - */ - columnArray(name, constructor) { - return this.array(name, constructor); - } - - /** - * Get an array of values contained in a column. The resulting array - * respects any table filter or orderby criteria. - * @param {string} name The column name. - * @param {ArrayConstructor|TypedArrayConstructor} [constructor=Array] - * The array constructor for instantiating the output array. - * @return {DataValue[]|TypedArray} The array of column values. - */ - array(name, constructor) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Returns an iterator over column values. - * @return {Iterator} An iterator over row objects. - */ - *values(name) { - const get = this.getter(name); - const n = this.numRows(); - for (let i = 0; i < n; ++i) { - yield get(i); - } - } - - /** - * Get the value for the given column and row. - * @param {string} name The column name. - * @param {number} [row=0] The row index, defaults to zero if not specified. - * @return {DataValue} The data value at (column, row). - */ - get(name, row = 0) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Returns an accessor ("getter") function for a column. The returned - * function takes a row index as its single argument and returns the - * corresponding column value. - * @param {string} name The column name. - * @return {ColumnGetter} The column getter function. - */ - getter(name) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Returns an array of objects representing table rows. - * @param {ObjectsOptions} [options] The options for row object generation. - * @return {RowObject[]} An array of row objects. - */ - objects(options) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Returns an object representing a table row. - * @param {number} [row=0] The row index, defaults to zero if not specified. - * @return {object} A row object with named properties for each column. - */ - object(row) { // eslint-disable-line no-unused-vars - error('Not implemented'); - } - - /** - * Returns an iterator over objects representing table rows. - * @return {Iterator} An iterator over row objects. - */ - [Symbol.iterator]() { - error('Not implemented'); - } - - /** - * Print the contents of this table using the console.table() method. - * @param {PrintOptions|number} options The options for row object - * generation, determining which rows and columns are printed. If - * number-valued, specifies the row limit. - * @return {this} The table instance. - */ - print(options = {}) { - if (isNumber(options)) { - options = { limit: options }; - } else if (options.limit == null) { - options.limit = 10; - } - - const obj = this.objects({ ...options, grouped: false }); - const msg = `${this[Symbol.toStringTag]}. Showing ${obj.length} rows.`; - - console.log(msg); // eslint-disable-line no-console - console.table(obj); // eslint-disable-line no-console - return this; - } - - /** - * Returns an array of indices for all rows passing the table filter. - * @param {boolean} [order=true] A flag indicating if the returned - * indices should be sorted if this table is ordered. If false, the - * returned indices may or may not be sorted. - * @return {Uint32Array} An array of row indices. - */ - indices(order = true) { - if (this._index) return this._index; - - const n = this.numRows(); - const index = new Uint32Array(n); - const ordered = this.isOrdered(); - const bits = this.mask(); - let row = -1; - - // inline the following for performance: - // this.scan(row => index[++i] = row); - if (bits) { - for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { - index[++row] = i; - } - } else { - for (let i = 0; i < n; ++i) { - index[++row] = i; - } - } - - // sort index vector - if (order && ordered) { - const compare = this._order; - const data = this._data; - index.sort((a, b) => compare(a, b, data)); - } - - // save indices if they reflect table metadata - if (order || !ordered) { - this._index = index; - } - - return index; - } - - /** - * Returns an array of indices for each group in the table. - * If the table is not grouped, the result is the same as - * {@link indices}, but wrapped within an array. - * @param {boolean} [order=true] A flag indicating if the returned - * indices should be sorted if this table is ordered. If false, the - * returned indices may or may not be sorted. - * @return {number[][]} An array of row index arrays, one per group. - * The indices will be filtered if the table is filtered. - */ - partitions(order = true) { - // return partitions if already generated - if (this._partitions) { - return this._partitions; - } - - // if not grouped, return a single partition - if (!this.isGrouped()) { - return [ this.indices(order) ]; - } - - // generate partitions - const { keys, size } = this._group; - const part = repeat(size, () => []); - - // populate partitions, don't sort if indices don't exist - // inline the following for performance: - // this.scan(row => part[keys[row]].push(row), sort); - const sort = this._index; - const bits = this.mask(); - const n = this.numRows(); - if (sort && this.isOrdered()) { - for (let i = 0, r; i < n; ++i) { - r = sort[i]; - part[keys[r]].push(r); - } - } else if (bits) { - for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { - part[keys[i]].push(i); - } - } else { - for (let i = 0; i < n; ++i) { - part[keys[i]].push(i); - } - } - - // if ordered but not yet sorted, sort partitions directly - if (order && !sort && this.isOrdered()) { - const compare = this._order; - const data = this._data; - for (let i = 0; i < size; ++i) { - part[i].sort((a, b) => compare(a, b, data)); - } - } - - // save partitions if they reflect table metadata - if (order || !this.isOrdered()) { - this._partitions = part; - } - - return part; - } - - /** - * Callback function to cancel a table scan. - * @callback ScanStop - * @return {void} - */ - - /** - * Callback function invoked for each row of a table scan. - * @callback ScanVisitor - * @param {number} [row] The table row index. - * @param {TableData} [data] The backing table data store. - * @param {ScanStop} [stop] Function to stop the scan early. - * Callees can invoke this function to prevent future calls. - * @return {void} - */ - - /** - * Perform a table scan, visiting each row of the table. - * If this table is filtered, only rows passing the filter are visited. - * @param {ScanVisitor} fn Callback invoked for each row of the table. - * @param {boolean} [order=false] Indicates if the table should be - * scanned in the order determined by {@link Table#orderby}. This - * argument has no effect if the table is unordered. - * @property {number} [limit=Infinity] The maximum number of objects to create. - * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - */ - scan(fn, order, limit = Infinity, offset = 0) { - const filter = this._mask; - const nrows = this._nrows; - const data = this._data; - - let i = offset || 0; - if (i > nrows) return; - - const n = Math.min(nrows, i + limit); - const stop = () => i = this._total; - - if (order && this.isOrdered() || filter && this._index) { - const index = this.indices(); - const data = this._data; - for (; i < n; ++i) { - fn(index[i], data, stop); - } - } else if (filter) { - let c = n - i + 1; - for (i = filter.nth(i); --c && i > -1; i = filter.next(i + 1)) { - fn(i, data, stop); - } - } else { - for (; i < n; ++i) { - fn(i, data, stop); - } - } - } - - /** - * Extract rows with indices from start to end (end not included), where - * start and end represent per-group ordered row numbers in the table. - * @param {number} [start] Zero-based index at which to start extraction. - * A negative index indicates an offset from the end of the group. - * If start is undefined, slice starts from the index 0. - * @param {number} [end] Zero-based index before which to end extraction. - * A negative index indicates an offset from the end of the group. - * If end is omitted, slice extracts through the end of the group. - * @return {this} A new table with sliced rows. - * @example table.slice(1, -1) - */ - slice(start = 0, end = Infinity) { - if (this.isGrouped()) return super.slice(start, end); - - // if not grouped, scan table directly - const indices = []; - const nrows = this.numRows(); - start = Math.max(0, start + (start < 0 ? nrows : 0)); - end = Math.min(nrows, Math.max(0, end + (end < 0 ? nrows : 0))); - this.scan(row => indices.push(row), true, end - start, start); - return this.reify(indices); - } - - /** - * Reduce a table, processing all rows to produce a new table. - * To produce standard aggregate summaries, use {@link rollup}. - * This method allows the use of custom reducer implementations, - * for example to produce multiple rows for an aggregate. - * @param {Reducer} reducer The reducer to apply. - * @return {Table} A new table of reducer outputs. - */ - reduce(reducer) { - return this.__reduce(this, reducer); - } -} - -/** - * A typed array constructor. - * @typedef {Uint8ArrayConstructor|Uint16ArrayConstructor|Uint32ArrayConstructor|BigUint64ArrayConstructor|Int8ArrayConstructor|Int16ArrayConstructor|Int32ArrayConstructor|BigInt64ArrayConstructor|Float32ArrayConstructor|Float64ArrayConstructor} TypedArrayConstructor - */ - -/** - * A typed array instance. - * @typedef {Uint8Array|Uint16Array|Uint32Array|BigUint64Array|Int8Array|Int16Array|Int32Array|BigInt64Array|Float32Array|Float64Array} TypedArray - */ - -/** - * Backing table data. - * @typedef {object|Array} TableData - */ - -/** - * Table value. - * @typedef {*} DataValue - */ - -/** - * Table row object. - * @typedef {Object.} RowObject - */ - -/** - * Table expression parameters. - * @typedef {import('./transformable').Params} Params - */ - -/** - * Proxy type for BitSet class. - * @typedef {import('./bit-set').default} BitSet - */ - -/** - * Abstract class for custom aggregation operations. - * @typedef {import('../engine/reduce/reducer').default} Reducer - */ - -/** - * A table groupby specification. - * @typedef {object} GroupBySpec - * @property {number} size The number of groups. - * @property {string[]} names Column names for each group. - * @property {RowExpression[]} get Value accessor functions for each group. - * @property {number[]} rows Indices of an example table row for each group. - * @property {number[]} keys Per-row group indices, length is total rows of table. - */ - -/** - * Column value accessor. - * @callback ColumnGetter - * @param {number} [row] The table row. - * @return {DataValue} - */ - -/** - * An expression evaluated over a table row. - * @callback RowExpression - * @param {number} [row] The table row. - * @param {TableData} [data] The backing table data store. - * @return {DataValue} - */ - -/** - * Comparator function for sorting table rows. - * @callback RowComparator - * @param {number} rowA The table row index for the first row. - * @param {number} rowB The table row index for the second row. - * @param {TableData} data The backing table data store. - * @return {number} Negative if rowA < rowB, positive if - * rowA > rowB, otherwise zero. - */ - -/** - * Options for derived table creation. - * @typedef {object} CreateOptions - * @property {TableData} [data] The backing column data. - * @property {string[]} [names] An ordered list of column names. - * @property {BitSet} [filter] An additional filter BitSet to apply. - * @property {GroupBySpec} [groups] The groupby specification to use, or null for no groups. - * @property {RowComparator} [order] The orderby comparator function to use, or null for no order. - */ - -/** - * Options for generating row objects. - * @typedef {object} PrintOptions - * @property {number} [limit=Infinity] The maximum number of objects to create. - * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - * @property {import('../table/transformable').Select} [columns] - * An ordered set of columns to include. The input may consist of column name - * strings, column integer indices, objects with current column names as keys - * and new column names as values (for renaming), or selection helper - * functions such as {@link all}, {@link not}, or {@link range}. - */ - -/** - * Options for generating row objects. - * @typedef {object} ObjectsOptions - * @property {number} [limit=Infinity] The maximum number of objects to create. - * @property {number} [offset=0] The row offset indicating how many initial rows to skip. - * @property {import('../table/transformable').Select} [columns] - * An ordered set of columns to include. The input may consist of column name - * strings, column integer indices, objects with current column names as keys - * and new column names as values (for renaming), or selection helper - * functions such as {@link all}, {@link not}, or {@link range}. - * @property {'map'|'entries'|'object'|boolean} [grouped=false] - * The export format for groups of rows. The default (false) is to ignore - * groups, returning a flat array of objects. The valid values are 'map' or - * true (for Map instances), 'object' (for standard objects), or 'entries' - * (for arrays in the style of Object.entries). For the 'object' format, - * groupby keys are coerced to strings to use as object property names; note - * that this can lead to undesirable behavior if the groupby keys are object - * values. The 'map' and 'entries' options preserve the groupby key values. - */ diff --git a/src/table/transformable.js b/src/table/transformable.js deleted file mode 100644 index b481573c..00000000 --- a/src/table/transformable.js +++ /dev/null @@ -1,978 +0,0 @@ -import toArray from '../util/to-array'; -import slice from '../helpers/slice'; - -/** - * Abstract base class for transforming data. - */ -export default class Transformable { - - /** - * Instantiate a new Transformable instance. - * @param {Params} [params] The parameter values. - */ - constructor(params) { - if (params) this._params = params; - } - - /** - * Get or set table expression parameter values. - * If called with no arguments, returns the current parameter values - * as an object. Otherwise, adds the provided parameters to this - * table's parameter set and returns the table. Any prior parameters - * with names matching the input parameters are overridden. - * @param {Params} [values] The parameter values. - * @return {this|Params} The current parameters values (if called with - * no arguments) or this table. - */ - params(values) { - if (arguments.length) { - if (values) { - this._params = { ...this._params, ...values }; - } - return this; - } else { - return this._params; - } - } - - /** - * Create a new fully-materialized instance of this table. - * All filter and orderby settings are removed from the new table. - * Instead, the backing data itself is filtered and ordered as needed. - * @param {number[]} [indices] Ordered row indices to materialize. - * If unspecified, all rows passing the table filter are used. - * @return {this} A reified table. - */ - reify(indices) { - return this.__reify(this, indices); - } - - // -- Transformation Verbs ------------------------------------------------ - - /** - * Count the number of values in a group. This method is a shorthand - * for {@link Transformable#rollup} with a count aggregate function. - * @param {CountOptions} [options] Options for the count. - * @return {this} A new table with groupby and count columns. - * @example table.groupby('colA').count() - * @example table.groupby('colA').count({ as: 'num' }) - */ - count(options) { - return this.__count(this, options); - } - - /** - * Derive new column values based on the provided expressions. By default, - * new columns are added after (higher indices than) existing columns. Use - * the before or after options to place new columns elsewhere. - * @param {ExprObject} values Object of name-value pairs defining the - * columns to derive. The input object should have output column - * names for keys and table expressions for values. - * @param {DeriveOptions} [options] Options for dropping or relocating - * derived columns. Use either a before or after property to indicate - * where to place derived columns. Specifying both before and after is an - * error. Unlike the relocate verb, this option affects only new columns; - * updated columns with existing names are excluded from relocation. - * @return {this} A new table with derived columns added. - * @example table.derive({ sumXY: d => d.x + d.y }) - * @example table.derive({ z: d => d.x * d.y }, { before: 'x' }) - */ - derive(values, options) { - return this.__derive(this, values, options); - } - - /** - * Filter a table to a subset of rows based on the input criteria. - * The resulting table provides a filtered view over the original data; no - * data copy is made. To create a table that copies only filtered data to - * new data structures, call {@link Transformable#reify} on the output table. - * @param {TableExpr} criteria Filter criteria as a table expression. - * Both aggregate and window functions are permitted, taking into account - * {@link Transformable#groupby} or {@link Transformable#orderby} settings. - * @return {this} A new table with filtered rows. - * @example table.filter(d => abs(d.value) < 5) - */ - filter(criteria) { - return this.__filter(this, criteria); - } - - /** - * Extract rows with indices from start to end (end not included), where - * start and end represent per-group ordered row numbers in the table. - * @param {number} [start] Zero-based index at which to start extraction. - * A negative index indicates an offset from the end of the group. - * If start is undefined, slice starts from the index 0. - * @param {number} [end] Zero-based index before which to end extraction. - * A negative index indicates an offset from the end of the group. - * If end is omitted, slice extracts through the end of the group. - * @return {this} A new table with sliced rows. - * @example table.slice(1, -1) - */ - slice(start, end) { - return this.filter(slice(start, end)).reify(); - } - - /** - * Group table rows based on a set of column values. - * Subsequent operations that are sensitive to grouping (such as - * aggregate functions) will operate over the grouped rows. - * To undo grouping, use {@link Transformable#ungroup}. - * @param {...ExprList} keys Key column values to group by. - * The keys may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @return {this} A new table with grouped rows. - * @example table.groupby('colA', 'colB') - * @example table.groupby({ key: d => d.colA + d.colB }) - */ - groupby(...keys) { - return this.__groupby(this, keys.flat()); - } - - /** - * Order table rows based on a set of column values. - * Subsequent operations sensitive to ordering (such as window functions) - * will operate over sorted values. - * The resulting table provides an view over the original data, without - * any copying. To create a table with sorted data copied to new data - * strucures, call {@link Transformable#reify} on the result of this method. - * To undo ordering, use {@link Transformable#unorder}. - * @param {...OrderKeys} keys Key values to sort by, in precedence order. - * By default, sorting is done in ascending order. - * To sort in descending order, wrap values using {@link desc}. - * If a string, order by the column with that name. - * If a number, order by the column with that index. - * If a function, must be a valid table expression; aggregate functions - * are permitted, but window functions are not. - * If an object, object values must be valid values parameters - * with output column names for keys and table expressions - * for values (the output names will be ignored). - * If an array, array values must be valid key parameters. - * @return {this} A new ordered table. - * @example table.orderby('a', desc('b')) - * @example table.orderby({ a: 'a', b: desc('b') )}) - * @example table.orderby(desc(d => d.a)) - */ - orderby(...keys) { - return this.__orderby(this, keys.flat()); - } - - /** - * Relocate a subset of columns to change their positions, also - * potentially renaming them. - * @param {Selection} columns An ordered selection of columns to relocate. - * The input may consist of column name strings, column integer indices, - * rename objects with current column names as keys and new column names - * as values, or functions that take a table as input and returns a valid - * selection parameter (typically the output of selection helper functions - * such as {@link all}, {@link not}, or {@link range}). - * @param {RelocateOptions} options Options for relocating. Must include - * either the before or after property to indicate where to place the - * relocated columns. Specifying both before and after is an error. - * @return {this} A new table with relocated columns. - * @example table.relocate(['colY', 'colZ'], { after: 'colX' }) - * @example table.relocate(not('colB', 'colC'), { before: 'colA' }) - * @example table.relocate({ colA: 'newA', colB: 'newB' }, { after: 'colC' }) - */ - relocate(columns, options) { - return this.__relocate(this, toArray(columns), options); - } - - /** - * Rename one or more columns, preserving column order. - * @param {...Select} columns One or more rename objects with current - * column names as keys and new column names as values. - * @return {this} A new table with renamed columns. - * @example table.rename({ oldName: 'newName' }) - * @example table.rename({ a: 'a2', b: 'b2' }) - */ - rename(...columns) { - return this.__rename(this, columns.flat()); - } - - /** - * Rollup a table to produce an aggregate summary. - * Often used in conjunction with {@link Transformable#groupby}. - * To produce counts only, {@link Transformable#count} is a shortcut. - * @param {ExprObject} [values] Object of name-value pairs defining aggregate - * output columns. The input object should have output column names for - * keys and table expressions for values. The expressions must be valid - * aggregate expressions: window functions are not allowed and column - * references must be arguments to aggregate functions. - * @return {this} A new table of aggregate summary values. - * @example table.groupby('colA').rollup({ mean: d => mean(d.colB) }) - * @example table.groupby('colA').rollup({ mean: op.median('colB') }) - */ - rollup(values) { - return this.__rollup(this, values); - } - - /** - * Generate a table from a random sample of rows. - * If the table is grouped, performs a stratified sample by - * sampling from each group separately. - * @param {number|TableExpr} size The number of samples to draw per group. - * If number-valued, the same sample size is used for each group. - * If function-valued, the input should be an aggregate table - * expression compatible with {@link Transformable#rollup}. - * @param {SampleOptions} [options] Options for sampling. - * @return {this} A new table with sampled rows. - * @example table.sample(50) - * @example table.sample(100, { replace: true }) - * @example table.groupby('colA').sample(() => op.floor(0.5 * op.count())) - */ - sample(size, options) { - return this.__sample(this, size, options); - } - - /** - * Select a subset of columns into a new table, potentially renaming them. - * @param {...Select} columns An ordered selection of columns. - * The input may consist of column name strings, column integer indices, - * rename objects with current column names as keys and new column names - * as values, or functions that take a table as input and returns a valid - * selection parameter (typically the output of selection helper functions - * such as {@link all}, {@link not}, or {@link range}). - * @return {this} A new table of selected columns. - * @example table.select('colA', 'colB') - * @example table.select(not('colB', 'colC')) - * @example table.select({ colA: 'newA', colB: 'newB' }) - */ - select(...columns) { - return this.__select(this, columns.flat()); - } - - /** - * Ungroup a table, removing any grouping criteria. - * Undoes the effects of {@link Transformable#groupby}. - * @return {this} A new ungrouped table, or this table if not grouped. - * @example table.ungroup() - */ - ungroup() { - return this.__ungroup(this); - } - - /** - * Unorder a table, removing any sorting criteria. - * Undoes the effects of {@link Transformable#orderby}. - * @return {this} A new unordered table, or this table if not ordered. - * @example table.unorder() - */ - unorder() { - return this.__unorder(this); - } - - // -- Cleaning Verbs ------------------------------------------------------ - - /** - * De-duplicate table rows by removing repeated row values. - * @param {...ExprList} keys Key columns to check for duplicates. - * Two rows are considered duplicates if they have matching values for - * all keys. If keys are unspecified, all columns are used. - * The keys may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @return {this} A new de-duplicated table. - * @example table.dedupe() - * @example table.dedupe('a', 'b') - * @example table.dedupe({ abs: d => op.abs(d.a) }) - */ - dedupe(...keys) { - return this.__dedupe(this, keys.flat()); - } - - /** - * Impute missing values or rows. Accepts a set of column-expression pairs - * and evaluates the expressions to replace any missing (null, undefined, - * or NaN) values in the original column. - * If the expand option is specified, imputes new rows for missing - * combinations of values. All combinations of key values (a full cross - * product) are considered for each level of grouping (specified by - * {@link Transformable#groupby}). New rows will be added for any combination - * of key and groupby values not already contained in the table. For all - * non-key and non-group columns the new rows are populated with imputation - * values (first argument) if specified, otherwise undefined. - * If the expand option is specified, any filter or orderby settings are - * removed from the output table, but groupby settings persist. - * @param {ExprObject} values Object of name-value pairs for the column values - * to impute. The input object should have existing column names for keys - * and table expressions for values. The expressions will be evaluated to - * determine replacements for any missing values. - * @param {ImputeOptions} [options] Imputation options. The expand - * property specifies a set of column values to consider for imputing - * missing rows. All combinations of expanded values are considered, and - * new rows are added for each combination that does not appear in the - * input table. - * @return {this} A new table with imputed values and/or rows. - * @example table.impute({ v: () => 0 }) - * @example table.impute({ v: d => op.mean(d.v) }) - * @example table.impute({ v: () => 0 }, { expand: ['x', 'y'] }) - */ - impute(values, options) { - return this.__impute(this, values, options); - } - - // -- Reshaping Verbs ----------------------------------------------------- - - /** - * Fold one or more columns into two key-value pair columns. - * The fold transform is an inverse of the {@link Transformable#pivot} transform. - * The resulting table has two new columns, one containing the column - * names (named "key") and the other the column values (named "value"). - * The number of output rows equals the original row count multiplied - * by the number of folded columns. - * @param {ExprList} values The columns to fold. - * The columns may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @param {FoldOptions} [options] Options for folding. - * @return {this} A new folded table. - * @example table.fold('colA') - * @example table.fold(['colA', 'colB']) - * @example table.fold(range(5, 8)) - */ - fold(values, options) { - return this.__fold(this, values, options); - } - - /** - * Pivot columns into a cross-tabulation. - * The pivot transform is an inverse of the {@link Transformable#fold} transform. - * The resulting table has new columns for each unique combination - * of the provided *keys*, populated with the provided *values*. - * The provided *values* must be aggregates, as a single set of keys may - * include more than one row. If string-valued, the *any* aggregate is used. - * If only one *values* column is defined, the new pivoted columns will - * be named using key values directly. Otherwise, input value column names - * will be included as a component of the output column names. - * @param {ExprList} keys Key values to map to new column names. - * The keys may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @param {ExprList} values Output values for pivoted columns. - * Column references will be wrapped in an *any* aggregate. - * If object-valued, the input object should have output value - * names for keys and aggregate table expressions for values. - * @param {PivotOptions} [options] Options for pivoting. - * @return {this} A new pivoted table. - * @example table.pivot('key', 'value') - * @example table.pivot(['keyA', 'keyB'], ['valueA', 'valueB']) - * @example table.pivot({ key: d => d.key }, { value: d => sum(d.value) }) - */ - pivot(keys, values, options) { - return this.__pivot(this, keys, values, options); - } - - /** - * Spread array elements into a set of new columns. - * Output columns are named based on the value key and array index. - * @param {ExprList} values The column values to spread. - * The values may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @param {SpreadOptions} [options] Options for spreading. - * @return {this} A new table with the spread columns added. - * @example table.spread({ a: split(d.text, '') }) - * @example table.spread('arrayCol', { limit: 100 }) - */ - spread(values, options) { - return this.__spread(this, values, options); - } - - /** - * Unroll one or more array-valued columns into new rows. - * If more than one array value is used, the number of new rows - * is the smaller of the limit and the largest length. - * Values for all other columns are copied over. - * @param {ExprList} values The column values to unroll. - * The values may be specified using column name strings, column index - * numbers, value objects with output column names for keys and table - * expressions for values, or selection helper functions. - * @param {UnrollOptions} [options] Options for unrolling. - * @return {this} A new unrolled table. - * @example table.unroll('colA', { limit: 1000 }) - */ - unroll(values, options) { - return this.__unroll(this, values, options); - } - - // -- Joins --------------------------------------------------------------- - - /** - * Lookup values from a secondary table and add them as new columns. - * A lookup occurs upon matching key values for rows in both tables. - * If the secondary table has multiple rows with the same key, only - * the last observed instance will be considered in the lookup. - * Lookup is similar to {@link Transformable#join_left}, but with a simpler - * syntax and the added constraint of allowing at most one match only. - * @param {TableRef} other The secondary table to look up values from. - * @param {JoinKeys} [on] Lookup keys (column name strings or table - * expressions) for this table and the secondary table, respectively. - * @param {...ExprList} values The column values to add from the - * secondary table. Can be column name strings or objects with column - * names as keys and table expressions as values. - * @return {this} A new table with lookup values added. - * @example table.lookup(other, ['key1', 'key2'], 'value1', 'value2') - */ - lookup(other, on, ...values) { - return this.__lookup(this, other, on, values.flat()); - } - - /** - * Join two tables, extending the columns of one table with - * values from the other table. The current table is considered - * the "left" table in the join, and the new table input is - * considered the "right" table in the join. By default an inner - * join is performed, removing all rows that do not match the - * join criteria. To perform left, right, or full outer joins, use - * the {@link Transformable#join_left}, {@link Transformable#join_right}, or - * {@link Transformable#join_full} methods, or provide an options argument. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @param {JoinValues} [values] The columns to include in the join output. - * If unspecified, all columns from both tables are included; paired - * join keys sharing the same column name are included only once. - * If array-valued, a two element array should be provided, containing - * the columns to include for the left and right tables, respectively. - * Array input may consist of column name strings, objects with output - * names as keys and single-table table expressions as values, or the - * selection helper functions {@link all}, {@link not}, or {@link range}. - * If object-valued, specifies the key-value pairs for each output, - * defined using two-table table expressions. - * @param {JoinOptions} [options] Options for the join. - * @return {this} A new joined table. - * @example table.join(other, ['keyL', 'keyR']) - * @example table.join(other, (a, b) => equal(a.keyL, b.keyR)) - */ - join(other, on, values, options) { - return this.__join(this, other, on, values, options); - } - - /** - * Perform a left outer join on two tables. Rows in the left table - * that do not match a row in the right table will be preserved. - * This is a convenience method with fixed options for {@link Transformable#join}. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @param {JoinValues} [values] The columns to include in the join output. - * If unspecified, all columns from both tables are included; paired - * join keys sharing the same column name are included only once. - * If array-valued, a two element array should be provided, containing - * the columns to include for the left and right tables, respectively. - * Array input may consist of column name strings, objects with output - * names as keys and single-table table expressions as values, or the - * selection helper functions {@link all}, {@link not}, or {@link range}. - * If object-valued, specifies the key-value pairs for each output, - * defined using two-table table expressions. - * @param {JoinOptions} [options] Options for the join. With this method, - * any options will be overridden with {left: true, right: false}. - * @return {this} A new joined table. - * @example table.join_left(other, ['keyL', 'keyR']) - * @example table.join_left(other, (a, b) => equal(a.keyL, b.keyR)) - */ - join_left(other, on, values, options) { - const opt = { ...options, left: true, right: false }; - return this.__join(this, other, on, values, opt); - } - - /** - * Perform a right outer join on two tables. Rows in the right table - * that do not match a row in the left table will be preserved. - * This is a convenience method with fixed options for {@link Transformable#join}. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @param {JoinValues} [values] The columns to include in the join output. - * If unspecified, all columns from both tables are included; paired - * join keys sharing the same column name are included only once. - * If array-valued, a two element array should be provided, containing - * the columns to include for the left and right tables, respectively. - * Array input may consist of column name strings, objects with output - * names as keys and single-table table expressions as values, or the - * selection helper functions {@link all}, {@link not}, or {@link range}. - * If object-valued, specifies the key-value pairs for each output, - * defined using two-table table expressions. - * @param {JoinOptions} [options] Options for the join. With this method, - * any options will be overridden with {left: false, right: true}. - * @return {this} A new joined table. - * @example table.join_right(other, ['keyL', 'keyR']) - * @example table.join_right(other, (a, b) => equal(a.keyL, b.keyR)) - */ - join_right(other, on, values, options) { - const opt = { ...options, left: false, right: true }; - return this.__join(this, other, on, values, opt); - } - - /** - * Perform a full outer join on two tables. Rows in either the left or - * right table that do not match a row in the other will be preserved. - * This is a convenience method with fixed options for {@link Transformable#join}. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @param {JoinValues} [values] The columns to include in the join output. - * If unspecified, all columns from both tables are included; paired - * join keys sharing the same column name are included only once. - * If array-valued, a two element array should be provided, containing - * the columns to include for the left and right tables, respectively. - * Array input may consist of column name strings, objects with output - * names as keys and single-table table expressions as values, or the - * selection helper functions {@link all}, {@link not}, or {@link range}. - * If object-valued, specifies the key-value pairs for each output, - * defined using two-table table expressions. - * @param {JoinOptions} [options] Options for the join. With this method, - * any options will be overridden with {left: true, right: true}. - * @return {this} A new joined table. - * @example table.join_full(other, ['keyL', 'keyR']) - * @example table.join_full(other, (a, b) => equal(a.keyL, b.keyR)) - */ - join_full(other, on, values, options) { - const opt = { ...options, left: true, right: true }; - return this.__join(this, other, on, values, opt); - } - - /** - * Produce the Cartesian cross product of two tables. The output table - * has one row for every pair of input table rows. Beware that outputs - * may be quite large, as the number of output rows is the product of - * the input row counts. - * This is a convenience method for {@link Transformable#join} in which the - * join criteria is always true. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinValues} [values] The columns to include in the output. - * If unspecified, all columns from both tables are included. - * If array-valued, a two element array should be provided, containing - * the columns to include for the left and right tables, respectively. - * Array input may consist of column name strings, objects with output - * names as keys and single-table table expressions as values, or the - * selection helper functions {@link all}, {@link not}, or {@link range}. - * If object-valued, specifies the key-value pairs for each output, - * defined using two-table table expressions. - * @param {JoinOptions} [options] Options for the join. - * @return {this} A new joined table. - * @example table.cross(other) - * @example table.cross(other, [['leftKey', 'leftVal'], ['rightVal']]) - */ - cross(other, values, options) { - return this.__cross(this, other, values, options); - } - - /** - * Perform a semi-join, filtering the left table to only rows that - * match a row in the right table. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @return {this} A new filtered table. - * @example table.semijoin(other) - * @example table.semijoin(other, ['keyL', 'keyR']) - * @example table.semijoin(other, (a, b) => equal(a.keyL, b.keyR)) - */ - semijoin(other, on) { - return this.__semijoin(this, other, on); - } - - /** - * Perform an anti-join, filtering the left table to only rows that - * do *not* match a row in the right table. - * @param {TableRef} other The other (right) table to join with. - * @param {JoinPredicate} [on] The join criteria for matching table rows. - * If unspecified, the values of all columns with matching names - * are compared. - * If array-valued, a two-element array should be provided, containing - * the columns to compare for the left and right tables, respectively. - * If a one-element array or a string value is provided, the same - * column names will be drawn from both tables. - * If function-valued, should be a two-table table expression that - * returns a boolean value. When providing a custom predicate, note that - * join key values can be arrays or objects, and that normal join - * semantics do not consider null or undefined values to be equal (that is, - * null !== null). Use the op.equal function to handle these cases. - * @return {this} A new filtered table. - * @example table.antijoin(other) - * @example table.antijoin(other, ['keyL', 'keyR']) - * @example table.antijoin(other, (a, b) => equal(a.keyL, b.keyR)) - */ - antijoin(other, on) { - return this.__antijoin(this, other, on); - } - - // -- Set Operations ------------------------------------------------------ - - /** - * Concatenate multiple tables into a single table, preserving all rows. - * This transformation mirrors the UNION_ALL operation in SQL. - * Only named columns in this table are included in the output. - * @see Transformable#union - * @param {...TableRef} tables A list of tables to concatenate. - * @return {this} A new concatenated table. - * @example table.concat(other) - * @example table.concat(other1, other2) - * @example table.concat([other1, other2]) - */ - concat(...tables) { - return this.__concat(this, tables.flat()); - } - - /** - * Union multiple tables into a single table, deduplicating all rows. - * This transformation mirrors the UNION operation in SQL. It is - * similar to {@link Transformable#concat} but suppresses duplicate rows with - * values identical to another row. - * Only named columns in this table are included in the output. - * @see Transformable#concat - * @param {...TableRef} tables A list of tables to union. - * @return {this} A new unioned table. - * @example table.union(other) - * @example table.union(other1, other2) - * @example table.union([other1, other2]) - */ - union(...tables) { - return this.__union(this, tables.flat()); - } - - /** - * Intersect multiple tables, keeping only rows whose with identical - * values for all columns in all tables, and deduplicates the rows. - * This transformation is similar to a series of {@link Transformable#semijoin} - * calls, but additionally suppresses duplicate rows. - * @see Transformable#semijoin - * @param {...TableRef} tables A list of tables to intersect. - * @return {this} A new filtered table. - * @example table.intersect(other) - * @example table.intersect(other1, other2) - * @example table.intersect([other1, other2]) - */ - intersect(...tables) { - return this.__intersect(this, tables.flat()); - } - - /** - * Compute the set difference with multiple tables, keeping only rows in - * this table that whose values do not occur in the other tables. - * This transformation is similar to a series of {@link Transformable#antijoin} - * calls, but additionally suppresses duplicate rows. - * @see Transformable#antijoin - * @param {...TableRef} tables A list of tables to difference. - * @return {this} A new filtered table. - * @example table.except(other) - * @example table.except(other1, other2) - * @example table.except([other1, other2]) - */ - except(...tables) { - return this.__except(this, tables.flat()); - } -} - -// -- Parameter Types ------------------------------------------------------- - -/** - * Table expression parameters. - * @typedef {Object.} Params - */ - -/** - * A reference to a column by string name or integer index. - * @typedef {string|number} ColumnRef - */ - -/** - * A value that can be coerced to a string. - * @typedef {object} Stringable - * @property {() => string} toString String coercion method. - */ - -/** - * A table expression provided as a string or string-coercible value. - * @typedef {string|Stringable} TableExprString - */ - -/** - * A struct object with arbitraty named properties. - * @typedef {Object.} Struct - */ - -/** - * A function defined over a table row. - * @typedef {(d?: Struct, $?: Params) => any} TableExprFunc - */ - -/** - * A table expression defined over a single table. - * @typedef {TableExprFunc|TableExprString} TableExpr - */ - -/** - * A function defined over rows from two tables. - * @typedef {(a?: Struct, b?: Struct, $?: Params) => any} TableExprFunc2 - */ - -/** - * A table expression defined over two tables. - * @typedef {TableExprFunc2|TableExprString} TableExpr2 - */ - -/** - * An object that maps current column names to new column names. - * @typedef {{ [name: string]: string }} RenameMap - */ - -/** - * A selection helper function. - * @typedef {(table: any) => string[]} SelectHelper - */ - -/** - * One or more column selections, potentially with renaming. - * The input may consist of a column name string, column integer index, a - * rename map object with current column names as keys and new column names - * as values, or a select helper function that takes a table as input and - * returns a valid selection parameter. - * @typedef {ColumnRef|RenameMap|SelectHelper} SelectEntry - */ - -/** - * An ordered set of column selections, potentially with renaming. - * @typedef {SelectEntry|SelectEntry[]} Select - */ - -/** - * An object of column name / table expression pairs. - * @typedef {{ [name: string]: TableExpr }} ExprObject - */ - -/** - * An object of column name / two-table expression pairs. - * @typedef {{ [name: string]: TableExpr2 }} Expr2Object - */ - -/** - * An ordered set of one or more column values. - * @typedef {ColumnRef|SelectHelper|ExprObject} ListEntry - */ - -/** - * An ordered set of column values. - * Entries may be column name strings, column index numbers, value objects - * with output column names for keys and table expressions for values, - * or a selection helper function. - * @typedef {ListEntry|ListEntry[]} ExprList - */ - -/** - * A reference to a data table or transformable instance. - * @typedef {Transformable|string} TableRef - */ - -/** - * One or more orderby sort criteria. - * If a string, order by the column with that name. - * If a number, order by the column with that index. - * If a function, must be a valid table expression; aggregate functions - * are permitted, but window functions are not. - * If an object, object values must be valid values parameters - * with output column names for keys and table expressions - * for values. The output name keys will subsequently be ignored. - * @typedef {ColumnRef|TableExpr|ExprObject} OrderKey - */ - -/** - * An ordered set of orderby sort criteria, in precedence order. - * @typedef {OrderKey|OrderKey[]} OrderKeys - */ - -/** - * Column values to use as a join key. - * @typedef {ColumnRef|TableExprFunc} JoinKey - */ - -/** - * An ordered set of join keys. - * @typedef {JoinKey|[JoinKey[]]|[JoinKey[], JoinKey[]]} JoinKeys - */ - -/** - * A predicate specification for joining two tables. - * @typedef {JoinKeys|TableExprFunc2|null} JoinPredicate - */ - -/** - * An array of per-table join values to extract. - * @typedef {[ExprList]|[ExprList, ExprList]|[ExprList, ExprList, Expr2Object]} JoinList - */ - -/** - * A specification of join values to extract. - * @typedef {JoinList|Expr2Object} JoinValues - */ - -// -- Transform Options ----------------------------------------------------- - -/** - * Options for count transformations. - * @typedef {object} CountOptions - * @property {string} [as='count'] The name of the output count column. - */ - -/** - * Options for derive transformations. - * @typedef {object} DeriveOptions - * @property {boolean} [drop=false] A flag indicating if the original - * columns should be dropped, leaving only the derived columns. If true, - * the before and after options are ignored. - * @property {Select} [before] - * An anchor column that relocated columns should be placed before. - * The value can be any legal column selection. If multiple columns are - * selected, only the first column will be used as an anchor. - * It is an error to specify both before and after options. - * @property {Select} [after] - * An anchor column that relocated columns should be placed after. - * The value can be any legal column selection. If multiple columns are - * selected, only the last column will be used as an anchor. - * It is an error to specify both before and after options. - */ - -/** - * Options for relocate transformations. - * @typedef {object} RelocateOptions - * @property {Selection} [before] - * An anchor column that relocated columns should be placed before. - * The value can be any legal column selection. If multiple columns are - * selected, only the first column will be used as an anchor. - * It is an error to specify both before and after options. - * @property {Selection} [after] - * An anchor column that relocated columns should be placed after. - * The value can be any legal column selection. If multiple columns are - * selected, only the last column will be used as an anchor. - * It is an error to specify both before and after options. - */ - -/** - * Options for sample transformations. - * @typedef {object} SampleOptions - * @property {boolean} [replace=false] Flag for sampling with replacement. - * @property {boolean} [shuffle=true] Flag to ensure randomly ordered rows. - * @property {string|TableExprFunc} [weight] Column values to use as weights - * for sampling. Rows will be sampled with probability proportional to - * their relative weight. The input should be a column name string or - * a table expression compatible with {@link Transformable#derive}. - */ - -/** - * Options for impute transformations. - * @typedef {object} ImputeOptions - * @property {ExprList} [expand] Column values to combine to impute missing - * rows. For column names and indices, all unique column values are - * considered. Otherwise, each entry should be an object of name-expresion - * pairs, with valid table expressions for {@link Transformable#rollup}. - * All combinations of values are checked for each set of unique groupby - * values. - */ - -/** - * Options for fold transformations. - * @typedef {object} FoldOptions - * @property {string[]} [as=['key', 'value']] An array indicating the - * output column names to use for the key and value columns, respectively. - */ - -/** - * Options for pivot transformations. - * @typedef {object} PivotOptions - * @property {number} [limit=Infinity] The maximum number of new columns to generate. - * @property {string} [keySeparator='_'] A string to place between multiple key names. - * @property {string} [valueSeparator='_'] A string to place between key and value names. - * @property {boolean} [sort=true] Flag for alphabetical sorting of new column names. - */ - -/** - * Options for spread transformations. - * @typedef {object} SpreadOptions - * @property {boolean} [drop=true] Flag indicating if input columns to the - * spread operation should be dropped in the output table. - * @property {number} [limit=Infinity] The maximum number of new columns to - * generate. - * @property {string[]} [as] Output column names to use. This option only - * applies when a single column is spread. If the given array of names is - * shorter than the number of generated columns and no limit option is - * specified, the additional generated columns will be dropped. - */ - -/** - * Options for unroll transformations. - * @typedef {object} UnrollOptions - * @property {number} [limit=Infinity] The maximum number of new rows - * to generate per array value. - * @property {boolean|string} [index=false] Flag or column name for adding - * zero-based array index values as an output column. If true, a new column - * named "index" will be included. If string-valued, a new column with - * the given name will be added. - * @property {Select} [drop] Columns to drop from the output. The input may - * consist of column name strings, column integer indices, objects with - * column names as keys, or functions that take a table as input and - * return a valid selection parameter (typically the output of selection - * helper functions such as {@link all}, {@link not}, or {@link range}). - */ - -/** - * Options for join transformations. - * @typedef {object} JoinOptions - * @property {boolean} [left=false] Flag indicating a left outer join. - * If both the *left* and *right* are true, indicates a full outer join. - * @property {boolean} [right=false] Flag indicating a right outer join. - * If both the *left* and *right* are true, indicates a full outer join. - * @property {string[]} [suffix=['_1', '_2']] Column name suffixes to - * append if two columns with the same name are produced by the join. - */ \ No newline at end of file diff --git a/src/table/types.ts b/src/table/types.ts new file mode 100644 index 00000000..17dfba77 --- /dev/null +++ b/src/table/types.ts @@ -0,0 +1,407 @@ +import { Table } from './Table.js'; +import { BitSet } from './BitSet.js'; + +/** A table column value. */ +export type DataValue = any; + +/** Interface for table columns. */ +export interface ColumnType { + /** The number of rows in the column. */ + length: number; + /** Retrieve the values at the given row index. */ + at(row: number): T; + /** Return a column value iterator. */ + [Symbol.iterator]() : Iterator; +} + +/** A named collection of columns. */ +export type ColumnData = Record>; + +/** Table expression parameters. */ +export type Params = Record; + +/** A typed array constructor. */ +export type TypedArrayConstructor = + | Uint8ArrayConstructor + | Uint16ArrayConstructor + | Uint32ArrayConstructor + | BigUint64ArrayConstructor + | Int8ArrayConstructor + | Int16ArrayConstructor + | Int32ArrayConstructor + | BigInt64ArrayConstructor + | Float32ArrayConstructor + | Float64ArrayConstructor; + +/** A typed array instance. */ +export type TypedArray = + | Uint8Array + | Uint16Array + | Uint32Array + | BigUint64Array + | Int8Array + | Int16Array + | Int32Array + | BigInt64Array + | Float32Array + | Float64Array; + +/** Table row object. */ +export type RowObject = Record; + +/** A table groupby specification. */ +export interface GroupBySpec { + /** The number of groups. */ + size: number; + /** Column names for each group. */ + names: string[]; + /** Value accessor functions for each group. */ + get: RowExpression[]; + /** Indices of an example table row for each group. */ + rows: number[] | Uint32Array; + /** Per-row group indices, length is total rows of table. */ + keys: number[] | Uint32Array; +} + +/** An expression evaluated over a table row. */ +export type RowExpression = ( + /** The table row. */ + row: number, + /** The backing table data store. */ + data: ColumnData +) => DataValue; + +/** Column value accessor. */ +export type ColumnGetter = ( + /** The table row. */ + row: number +) => DataValue; + +/** + * Comparator function for sorting table rows. Returns a negative value + * if rowA < rowB, positive if rowA > rowB, otherwise zero. + */ +export type RowComparator = ( + /** The table row index for the first row. */ + rowA: number, + /** The table row index for the second row. */ + rowB: number, + /** The backing table data store. */ + data: ColumnData +) => number; + +/** Options for derived table creation. */ +export interface CreateOptions { + /** The backing column data. */ + data?: ColumnData; + /** An ordered list of column names. */ + names?: readonly string[]; + /** An additional filter BitSet to apply. */ + filter?: BitSet; + /** The groupby specification to use, or null for no groups. */ + groups?: GroupBySpec; + /** The orderby comparator function to use, or null for no order. */ + order?: RowComparator +} + +/** Options for generating row objects. */ +export interface PrintOptions { + /** The maximum number of objects to create, default `Infinity`. */ + limit?: number; + /** The row offset indicating how many initial rows to skip, default `0`. */ + offset?: number; + /** + * An ordered set of columns to include. The input may consist of column + * name strings, column integer indices, objects with current column names + * as keys and new column names as values (for renaming), or selection + * helper functions such as *all*, *not*, or *range*. + */ + columns?: Select; +} + +/** Options for generating row objects. */ +export interface ObjectsOptions { + /** The maximum number of objects to create, default `Infinity`. */ + limit?: number; + /** The row offset indicating how many initial rows to skip, default `0`. */ + offset?: number; + /** + * An ordered set of columns to include. The input may consist of column + * name strings, column integer indices, objects with current column names + * as keys and new column names as values (for renaming), or selection + * helper functions such as *all*, *not*, or *range*. + */ + columns?: Select; + /** + * The export format for groups of rows. The default (false) is to ignore + * groups, returning a flat array of objects. The valid values are 'map' or + * true (for Map instances), 'object' (for standard objects), or 'entries' + * (for arrays in the style of Object.entries). For the 'object' format, + * groupby keys are coerced to strings to use as object property names; note + * that this can lead to undesirable behavior if the groupby keys are object + * values. The 'map' and 'entries' options preserve the groupby key values. + */ + grouped?: 'map' | 'entries' | 'object' | boolean; +} + +/** A reference to a column by string name or integer index. */ +export type ColumnRef = string | number; + +/** A value that can be coerced to a string. */ +export interface Stringable { + /** String coercion method. */ + toString(): string; +} + +/** A table expression provided as a string or string-coercible value. */ +export type TableExprString = string | Stringable; + +/** A struct object with arbitrary named properties. */ +export type Struct = Record; + +/** A function defined over a table row. */ +export type TableExprFunc = (d?: Struct, $?: Params) => any; + +/** A table expression defined over a single table. */ +export type TableExpr = TableExprFunc | TableExprString; + +/** A function defined over rows from two tables. */ +export type TableExprFunc2 = (a?: Struct, b?: Struct, $?: Params) => any; + +/** A table expression defined over two tables. */ +export type TableExpr2 = TableExprFunc2 | TableExprString; + +/** An object that maps current column names to new column names. */ +export type RenameMap = Record; + +/** A selection helper function. */ +export type SelectHelper = (table: Table) => string[]; + +/** + * One or more column selections, potentially with renaming. + * The input may consist of a column name string, column integer index, a + * rename map object with current column names as keys and new column names + * as values, or a select helper function that takes a table as input and + * returns a valid selection parameter. + */ +export type SelectEntry = ColumnRef | RenameMap | SelectHelper; + +/** An ordered set of column selections, potentially with renaming. */ +export type Select = SelectEntry | SelectEntry[]; + +/** An object of column name / table expression pairs. */ +export type ExprObject = Record; + +/** An object of column name / two-table expression pairs. */ +export type Expr2Object = Record; + +/** An ordered set of one or more column values. */ +export type ListEntry = ColumnRef | SelectHelper | ExprObject; + +/** + * An ordered set of column values. + * Entries may be column name strings, column index numbers, value objects + * with output column names for keys and table expressions for values, + * or a selection helper function. + */ +export type ExprList = ListEntry | ListEntry[]; + +/** A reference to a data table instance. */ +export type TableRef = Table | string; + +/** A list of one or more table references. */ +export type TableRefList = TableRef | TableRef[]; + +/** + * One or more orderby sort criteria. + * If a string, order by the column with that name. + * If a number, order by the column with that index. + * If a function, must be a valid table expression; aggregate functions + * are permitted, but window functions are not. + * If an object, object values must be valid values parameters + * with output column names for keys and table expressions + * for values. The output name keys will subsequently be ignored. + */ +export type OrderKey = ColumnRef | TableExpr | ExprObject; + +/** An ordered set of orderby sort criteria, in precedence order. */ +export type OrderKeys = OrderKey | OrderKey[]; + +/** Column values to use as a join key. */ +export type JoinKey = ColumnRef | TableExprFunc; + +/** An ordered set of join keys. */ +export type JoinKeys = + | JoinKey + | [JoinKey[]] + | [JoinKey, JoinKey] + | [JoinKey[], JoinKey[]]; + +/** A predicate specification for joining two tables. */ +export type JoinPredicate = JoinKeys | TableExprFunc2 | null; + +/** An array of per-table join values to extract. */ +export type JoinList = + | [ExprList] + | [ExprList, ExprList] + | [ExprList, ExprList, Expr2Object]; + +/** A specification of join values to extract. */ +export type JoinValues = JoinList | Expr2Object; + +// -- Transform Options ----------------------------------------------------- + +/** Options for count transformations. */ +export interface CountOptions { + /** The name of the output count column, default `count`. */ + as?: string; +} + +/** Options for derive transformations. */ +export interface DeriveOptions { + /** + * A flag (default `false`) indicating if the original columns should be + * dropped, leaving only the derived columns. If true, the before and after + * options are ignored. + */ + drop?: boolean; + /** + * An anchor column that relocated columns should be placed before. + * The value can be any legal column selection. If multiple columns are + * selected, only the first column will be used as an anchor. + * It is an error to specify both before and after options. + */ + before?: Select; + /** + * An anchor column that relocated columns should be placed after. + * The value can be any legal column selection. If multiple columns are + * selected, only the last column will be used as an anchor. + * It is an error to specify both before and after options. + */ + after?: Select; +} + +/** Options for relocate transformations. */ +export interface RelocateOptions { + /** + * An anchor column that relocated columns should be placed before. + * The value can be any legal column selection. If multiple columns are + * selected, only the first column will be used as an anchor. + * It is an error to specify both before and after options. + */ + before?: Select; + /** + * An anchor column that relocated columns should be placed after. + * The value can be any legal column selection. If multiple columns are + * selected, only the last column will be used as an anchor. + * It is an error to specify both before and after options. + */ + after?: Select; +} + +/** Options for sample transformations. */ +export interface SampleOptions { + /** Flag for sampling with replacement (default `false`). */ + replace?: boolean; + /** Flag to ensure randomly ordered rows (default `true`). */ + shuffle?: boolean; + /** + * Column values to use as weights for sampling. Rows will be sampled with + * probability proportional to their relative weight. The input should be a + * column name string or a table expression compatible with *derive*. + */ + weight?: string | TableExprFunc; +} + +/** Options for impute transformations. */ +export interface ImputeOptions { + /** + * Column values to combine to impute missing rows. For column names and + * indices, all unique column values are considered. Otherwise, each entry + * should be an object of name-expresion pairs, with valid table expressions + * for *rollup*. All combinations of values are checked for each set of + * unique groupby values. + */ + expand?: ExprList; +} + +/** Options for fold transformations. */ +export interface FoldOptions { + /** + * An array indicating the output column names to use for the key and value + * columns, respectively. The default is `['key', 'value']`. + */ + as?: string[]; +} + +/** Options for pivot transformations. */ +export interface PivotOptions { + /** The maximum number of new columns to generate (default `Infinity`). */ + limit?: number; + /** A string to place between multiple key names (default `_`); */ + keySeparator?: string; + /** A string to place between key and value names (default `_`). */ + valueSeparator?: string; + /** Flag for alphabetical sorting of new column names (default `true`). */ + sort?: boolean; +} + +/** Options for spread transformations. */ +export interface SpreadOptions { + /** + * Flag (default `true`) indicating if input columns to the + * spread operation should be dropped in the output table. + */ + drop?: boolean; + /** The maximum number of new columns to generate (default `Infinity`). */ + limit?: number; + /** + * Output column names to use. This option only applies when a single + * column is spread. If the given array of names is shorter than the + * number of generated columns and no limit option is specified, the + * additional generated columns will be dropped. + */ + as?: string[]; +} + +/** Options for unroll transformations. */ +export interface UnrollOptions { + /** + * The maximum number of new rows to generate per array value + * (default `Infinity`). + */ + limit?: number; + /** + * Flag or column name to add zero-based array index values as an output + * column (default `false`). If true, a column named "index" will be + * included. If string-valued, a column with the given name will be added. + */ + index?: boolean | string; + /** + * Columns to drop from the output. The input may consist of column name + * strings, column integer indices, objects with column names as keys, or + * functions that take a table as input and return a valid selection + * parameter (typically the output of selection helper functions such as + * *all*, *not*, or *range*. + */ + drop?: Select; +} + +/** Options for join transformations. */ +export interface JoinOptions { + /** + * Flag indicating a left outer join (default `false`). If both the + * *left* and *right* flags are true, indicates a full outer join. + */ + left?: boolean; + /** + * Flag indicating a right outer join (default `false`). If both the + * *left* and *right* flags are true, indicates a full outer join. + */ + right?: boolean; + /** + * Column name suffixes to append if two columns with the same name are + * produced by the join. The default is `['_1', '_2']`. + */ + suffix?: string[]; +} diff --git a/src/util/array-type.js b/src/util/array-type.js index 43f01d75..e7dadf13 100644 --- a/src/util/array-type.js +++ b/src/util/array-type.js @@ -1,5 +1,10 @@ -import isTypedArray from './is-typed-array'; +import isTypedArray from './is-typed-array.js'; +/** + * @param {*} column + * @returns {ArrayConstructor | import('../table/types.js').TypedArrayConstructor} + */ export default function(column) { - return isTypedArray(column.data) ? column.data.constructor : Array; -} \ No newline at end of file + // @ts-ignore + return isTypedArray(column) ? column.constructor : Array; +} diff --git a/src/util/ascending.js b/src/util/ascending.js index c6600e2a..21a4b8f5 100644 --- a/src/util/ascending.js +++ b/src/util/ascending.js @@ -1,3 +1,3 @@ export default function(a, b) { return a < b ? -1 : a > b ? 1 : a >= b ? 0 : NaN; -} \ No newline at end of file +} diff --git a/src/util/assign.js b/src/util/assign.js index 3f6d4e30..fe25b0a2 100644 --- a/src/util/assign.js +++ b/src/util/assign.js @@ -1,8 +1,8 @@ -import entries from './entries'; +import entries from './entries.js'; export default function(map, pairs) { for (const [key, value] of entries(pairs)) { map.set(key, value); } return map; -} \ No newline at end of file +} diff --git a/src/util/auto-type.js b/src/util/auto-type.js index a91b1c75..d398ddbb 100644 --- a/src/util/auto-type.js +++ b/src/util/auto-type.js @@ -1,4 +1,4 @@ -import parseIsoDate from './parse-iso-date'; +import parseIsoDate from './parse-iso-date.js'; export default function(input) { const value = input.trim(); @@ -9,6 +9,7 @@ export default function(input) { : value === 'false' ? false : value === 'NaN' ? NaN : !isNaN(parsed = +value) ? parsed + // @ts-ignore : (parsed = parseIsoDate(value, d => new Date(d))) !== value ? parsed : input; -} \ No newline at end of file +} diff --git a/src/util/bins.js b/src/util/bins.js index 7faefd8a..d28244fa 100644 --- a/src/util/bins.js +++ b/src/util/bins.js @@ -42,4 +42,4 @@ export default function(min, max, maxbins = 15, nice = true, minstep = 0, step) max === min ? min + step : max, step ]; -} \ No newline at end of file +} diff --git a/src/util/concat.js b/src/util/concat.js index 4c776840..447b1c21 100644 --- a/src/util/concat.js +++ b/src/util/concat.js @@ -1,4 +1,5 @@ -export default function(list, fn = (x => x), delim = '') { +// eslint-disable-next-line no-unused-vars +export default function(list, fn = ((x, i) => x), delim = '') { const n = list.length; if (!n) return ''; @@ -8,4 +9,4 @@ export default function(list, fn = (x => x), delim = '') { } return s; -} \ No newline at end of file +} diff --git a/src/util/default-true.js b/src/util/default-true.js index 59cdb40c..4a22345d 100644 --- a/src/util/default-true.js +++ b/src/util/default-true.js @@ -1,3 +1,3 @@ export default function(value, trueValue = true, falseValue = false) { return (value === undefined || value) ? trueValue : falseValue; -} \ No newline at end of file +} diff --git a/src/util/descending.js b/src/util/descending.js index 82e4d7c7..a4e2d7fb 100644 --- a/src/util/descending.js +++ b/src/util/descending.js @@ -1,3 +1,3 @@ export default function(a, b) { return b < a ? -1 : b > a ? 1 : b >= a ? 0 : NaN; -} \ No newline at end of file +} diff --git a/src/util/distinct-map.js b/src/util/distinct-map.js index 1a248acd..1d27c1f3 100644 --- a/src/util/distinct-map.js +++ b/src/util/distinct-map.js @@ -1,4 +1,4 @@ -import { key } from './key-function'; +import { key } from './key-function.js'; export default function() { const map = new Map(); @@ -23,4 +23,4 @@ export default function() { map.forEach(({ v, n }) => fn(v, n)); } }; -} \ No newline at end of file +} diff --git a/src/util/entries.js b/src/util/entries.js index 6bf4a2b8..ce948bc3 100644 --- a/src/util/entries.js +++ b/src/util/entries.js @@ -1,9 +1,9 @@ -import isArray from './is-array'; -import isMap from './is-map'; +import isArray from './is-array.js'; +import isMap from './is-map.js'; export default function(value) { return isArray(value) ? value : isMap(value) ? value.entries() : value ? Object.entries(value) : []; -} \ No newline at end of file +} diff --git a/src/util/error.js b/src/util/error.js index 03797070..80e4372c 100644 --- a/src/util/error.js +++ b/src/util/error.js @@ -1,3 +1,4 @@ -export default function(message) { - throw Error(message); -} \ No newline at end of file +export default function(message, cause) { + // @ts-ignore + throw Error(message, { cause }); +} diff --git a/src/util/escape-regexp.js b/src/util/escape-regexp.js index f1b8a71a..c0501e2c 100644 --- a/src/util/escape-regexp.js +++ b/src/util/escape-regexp.js @@ -1,3 +1,3 @@ export default function(str) { return str.replace(/[.*+\-?^${}()|[\]\\]/g, '\\$&'); -} \ No newline at end of file +} diff --git a/src/util/format-date.js b/src/util/format-date.js index 672b0323..4f05ef91 100644 --- a/src/util/format-date.js +++ b/src/util/format-date.js @@ -1,4 +1,4 @@ -import pad from './pad'; +import pad from './pad.js'; const pad2 = v => (v < 10 ? '0' : '') + v; @@ -44,4 +44,4 @@ export function formatUTCDate(d, short) { d.getUTCMilliseconds(), true, short ); -} \ No newline at end of file +} diff --git a/src/util/has.js b/src/util/has.js index 2bc5e1df..12ee46fa 100644 --- a/src/util/has.js +++ b/src/util/has.js @@ -1,5 +1 @@ -const { hasOwnProperty } = Object.prototype; - -export default function(object, property) { - return hasOwnProperty.call(object, property); -} \ No newline at end of file +export default Object.hasOwn; diff --git a/src/util/identity.js b/src/util/identity.js index 1afdd419..9a800599 100644 --- a/src/util/identity.js +++ b/src/util/identity.js @@ -1 +1 @@ -export default x => x; \ No newline at end of file +export default x => x; diff --git a/src/util/intersect.js b/src/util/intersect.js index c3570aeb..ee53fd14 100644 --- a/src/util/intersect.js +++ b/src/util/intersect.js @@ -1,4 +1,4 @@ export default function intersect(a, b) { const set = new Set(b); return a.filter(x => set.has(x)); -} \ No newline at end of file +} diff --git a/src/util/is-array-type.js b/src/util/is-array-type.js index dbf32290..5fa76dda 100644 --- a/src/util/is-array-type.js +++ b/src/util/is-array-type.js @@ -1,6 +1,10 @@ -import isArray from './is-array'; -import isTypedArray from './is-typed-array'; +import isArray from './is-array.js'; +import isTypedArray from './is-typed-array.js'; +/** + * @param {*} value + * @return {value is (any[] | import('../table/types.js').TypedArray)} + */ export default function isArrayType(value) { return isArray(value) || isTypedArray(value); -} \ No newline at end of file +} diff --git a/src/util/is-bigint.js b/src/util/is-bigint.js index 445be604..e3ad5d0e 100644 --- a/src/util/is-bigint.js +++ b/src/util/is-bigint.js @@ -1,3 +1,3 @@ export default function(value) { return typeof value === 'bigint'; -} \ No newline at end of file +} diff --git a/src/util/is-date.js b/src/util/is-date.js index 2c023a41..b4a8f836 100644 --- a/src/util/is-date.js +++ b/src/util/is-date.js @@ -1,3 +1,3 @@ export default function(value) { return value instanceof Date; -} \ No newline at end of file +} diff --git a/src/util/is-digit-string.js b/src/util/is-digit-string.js index ae89d578..09b27e2f 100644 --- a/src/util/is-digit-string.js +++ b/src/util/is-digit-string.js @@ -5,4 +5,4 @@ export default function(value) { if (c < 48 || c > 57) return false; } return true; -} \ No newline at end of file +} diff --git a/src/util/is-exact-utc-date.js b/src/util/is-exact-utc-date.js index f3f532e8..88232fd2 100644 --- a/src/util/is-exact-utc-date.js +++ b/src/util/is-exact-utc-date.js @@ -3,4 +3,4 @@ export default function(d) { && d.getUTCMinutes() === 0 && d.getUTCSeconds() === 0 && d.getUTCMilliseconds() === 0; -} \ No newline at end of file +} diff --git a/src/util/is-function.js b/src/util/is-function.js index 62baa6f2..5249120d 100644 --- a/src/util/is-function.js +++ b/src/util/is-function.js @@ -1,3 +1,3 @@ export default function(value) { return typeof value === 'function'; -} \ No newline at end of file +} diff --git a/src/util/is-iso-date-string.js b/src/util/is-iso-date-string.js index e8f6ba5c..8d94a5b6 100644 --- a/src/util/is-iso-date-string.js +++ b/src/util/is-iso-date-string.js @@ -2,4 +2,4 @@ const iso_re = /^([-+]\d{2})?\d{4}(-\d{2}(-\d{2})?)?(T\d{2}:\d{2}(:\d{2}(\.\d{3} export default function(value) { return value.match(iso_re) && !isNaN(Date.parse(value)); -} \ No newline at end of file +} diff --git a/src/util/is-map-or-set.js b/src/util/is-map-or-set.js index 9a1abe6b..b29edb82 100644 --- a/src/util/is-map-or-set.js +++ b/src/util/is-map-or-set.js @@ -1,6 +1,10 @@ -import isMap from './is-map'; -import isSet from './is-set'; +import isMap from './is-map.js'; +import isSet from './is-set.js'; +/** + * @param {*} value + * @return {value is Map | Set} + */ export default function(value) { return isMap(value) || isSet(value); -} \ No newline at end of file +} diff --git a/src/util/is-map.js b/src/util/is-map.js index 43a9e434..c7be6098 100644 --- a/src/util/is-map.js +++ b/src/util/is-map.js @@ -1,3 +1,7 @@ +/** + * @param {*} value + * @return {value is Map} + */ export default function(value) { return value instanceof Map; -} \ No newline at end of file +} diff --git a/src/util/is-number.js b/src/util/is-number.js index 3304d1fd..f28f0e3c 100644 --- a/src/util/is-number.js +++ b/src/util/is-number.js @@ -1,3 +1,3 @@ export default function(value) { return typeof value === 'number'; -} \ No newline at end of file +} diff --git a/src/util/is-object.js b/src/util/is-object.js index 2b4bcc70..a26b8679 100644 --- a/src/util/is-object.js +++ b/src/util/is-object.js @@ -1,3 +1,3 @@ export default function(value) { return value === Object(value); -} \ No newline at end of file +} diff --git a/src/util/is-regexp.js b/src/util/is-regexp.js index 2af4b671..5c511621 100644 --- a/src/util/is-regexp.js +++ b/src/util/is-regexp.js @@ -1,3 +1,3 @@ export default function(value) { return value instanceof RegExp; -} \ No newline at end of file +} diff --git a/src/util/is-set.js b/src/util/is-set.js index 30888676..3b26a434 100644 --- a/src/util/is-set.js +++ b/src/util/is-set.js @@ -1,3 +1,7 @@ +/** + * @param {*} value + * @return {value is Set} + */ export default function(value) { return value instanceof Set; -} \ No newline at end of file +} diff --git a/src/util/is-string.js b/src/util/is-string.js index 653c8a56..7944070e 100644 --- a/src/util/is-string.js +++ b/src/util/is-string.js @@ -1,3 +1,7 @@ +/** + * @param {*} value + * @return {value is String} + */ export default function(value) { return typeof value === 'string'; } diff --git a/src/util/is-typed-array.js b/src/util/is-typed-array.js index 32cc0f0f..065ba907 100644 --- a/src/util/is-typed-array.js +++ b/src/util/is-typed-array.js @@ -1,5 +1,9 @@ const TypedArray = Object.getPrototypeOf(Int8Array); +/** + * @param {*} value + * @return {value is import("../table/types.js").TypedArray} + */ export default function(value) { return value instanceof TypedArray; -} \ No newline at end of file +} diff --git a/src/util/key-function.js b/src/util/key-function.js index d3cbecb9..463f7021 100644 --- a/src/util/key-function.js +++ b/src/util/key-function.js @@ -1,7 +1,7 @@ -import isArray from './is-array'; -import isDate from './is-date'; -import isRegExp from './is-regexp'; -import isTypedArray from './is-typed-array'; +import isArray from './is-array.js'; +import isDate from './is-date.js'; +import isRegExp from './is-regexp.js'; +import isTypedArray from './is-typed-array.js'; export function key(value) { const type = typeof value; @@ -38,4 +38,4 @@ export default function(get, nulls) { } return s; }; -} \ No newline at end of file +} diff --git a/src/util/map-object.js b/src/util/map-object.js index 2f8451fa..e80b7529 100644 --- a/src/util/map-object.js +++ b/src/util/map-object.js @@ -3,4 +3,4 @@ export default function(obj, fn, output = {}) { output[key] = fn(obj[key], key); } return output; -} \ No newline at end of file +} diff --git a/src/util/max.js b/src/util/max.js index 3e9dd8c8..91bf0867 100644 --- a/src/util/max.js +++ b/src/util/max.js @@ -1,4 +1,4 @@ -import NULL from './null'; +import NULL from './null.js'; export default function(values, start = 0, stop = values.length) { let max = stop ? values[start++] : NULL; @@ -10,4 +10,4 @@ export default function(values, start = 0, stop = values.length) { } return max; -} \ No newline at end of file +} diff --git a/src/util/min.js b/src/util/min.js index 414b18e4..02a1ddbe 100644 --- a/src/util/min.js +++ b/src/util/min.js @@ -1,4 +1,4 @@ -import NULL from './null'; +import NULL from './null.js'; export default function(values, start = 0, stop = values.length) { let min = stop ? values[start++] : NULL; @@ -10,4 +10,4 @@ export default function(values, start = 0, stop = values.length) { } return min; -} \ No newline at end of file +} diff --git a/src/util/no-op.js b/src/util/no-op.js index 421474fe..6ab80bc8 100644 --- a/src/util/no-op.js +++ b/src/util/no-op.js @@ -1 +1 @@ -export default function() {} \ No newline at end of file +export default function() {} diff --git a/src/util/null.js b/src/util/null.js index 6affaebd..ff1cd12a 100644 --- a/src/util/null.js +++ b/src/util/null.js @@ -1,4 +1,4 @@ /** * Default NULL (missing) value to use. */ -export default undefined; \ No newline at end of file +export default undefined; diff --git a/src/util/pad.js b/src/util/pad.js index 2175a974..9e8a3400 100644 --- a/src/util/pad.js +++ b/src/util/pad.js @@ -2,4 +2,4 @@ export default function(value, width, char = '0') { const s = value + ''; const len = s.length; return len < width ? Array(width - len + 1).join(char) + s : s; -} \ No newline at end of file +} diff --git a/src/util/parse-dsv.js b/src/util/parse-dsv.js index 185cdfbe..9d5d1961 100644 --- a/src/util/parse-dsv.js +++ b/src/util/parse-dsv.js @@ -1,4 +1,4 @@ -import error from './error'; +import error from './error.js'; const EOL = {}; const EOF = {}; @@ -31,7 +31,7 @@ const RETURN = 13; // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. export default function( text, - { delimiter = ',', comment, skip } = {} + { delimiter = ',', comment = undefined, skip = 0 } = {} ) { if (delimiter.length !== 1) { error(`Text delimiter should be a single character: "${delimiter}"`); @@ -99,4 +99,4 @@ export default function( } } }; -} \ No newline at end of file +} diff --git a/src/util/parse-iso-date.js b/src/util/parse-iso-date.js index c3f9e9fc..486fc4fc 100644 --- a/src/util/parse-iso-date.js +++ b/src/util/parse-iso-date.js @@ -1,5 +1,5 @@ -import isISODateString from './is-iso-date-string'; +import isISODateString from './is-iso-date-string.js'; export default function(value, parse = Date.parse) { return isISODateString(value) ? parse(value) : value; -} \ No newline at end of file +} diff --git a/src/util/parse-values.js b/src/util/parse-values.js index 44dc9562..635c9d9b 100644 --- a/src/util/parse-values.js +++ b/src/util/parse-values.js @@ -1,5 +1,5 @@ -import identity from './identity'; -import isISODateString from './is-iso-date-string'; +import identity from './identity.js'; +import isISODateString from './is-iso-date-string.js'; const parseBoolean = [ // boolean v => (v === 'true') || (v === 'false'), @@ -44,4 +44,4 @@ function check(values, test) { } } return true; -} \ No newline at end of file +} diff --git a/src/util/product.js b/src/util/product.js index 45c2f0de..e4a6050c 100644 --- a/src/util/product.js +++ b/src/util/product.js @@ -6,4 +6,4 @@ export default function(values, start = 0, stop = values.length) { } return prod; -} \ No newline at end of file +} diff --git a/src/util/quantile.js b/src/util/quantile.js index 5c476abc..887df136 100644 --- a/src/util/quantile.js +++ b/src/util/quantile.js @@ -1,6 +1,6 @@ -import isBigInt from './is-bigint'; -import NULL from './null'; -import toNumeric from './to-numeric'; +import isBigInt from './is-bigint.js'; +import NULL from './null.js'; +import toNumeric from './to-numeric.js'; export default function quantile(values, p) { const n = values.length; @@ -14,5 +14,6 @@ export default function quantile(values, p) { const v0 = toNumeric(values[i0]); return isBigInt(v0) ? v0 + // @ts-ignore : v0 + (toNumeric(values[i0 + 1]) - v0) * (i - i0); -} \ No newline at end of file +} diff --git a/src/util/random.js b/src/util/random.js index 70c5da16..e3d1cdbc 100644 --- a/src/util/random.js +++ b/src/util/random.js @@ -1,4 +1,4 @@ -import isValid from './is-valid'; +import isValid from './is-valid.js'; let source = Math.random; @@ -28,4 +28,4 @@ function lcg(seed) { // Random numbers using a Linear Congruential Generator with seed value // https://en.wikipedia.org/wiki/Linear_congruential_generator return () => (seed = a * seed + c | 0, m * (seed >>> 0)); -} \ No newline at end of file +} diff --git a/src/util/repeat.js b/src/util/repeat.js index 318dff1e..088d2fc6 100644 --- a/src/util/repeat.js +++ b/src/util/repeat.js @@ -1,4 +1,4 @@ -import isFunction from './is-function'; +import isFunction from './is-function.js'; export default function(reps, value) { const result = Array(reps); @@ -10,4 +10,4 @@ export default function(reps, value) { result.fill(value); } return result; -} \ No newline at end of file +} diff --git a/src/util/sample.js b/src/util/sample.js index f07dd695..d2c8558b 100644 --- a/src/util/sample.js +++ b/src/util/sample.js @@ -1,6 +1,6 @@ -import ascending from './ascending'; -import bisector from './bisector'; -import { random } from './random'; +import ascending from './ascending.js'; +import bisector from './bisector.js'; +import { random } from './random.js'; export default function(buffer, replace, index, weight) { return ( @@ -78,4 +78,4 @@ function sampleNW(size, buffer, index, weight) { buffer[i] = index[k[i]]; } return buffer; -} \ No newline at end of file +} diff --git a/src/util/shuffle.js b/src/util/shuffle.js index cbf09c07..f393cbec 100644 --- a/src/util/shuffle.js +++ b/src/util/shuffle.js @@ -1,4 +1,4 @@ -import { random } from './random'; +import { random } from './random.js'; export default function(array, lo = 0, hi = array.length) { let n = hi - (lo = +lo); @@ -11,4 +11,4 @@ export default function(array, lo = 0, hi = array.length) { } return array; -} \ No newline at end of file +} diff --git a/src/util/to-array.js b/src/util/to-array.js index 20f9744e..dcd2014f 100644 --- a/src/util/to-array.js +++ b/src/util/to-array.js @@ -1,7 +1,7 @@ -import isArray from './is-array'; +import isArray from './is-array.js'; export default function(value) { return value != null ? (isArray(value) ? value : [value]) : []; -} \ No newline at end of file +} diff --git a/src/util/to-function.js b/src/util/to-function.js index a0d9b004..939809c4 100644 --- a/src/util/to-function.js +++ b/src/util/to-function.js @@ -1,5 +1,5 @@ -import isFunction from './is-function'; +import isFunction from './is-function.js'; export default function(value) { return isFunction(value) ? value : () => value; -} \ No newline at end of file +} diff --git a/src/util/to-numeric.js b/src/util/to-numeric.js index 86d5fdc5..467ac34c 100644 --- a/src/util/to-numeric.js +++ b/src/util/to-numeric.js @@ -1,5 +1,5 @@ -import isBigInt from './is-bigint'; +import isBigInt from './is-bigint.js'; export default function(value) { return isBigInt(value) ? value : +value; -} \ No newline at end of file +} diff --git a/src/util/to-string.js b/src/util/to-string.js index 34e9fd5e..70207b11 100644 --- a/src/util/to-string.js +++ b/src/util/to-string.js @@ -1,7 +1,7 @@ -import isBigInt from './is-bigint'; +import isBigInt from './is-bigint.js'; export default function(v) { return v === undefined ? v + '' : isBigInt(v) ? v + 'n' : JSON.stringify(v); -} \ No newline at end of file +} diff --git a/src/util/unique-name.js b/src/util/unique-name.js index f182e084..211ff332 100644 --- a/src/util/unique-name.js +++ b/src/util/unique-name.js @@ -1,4 +1,4 @@ -import isMapOrSet from './is-map-or-set'; +import isMapOrSet from './is-map-or-set.js'; export default function(names, name) { names = isMapOrSet(names) ? names : new Set(names); @@ -10,4 +10,4 @@ export default function(names, name) { } return uname; -} \ No newline at end of file +} diff --git a/src/util/unroll.js b/src/util/unroll.js index c53f81c7..45dca6f2 100644 --- a/src/util/unroll.js +++ b/src/util/unroll.js @@ -8,4 +8,4 @@ export default function(args, code, ...lists) { + `; return (${args}) => ${code};` ); return Function(...a)(...lists); -} \ No newline at end of file +} diff --git a/src/util/value-list.js b/src/util/value-list.js index 8c018e2f..f0401bc3 100644 --- a/src/util/value-list.js +++ b/src/util/value-list.js @@ -1,7 +1,7 @@ -import ascending from './ascending'; -import min from './min'; -import max from './max'; -import quantile from './quantile'; +import ascending from './ascending.js'; +import max from './max.js'; +import min from './min.js'; +import quantile from './quantile.js'; export default class ValueList { constructor(values) { @@ -49,4 +49,4 @@ export default class ValueList { } return quantile(this._sorted, p); } -} \ No newline at end of file +} diff --git a/src/verbs/assign.js b/src/verbs/assign.js new file mode 100644 index 00000000..56d14ab5 --- /dev/null +++ b/src/verbs/assign.js @@ -0,0 +1,17 @@ +import { columnSet } from '../table/ColumnSet.js'; +import { Table } from '../table/Table.js'; +import error from '../util/error.js'; + +export function assign(table, ...others) { + others = others.flat(); + const nrows = table.numRows(); + const base = table.reify(); + const cols = columnSet(base).groupby(base.groups()); + others.forEach(input => { + input = input instanceof Table ? input : new Table(input); + if (input.numRows() !== nrows) error('Assign row counts do not match'); + input = input.reify(); + input.columnNames(name => cols.add(name, input.column(name))); + }); + return cols.new(table); +} diff --git a/src/engine/concat.js b/src/verbs/concat.js similarity index 55% rename from src/engine/concat.js rename to src/verbs/concat.js index 7be683e8..c5f7942b 100644 --- a/src/engine/concat.js +++ b/src/verbs/concat.js @@ -1,7 +1,8 @@ -import columnSet from '../table/column-set'; -import NULL from '../util/null'; +import { columnSet } from '../table/ColumnSet.js'; +import NULL from '../util/null.js'; -export default function(table, others) { +export function concat(table, ...others) { + others = others.flat(); const trows = table.numRows(); const nrows = trows + others.reduce((n, t) => n + t.numRows(), 0); if (trows === nrows) return table; @@ -13,11 +14,11 @@ export default function(table, others) { const arr = Array(nrows); let row = 0; tables.forEach(table => { - const col = table.column(name) || { get: () => NULL }; - table.scan(trow => arr[row++] = col.get(trow)); + const col = table.column(name) || { at: () => NULL }; + table.scan(trow => arr[row++] = col.at(trow)); }); cols.add(name, arr); }); - return table.create(cols.new()); -} \ No newline at end of file + return cols.new(table); +} diff --git a/src/verbs/dedupe.js b/src/verbs/dedupe.js index 2a110083..c68c916a 100644 --- a/src/verbs/dedupe.js +++ b/src/verbs/dedupe.js @@ -1,7 +1,8 @@ -export default function(table, keys = []) { - return table - .groupby(keys.length ? keys : table.columnNames()) - .filter('row_number() === 1') - .ungroup() - .reify(); -} \ No newline at end of file +import { groupby } from './groupby.js'; +import { filter } from './filter.js'; + +export function dedupe(table, ...keys) { + keys = keys.flat(); + const gt = groupby(table, keys.length ? keys : table.columnNames()); + return filter(gt, 'row_number() === 1').ungroup().reify(); +} diff --git a/src/verbs/derive.js b/src/verbs/derive.js index 29d066f1..1fbc71cc 100644 --- a/src/verbs/derive.js +++ b/src/verbs/derive.js @@ -1,14 +1,92 @@ -import relocate from './relocate'; -import _derive from '../engine/derive'; -import parse from '../expression/parse'; +import { relocate } from './relocate.js'; +import { aggregate } from './reduce/util.js'; +import { window } from './window/window.js'; +import parse from '../expression/parse.js'; +import { hasWindow } from '../op/index.js'; +import { columnSet } from '../table/ColumnSet.js'; +import repeat from '../util/repeat.js'; -export default function(table, values, options = {}) { +function isWindowed(op) { + return hasWindow(op.name) || + op.frame && ( + Number.isFinite(op.frame[0]) || + Number.isFinite(op.frame[1]) + ); +} + +export function derive(table, values, options = {}) { const dt = _derive(table, parse(values, { table }), options); return options.drop || (options.before == null && options.after == null) ? dt - : relocate(dt, + : relocate( + dt, Object.keys(values).filter(name => !table.column(name)), options ); -} \ No newline at end of file +} + +export function _derive(table, { names, exprs, ops = [] }, options = {}) { + // instantiate output data + const total = table.totalRows(); + const cols = columnSet(options.drop ? null : table); + const data = names.map(name => cols.add(name, Array(total))); + + // analyze operations, compute non-windowed aggregates + const [ aggOps, winOps ] = segmentOps(ops); + + const size = table.isGrouped() ? table.groups().size : 1; + const result = aggregate( + table, aggOps, + repeat(ops.length, () => Array(size)) + ); + + // perform table scans to generate output values + winOps.length + ? window(table, data, exprs, result, winOps) + : output(table, data, exprs, result); + + return cols.derive(table); +} + +function segmentOps(ops) { + const aggOps = []; + const winOps = []; + const n = ops.length; + + for (let i = 0; i < n; ++i) { + const op = ops[i]; + op.id = i; + (isWindowed(op) ? winOps : aggOps).push(op); + } + + return [aggOps, winOps]; +} + +function output(table, cols, exprs, result) { + const bits = table.mask(); + const data = table.data(); + const { keys } = table.groups() || {}; + const op = keys + ? (id, row) => result[id][keys[row]] + : id => result[id][0]; + + const m = cols.length; + for (let j = 0; j < m; ++j) { + const get = exprs[j]; + const col = cols[j]; + + // inline the following for performance: + // table.scan((i, data) => col[i] = get(i, data, op)); + if (bits) { + for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { + col[i] = get(i, data, op); + } + } else { + const n = table.totalRows(); + for (let i = 0; i < n; ++i) { + col[i] = get(i, data, op); + } + } + } +} diff --git a/src/verbs/except.js b/src/verbs/except.js index c5316b75..972ea26e 100644 --- a/src/verbs/except.js +++ b/src/verbs/except.js @@ -1,5 +1,9 @@ -export default function(table, others) { +import { dedupe } from './dedupe.js'; +import { antijoin } from './join-filter.js'; + +export function except(table, ...others) { + others = others.flat(); if (others.length === 0) return table; const names = table.columnNames(); - return others.reduce((a, b) => a.antijoin(b.select(names)), table).dedupe(); -} \ No newline at end of file + return dedupe(others.reduce((a, b) => antijoin(a, b.select(names)), table)); +} diff --git a/src/verbs/filter.js b/src/verbs/filter.js index 12e84a4a..fe4ca1b8 100644 --- a/src/verbs/filter.js +++ b/src/verbs/filter.js @@ -1,13 +1,34 @@ -import _derive from '../engine/derive'; -import _filter from '../engine/filter'; -import parse from '../expression/parse'; +import { _derive } from './derive.js'; +import parse from '../expression/parse.js'; +import { BitSet } from '../table/BitSet.js'; -export default function(table, criteria) { +export function filter(table, criteria) { const test = parse({ p: criteria }, { table }); let predicate = test.exprs[0]; if (test.ops.length) { - const { data } = _derive(table, test, { drop: true }).column('p'); - predicate = row => data[row]; + const data = _derive(table, test, { drop: true }).column('p'); + predicate = row => data.at(row); } return _filter(table, predicate); -} \ No newline at end of file +} + +export function _filter(table, predicate) { + const n = table.totalRows(); + const bits = table.mask(); + const data = table.data(); + const filter = new BitSet(n); + + // inline the following for performance: + // table.scan((row, data) => { if (predicate(row, data)) filter.set(row); }); + if (bits) { + for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { + if (predicate(i, data)) filter.set(i); + } + } else { + for (let i = 0; i < n; ++i) { + if (predicate(i, data)) filter.set(i); + } + } + + return table.create({ filter }); +} diff --git a/src/verbs/fold.js b/src/verbs/fold.js index ebf98e9b..ac127041 100644 --- a/src/verbs/fold.js +++ b/src/verbs/fold.js @@ -1,6 +1,23 @@ -import _fold from '../engine/fold'; -import parse from './util/parse'; +import { aggregateGet } from './reduce/util.js'; +import { _unroll } from './unroll.js'; +import parse from './util/parse.js'; -export default function(table, values, options) { +export function fold(table, values, options) { return _fold(table, parse('fold', table, values), options); -} \ No newline at end of file +} + +export function _fold(table, { names = [], exprs = [], ops = [] }, options = {}) { + if (names.length === 0) return table; + + const [k = 'key', v = 'value'] = options.as || []; + const vals = aggregateGet(table, ops, exprs); + + return _unroll( + table, + { + names: [k, v], + exprs: [() => names, (row, data) => vals.map(fn => fn(row, data))] + }, + { ...options, drop: names } + ); +} diff --git a/src/verbs/groupby.js b/src/verbs/groupby.js index ebac90c1..dbf5c92f 100644 --- a/src/verbs/groupby.js +++ b/src/verbs/groupby.js @@ -1,6 +1,54 @@ -import _groupby from '../engine/groupby'; -import parse from './util/parse'; +import { aggregateGet } from './reduce/util.js'; +import parse from './util/parse.js'; +import keyFunction from '../util/key-function.js'; -export default function(table, values) { - return _groupby(table, parse('groupby', table, values)); -} \ No newline at end of file +export function groupby(table, ...values) { + return _groupby(table, parse('groupby', table, values.flat())); +} + +export function _groupby(table, exprs) { + return table.create({ + groups: createGroups(table, exprs) + }); +} + +function createGroups(table, { names = [], exprs = [], ops = [] }) { + const n = names.length; + if (n === 0) return null; + + // check for optimized path when grouping by a single field + // use pre-calculated groups if available + if (n === 1 && !table.isFiltered() && exprs[0].field) { + const col = table.column(exprs[0].field); + if (col.groups) return col.groups(names); + } + + let get = aggregateGet(table, ops, exprs); + const getKey = keyFunction(get); + const nrows = table.totalRows(); + const keys = new Uint32Array(nrows); + const index = {}; + const rows = []; + + // inline table scan for performance + const data = table.data(); + const bits = table.mask(); + if (bits) { + for (let i = bits.next(0); i >= 0; i = bits.next(i + 1)) { + const key = getKey(i, data) + ''; + keys[i] = (index[key] ??= rows.push(i) - 1); + } + } else { + for (let i = 0; i < nrows; ++i) { + const key = getKey(i, data) + ''; + keys[i] = (index[key] ??= rows.push(i) - 1); + } + } + + if (!ops.length) { + // capture data in closure, so no interaction with select + get = get.map(f => row => f(row, data)); + } + + return { keys, get, names, rows, size: rows.length }; +} diff --git a/src/verbs/helpers/agg.js b/src/verbs/helpers/agg.js index 2d213508..0cff02f5 100644 --- a/src/verbs/helpers/agg.js +++ b/src/verbs/helpers/agg.js @@ -1,16 +1,17 @@ -import Table from '../../table/table'; // eslint-disable-line no-unused-vars +import { rollup } from '../rollup.js'; +import { ungroup } from '../ungroup.js'; /** * Convenience function for computing a single aggregate value for * a table. Equivalent to ungrouping a table, applying a rollup verb * for a single aggregate, and extracting the resulting value. - * @param {Table} table A table instance. - * @param {import('../../table/transformable').TableExpr} expr An + * @param {import('../../table/Table.js').Table} table A table instance. + * @param {import('../../table/types.js').TableExpr} expr An * aggregate table expression to evaluate. - * @return {import('../../table/table').DataValue} The aggregate value. + * @return {import('../../table/types.js').DataValue} The aggregate value. * @example agg(table, op.max('colA')) * @example agg(table, d => [op.min('colA'), op.max('colA')]) */ export default function agg(table, expr) { - return table.ungroup().rollup({ _: expr }).get('_'); -} \ No newline at end of file + return rollup(ungroup(table), { _: expr }).get('_'); +} diff --git a/src/verbs/impute.js b/src/verbs/impute.js index fcad1ac0..66fb7f3f 100644 --- a/src/verbs/impute.js +++ b/src/verbs/impute.js @@ -1,12 +1,17 @@ -import _impute from '../engine/impute'; -import _rollup from '../engine/rollup'; -import parse from '../expression/parse'; -import parseValues from './util/parse'; -import { array_agg_distinct } from '../op/op-api'; -import error from '../util/error'; -import toString from '../util/to-string'; - -export default function(table, values, options = {}) { +import { aggregateGet } from './reduce/util.js'; +import { _rollup } from './rollup.js'; +import { ungroup } from './ungroup.js'; +import parseValues from './util/parse.js'; +import parse from '../expression/parse.js'; +import { array_agg_distinct } from '../op/op-api.js'; +import { columnSet } from '../table/ColumnSet.js'; +import error from '../util/error.js'; +import isValid from '../util/is-valid.js'; +import keyFunction from '../util/key-function.js'; +import toString from '../util/to-string.js'; +import unroll from '../util/unroll.js'; + +export function impute(table, values, options = {}) { values = parse(values, { table }); values.names.forEach(name => @@ -14,9 +19,9 @@ export default function(table, values, options = {}) { ); if (options.expand) { - const opt = { preparse, aggronly: true }; + const opt = { preparse, window: false, aggronly: true }; const params = parseValues('impute', table, options.expand, opt); - const result = _rollup(table.ungroup(), params); + const result = _rollup(ungroup(table), params); return _impute( table, values, params.names, params.names.map(name => result.get(name, 0)) @@ -31,4 +36,127 @@ function preparse(map) { map.forEach((value, key) => value.field ? map.set(key, array_agg_distinct(value + '')) : 0 ); -} \ No newline at end of file +} + +export function _impute(table, values, keys, arrays) { + const write = keys && keys.length; + table = write ? expand(table, keys, arrays) : table; + const { names, exprs, ops } = values; + const gets = aggregateGet(table, ops, exprs); + const cols = write ? null : columnSet(table); + const rows = table.totalRows(); + + names.forEach((name, i) => { + const col = table.column(name); + const out = write ? col : cols.add(name, Array(rows)); + const get = gets[i]; + + table.scan(idx => { + const v = col.at(idx); + out[idx] = !isValid(v) ? get(idx) : v; + }); + }); + + return write ? table : table.create(cols); +} + +function expand(table, keys, values) { + const groups = table.groups(); + const data = table.data(); + + // expansion keys and accessors + const keyNames = (groups ? groups.names : []).concat(keys); + const keyGet = (groups ? groups.get : []) + .concat(keys.map(key => table.getter(key))); + + // build hash of existing rows + const hash = new Set(); + const keyTable = keyFunction(keyGet); + table.scan((idx, data) => hash.add(keyTable(idx, data))); + + // initialize output table data + const names = table.columnNames(); + const cols = columnSet(); + const out = names.map(name => cols.add(name, [])); + names.forEach((name, i) => { + const old = data[name]; + const col = out[i]; + table.scan(row => col.push(old.at(row))); + }); + + // enumerate expanded value sets and augment output table + const keyEnum = keyFunction(keyGet.map((k, i) => a => a[i])); + const set = unroll( + 'v', + '{' + out.map((_, i) => `_${i}.push(v[$${i}]);`).join('') + '}', + out, names.map(name => keyNames.indexOf(name)) + ); + + if (groups) { + let row = groups.keys.length; + const prod = values.reduce((p, a) => p * a.length, groups.size); + const keys = new Uint32Array(prod + (row - hash.size)); + keys.set(groups.keys); + enumerate(groups, values, (vec, idx) => { + if (!hash.has(keyEnum(vec))) { + set(vec); + keys[row++] = idx[0]; + } + }); + cols.groupby({ ...groups, keys }); + } else { + enumerate(groups, values, vec => { + if (!hash.has(keyEnum(vec))) set(vec); + }); + } + + return cols.new(table); +} + +function enumerate(groups, values, callback) { + const offset = groups ? groups.get.length : 0; + const pad = groups ? 1 : 0; + const len = pad + values.length; + const lens = new Int32Array(len); + const idxs = new Int32Array(len); + const set = []; + + if (groups) { + const { get, rows, size } = groups; + lens[0] = size; + set.push((vec, idx) => { + const row = rows[idx]; + for (let i = 0; i < offset; ++i) { + vec[i] = get[i](row); + } + }); + } + + values.forEach((a, i) => { + const j = i + offset; + lens[i + pad] = a.length; + set.push((vec, idx) => vec[j] = a[idx]); + }); + + const vec = Array(offset + values.length); + + // initialize value vector + for (let i = 0; i < len; ++i) { + set[i](vec, 0); + } + callback(vec, idxs); + + // enumerate all combinations of values + for (let i = len - 1; i >= 0;) { + const idx = ++idxs[i]; + if (idx < lens[i]) { + set[i](vec, idx); + callback(vec, idxs); + i = len - 1; + } else { + idxs[i] = 0; + set[i](vec, 0); + --i; + } + } +} diff --git a/src/verbs/index.js b/src/verbs/index.js index e401bfb3..5fbfd517 100644 --- a/src/verbs/index.js +++ b/src/verbs/index.js @@ -1,64 +1,27 @@ -import __dedupe from './dedupe'; -import __derive from './derive'; -import __except from './except'; -import __filter from './filter'; -import __fold from './fold'; -import __impute from './impute'; -import __intersect from './intersect'; -import __join from './join'; -import __semijoin from './join-filter'; -import __lookup from './lookup'; -import __pivot from './pivot'; -import __relocate from './relocate'; -import __rename from './rename'; -import __rollup from './rollup'; -import __sample from './sample'; -import __select from './select'; -import __spread from './spread'; -import __union from './union'; -import __unroll from './unroll'; -import __groupby from './groupby'; -import __orderby from './orderby'; - -import __concat from '../engine/concat'; -import __reduce from '../engine/reduce'; -import __ungroup from '../engine/ungroup'; -import __unorder from '../engine/unorder'; - -import { count } from '../op/op-api'; - -export default { - __antijoin: (table, other, on) => - __semijoin(table, other, on, { anti: true }), - __count: (table, options = {}) => - __rollup(table, { [options.as || 'count']: count() }), - __cross: (table, other, values, options) => - __join(table, other, () => true, values, { - ...options, left: true, right: true - }), - __concat, - __dedupe, - __derive, - __except, - __filter, - __fold, - __impute, - __intersect, - __join, - __lookup, - __pivot, - __relocate, - __rename, - __rollup, - __sample, - __select, - __semijoin, - __spread, - __union, - __unroll, - __groupby, - __orderby, - __ungroup, - __unorder, - __reduce -}; \ No newline at end of file +export { assign } from './assign.js'; +export { concat } from './concat.js'; +export { dedupe } from './dedupe.js'; +export { derive } from './derive.js'; +export { except } from './except.js'; +export { filter } from './filter.js'; +export { fold } from './fold.js'; +export { groupby } from './groupby.js'; +export { impute } from './impute.js'; +export { intersect } from './intersect.js'; +export { cross, join } from './join.js'; +export { antijoin, semijoin } from './join-filter.js'; +export { lookup } from './lookup.js'; +export { orderby } from './orderby.js'; +export { pivot } from './pivot.js'; +export { reduce } from './reduce.js'; +export { relocate } from './relocate.js'; +export { rename } from './rename.js'; +export { rollup } from './rollup.js'; +export { sample } from './sample.js'; +export { select } from './select.js'; +export { slice } from './slice.js'; +export { spread } from './spread.js'; +export { ungroup } from './ungroup.js'; +export { union } from './union.js'; +export { unorder } from './unorder.js'; +export { unroll } from './unroll.js'; diff --git a/src/verbs/intersect.js b/src/verbs/intersect.js index d4671418..fd90e184 100644 --- a/src/verbs/intersect.js +++ b/src/verbs/intersect.js @@ -1,6 +1,10 @@ -export default function(table, others) { +import { dedupe } from './dedupe.js'; +import { semijoin } from './join-filter.js'; + +export function intersect(table, ...others) { + others = others.flat(); const names = table.columnNames(); return others.length - ? others.reduce((a, b) => a.semijoin(b.select(names)), table).dedupe() + ? dedupe(others.reduce((a, b) => semijoin(a, b.select(names)), table)) : table.reify([]); -} \ No newline at end of file +} diff --git a/src/verbs/join-filter.js b/src/verbs/join-filter.js index 408b08e5..148235bc 100644 --- a/src/verbs/join-filter.js +++ b/src/verbs/join-filter.js @@ -1,10 +1,19 @@ -import _join_filter from '../engine/join-filter'; -import { inferKeys, keyPredicate } from './util/join-keys'; -import parse from '../expression/parse'; -import isArray from '../util/is-array'; -import toArray from '../util/to-array'; +import { rowLookup } from './join/lookup.js'; +import { inferKeys, keyPredicate } from './util/join-keys.js'; +import parse from '../expression/parse.js'; +import { BitSet } from '../table/BitSet.js'; +import isArray from '../util/is-array.js'; +import toArray from '../util/to-array.js'; -export default function(tableL, tableR, on, options) { +export function semijoin(tableL, tableR, on) { + return join_filter(tableL, tableR, on, { anti: false }); +} + +export function antijoin(tableL, tableR, on) { + return join_filter(tableL, tableR, on, { anti: true }); +} + +export function join_filter(tableL, tableR, on, options) { on = inferKeys(tableL, tableR, on); const predicate = isArray(on) @@ -12,4 +21,61 @@ export default function(tableL, tableR, on, options) { : parse({ on }, { join: [tableL, tableR] }).exprs[0]; return _join_filter(tableL, tableR, predicate, options); -} \ No newline at end of file +} + +export function _join_filter(tableL, tableR, predicate, options = {}) { + // calculate semi-join filter mask + const filter = new BitSet(tableL.totalRows()); + const join = isArray(predicate) ? hashSemiJoin : loopSemiJoin; + join(filter, tableL, tableR, predicate); + + // if anti-join, negate the filter + if (options.anti) { + filter.not().and(tableL.mask()); + } + + return tableL.create({ filter }); +} + +function hashSemiJoin(filter, tableL, tableR, [keyL, keyR]) { + // build lookup table + const lut = rowLookup(tableR, keyR); + + // scan table, update filter with matches + tableL.scan((rowL, data) => { + const rowR = lut.get(keyL(rowL, data)); + if (rowR >= 0) filter.set(rowL); + }); +} + +function loopSemiJoin(filter, tableL, tableR, predicate) { + const nL = tableL.numRows(); + const nR = tableR.numRows(); + const dataL = tableL.data(); + const dataR = tableR.data(); + + if (tableL.isFiltered() || tableR.isFiltered()) { + // use indices as at least one table is filtered + const idxL = tableL.indices(false); + const idxR = tableR.indices(false); + for (let i = 0; i < nL; ++i) { + const rowL = idxL[i]; + for (let j = 0; j < nR; ++j) { + if (predicate(rowL, dataL, idxR[j], dataR)) { + filter.set(rowL); + break; + } + } + } + } else { + // no filters, enumerate row indices directly + for (let i = 0; i < nL; ++i) { + for (let j = 0; j < nR; ++j) { + if (predicate(i, dataL, j, dataR)) { + filter.set(i); + break; + } + } + } + } +} diff --git a/src/verbs/join.js b/src/verbs/join.js index 19df9b24..ccfa813c 100644 --- a/src/verbs/join.js +++ b/src/verbs/join.js @@ -1,17 +1,31 @@ -import _join from '../engine/join'; -import { inferKeys, keyPredicate } from './util/join-keys'; -import parseValue from './util/parse'; -import parse from '../expression/parse'; -import { all, not } from '../helpers/selection'; -import isArray from '../util/is-array'; -import isString from '../util/is-string'; -import toArray from '../util/to-array'; -import toString from '../util/to-string'; +import { indexLookup } from './join/lookup.js'; +import { inferKeys, keyPredicate } from './util/join-keys.js'; +import parseValue from './util/parse.js'; +import parse from '../expression/parse.js'; +import { all, not } from '../helpers/selection.js'; +import { columnSet } from '../table/ColumnSet.js'; +import concat from '../util/concat.js'; +import isArray from '../util/is-array.js'; +import isString from '../util/is-string.js'; +import toArray from '../util/to-array.js'; +import toString from '../util/to-string.js'; +import unroll from '../util/unroll.js'; const OPT_L = { aggregate: false, window: false }; const OPT_R = { ...OPT_L, index: 1 }; +const NONE = -Infinity; -export default function(tableL, tableR, on, values, options = {}) { +export function cross(table, other, values, options) { + return join( + table, + other, + () => true, + values, + { ...options, left: true, right: true } + ); +} + +export function join(tableL, tableR, on, values, options = {}) { on = inferKeys(tableL, tableR, on); const optParse = { join: [tableL, tableR] }; let predicate; @@ -104,3 +118,108 @@ function rekey(names, rename, suffix) { ? (names[i] = name + suffix) : 0); } + +function emitter(columns, getters) { + const args = ['i', 'a', 'j', 'b']; + return unroll( + args, + '{' + concat(columns, (_, i) => `_${i}.push($${i}(${args}));`) + '}', + columns, getters + ); +} + +export function _join(tableL, tableR, predicate, { names, exprs }, options = {}) { + // initialize data for left table + const dataL = tableL.data(); + const idxL = tableL.indices(false); + const nL = idxL.length; + const hitL = new Int32Array(nL); + + // initialize data for right table + const dataR = tableR.data(); + const idxR = tableR.indices(false); + const nR = idxR.length; + const hitR = new Int32Array(nR); + + // initialize output data + const ncols = names.length; + const cols = columnSet(); + const columns = Array(ncols); + const getters = Array(ncols); + for (let i = 0; i < names.length; ++i) { + columns[i] = cols.add(names[i], []); + getters[i] = exprs[i]; + } + const emit = emitter(columns, getters); + + // perform join + const join = isArray(predicate) ? hashJoin : loopJoin; + join(emit, predicate, dataL, dataR, idxL, idxR, hitL, hitR, nL, nR); + + if (options.left) { + for (let i = 0; i < nL; ++i) { + if (!hitL[i]) { + emit(idxL[i], dataL, NONE, dataR); + } + } + } + + if (options.right) { + for (let j = 0; j < nR; ++j) { + if (!hitR[j]) { + emit(NONE, dataL, idxR[j], dataR); + } + } + } + + return cols.new(tableL); +} + +function loopJoin(emit, predicate, dataL, dataR, idxL, idxR, hitL, hitR, nL, nR) { + // perform nested-loops join + for (let i = 0; i < nL; ++i) { + const rowL = idxL[i]; + for (let j = 0; j < nR; ++j) { + const rowR = idxR[j]; + if (predicate(rowL, dataL, rowR, dataR)) { + emit(rowL, dataL, rowR, dataR); + hitL[i] = 1; + hitR[j] = 1; + } + } + } +} + +function hashJoin(emit, [keyL, keyR], dataL, dataR, idxL, idxR, hitL, hitR, nL, nR) { + // determine which table to hash + let dataScan, keyScan, hitScan, idxScan; + let dataHash, keyHash, hitHash, idxHash; + let emitScan = emit; + if (nL >= nR) { + dataScan = dataL; keyScan = keyL; hitScan = hitL; idxScan = idxL; + dataHash = dataR; keyHash = keyR; hitHash = hitR; idxHash = idxR; + } else { + dataScan = dataR; keyScan = keyR; hitScan = hitR; idxScan = idxR; + dataHash = dataL; keyHash = keyL; hitHash = hitL; idxHash = idxL; + emitScan = (i, a, j, b) => emit(j, b, i, a); + } + + // build lookup table + const lut = indexLookup(idxHash, dataHash, keyHash); + + // scan other table + const m = idxScan.length; + for (let j = 0; j < m; ++j) { + const rowScan = idxScan[j]; + const list = lut.get(keyScan(rowScan, dataScan)); + if (list) { + const n = list.length; + for (let k = 0; k < n; ++k) { + const i = list[k]; + emitScan(rowScan, dataScan, idxHash[i], dataHash); + hitHash[i] = 1; + } + hitScan[j] = 1; + } + } +} diff --git a/src/engine/join/lookup.js b/src/verbs/join/lookup.js similarity index 99% rename from src/engine/join/lookup.js rename to src/verbs/join/lookup.js index 745a4be0..52695e4a 100644 --- a/src/engine/join/lookup.js +++ b/src/verbs/join/lookup.js @@ -22,4 +22,4 @@ export function indexLookup(idx, data, hash) { } } return lut; -} \ No newline at end of file +} diff --git a/src/verbs/lookup.js b/src/verbs/lookup.js index ecddf285..3ddf656a 100644 --- a/src/verbs/lookup.js +++ b/src/verbs/lookup.js @@ -1,14 +1,46 @@ -import _lookup from '../engine/lookup'; -import { inferKeys } from './util/join-keys'; -import parseKey from './util/parse-key'; -import parseValues from './util/parse'; +import { rowLookup } from './join/lookup.js'; +import { aggregateGet } from './reduce/util.js'; +import { inferKeys } from './util/join-keys.js'; +import parseKey from './util/parse-key.js'; +import parseValues from './util/parse.js'; +import { columnSet } from '../table/ColumnSet.js'; +import NULL from '../util/null.js'; +import concat from '../util/concat.js'; +import unroll from '../util/unroll.js'; -export default function(tableL, tableR, on, values) { +export function lookup(tableL, tableR, on, ...values) { on = inferKeys(tableL, tableR, on); return _lookup( tableL, tableR, [ parseKey('lookup', tableL, on[0]), parseKey('lookup', tableR, on[1]) ], - parseValues('lookup', tableR, values) + parseValues('lookup', tableR, values.flat()) ); -} \ No newline at end of file +} + +export function _lookup(tableL, tableR, [keyL, keyR], { names, exprs, ops = [] }) { + // instantiate output data + const cols = columnSet(tableL); + const total = tableL.totalRows(); + names.forEach(name => cols.add(name, Array(total).fill(NULL))); + + // build lookup table + const lut = rowLookup(tableR, keyR); + + // generate setter function for lookup match + const set = unroll( + ['lr', 'rr', 'data'], + '{' + concat(names, (_, i) => `_[${i}][lr] = $[${i}](rr, data);`) + '}', + names.map(name => cols.data[name]), + aggregateGet(tableR, ops, exprs) + ); + + // find matching rows, set values on match + const dataR = tableR.data(); + tableL.scan((lrow, data) => { + const rrow = lut.get(keyL(lrow, data)); + if (rrow >= 0) set(lrow, rrow, dataR); + }); + + return cols.derive(tableL); +} diff --git a/src/verbs/orderby.js b/src/verbs/orderby.js index 6a07035a..f119a58d 100644 --- a/src/verbs/orderby.js +++ b/src/verbs/orderby.js @@ -1,14 +1,13 @@ -import _orderby from '../engine/orderby'; -import parse from '../expression/compare'; -import field from '../helpers/field'; -import error from '../util/error'; -import isFunction from '../util/is-function'; -import isObject from '../util/is-object'; -import isNumber from '../util/is-number'; -import isString from '../util/is-string'; +import parse from '../expression/compare.js'; +import field from '../helpers/field.js'; +import error from '../util/error.js'; +import isFunction from '../util/is-function.js'; +import isObject from '../util/is-object.js'; +import isNumber from '../util/is-number.js'; +import isString from '../util/is-string.js'; -export default function(table, values) { - return _orderby(table, parseValues(table, values)); +export function orderby(table, ...values) { + return _orderby(table, parseValues(table, values.flat())); } function parseValues(table, params) { @@ -32,4 +31,8 @@ function parseValues(table, params) { }); return parse(table, exprs); -} \ No newline at end of file +} + +export function _orderby(table, comparator) { + return table.create({ order: comparator }); +} diff --git a/src/verbs/pivot.js b/src/verbs/pivot.js index 47daf383..182ab6bd 100644 --- a/src/verbs/pivot.js +++ b/src/verbs/pivot.js @@ -1,13 +1,15 @@ -import _pivot from '../engine/pivot'; -import { any } from '../op/op-api'; -import parse from './util/parse'; +import { aggregate, aggregateGet, groupOutput } from './reduce/util.js'; +import parse from './util/parse.js'; +import { ungroup } from './ungroup.js'; +import { any } from '../op/op-api.js'; +import { columnSet } from '../table/ColumnSet.js'; // TODO: enforce aggregates only (no output changes) for values -export default function(table, on, values, options) { +export function pivot(table, on, values, options) { return _pivot( table, parse('fold', table, on), - parse('fold', table, values, { preparse, aggronly: true }), + parse('fold', table, values, { preparse, window: false, aggronly: true }), options ); } @@ -17,4 +19,111 @@ function preparse(map) { map.forEach((value, key) => value.field ? map.set(key, any(value + '')) : 0 ); -} \ No newline at end of file +} + +const opt = (value, defaultValue) => value != null ? value : defaultValue; + +export function _pivot(table, on, values, options = {}) { + const { keys, keyColumn } = pivotKeys(table, on, options); + const vsep = opt(options.valueSeparator, '_'); + const namefn = values.names.length > 1 + ? (i, name) => name + vsep + keys[i] + : i => keys[i]; + + // perform separate aggregate operations for each key + // if keys do not match, emit NaN so aggregate skips it + // use custom toString method for proper field resolution + const results = keys.map( + k => aggregate(table, values.ops.map(op => { + if (op.name === 'count') { // fix #273 + const fn = r => k === keyColumn[r] ? 1 : NaN; + fn.toString = () => k + ':1'; + return { ...op, name: 'sum', fields: [fn] }; + } + const fields = op.fields.map(f => { + const fn = (r, d) => k === keyColumn[r] ? f(r, d) : NaN; + fn.toString = () => k + ':' + f; + return fn; + }); + return { ...op, fields }; + })) + ); + + return output(values, namefn, table.groups(), results).new(table); +} + +function pivotKeys(table, on, options) { + const limit = options.limit > 0 ? +options.limit : Infinity; + const sort = opt(options.sort, true); + const ksep = opt(options.keySeparator, '_'); + + // construct key accessor function + const get = aggregateGet(table, on.ops, on.exprs); + const key = get.length === 1 + ? get[0] + : (row, data) => get.map(fn => fn(row, data)).join(ksep); + + // generate vector of per-row key values + const kcol = Array(table.totalRows()); + table.scan((row, data) => kcol[row] = key(row, data)); + + // collect unique key values + const uniq = aggregate( + ungroup(table), + [ { + id: 0, + name: 'array_agg_distinct', + fields: [(row => kcol[row])], params: [] + } ] + )[0][0]; + + // get ordered set of unique key values + const keys = sort ? uniq.sort() : uniq; + + // return key values + return { + keys: Number.isFinite(limit) ? keys.slice(0, limit) : keys, + keyColumn: kcol + }; +} + +function output({ names, exprs }, namefn, groups, results) { + const size = groups ? groups.size : 1; + const cols = columnSet(); + const m = results.length; + const n = names.length; + + let result; + const op = (id, row) => result[id][row]; + + // write groupby fields to output + if (groups) groupOutput(cols, groups); + + // write pivot values to output + for (let i = 0; i < n; ++i) { + const get = exprs[i]; + if (get.field != null) { + // if expression is op only, use aggregates directly + for (let j = 0; j < m; ++j) { + cols.add(namefn(j, names[i]), results[j][get.field]); + } + } else if (size > 1) { + // if multiple groups, evaluate expression for each + for (let j = 0; j < m; ++j) { + result = results[j]; + const col = cols.add(namefn(j, names[i]), Array(size)); + for (let k = 0; k < size; ++k) { + col[k] = get(k, null, op); + } + } + } else { + // if only one group, no need to loop + for (let j = 0; j < m; ++j) { + result = results[j]; + cols.add(namefn(j, names[i]), [ get(0, null, op) ]); + } + } + } + + return cols; +} diff --git a/src/engine/reduce.js b/src/verbs/reduce.js similarity index 84% rename from src/engine/reduce.js rename to src/verbs/reduce.js index 6e738f47..72135b90 100644 --- a/src/engine/reduce.js +++ b/src/verbs/reduce.js @@ -1,7 +1,7 @@ -import { reduceFlat, reduceGroups } from './reduce/util'; -import columnSet from '../table/column-set'; +import { reduceFlat, reduceGroups } from './reduce/util.js'; +import { columnSet } from '../table/ColumnSet.js'; -export default function(table, reducer) { +export function reduce(table, reducer) { const cols = columnSet(); const groups = table.groups(); @@ -37,5 +37,5 @@ export default function(table, reducer) { }); } - return table.create(cols.new()); -} \ No newline at end of file + return cols.new(table); +} diff --git a/src/engine/reduce/count-pattern.js b/src/verbs/reduce/count-pattern.js similarity index 91% rename from src/engine/reduce/count-pattern.js rename to src/verbs/reduce/count-pattern.js index 243f1676..1ddc359e 100644 --- a/src/engine/reduce/count-pattern.js +++ b/src/verbs/reduce/count-pattern.js @@ -1,12 +1,12 @@ -import Reducer from './reducer'; -import toArray from '../../util/to-array'; +import Reducer from './reducer.js'; +import toArray from '../../util/to-array.js'; export default function(fields, as, pattern) { return new CountPattern(fields, as, pattern); } function columnGetter(column) { - return (row, data) => data[column].get(row); + return (row, data) => data[column].at(row); } export class CountPattern extends Reducer { @@ -62,4 +62,4 @@ export class CountPattern extends Reducer { } return offset - index; } -} \ No newline at end of file +} diff --git a/src/engine/reduce/field-reducer.js b/src/verbs/reduce/field-reducer.js similarity index 91% rename from src/engine/reduce/field-reducer.js rename to src/verbs/reduce/field-reducer.js index 56841dad..a1af271c 100644 --- a/src/engine/reduce/field-reducer.js +++ b/src/verbs/reduce/field-reducer.js @@ -1,10 +1,10 @@ -import Reducer from './reducer'; -import { getAggregate } from '../../op'; -import concat from '../../util/concat'; -import error from '../../util/error'; -import isValid from '../../util/is-valid'; -import unroll from '../../util/unroll'; -import ValueList from '../../util/value-list'; +import Reducer from './reducer.js'; +import { getAggregate } from '../../op/index.js'; +import concat from '../../util/concat.js'; +import error from '../../util/error.js'; +import isValid from '../../util/is-valid.js'; +import unroll from '../../util/unroll.js'; +import ValueList from '../../util/value-list.js'; const update = (ops, args, fn) => unroll( args, @@ -20,6 +20,7 @@ export default function(oplist, stream) { : n === 1 ? Field1Reducer : n === 2 ? Field2Reducer : error('Unsupported field count: ' + n); + // @ts-ignore return new cls(fields, ops, output, stream); } @@ -166,4 +167,4 @@ class Field2Reducer extends FieldReducer { this._rem(state, value1, value2); } } -} \ No newline at end of file +} diff --git a/src/engine/reduce/reducer.js b/src/verbs/reduce/reducer.js similarity index 55% rename from src/engine/reduce/reducer.js rename to src/verbs/reduce/reducer.js index 77415041..fdb359ed 100644 --- a/src/engine/reduce/reducer.js +++ b/src/verbs/reduce/reducer.js @@ -14,18 +14,22 @@ export default class Reducer { return this._outputs; } - init(/* columns */) { + // eslint-disable-next-line no-unused-vars + init(columns) { return {}; } - add(/* state, row, data */) { + // eslint-disable-next-line no-unused-vars + add(state, row, data) { // no-op, subclasses should override } - rem(/* state, row, data */) { + // eslint-disable-next-line no-unused-vars + rem(state, row, data) { // no-op, subclasses should override } - write(/* state, values, index */) { + // eslint-disable-next-line no-unused-vars + write(state, values, index) { } -} \ No newline at end of file +} diff --git a/src/engine/reduce/util.js b/src/verbs/reduce/util.js similarity index 97% rename from src/engine/reduce/util.js rename to src/verbs/reduce/util.js index 76274220..b6b687ca 100644 --- a/src/engine/reduce/util.js +++ b/src/verbs/reduce/util.js @@ -1,5 +1,5 @@ -import fieldReducer from './field-reducer'; -import repeat from '../../util/repeat'; +import fieldReducer from './field-reducer.js'; +import repeat from '../../util/repeat.js'; export function aggregateGet(table, ops, get) { if (ops.length) { @@ -134,4 +134,4 @@ export function groupOutput(cols, groups) { col[i] = val(rows[i]); } } -} \ No newline at end of file +} diff --git a/src/verbs/relocate.js b/src/verbs/relocate.js index 731e28f0..73b0ea80 100644 --- a/src/verbs/relocate.js +++ b/src/verbs/relocate.js @@ -1,8 +1,11 @@ -import _select from '../engine/select'; -import resolve from '../helpers/selection'; -import error from '../util/error'; +import { _select } from './select.js'; +import resolve from '../helpers/selection.js'; +import error from '../util/error.js'; -export default function(table, columns, { before, after } = {}) { +export function relocate(table, columns, { + before = undefined, + after = undefined +} = {}) { const bef = before != null; const aft = after != null; @@ -36,4 +39,4 @@ export default function(table, columns, { before, after } = {}) { }); return _select(table, select); -} \ No newline at end of file +} diff --git a/src/verbs/rename.js b/src/verbs/rename.js index 1cbc2cad..c22f1864 100644 --- a/src/verbs/rename.js +++ b/src/verbs/rename.js @@ -1,8 +1,8 @@ -import _select from '../engine/select'; -import resolve from '../helpers/selection'; +import { _select } from './select.js'; +import resolve from '../helpers/selection.js'; -export default function(table, columns) { +export function rename(table, ...columns) { const map = new Map(); table.columnNames(x => (map.set(x, x), 0)); - return _select(table, resolve(table, columns, map)); -} \ No newline at end of file + return _select(table, resolve(table, columns.flat(), map)); +} diff --git a/src/verbs/rollup.js b/src/verbs/rollup.js index 27a82029..997e617c 100644 --- a/src/verbs/rollup.js +++ b/src/verbs/rollup.js @@ -1,6 +1,46 @@ -import _rollup from '../engine/rollup'; -import parse from '../expression/parse'; +import { aggregate, groupOutput } from './reduce/util.js'; +import parse from '../expression/parse.js'; +import { columnSet } from '../table/ColumnSet.js'; -export default function(table, values) { +export function rollup(table, values) { return _rollup(table, parse(values, { table, aggronly: true, window: false })); -} \ No newline at end of file +} + +export function _rollup(table, { names, exprs, ops = [] }) { + // output data + const cols = columnSet(); + const groups = table.groups(); + + // write groupby fields to output + if (groups) groupOutput(cols, groups); + + // compute and write aggregate output + output(names, exprs, groups, aggregate(table, ops), cols); + + // return output table + return cols.new(table); +} + +function output(names, exprs, groups, result = [], cols) { + if (!exprs.length) return; + const size = groups ? groups.size : 1; + const op = (id, row) => result[id][row]; + const n = names.length; + + for (let i = 0; i < n; ++i) { + const get = exprs[i]; + if (get.field != null) { + // if expression is op only, use aggregates directly + cols.add(names[i], result[get.field]); + } else if (size > 1) { + // if multiple groups, evaluate expression for each + const col = cols.add(names[i], Array(size)); + for (let j = 0; j < size; ++j) { + col[j] = get(j, null, op); + } + } else { + // if only one group, no need to loop + cols.add(names[i], [ get(0, null, op) ]); + } + } +} diff --git a/src/verbs/sample.js b/src/verbs/sample.js index 090d42f4..f2cc8f77 100644 --- a/src/verbs/sample.js +++ b/src/verbs/sample.js @@ -1,11 +1,12 @@ -import _derive from '../engine/derive'; -import _rollup from '../engine/rollup'; -import _sample from '../engine/sample'; -import parse from '../expression/parse'; -import isNumber from '../util/is-number'; -import isString from '../util/is-string'; - -export default function(table, size, options = {}) { +import { _derive } from './derive.js'; +import { _rollup } from './rollup.js'; +import parse from '../expression/parse.js'; +import isNumber from '../util/is-number.js'; +import isString from '../util/is-string.js'; +import sampleIndices from '../util/sample.js'; +import shuffleIndices from '../util/shuffle.js'; + +export function sample(table, size, options = {}) { return _sample( table, parseSize(table, size), @@ -14,7 +15,7 @@ export default function(table, size, options = {}) { ); } -const get = col => row => col.get(row) || 0; +const get = col => row => col.at(row) || 0; function parseSize(table, size) { return isNumber(size) @@ -30,4 +31,40 @@ function parseWeight(table, w) { ? table.column(w) : _derive(table, parse({ w }, { table }), { drop: true }).column('w') ); -} \ No newline at end of file +} + +export function _sample(table, size, weight, options = {}) { + const { replace, shuffle } = options; + const parts = table.partitions(false); + + let total = 0; + size = parts.map((idx, group) => { + let s = size(group); + total += (s = (replace ? s : Math.min(idx.length, s))); + return s; + }); + + const samples = new Uint32Array(total); + let curr = 0; + + parts.forEach((idx, group) => { + const sz = size[group]; + const buf = samples.subarray(curr, curr += sz); + + if (!replace && sz === idx.length) { + // sample size === data size, no replacement + // no need to sample, just copy indices + buf.set(idx); + } else { + sampleIndices(buf, replace, idx, weight); + } + }); + + if (shuffle !== false && (parts.length > 1 || !replace)) { + // sampling with replacement methods shuffle, so in + // that case a single partition is already good to go + shuffleIndices(samples); + } + + return table.reify(samples); +} diff --git a/src/verbs/select.js b/src/verbs/select.js index 0169057f..241aaa2a 100644 --- a/src/verbs/select.js +++ b/src/verbs/select.js @@ -1,6 +1,22 @@ -import _select from '../engine/select'; -import resolve from '../helpers/selection'; +import resolve from '../helpers/selection.js'; +import { columnSet } from '../table/ColumnSet.js'; +import error from '../util/error.js'; +import isString from '../util/is-string.js'; -export default function(table, columns) { - return _select(table, resolve(table, columns)); -} \ No newline at end of file +export function select(table, ...columns) { + return _select(table, resolve(table, columns.flat())); +} + +export function _select(table, columns) { + const cols = columnSet(); + + columns.forEach((value, curr) => { + const next = isString(value) ? value : curr; + if (next) { + const col = table.column(curr) || error(`Unrecognized column: ${curr}`); + cols.add(next, col); + } + }); + + return cols.derive(table); +} diff --git a/src/verbs/slice.js b/src/verbs/slice.js new file mode 100644 index 00000000..4cd1a5eb --- /dev/null +++ b/src/verbs/slice.js @@ -0,0 +1,16 @@ +import { filter } from './filter.js'; +import _slice from '../helpers/slice.js'; + +export function slice(table, start = 0, end = Infinity) { + if (table.isGrouped()) { + return filter(table, _slice(start, end)).reify(); + } + + // if not grouped, scan table directly + const indices = []; + const nrows = table.numRows(); + start = Math.max(0, start + (start < 0 ? nrows : 0)); + end = Math.min(nrows, Math.max(0, end + (end < 0 ? nrows : 0))); + table.scan(row => indices.push(row), true, end - start, start); + return table.reify(indices); +} diff --git a/src/verbs/spread.js b/src/verbs/spread.js index fadcbff5..b7773d1f 100644 --- a/src/verbs/spread.js +++ b/src/verbs/spread.js @@ -1,6 +1,64 @@ -import _spread from '../engine/spread'; -import parse from './util/parse'; +import { aggregateGet } from './reduce/util.js'; +import parse from './util/parse.js'; +import { columnSet } from '../table/ColumnSet.js'; +import NULL from '../util/null.js'; +import toArray from '../util/to-array.js'; -export default function(table, values, options) { +export function spread(table, values, options) { return _spread(table, parse('spread', table, values), options); -} \ No newline at end of file +} + +export function _spread(table, { names, exprs, ops = [] }, options = {}) { + if (names.length === 0) return table; + + // ignore 'as' if there are multiple field names + const as = (names.length === 1 && options.as) || []; + const drop = options.drop == null ? true : !!options.drop; + const limit = options.limit == null + ? as.length || Infinity + : Math.max(1, +options.limit || 1); + + const get = aggregateGet(table, ops, exprs); + const cols = columnSet(); + const map = names.reduce((map, name, i) => map.set(name, i), new Map()); + + const add = (index, name) => { + const columns = spreadCols(table, get[index], limit); + const n = columns.length; + for (let i = 0; i < n; ++i) { + cols.add(as[i] || `${name}_${i + 1}`, columns[i]); + } + }; + + table.columnNames().forEach(name => { + if (map.has(name)) { + if (!drop) cols.add(name, table.column(name)); + add(map.get(name), name); + map.delete(name); + } else { + cols.add(name, table.column(name)); + } + }); + + map.forEach(add); + + return cols.derive(table); +} + +function spreadCols(table, get, limit) { + const nrows = table.totalRows(); + const columns = []; + + table.scan((row, data) => { + const values = toArray(get(row, data)); + const n = Math.min(values.length, limit); + while (columns.length < n) { + columns.push(Array(nrows).fill(NULL)); + } + for (let i = 0; i < n; ++i) { + columns[i][row] = values[i]; + } + }); + + return columns; +} diff --git a/src/engine/ungroup.js b/src/verbs/ungroup.js similarity index 68% rename from src/engine/ungroup.js rename to src/verbs/ungroup.js index 0f16ffdb..6ef68d48 100644 --- a/src/engine/ungroup.js +++ b/src/verbs/ungroup.js @@ -1,5 +1,5 @@ -export default function(table) { +export function ungroup(table) { return table.isGrouped() ? table.create({ groups: null }) : table; -} \ No newline at end of file +} diff --git a/src/verbs/union.js b/src/verbs/union.js index 60f63b58..2e9afdef 100644 --- a/src/verbs/union.js +++ b/src/verbs/union.js @@ -1,3 +1,6 @@ -export default function(table, others) { - return table.concat(others).dedupe(); -} \ No newline at end of file +import { concat } from './concat.js'; +import { dedupe } from './dedupe.js'; + +export function union(table, ...others) { + return dedupe(concat(table, others.flat())); +} diff --git a/src/engine/unorder.js b/src/verbs/unorder.js similarity index 68% rename from src/engine/unorder.js rename to src/verbs/unorder.js index 0a8ae3bb..b96e4b4c 100644 --- a/src/engine/unorder.js +++ b/src/verbs/unorder.js @@ -1,5 +1,5 @@ -export default function(table) { +export function unorder(table) { return table.isOrdered() ? table.create({ order: null }) : table; -} \ No newline at end of file +} diff --git a/src/verbs/unroll.js b/src/verbs/unroll.js index a5b6a8c3..cc980560 100644 --- a/src/verbs/unroll.js +++ b/src/verbs/unroll.js @@ -1,7 +1,9 @@ -import _unroll from '../engine/unroll'; -import parse from './util/parse'; +import { aggregateGet } from './reduce/util.js'; +import parse from './util/parse.js'; +import { columnSet } from '../table/ColumnSet.js'; +import toArray from '../util/to-array.js'; -export default function(table, values, options) { +export function unroll(table, values, options) { return _unroll( table, parse('unroll', table, values), @@ -9,4 +11,118 @@ export default function(table, values, options) { ? { ...options, drop: parse('unroll', table, options.drop).names } : options ); -} \ No newline at end of file +} + +export function _unroll(table, { names = [], exprs = [], ops = [] }, options = {}) { + if (!names.length) return table; + + const limit = options.limit > 0 ? +options.limit : Infinity; + const index = options.index + ? options.index === true ? 'index' : options.index + '' + : null; + const drop = new Set(options.drop); + const get = aggregateGet(table, ops, exprs); + + // initialize output columns + const cols = columnSet(); + const nset = new Set(names); + const priors = []; + const copies = []; + const unroll = []; + + // original and copied columns + table.columnNames().forEach(name => { + if (!drop.has(name)) { + const col = cols.add(name, []); + if (!nset.has(name)) { + priors.push(table.column(name)); + copies.push(col); + } + } + }); + + // unrolled output columns + names.forEach(name => { + if (!drop.has(name)) { + if (!cols.has(name)) cols.add(name, []); + unroll.push(cols.data[name]); + } + }); + + // index column, if requested + const icol = index ? cols.add(index, []) : null; + + let start = 0; + const m = priors.length; + const n = unroll.length; + + const copy = (row, maxlen) => { + for (let i = 0; i < m; ++i) { + copies[i].length = start + maxlen; + copies[i].fill(priors[i].at(row), start, start + maxlen); + } + }; + + const indices = icol + ? (row, maxlen) => { + for (let i = 0; i < maxlen; ++i) { + icol[row + i] = i; + } + } + : () => {}; + + if (n === 1) { + // optimize common case of one array-valued column + const fn = get[0]; + const col = unroll[0]; + + table.scan((row, data) => { + // extract array data + const array = toArray(fn(row, data)); + const maxlen = Math.min(array.length, limit); + + // copy original table data + copy(row, maxlen); + + // copy unrolled array data + for (let j = 0; j < maxlen; ++j) { + col[start + j] = array[j]; + } + + // fill in array indices + indices(start, maxlen); + + start += maxlen; + }); + } else { + table.scan((row, data) => { + let maxlen = 0; + + // extract parallel array data + const arrays = get.map(fn => { + const value = toArray(fn(row, data)); + maxlen = Math.min(Math.max(maxlen, value.length), limit); + return value; + }); + + // copy original table data + copy(row, maxlen); + + // copy unrolled array data + for (let i = 0; i < n; ++i) { + const col = unroll[i]; + const arr = arrays[i]; + for (let j = 0; j < maxlen; ++j) { + col[start + j] = arr[j]; + } + } + + // fill in array indices + indices(start, maxlen); + + start += maxlen; + }); + } + + return cols.new(table); +} diff --git a/src/verbs/util/join-keys.js b/src/verbs/util/join-keys.js index dd5414bf..1c80bc7c 100644 --- a/src/verbs/util/join-keys.js +++ b/src/verbs/util/join-keys.js @@ -1,8 +1,8 @@ -import parseKey from './parse-key'; -import error from '../../util/error'; -import intersect from '../../util/intersect'; -import isArray from '../../util/is-array'; -import isString from '../../util/is-string'; +import parseKey from './parse-key.js'; +import error from '../../util/error.js'; +import intersect from '../../util/intersect.js'; +import isArray from '../../util/is-array.js'; +import isString from '../../util/is-string.js'; export function inferKeys(tableL, tableR, on) { if (!on) { @@ -27,4 +27,4 @@ export function keyPredicate(tableL, tableR, onL, onR) { parseKey('join', tableL, onL), parseKey('join', tableR, onR) ]; -} \ No newline at end of file +} diff --git a/src/verbs/util/parse-key.js b/src/verbs/util/parse-key.js index 62c78a76..f8fcb0ab 100644 --- a/src/verbs/util/parse-key.js +++ b/src/verbs/util/parse-key.js @@ -1,12 +1,12 @@ -import parse from '../../expression/parse'; -import field from '../../helpers/field'; -import error from '../../util/error'; -import isFunction from '../../util/is-function'; -import isNumber from '../../util/is-number'; -import isObject from '../../util/is-object'; -import isString from '../../util/is-string'; -import keyFunction from '../../util/key-function'; -import toArray from '../../util/to-array'; +import parse from '../../expression/parse.js'; +import field from '../../helpers/field.js'; +import error from '../../util/error.js'; +import isFunction from '../../util/is-function.js'; +import isNumber from '../../util/is-number.js'; +import isObject from '../../util/is-object.js'; +import isString from '../../util/is-string.js'; +import keyFunction from '../../util/key-function.js'; +import toArray from '../../util/to-array.js'; export default function(name, table, params) { const exprs = new Map(); @@ -20,4 +20,4 @@ export default function(name, table, params) { const fn = parse(exprs, { table, aggregate: false, window: false }); return keyFunction(fn.exprs, true); -} \ No newline at end of file +} diff --git a/src/verbs/util/parse.js b/src/verbs/util/parse.js index 0d78b0f6..ee53b3cd 100644 --- a/src/verbs/util/parse.js +++ b/src/verbs/util/parse.js @@ -1,13 +1,13 @@ -import parse from '../../expression/parse'; -import field from '../../helpers/field'; -import resolve from '../../helpers/selection'; -import assign from '../../util/assign'; -import error from '../../util/error'; -import isNumber from '../../util/is-number'; -import isObject from '../../util/is-object'; -import isString from '../../util/is-string'; -import isFunction from '../../util/is-function'; -import toArray from '../../util/to-array'; +import parse from '../../expression/parse.js'; +import field from '../../helpers/field.js'; +import resolve from '../../helpers/selection.js'; +import assign from '../../util/assign.js'; +import error from '../../util/error.js'; +import isNumber from '../../util/is-number.js'; +import isObject from '../../util/is-object.js'; +import isString from '../../util/is-string.js'; +import isFunction from '../../util/is-function.js'; +import toArray from '../../util/to-array.js'; export default function(name, table, params, options = { window: false }) { const exprs = new Map(); @@ -27,4 +27,4 @@ export default function(name, table, params, options = { window: false }) { } return parse(exprs, { table, ...options }); -} \ No newline at end of file +} diff --git a/src/engine/window/window-state.js b/src/verbs/window/window-state.js similarity index 91% rename from src/engine/window/window-state.js rename to src/verbs/window/window-state.js index 0b54e5ef..9f990669 100644 --- a/src/engine/window/window-state.js +++ b/src/verbs/window/window-state.js @@ -1,7 +1,7 @@ -import ascending from '../../util/ascending'; -import bisector from '../../util/bisector'; -import concat from '../../util/concat'; -import unroll from '../../util/unroll'; +import ascending from '../../util/ascending.js'; +import bisector from '../../util/bisector.js'; +import concat from '../../util/concat.js'; +import unroll from '../../util/unroll.js'; const bisect = bisector(ascending); @@ -89,4 +89,4 @@ export default function(data, frame, adjust, ops, aggrs) { }; return w; -} \ No newline at end of file +} diff --git a/src/engine/window/window.js b/src/verbs/window/window.js similarity index 86% rename from src/engine/window/window.js rename to src/verbs/window/window.js index 184f734b..979770c1 100644 --- a/src/engine/window/window.js +++ b/src/verbs/window/window.js @@ -1,8 +1,8 @@ -import { reducers } from '../reduce/util'; -import { getWindow, hasAggregate } from '../../op'; -import concat from '../../util/concat'; -import unroll from '../../util/unroll'; -import windowState from './window-state'; +import { reducers } from '../reduce/util.js'; +import { getWindow, hasAggregate } from '../../op/index.js'; +import concat from '../../util/concat.js'; +import unroll from '../../util/unroll.js'; +import windowState from './window-state.js'; const frameValue = op => (op.frame || [null, null]).map(v => Number.isFinite(v) ? Math.abs(v) : null); @@ -11,10 +11,11 @@ const peersValue = op => !!op.peers; function windowOp(spec) { const { id, name, fields = [], params = [] } = spec; - const op = getWindow(name).create(...params); - if (fields.length) op.get = fields[0]; - op.id = id; - return op; + return { + ...getWindow(name).create(...params), + get: fields.length ? fields[0] : null, + id + }; } export function window(table, cols, exprs, result = {}, ops) { @@ -91,4 +92,4 @@ function windowPeers(table, rows) { // no sort, no peers: reuse row indices as peer ids return rows; } -} \ No newline at end of file +} diff --git a/test/arrow/arrow-column-test.js b/test/arrow/arrow-column-test.js new file mode 100644 index 00000000..d9824377 --- /dev/null +++ b/test/arrow/arrow-column-test.js @@ -0,0 +1,112 @@ +import assert from 'node:assert'; +import arrowColumn from '../../src/arrow/arrow-column.js'; +import { + DateDay, DateMillisecond, Int64, tableFromIPC, vectorFromArray +} from 'apache-arrow'; + +describe('arrowColumn', () => { + it('converts date day data', () => { + const date = (y, m = 0, d = 1) => new Date(Date.UTC(y, m, d)); + const values = [ + date(2000, 0, 1), + date(2004, 10, 12), + date(2007, 3, 14), + date(2009, 6, 26), + date(2000, 0, 1), + date(2004, 10, 12), + date(2007, 3, 14), + date(2009, 6, 26), + date(2000, 0, 1), + date(2004, 10, 12) + ]; + const vec = vectorFromArray(values, new DateDay()); + const proxy = arrowColumn(vec); + + assert.deepStrictEqual( + Array.from(proxy), + values, + 'date day converted' + ); + assert.deepStrictEqual( + Array.from(arrowColumn(vec, { convertDate: false })), + values.map(v => +v), + 'date day unconverted' + ); + assert.ok(proxy.at(0) === proxy.at(0), 'data day object equality'); + }); + + it('converts date millisecond data', () => { + const date = (y, m = 0, d = 1) => new Date(Date.UTC(y, m, d)); + const values = [ + date(2000, 0, 1), + date(2004, 10, 12), + date(2007, 3, 14), + date(2009, 6, 26), + date(2000, 0, 1), + date(2004, 10, 12), + date(2007, 3, 14), + date(2009, 6, 26), + date(2000, 0, 1), + date(2004, 10, 12) + ]; + const vec = vectorFromArray(values, new DateMillisecond()); + const proxy = arrowColumn(vec); + + assert.deepStrictEqual( + Array.from(proxy), + values, + 'date millisecond converted' + ); + assert.deepStrictEqual( + Array.from(arrowColumn(vec, { convertDate: false })), + values.map(v => +v), + 'date millisecond unconverted' + ); + assert.ok(proxy.at(0) === proxy.at(0), 'data millisecond object equality'); + }); + + it('converts bigint data', () => { + const values = [0n, 1n, 2n, 3n, 10n, 1000n]; + const vec = vectorFromArray(values, new Int64()); + + assert.deepStrictEqual( + Array.from(arrowColumn(vec, { convertBigInt: true })), + values.map(v => Number(v)), + 'bigint converted' + ); + assert.deepStrictEqual( + Array.from(arrowColumn(vec)), + values, + 'bigint unconverted' + ); + }); + + it('converts decimal data', () => { + // encoded externally to sidestep arrow JS lib bugs: + // import pyarrow as pa + // v = pa.array([1, 12, 34], type=pa.decimal128(18, 3)) + // batch = pa.record_batch([v], names=['d']) + // sink = pa.BufferOutputStream() + // with pa.ipc.new_stream(sink, batch.schema) as writer: + // writer.write_batch(batch) + // sink.getvalue().hex() + const hex = 'FFFFFFFF780000001000000000000A000C000600050008000A000000000104000C000000080008000000040008000000040000000100000014000000100014000800060007000C00000010001000000000000107100000001C0000000400000000000000010000006400000008000C0004000800080000001200000003000000FFFFFFFF8800000014000000000000000C0016000600050008000C000C0000000003040018000000300000000000000000000A0018000C00040008000A0000003C00000010000000030000000000000000000000020000000000000000000000000000000000000000000000000000003000000000000000000000000100000003000000000000000000000000000000E8030000000000000000000000000000E02E0000000000000000000000000000D0840000000000000000000000000000FFFFFFFF00000000'; + const bytes = Uint8Array.from(hex.match(/.{1,2}/g).map(s => parseInt(s, 16))); + const vec = tableFromIPC(bytes).getChild('d'); + + assert.deepStrictEqual( + Array.from(arrowColumn(vec, { convertDecimal: true })), + [1, 12, 34], + 'decimal converted' + ); + assert.deepEqual( + Array.from(arrowColumn(vec, { convertDecimal: false })), + [ + Uint32Array.from([1000, 0, 0, 0]), + Uint32Array.from([12000, 0, 0, 0]), + Uint32Array.from([34000, 0, 0, 0 ]) + ], + 'decimal unconverted' + ); + }); +}); diff --git a/test/arrow/data-from-test.js b/test/arrow/data-from-test.js index f4325f05..a320cd90 100644 --- a/test/arrow/data-from-test.js +++ b/test/arrow/data-from-test.js @@ -1,12 +1,12 @@ -import tape from 'tape'; +import assert from 'node:assert'; import { Bool, DateDay, DateMillisecond, Dictionary, Field, FixedSizeList, Float32, Float64, Int16, Int32, Int64, Int8, List, Struct, Table, Uint16, Uint32, Uint64, Uint8, Utf8, tableToIPC, vectorFromArray } from 'apache-arrow'; -import { dataFromScan } from '../../src/arrow/encode/data-from'; -import { scanTable } from '../../src/arrow/encode/scan'; -import { table } from '../../src/table'; +import { dataFromScan } from '../../src/arrow/encode/data-from.js'; +import { scanTable } from '../../src/arrow/encode/scan.js'; +import { table } from '../../src/index.js'; function dataFromTable(table, column, type, nullable) { const nrows = table.numRows(); @@ -14,25 +14,25 @@ function dataFromTable(table, column, type, nullable) { return dataFromScan(nrows, scan, column, type, nullable); } -function integerTest(t, type) { +function integerTest(type) { const values = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); } -function floatTest(t, type) { +function floatTest(type) { const values = [0, NaN, 1/3, Math.PI, 7, Infinity, -Infinity]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); } -function bigintTest(t, type) { +function bigintTest(type) { const values = [0n, 1n, 10n, 100n, 1000n, 10n ** 10n]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); } -function dateTest(t, type) { +function dateTest(type) { const date = (y, m = 0, d = 1) => new Date(Date.UTC(y, m, d)); const values = [ date(2000, 0, 1), @@ -46,122 +46,108 @@ function dateTest(t, type) { date(2000, 0, 1), date(2004, 10, 12) ]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); } -function valueTest(t, type, values, msg) { +function valueTest(type, values, msg) { const dt = table({ values }); const u = dataFromTable(dt, dt.column('values'), type); const v = vectorFromArray(values, type); const tu = new Table({ values: u }); const tv = new Table({ values: v }); - t.equal( + + assert.equal( tableToIPC(tu).join(' '), tableToIPC(tv).join(' '), 'serialized data matches' + msg ); } -tape('dataFrom encodes dictionary data', t => { - const type = new Dictionary(new Utf8(), new Uint32(), 0); - const values = ['a', 'b', 'FOO', 'b', 'a']; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); - t.end(); -}); - -tape('dataFrom encodes boolean data', t => { - const type = new Bool(); - const values = [true, false, false, true, false]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); - t.end(); -}); - -tape('dataFrom encodes date millis data', t => { - dateTest(t, new DateMillisecond()); - t.end(); -}); - -tape('dataFrom encodes date day data', t => { - dateTest(t, new DateDay()); - t.end(); -}); - -tape('dataFrom encodes int8 data', t => { - integerTest(t, new Int8()); - t.end(); -}); - -tape('dataFrom encodes int16 data', t => { - integerTest(t, new Int16()); - t.end(); -}); - -tape('dataFrom encodes int32 data', t => { - integerTest(t, new Int32()); - t.end(); -}); - -tape('dataFrom encodes int64 data', t => { - bigintTest(t, new Int64()); - t.end(); -}); - -tape('dataFrom encodes uint8 data', t => { - integerTest(t, new Uint8()); - t.end(); -}); - -tape('dataFrom encodes uint16 data', t => { - integerTest(t, new Uint16()); - t.end(); -}); - -tape('dataFrom encodes uint32 data', t => { - integerTest(t, new Uint32()); - t.end(); -}); - -tape('dataFrom encodes uint64 data', t => { - bigintTest(t, new Uint64()); - t.end(); -}); - -tape('dataFrom encodes float32 data', t => { - floatTest(t, new Float32()); - t.end(); -}); - -tape('dataFrom encodes float64 data', t => { - floatTest(t, new Float64()); - t.end(); -}); - -tape('dataFrom encodes list data', t => { - const field = Field.new({ name: 'value', type: new Int32() }); - const type = new List(field); - const values = [[1, 2], [3], [4, 5, 6], [7]]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); - t.end(); -}); - -tape('dataFrom encodes fixed size list data', t => { - const field = Field.new({ name: 'value', type: new Int32() }); - const type = new FixedSizeList(1, field); - const values = [[1], [2], [3], [4], [5], [6]]; - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); - t.end(); +describe('dataFrom', () => { + it('encodes dictionary data', () => { + const type = new Dictionary(new Utf8(), new Uint32(), 0); + const values = ['a', 'b', 'FOO', 'b', 'a']; + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); + }); + + it('encodes boolean data', () => { + const type = new Bool(); + const values = [true, false, false, true, false]; + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); + }); + + it('encodes date millis data', () => { + dateTest(new DateMillisecond()); + }); + + it('encodes date day data', () => { + dateTest(new DateDay()); + }); + + it('encodes int8 data', () => { + integerTest(new Int8()); + }); + + it('encodes int16 data', () => { + integerTest(new Int16()); + }); + + it('encodes int32 data', () => { + integerTest(new Int32()); + }); + + it('encodes int64 data', () => { + bigintTest(new Int64()); + }); + + it('encodes uint8 data', () => { + integerTest(new Uint8()); + }); + + it('encodes uint16 data', () => { + integerTest(new Uint16()); + }); + + it('encodes uint32 data', () => { + integerTest(new Uint32()); + }); + + it('encodes uint64 data', () => { + bigintTest(new Uint64()); + }); + + it('encodes float32 data', () => { + floatTest(new Float32()); + }); + + it('encodes float64 data', () => { + floatTest(new Float64()); + }); + + it('encodes list data', () => { + const field = Field.new({ name: 'value', type: new Int32() }); + const type = new List(field); + const values = [[1, 2], [3], [4, 5, 6], [7]]; + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); + }); + + it('encodes fixed size list data', () => { + const field = Field.new({ name: 'value', type: new Int32() }); + const type = new FixedSizeList(1, field); + const values = [[1], [2], [3], [4], [5], [6]]; + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); + }); + + it('encodes struct data', () => { + const key = Field.new({ name: 'key', type: new Int32() }); + const type = new Struct([key]); + const values = [1, 2, 3, null, 5, 6].map(key => ({ key })); + valueTest(type, values, ', without nulls'); + valueTest(type, [null, ...values, null], ', with nulls'); + }); }); - -tape('dataFrom encodes struct data', t => { - const key = Field.new({ name: 'key', type: new Int32() }); - const type = new Struct([key]); - const values = [1, 2, 3, null, 5, 6].map(key => ({ key })); - valueTest(t, type, values, ', without nulls'); - valueTest(t, type, [null, ...values, null], ', with nulls'); - t.end(); -}); \ No newline at end of file diff --git a/test/arrow/from-arrow-test.js b/test/arrow/from-arrow-test.js new file mode 100644 index 00000000..63fc054f --- /dev/null +++ b/test/arrow/from-arrow-test.js @@ -0,0 +1,104 @@ +import assert from 'node:assert'; +import { Utf8 } from 'apache-arrow'; +import tableEqual from '../table-equal.js'; +import fromArrow from '../../src/arrow/from-arrow.js'; +import toArrow from '../../src/arrow/to-arrow.js'; +import { not } from '../../src/helpers/selection.js'; +import { table } from '../../src/index-browser.js'; + +function arrowTable(data, types) { + return toArrow(table(data), { types }); +} + +describe('fromArrow', () => { + it('imports Apache Arrow tables', () => { + const data = { + u: [1, 2, 3, 4, 5], + v: ['a', 'b', null, 'd', 'e'] + }; + const at = arrowTable(data); + + tableEqual(fromArrow(at), data, 'arrow data'); + }); + + it('can unpack Apache Arrow tables', () => { + const data = { + u: [1, 2, 3, 4, 5], + v: ['a', 'b', null, 'd', 'e'], + x: ['cc', 'dd', 'cc', 'dd', 'cc'], + y: ['aa', 'aa', null, 'bb', 'bb'] + }; + const at = arrowTable(data, { v: new Utf8() }); + const dt = fromArrow(at); + + tableEqual(dt, data, 'arrow data'); + assert.ok(dt.column('x').keyFor, 'create dictionary column without nulls'); + assert.ok(dt.column('y').keyFor, 'create dictionary column with nulls'); + }); + + it('can select Apache Arrow columns', () => { + const data = { + u: [1, 2, 3, 4, 5], + v: ['a', 'b', null, 'd', 'e'], + x: ['cc', 'dd', 'cc', 'dd', 'cc'], + y: ['aa', 'aa', null, 'bb', 'bb'] + }; + const at = arrowTable(data); + + const s1 = fromArrow(at, { columns: 'x' }); + assert.deepEqual(s1.columnNames(), ['x'], 'select by column name'); + tableEqual(s1, { x: data.x }, 'correct columns selected'); + + const s2 = fromArrow(at, { columns: ['u', 'y'] }); + assert.deepEqual(s2.columnNames(), ['u', 'y'], 'select by column names'); + tableEqual(s2, { u: data.u, y: data.y }, 'correct columns selected'); + + const s3 = fromArrow(at, { columns: not('u', 'y') }); + assert.deepEqual(s3.columnNames(), ['v', 'x'], 'select by helper'); + tableEqual(s3, { v: data.v, x: data.x }, 'correct columns selected'); + + const s4 = fromArrow(at, { columns: { u: 'a', x: 'b'} }); + assert.deepEqual(s4.columnNames(), ['a', 'b'], 'select by helper'); + tableEqual(s4, { a: data.u, b: data.x }, 'correct columns selected'); + }); + + it('can read Apache Arrow lists', () => { + const l = [[1, 2, 3], null, [4, 5]]; + const at = arrowTable({ l }); + + if (at.getChild('l').type.typeId !== 12) { + assert.fail('Arrow column should have List type'); + } + tableEqual(fromArrow(at), { l }, 'extract Arrow list'); + }); + + it('can read Apache Arrow fixed-size lists', () => { + const l = [[1, 2], null, [4, 5]]; + const at = arrowTable({ l }); + + if (at.getChild('l').type.typeId !== 16) { + assert.fail('Arrow column should have FixedSizeList type'); + } + tableEqual(fromArrow(at), { l }, 'extract Arrow list'); + }); + + it('can read Apache Arrow structs', () => { + const s = [{ foo: 1, bar: [2, 3] }, null, { foo: 2, bar: [4] }]; + const at = arrowTable({ s }); + + if (at.getChild('s').type.typeId !== 13) { + assert.fail('Arrow column should have Struct type'); + } + tableEqual(fromArrow(at), { s }, 'extract Arrow struct'); + }); + + it('can read nested Apache Arrow structs', () => { + const s = [{ foo: 1, bar: { bop: 2 } }, { foo: 2, bar: { bop: 3 } }]; + const at = arrowTable({ s }); + + if (at.getChild('s').type.typeId !== 13) { + assert.fail('Arrow column should have Struct type'); + } + tableEqual(fromArrow(at), { s }, 'extract nested Arrow struct'); + }); +}); diff --git a/test/arrow/profiler-test.js b/test/arrow/profiler-test.js index 1305cae0..900e0ed8 100644 --- a/test/arrow/profiler-test.js +++ b/test/arrow/profiler-test.js @@ -1,9 +1,9 @@ -import tape from 'tape'; -import { profiler } from '../../src/arrow/encode/profiler'; +import assert from 'node:assert'; import { Float64, Int16, Int32, Int64, Int8, Uint16, Uint32, Uint64, Uint8, util } from 'apache-arrow'; +import { profiler } from '../../src/arrow/encode/profiler.js'; function profile(array) { const p = profiler(); @@ -15,71 +15,69 @@ function typeCompare(a, b) { return util.compareTypes(a, b); } -tape('profiler infers integer types', t => { - const types = { - uint8: new Uint8(), - uint16: new Uint16(), - uint32: new Uint32(), - int8: new Int8(), - int16: new Int16(), - int32: new Int32() - }; +describe('profiler', () => { + it('infers integer types', () => { + const types = { + uint8: new Uint8(), + uint16: new Uint16(), + uint32: new Uint32(), + int8: new Int8(), + int16: new Int16(), + int32: new Int32() + }; - const dt = { - uint8: [0, 1 << 7, 1 << 8 - 1], - uint16: [0, 1 << 15, 1 << 16 - 1], - uint32: [0, 2 ** 31 - 1, 2 ** 32 - 1], - int8: [-(1 << 7), 0, (1 << 7) - 1], - int16: [-(1 << 15), 0, (1 << 15) - 1], - int32: [(1 << 31), 0, 2 ** 31 - 1] - }; + const dt = { + uint8: [0, 1 << 7, 1 << 8 - 1], + uint16: [0, 1 << 15, 1 << 16 - 1], + uint32: [0, 2 ** 31 - 1, 2 ** 32 - 1], + int8: [-(1 << 7), 0, (1 << 7) - 1], + int16: [-(1 << 15), 0, (1 << 15) - 1], + int32: [(1 << 31), 0, 2 ** 31 - 1] + }; - Object.keys(dt).forEach(name => { - const type = profile(dt[name]).type(); - t.ok(typeCompare(types[name], type), `${name} type`); - }); + Object.keys(dt).forEach(name => { + const type = profile(dt[name]).type(); + assert.ok(typeCompare(types[name], type), `${name} type`); + }); - const float = new Float64(); - t.ok( - typeCompare(float, profile([0, 1, 2 ** 32]).type()), - 'overflow to float64 type' - ); - t.ok( - typeCompare(float, profile([(1 << 31), 0, 2 ** 32 - 1]).type()), - 'overflow to float64 type' - ); - t.ok( - typeCompare(float, profile([(1 << 31) - 1, 0, 1]).type()), - 'underflow to float64 type' - ); + const float = new Float64(); + assert.ok( + typeCompare(float, profile([0, 1, 2 ** 32]).type()), + 'overflow to float64 type' + ); + assert.ok( + typeCompare(float, profile([(1 << 31), 0, 2 ** 32 - 1]).type()), + 'overflow to float64 type' + ); + assert.ok( + typeCompare(float, profile([(1 << 31) - 1, 0, 1]).type()), + 'underflow to float64 type' + ); + }); - t.end(); -}); + it('infers bigint types', () => { + const types = { + int64: new Int64(), + uint64: new Uint64() + }; -tape('profiler infers bigint types', t => { - const types = { - int64: new Int64(), - uint64: new Uint64() - }; + const dt = { + int64: [-(2n ** 63n), 0n, (2n ** 63n) - 1n], + uint64: [0n, 1n, 2n ** 64n - 1n] + }; - const dt = { - int64: [-(2n ** 63n), 0n, (2n ** 63n) - 1n], - uint64: [0n, 1n, 2n ** 64n - 1n] - }; + Object.keys(dt).forEach(name => { + const type = profile(dt[name]).type(); + assert.ok(typeCompare(types[name], type), `${name} type`); + }); - Object.keys(dt).forEach(name => { - const type = profile(dt[name]).type(); - t.ok(typeCompare(types[name], type), `${name} type`); + assert.throws( + () => profile([0n, 1n, 2n ** 64n]).type(), + 'throws on overflow' + ); + assert.throws( + () => profile([-(2n ** 63n), 0n, 2n ** 63n]).type(), + 'throws on underflow' + ); }); - - t.throws( - () => profile([0n, 1n, 2n ** 64n]).type(), - 'throws on overflow' - ); - t.throws( - () => profile([-(2n ** 63n), 0n, 2n ** 63n]).type(), - 'throws on underflow' - ); - - t.end(); -}); \ No newline at end of file +}); diff --git a/test/arrow/to-arrow-test.js b/test/arrow/to-arrow-test.js new file mode 100644 index 00000000..1858f4d0 --- /dev/null +++ b/test/arrow/to-arrow-test.js @@ -0,0 +1,285 @@ +import assert from 'node:assert'; +import { readFileSync } from 'node:fs'; +import { + Int8, Type, tableFromIPC, tableToIPC, vectorFromArray +} from 'apache-arrow'; +import { + fromArrow, fromCSV, fromJSON, table, toArrow, toArrowIPC, toJSON +} from '../../src/index.js'; + +function date(year, month=0, date=1, hours=0, minutes=0, seconds=0, ms=0) { + return new Date(year, month, date, hours, minutes, seconds, ms); +} + +function utc(year, month=0, date=1, hours=0, minutes=0, seconds=0, ms=0) { + return new Date(Date.UTC(year, month, date, hours, minutes, seconds, ms)); +} + +function Int8Vector(data) { + return vectorFromArray(data, new Int8); +} + +function isArrayType(value) { + return Array.isArray(value) + || (value && value.map === Int8Array.prototype.map); +} + +function compareTables(aqt, art) { + const err = aqt.columnNames() + .map(name => compareColumns(name, aqt, art)) + .filter(a => a.length); + return err.length; +} + +function compareColumns(name, aqt, art) { + const normalize = v => v === undefined ? null : v instanceof Date ? +v : v; + const idx = aqt.indices(); + const aqc = aqt.column(name); + const arc = art.getChild(name); + const err = []; + for (let i = 0; i < idx.length; ++i) { + let v1 = normalize(aqc.at(idx[i])); + let v2 = normalize(arc.at(i)); + if (isArrayType(v1)) { + v1 = v1.join(); + v2 = [...v2].join(); + } else if (typeof v1 === 'object') { + v1 = JSON.stringify(v1); + v2 = JSON.stringify(v2); + } + if (v1 !== v2) { + err.push({ name, index: i, v1, v2 }); + } + } + return err; +} + +describe('toArrow', () => { + it('produces Arrow data for an input table', () => { + const dt = table({ + i: [1, 2, 3, undefined, 4, 5], + f: Float32Array.from([1.2, 2.3, 3.0, 3.4, null, 4.5]), + n: [4.5, 4.4, 3.4, 3.0, 2.3, 1.2], + b: [true, true, false, true, null, false], + s: ['foo', null, 'bar', 'baz', 'baz', 'bar'], + d: [date(2000,0,1), date(2000,1,2), null, date(2010,6,9), date(2018,0,1), date(2020,10,3)], + u: [utc(2000,0,1), utc(2000,1,2), null, utc(2010,6,9), utc(2018,0,1), utc(2020,10,3)], + e: [null, null, null, null, null, null], + v: Int8Vector([10, 9, 8, 7, 6, 5]), + a: [[1, null, 3], [4, 5], null, [6, 7], [8, 9], []], + l: [[1], [2], [3], [4], [5], [6]], + o: [1, 2, 3, null, 5, 6].map(v => v ? { key: v } : null) + }); + + const at = toArrow(dt); + + assert.equal( + compareTables(dt, at), 0, + 'arquero and arrow tables match' + ); + + assert.equal( + compareTables(dt, toArrow(dt.objects())), 0, + 'object array and arrow tables match' + ); + + const buffer = tableToIPC(at); + const bt = tableFromIPC(buffer); + + assert.equal( + compareTables(dt, bt), 0, + 'arquero and serialized arrow tables match' + ); + + assert.equal( + compareTables(fromArrow(bt), at), 0, + 'serialized arquero and arrow tables match' + ); + }); + + it('produces Arrow data for an input CSV', async () => { + const dt = fromCSV(readFileSync('test/format/data/beers.csv', 'utf8')); + const st = dt.derive({ name: d => d.name + '' }); + const at = toArrow(dt); + + assert.equal( + compareTables(st, at), 0, + 'arquero and arrow tables match' + ); + + assert.equal( + compareTables(st, toArrow(st.objects())), 0, + 'object array and arrow tables match' + ); + + const buffer = tableToIPC(at); + + assert.equal( + compareTables(st, tableFromIPC(buffer)), 0, + 'arquero and serialized arrow tables match' + ); + + assert.equal( + compareTables(fromArrow(tableFromIPC(buffer)), at), 0, + 'serialized arquero and arrow tables match' + ); + }); + + it('handles ambiguously typed data', async () => { + const at = toArrow(table({ x: [1, 2, 3, 'foo'] })); + assert.deepEqual( + [...at.getChild('x')], + ['1', '2', '3', 'foo'], + 'fallback to string type if a string is observed' + ); + + assert.throws( + () => toArrow(table({ x: [1, 2, 3, true] })), + 'fail on mixed types' + ); + }); + + it('result produces serialized arrow data', () => { + const dt = fromCSV(readFileSync('test/format/data/beers.csv', 'utf8')) + .derive({ name: d => d.name + '' }); + + const json = toJSON(dt); + const jt = fromJSON(json); + + const bytes = tableToIPC(toArrow(dt)); + const bt = fromArrow(tableFromIPC(bytes)); + + assert.deepEqual( + [toJSON(bt), toJSON(jt)], + [json, json], + 'arrow and json round trips match' + ); + }); + + it('respects columns option', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + const at = toArrow(dt, { columns: ['w', 'y'] }); + + assert.deepEqual( + at.schema.fields.map(f => f.name), + ['w', 'y'], + 'column subset' + ); + }); + + it('respects limit and offset options', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + assert.equal( + JSON.stringify([...toArrow(dt, { limit: 2 })]), + '[{"w":"a","x":1,"y":1.6181,"z":true},{"w":"b","x":2,"y":2.7182,"z":true}]', + 'limit' + ); + assert.equal( + JSON.stringify([...toArrow(dt, { offset: 1 })]), + '[{"w":"b","x":2,"y":2.7182,"z":true},{"w":"a","x":3,"y":3.1415,"z":false}]', + 'offset' + ); + assert.equal( + JSON.stringify([...toArrow(dt, { offset: 1, limit: 1 })]), + '[{"w":"b","x":2,"y":2.7182,"z":true}]', + 'limit and offset' + ); + }); + + it('respects limit and types option', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + const at = toArrow(dt, { + types: { w: Type.Utf8, x: Type.Int32, y: Type.Float32 } + }); + + const types = ['w', 'x', 'y', 'z'].map(name => at.getChild(name).type); + + assert.deepEqual( + types.map(t => t.typeId), + [Type.Utf8, Type.Int, Type.Float, Type.Bool], + 'type ids match' + ); + assert.equal(types[1].bitWidth, 32, 'int32'); + assert.equal(types[2].precision, 1, 'float32'); + }); +}); + +describe('toArrowIPC', () => { + it('generates the correct output for file option', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + const buffer = toArrowIPC(dt, { format: 'file' }); + + assert.deepEqual( + buffer.slice(0, 8), + new Uint8Array([65, 82, 82, 79, 87, 49, 0, 0]) + ); + }); + + it('generates the correct output for stream option', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + const buffer = toArrowIPC(dt, { format: 'stream' }); + + assert.deepEqual( + buffer.slice(0, 8), + new Uint8Array([255, 255, 255, 255, 88, 1, 0, 0]) + ); + }); + + it('defaults to using stream option', () => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + + const buffer = toArrowIPC(dt); + + assert.deepEqual( + buffer.slice(0, 8), + new Uint8Array([255, 255, 255, 255, 88, 1, 0, 0]) + ); + }); + + it('throws an error if the format is not stream or file', () => { + assert.throws(() => { + const dt = table({ + w: ['a', 'b', 'a'], + x: [1, 2, 3], + y: [1.6181, 2.7182, 3.1415], + z: [true, true, false] + }); + toArrowIPC(dt, { format: 'nonsense' }); + }, 'Unrecognized output format'); + }); +}); diff --git a/test/expression/params-test.js b/test/expression/params-test.js index 65353109..acb7b578 100644 --- a/test/expression/params-test.js +++ b/test/expression/params-test.js @@ -1,137 +1,128 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import op from '../../src/op/op-api'; -import { table } from '../../src/table'; - -tape('parse supports table expression with parameter arg', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const ft = table(cols) - .params({ lo: 1, hi: 7 }) - .filter((d, $) => $.lo < d.a && d.a < $.hi) - .reify(); - - tableEqual(t, ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); - t.deepEqual(ft.params(), { lo: 1, hi: 7 }); - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table} from '../../src/index.js'; + +describe('parse with params', () => { + it('supports table expression with parameter arg', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const ft = table(cols) + .params({ lo: 1, hi: 7 }) + .filter((d, $) => $.lo < d.a && d.a < $.hi) + .reify(); + + tableEqual(ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); + assert.deepEqual(ft.params(), { lo: 1, hi: 7 }); + }); + + it('supports table expression with renamed parameter arg', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const t1 = table(cols) + .params({ lo: 1, hi: 7 }) + .filter((d, _) => _.lo < d.a && d.a < _.hi) + .reify(); + tableEqual(t1, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); + assert.deepEqual(t1.params(), { lo: 1, hi: 7 }); + + const t2 = table(cols) + .params({ lo: 1, hi: 7 }) + .filter((d, params) => op.equal(params.lo, d.a)) + .reify(); + tableEqual(t2, { a: [1], b: [2] }, 'parameter filtered data'); + assert.deepEqual(t2.params(), { lo: 1, hi: 7 }); + + const t3 = table(cols) + .params({ column: 'a' }) + .filter((d, p) => d[p.column] > 3) + .reify(); + tableEqual(t3, { a: [5, 7], b: [6, 8] }, 'parameter filtered data'); + assert.deepEqual(t3.params(), { column: 'a' }); + }); + + it('supports table expression with object pattern parameter arg', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const ft = table(cols) + .params({ lo: 1, hi: 7 }) + .filter((d, { lo, hi }) => lo < d.a && d.a < hi) + .reify(); + + tableEqual(ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); + assert.deepEqual(ft.params(), { lo: 1, hi: 7 }); + }); + + it('throws on table expression with nested object pattern parameter arg', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + assert.throws(() => { + table(cols) + .params({ thresh: {lo: 1, hi: 7} }) + .filter((d, { thresh: { lo, hi } }) => lo < d.a && d.a < hi); + }, 'throws on nested argument destructuring'); + }); + + it('supports table expression without parameter arg', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const lo = 1; + const hi = 7; + + const ft = table(cols) + .params({ lo, hi }) + .filter(d => lo < d.a && d.a < hi) + .reify(); + + tableEqual(ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); + assert.deepEqual(ft.params(), { lo: 1, hi: 7 }); + }); + + it('supports table expression with object-valued parameter', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const arr = [1, 7]; + const at = table(cols) + .params({ arr }) + .filter(d => arr[0] < d.a && d.a < arr[1]) + .reify(); + tableEqual(at, { a: [3, 5], b: [4, 6] }, 'array parameter filtered data'); + assert.deepEqual(at.params(), { arr }); + + const obj = { lo: 1, hi: 7 }; + const ot = table(cols) + .params({ obj }) + .filter(d => obj.lo < d.a && d.a < obj.hi) + .reify(); + tableEqual(ot, { a: [3, 5], b: [4, 6] }, 'object parameter filtered data'); + assert.deepEqual(ot.params(), { obj }); + }); + + it('throws on invalid parameter', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + assert.throws( + () => { table(cols).filter((d, $) => $.lo < d.a && d.a < $.hi); }, + 'throws on undefined parameter' + ); + }); }); - -tape('parse supports table expression with renamed parameter arg', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const t1 = table(cols) - .params({ lo: 1, hi: 7 }) - .filter((d, _) => _.lo < d.a && d.a < _.hi) - .reify(); - tableEqual(t, t1, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); - t.deepEqual(t1.params(), { lo: 1, hi: 7 }); - - const t2 = table(cols) - .params({ lo: 1, hi: 7 }) - .filter((d, params) => op.equal(params.lo, d.a)) - .reify(); - tableEqual(t, t2, { a: [1], b: [2] }, 'parameter filtered data'); - t.deepEqual(t2.params(), { lo: 1, hi: 7 }); - - const t3 = table(cols) - .params({ column: 'a' }) - .filter((d, p) => d[p.column] > 3) - .reify(); - tableEqual(t, t3, { a: [5, 7], b: [6, 8] }, 'parameter filtered data'); - t.deepEqual(t3.params(), { column: 'a' }); - - t.end(); -}); - -tape('parse supports table expression with object pattern parameter arg', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const ft = table(cols) - .params({ lo: 1, hi: 7 }) - .filter((d, { lo, hi }) => lo < d.a && d.a < hi) - .reify(); - - tableEqual(t, ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); - t.deepEqual(ft.params(), { lo: 1, hi: 7 }); - t.end(); -}); - -tape('parse throws on table expression with nested object pattern parameter arg', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - t.throws(() => { - table(cols) - .params({ thresh: {lo: 1, hi: 7} }) - .filter((d, { thresh: { lo, hi } }) => lo < d.a && d.a < hi); - }, 'throws on nested argument destructuring'); - - t.end(); -}); - -tape('parse supports table expression without parameter arg', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const lo = 1; - const hi = 7; - - const ft = table(cols) - .params({ lo, hi }) - .filter(d => lo < d.a && d.a < hi) - .reify(); - - tableEqual(t, ft, { a: [3, 5], b: [4, 6] }, 'parameter filtered data'); - t.deepEqual(ft.params(), { lo: 1, hi: 7 }); - t.end(); -}); - -tape('parse supports table expression with object-valued parameter', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const arr = [1, 7]; - const at = table(cols) - .params({ arr }) - .filter(d => arr[0] < d.a && d.a < arr[1]) - .reify(); - tableEqual(t, at, { a: [3, 5], b: [4, 6] }, 'array parameter filtered data'); - t.deepEqual(at.params(), { arr }); - - const obj = { lo: 1, hi: 7 }; - const ot = table(cols) - .params({ obj }) - .filter(d => obj.lo < d.a && d.a < obj.hi) - .reify(); - tableEqual(t, ot, { a: [3, 5], b: [4, 6] }, 'object parameter filtered data'); - t.deepEqual(ot.params(), { obj }); - - t.end(); -}); - -tape('parse throws on invalid parameter', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - t.throws( - () => { table(cols).filter((d, $) => $.lo < d.a && d.a < $.hi); }, - 'throws on undefined parameter' - ); - t.end(); -}); \ No newline at end of file diff --git a/test/expression/parse-test.js b/test/expression/parse-test.js index 4dac5ba9..76686eb2 100644 --- a/test/expression/parse-test.js +++ b/test/expression/parse-test.js @@ -1,24 +1,22 @@ -import tape from 'tape'; -import parse from '../../src/expression/parse'; -import op from '../../src/op/op-api'; -import rolling from '../../src/helpers/rolling'; +import assert from 'node:assert'; +import { op, parse, rolling } from '../../src/index.js'; // pass code through for testing const compiler = { param: x => x, expr: x => x }; -function test(t, input) { +function test(input) { const { ops, names, exprs } = parse(input, { compiler }); - t.deepEqual(ops, [ - { name: 'mean', fields: ['data.a.get(row)'], params: [], id: 0 }, - { name: 'corr', fields: ['data.a.get(row)', 'data.b.get(row)'], params: [], id: 1}, - { name: 'quantile', fields: ['(-data.bar.get(row))'], params: ['(0.5 / 2)'], id: 2}, - { name: 'lag', fields: ['data.value.get(row)'], params: [2], id: 3 }, - { name: 'mean', fields: ['data.value.get(row)'], params: [], frame: [-3, 3], peers: false, id: 4 }, + assert.deepEqual(ops, [ + { name: 'mean', fields: ['data.a.at(row)'], params: [], id: 0 }, + { name: 'corr', fields: ['data.a.at(row)', 'data.b.at(row)'], params: [], id: 1}, + { name: 'quantile', fields: ['(-data.bar.at(row))'], params: ['(0.5 / 2)'], id: 2}, + { name: 'lag', fields: ['data.value.at(row)'], params: [2], id: 3 }, + { name: 'mean', fields: ['data.value.at(row)'], params: [], frame: [-3, 3], peers: false, id: 4 }, { name: 'count', fields: [], params: [], frame: [-3, 3], peers: true, id: 5 } ], 'parsed operators'); - t.deepEqual(names, [ + assert.deepEqual(names, [ 'constant', 'column', 'agg1', @@ -29,709 +27,672 @@ function test(t, input) { 'win3' ], 'parsed output names'); - t.deepEqual(exprs, [ + assert.deepEqual(exprs, [ '(1 + 1)', - '(data.a.get(row) * data.b.get(row))', + '(data.a.at(row) * data.b.at(row))', 'op(0,row)', 'op(1,row)', '(1 + op(2,row))', - '(data.value.get(row) - op(3,row))', + '(data.value.at(row) - op(3,row))', 'op(4,row)', 'op(5,row)' ], 'parsed output expressions'); } -tape('parse parses expressions with global operator names', t => { - /* eslint-disable no-undef */ - test(t, { - constant: () => 1 + 1, - column: d => d.a * d.b, - agg1: d => mean(d.a), - agg2: d => corr(d.a, d.b), - agg3: d => 1 + quantile(-d.bar, 0.5/2), - win1: d => d.value - lag(d.value, 2), - win2: rolling(d => mean(d.value), [-3, 3]), - win3: rolling(() => count(), [-3, 3], true) +describe('parse', () => { + it('parses expressions with global operator names', () => { + /* eslint-disable no-undef */ + test({ + constant: () => 1 + 1, + column: d => d.a * d.b, + agg1: d => mean(d.a), + agg2: d => corr(d.a, d.b), + agg3: d => 1 + quantile(-d.bar, 0.5/2), + win1: d => d.value - lag(d.value, 2), + win2: rolling(d => mean(d.value), [-3, 3]), + win3: rolling(() => count(), [-3, 3], true) + }); + /* eslint-enable */ }); - /* eslint-enable */ - t.end(); -}); - -tape('parse parses expressions with operator object', t => { - test(t, { - constant: () => 1 + 1, - column: d => d.a * d.b, - agg1: d => op.mean(d.a), - agg2: d => op.corr(d.a, d.b), - agg3: d => 1 + op.quantile(-d.bar, 0.5/2), - win1: d => d.value - op.lag(d.value, 2), - win2: rolling(d => op.mean(d.value), [-3, 3]), - win3: rolling(() => op.count(), [-3, 3], true) + it('parses expressions with operator object', () => { + test({ + constant: () => 1 + 1, + column: d => d.a * d.b, + agg1: d => op.mean(d.a), + agg2: d => op.corr(d.a, d.b), + agg3: d => 1 + op.quantile(-d.bar, 0.5/2), + win1: d => d.value - op.lag(d.value, 2), + win2: rolling(d => op.mean(d.value), [-3, 3]), + win3: rolling(() => op.count(), [-3, 3], true) + }); }); - t.end(); -}); - -tape('parse parses expressions with nested operator object', t => { - const aq = { op }; - - test(t, { - constant: () => 1 + 1, - column: d => d.a * d.b, - agg1: d => aq.op.mean(d.a), - agg2: d => aq.op.corr(d.a, d.b), - agg3: d => 1 + aq.op.quantile(-d.bar, 0.5/2), - win1: d => d.value - aq.op.lag(d.value, 2), - win2: rolling(d => aq.op.mean(d.value), [-3, 3]), - win3: rolling(() => aq.op.count(), [-3, 3], true) + it('parses expressions with nested operator object', () => { + const aq = { op }; + test({ + constant: () => 1 + 1, + column: d => d.a * d.b, + agg1: d => aq.op.mean(d.a), + agg2: d => aq.op.corr(d.a, d.b), + agg3: d => 1 + aq.op.quantile(-d.bar, 0.5/2), + win1: d => d.value - aq.op.lag(d.value, 2), + win2: rolling(d => aq.op.mean(d.value), [-3, 3]), + win3: rolling(() => aq.op.count(), [-3, 3], true) + }); }); - t.end(); -}); - -tape('parse parses expressions with Math object', t => { - t.equal( - parse({ f: d => Math.sqrt(d.x) }).exprs[0] + '', - '(row,data,op)=>fn.sqrt(data.x.get(row))', - 'parse Math.sqrt' - ); - - t.equal( - parse({ f: d => Math.max(d.x) }).exprs[0] + '', - '(row,data,op)=>fn.greatest(data.x.get(row))', - 'parse Math.max, rewrite as greatest' - ); - - t.equal( - parse({ f: d => Math.min(d.x) }).exprs[0] + '', - '(row,data,op)=>fn.least(data.x.get(row))', - 'parse Math.min, rewrite as least' - ); - - t.end(); -}); - -tape('parse parses expressions with constant values', t => { - function constant(string, result) { - const { exprs } = parse({ f: `d => ${string}` }); - t.equal( - exprs[0] + '', - `(row,data,op)=>${result}`, - `parsed ${string} constant` + it('parses expressions with Math object', () => { + assert.equal( + parse({ f: d => Math.sqrt(d.x) }).exprs[0] + '', + '(row,data,op)=>fn.sqrt(data.x.at(row))', + 'parse Math.sqrt' ); - } - - constant('undefined', 'void(0)'); - constant('Infinity', 'Number.POSITIVE_INFINITY'); - constant('NaN', 'Number.NaN'); - constant('E', 'Math.E'); - constant('LN2', 'Math.LN2'); - constant('LN10', 'Math.LN10'); - constant('LOG2E', 'Math.LOG2E'); - constant('LOG10E', 'Math.LOG10E'); - constant('PI', 'Math.PI'); - constant('SQRT1_2', 'Math.SQRT1_2'); - constant('SQRT2', 'Math.SQRT2'); - - constant('Math.E', 'Math.E'); - constant('Math.LN2', 'Math.LN2'); - constant('Math.LN10', 'Math.LN10'); - constant('Math.LOG2E', 'Math.LOG2E'); - constant('Math.LOG10E', 'Math.LOG10E'); - constant('Math.PI', 'Math.PI'); - constant('Math.SQRT1_2', 'Math.SQRT1_2'); - constant('Math.SQRT2', 'Math.SQRT2'); - - t.throws(() => constant('Object'), 'throws on constant Object'); - t.throws(() => constant('Object.keys'), 'throws on constant Object.keys'); - t.throws(() => constant('Number.NaN'), 'throws on constant Number.NaN'); - - t.end(); -}); -tape('parse parses expressions with literal values', t => { - function literal(string, result) { - const { exprs } = parse({ f: `d => ${string}` }); - t.equal( - exprs[0] + '', - `(row,data,op)=>${result}`, - `parsed ${string} literal` + assert.equal( + parse({ f: d => Math.max(d.x) }).exprs[0] + '', + '(row,data,op)=>fn.greatest(data.x.at(row))', + 'parse Math.max, rewrite as greatest' ); - } - - literal('1', '1'); - literal('1e-5', '1e-5'); - literal('true', 'true'); - literal('false', 'false'); - literal('"foo"', '"foo"'); - literal('[1,2,3]', '[1,2,3]'); - literal('({a:1})', '({a:1})'); - literal('({"b":2})', '({"b":2})'); - t.end(); -}); -tape('parse parses column references with nested properties', t => { - t.equal( - parse({ f: d => d.x.y }).exprs[0] + '', - '(row,data,op)=>data.x.get(row).y', - 'parsed nested members' - ); - - t.equal( - parse({ f: d => d['x'].y }).exprs[0] + '', - '(row,data,op)=>data["x"].get(row).y', - 'parsed nested members' - ); - - t.equal( - parse({ f: d => d['x']['y'] }).exprs[0] + '', - '(row,data,op)=>data["x"].get(row)[\'y\']', - 'parsed nested members' - ); - - t.end(); -}); + assert.equal( + parse({ f: d => Math.min(d.x) }).exprs[0] + '', + '(row,data,op)=>fn.least(data.x.at(row))', + 'parse Math.min, rewrite as least' + ); + }); -tape('parse parses indirect column names', t => { - // direct expression - t.equal( - parse({ f: d => d['x' + 'y'] }).exprs[0] + '', - '(row,data,op)=>data["xy"].get(row)', - 'parsed indirect member as expression' - ); - - // parameter reference - const opt = { - table: { - params: () => ({ col: 'a' }), - column: (name) => name == 'a' ? {} : null + it('parses expressions with constant values', () => { + function constant(string, result) { + const { exprs } = parse({ f: `d => ${string}` }); + assert.equal( + exprs[0] + '', + `(row,data,op)=>${result}`, + `parsed ${string} constant` + ); } - }; - t.equal( - parse({ f: (d, $) => d[$.col] }, opt).exprs[0] + '', - '(row,data,op)=>data["a"].get(row)', - 'parsed indirect member as param' - ); - - // variable reference - t.throws( - () => parse({ - f: d => { - const col = 'a'; - return d[col]; - } - }), - 'throws on indirect variable' - ); - - // variable reference - t.throws( - () => parse({ f: d => d[d.foo] }), - 'throws on nested column reference' - ); - t.end(); -}); - -tape('parse throws on invalid column names', t => { - const opt = { table: { params: () => ({}), data: () => ({}) } }; - t.throws(() => parse({ f: d => d.foo }, opt)); - t.throws(() => parse({ f: ({ foo }) => foo }, opt)); - t.end(); -}); - -tape('parse parses expressions with op parameter expressions', t => { - const exprs = parse({ - op: d => op.quantile(d.a, op.abs(op.sqrt(0.25))) + constant('undefined', 'void(0)'); + constant('Infinity', 'Number.POSITIVE_INFINITY'); + constant('NaN', 'Number.NaN'); + constant('E', 'Math.E'); + constant('LN2', 'Math.LN2'); + constant('LN10', 'Math.LN10'); + constant('LOG2E', 'Math.LOG2E'); + constant('LOG10E', 'Math.LOG10E'); + constant('PI', 'Math.PI'); + constant('SQRT1_2', 'Math.SQRT1_2'); + constant('SQRT2', 'Math.SQRT2'); + + constant('Math.E', 'Math.E'); + constant('Math.LN2', 'Math.LN2'); + constant('Math.LN10', 'Math.LN10'); + constant('Math.LOG2E', 'Math.LOG2E'); + constant('Math.LOG10E', 'Math.LOG10E'); + constant('Math.PI', 'Math.PI'); + constant('Math.SQRT1_2', 'Math.SQRT1_2'); + constant('Math.SQRT2', 'Math.SQRT2'); + + assert.throws(() => constant('Object'), 'throws on constant Object'); + assert.throws(() => constant('Object.keys'), 'throws on constant Object.keys'); + assert.throws(() => constant('Number.NaN'), 'throws on constant Number.NaN'); }); - t.equal( - exprs.ops[0].params[0], 0.5, 'calculated op param' - ); - t.end(); -}); -tape('parse throws on invalid op parameter expressions', t => { - t.throws(() => parse({ op: d => op.quantile(d.a, d.b) })); - t.throws(() => parse({ op: d => op.sum(op.mean(d.a)) })); - t.throws(() => parse({ op: d => op.sum(op.lag(d.a)) })); - t.throws(() => parse({ op: d => op.lag(op.sum(d.a)) })); - t.throws(() => parse({ - op: d => { - const value = 0.5; - return op.quantile(d.a, value); - } - })); - t.throws(() => parse({ - op: d => { - const value = 0.5; - return op.quantile(d.a + value, 0.5); + it('parses expressions with literal values', () => { + function literal(string, result) { + const { exprs } = parse({ f: `d => ${string}` }); + assert.equal( + exprs[0] + '', + `(row,data,op)=>${result}`, + `parsed ${string} literal` + ); } - })); - t.end(); -}); - -tape('parse parses computed object properties', t => { - const { exprs } = parse({ f: d => ({ [d.x]: d.y }) }); - t.equal( - exprs[0] + '', - '(row,data,op)=>({[data.x.get(row)]:data.y.get(row)})', - 'parsed computed object property' - ); - t.end(); -}); -tape('parse parses template literals', t => { - const { exprs } = parse({ f: d => `${d.x} + ${d.y}` }); - t.equal( - exprs[0] + '', - '(row,data,op)=>`${data.x.get(row)} + ${data.y.get(row)}`', - 'parsed template literal' - ); - t.end(); -}); + literal('1', '1'); + literal('1e-5', '1e-5'); + literal('true', 'true'); + literal('false', 'false'); + literal('"foo"', '"foo"'); + literal('[1,2,3]', '[1,2,3]'); + literal('({a:1})', '({a:1})'); + literal('({"b":2})', '({"b":2})'); + }); -tape('parse parses expressions with block statements', t => { - const exprs = { - val: d => { const s = op.sum(d.a); return s * s; } - }; - - t.deepEqual( - parse(exprs, { compiler }), - { - names: [ 'val' ], - exprs: [ '{const s=op(0,row);return (s * s);}' ], - ops: [ - { name: 'sum', fields: [ 'data.a.get(row)' ], params: [], id: 0 } - ] - }, - 'parsed block' - ); + it('parses column references with nested properties', () => { + assert.equal( + parse({ f: d => d.x.y }).exprs[0] + '', + '(row,data,op)=>data.x.at(row).y', + 'parsed nested members' + ); - t.equal( - parse(exprs).exprs[0] + '', - '(row,data,op)=>{const s=op(0,row);return (s * s);}', - 'compiled block' - ); + assert.equal( + parse({ f: d => d['x'].y }).exprs[0] + '', + '(row,data,op)=>data["x"].at(row).y', + 'parsed nested members' + ); - t.end(); -}); + assert.equal( + parse({ f: d => d['x']['y'] }).exprs[0] + '', + '(row,data,op)=>data["x"].at(row)[\'y\']', + 'parsed nested members' + ); + }); -tape('parse parses expressions with if statements', t => { - const exprs = { - val1: () => { - const d = 3 - 2; - if (d < 1) { return 1; } else { return 0; } - }, - val2: () => { - const d = 3 - 2; - if (d < 1) { return 1; } - return 0; - } - }; - - t.deepEqual( - parse(exprs, { compiler }), - { - names: ['val1', 'val2'], - exprs: [ - '{const d=(3 - 2);if ((d < 1)){return 1;} else {return 0;};}', - '{const d=(3 - 2);if ((d < 1)){return 1;};return 0;}' - ], - ops: [] - }, - 'parsed if' - ); - - t.end(); -}); + it('parses indirect column names', () => { + // direct expression + assert.equal( + parse({ f: d => d['x' + 'y'] }).exprs[0] + '', + '(row,data,op)=>data["xy"].at(row)', + 'parsed indirect member as expression' + ); -tape('parse parses expressions with switch statements', t => { - const exprs = { - val: () => { - const v = 'foo'; - switch (v) { - case 'foo': return 1; - case 'bar': return 2; - default: return 3; + // parameter reference + const opt = { + table: { + params: () => ({ col: 'a' }), + column: (name) => name == 'a' ? {} : null } - } - }; + }; + assert.equal( + parse({ f: (d, $) => d[$.col] }, opt).exprs[0] + '', + '(row,data,op)=>data["a"].at(row)', + 'parsed indirect member as param' + ); - t.equal( - parse(exprs, { compiler }).exprs[0], - '{const v=\'foo\';switch (v) {case \'foo\': return 1;case \'bar\': return 2;default: return 3;};}', - 'parsed switch' - ); + // variable reference + assert.throws( + () => parse({ + f: d => { + const col = 'a'; + return d[col]; + } + }), + 'throws on indirect variable' + ); - t.end(); -}); + // variable reference + assert.throws( + () => parse({ f: d => d[d.foo] }), + 'throws on nested column reference' + ); + }); -tape('parse parses expressions with destructuring assignments', t => { - const exprs = { - arr: () => { - const [start, stop, step] = op.bins('value'); - return op.bin('value', start, stop, step); - }, - obj: () => { - const { start, stop, step } = op.bins('value'); - return op.bin('value', start, stop, step); - }, - nest: () => { - const { start: [{ baz: bop }], stop, step } = op.bins('value'); - return op.bin('value', bop, stop, step); - } - }; - - t.deepEqual( - parse(exprs, { compiler }), - { - names: ['arr', 'obj', 'nest'], - exprs: [ - '{const [start,stop,step]=op(0,row);return fn.bin(\'value\',start,stop,step);}', - '{const {start:start,stop:stop,step:step}=op(0,row);return fn.bin(\'value\',start,stop,step);}', - '{const {start:[{baz:bop}],stop:stop,step:step}=op(0,row);return fn.bin(\'value\',bop,stop,step);}' - ], - ops: [ - { name: 'bins', fields: [ '\'value\'' ], params: [], id: 0 } - ] - }, - 'parsed destructuring assignmeents' - ); + it('throws on invalid column names', () => { + const opt = { table: { params: () => ({}), data: () => ({}) } }; + assert.throws(() => parse({ f: d => d.foo }, opt)); + assert.throws(() => parse({ f: ({ foo }) => foo }, opt)); + }); - t.end(); -}); + it('parses expressions with op parameter expressions', () => { + const exprs = parse({ + op: d => op.quantile(d.a, op.abs(op.sqrt(0.25))) + }); + assert.equal( + exprs.ops[0].params[0], 0.5, 'calculated op param' + ); + }); -tape('parse throws on expressions with for loops', t => { - const exprs = { - val: () => { - let v = 0; - for (let i = 0; i < 5; ++i) { - v += i; + it('throws on invalid op parameter expressions', () => { + assert.throws(() => parse({ op: d => op.quantile(d.a, d.b) })); + assert.throws(() => parse({ op: d => op.sum(op.mean(d.a)) })); + assert.throws(() => parse({ op: d => op.sum(op.lag(d.a)) })); + assert.throws(() => parse({ op: d => op.lag(op.sum(d.a)) })); + assert.throws(() => parse({ + op: d => { + const value = 0.5; + return op.quantile(d.a, value); } - return v; - } - }; - t.throws(() => parse(exprs), 'no for loops'); - t.end(); -}); - -tape('parse throws on expressions with while loops', t => { - const exprs = { - val: () => { - let v = 0; - let i = 0; - while (i < 5) { - v += i++; + })); + assert.throws(() => parse({ + op: d => { + const value = 0.5; + return op.quantile(d.a + value, 0.5); } - return v; - } - }; - t.throws(() => parse(exprs), 'no while loops'); - t.end(); -}); + })); + }); -tape('parse throws on expressions with do-while loops', t => { - const exprs = { - val: () => { - let v = 0; - let i = 0; - do { - v += i; - } while (++i < 5); - return v; - } - }; - t.throws(() => parse(exprs), 'no do-while loops'); - t.end(); -}); + it('parses computed object properties', () => { + const { exprs } = parse({ f: d => ({ [d.x]: d.y }) }); + assert.equal( + exprs[0] + '', + '(row,data,op)=>({[data.x.at(row)]:data.y.at(row)})', + 'parsed computed object property' + ); + }); -tape('parse throws on expressions with comma sequences', t => { - const exprs = { val: () => (1, 2) }; - t.throws(() => parse(exprs), 'no comma sequences'); - t.end(); -}); + it('parses template literals', () => { + const { exprs } = parse({ f: d => `${d.x} + ${d.y}` }); + assert.equal( + exprs[0] + '', + '(row,data,op)=>`${data.x.at(row)} + ${data.y.at(row)}`', + 'parsed template literal' + ); + }); -tape('parse throws on dirty tricks', t => { - t.throws(() => parse({ f: () => globalThis }), 'no globalThis access'); - t.throws(() => parse({ f: () => global }), 'no global access'); - t.throws(() => parse({ f: () => window }), 'no window access'); - t.throws(() => parse({ f: () => self }), 'no self access'); - t.throws(() => parse({ f: () => this }), 'no this access'); - t.throws(() => parse({ f: () => Object }), 'no Object access'); - t.throws(() => parse({ f: () => Date }), 'no Date access'); - t.throws(() => parse({ f: () => Array }), 'no Array access'); - t.throws(() => parse({ f: () => Number }), 'no Number access'); - t.throws(() => parse({ f: () => Math }), 'no Math access'); - t.throws(() => parse({ f: () => String }), 'no String access'); - t.throws(() => parse({ f: () => RegExp }), 'no RegExp access'); - - t.throws(() => parse({ - f: () => { const foo = [].constructor; return new foo(3); } - }), 'no instantiation'); - - t.throws(() => parse({ - f: () => [].constructor() - }), 'no property invocation'); - - t.throws(() => parse({ - f: () => [].__proto__.unsafe = 1 - }), 'no __proto__ assignment'); - - t.throws(() => parse({ - f: () => 'abc'.toUpperCase() - }), 'no literal method calls'); - - t.throws(() => parse({ - f: () => { const s = 'abc'; return s.toUpperCase(); } - }), 'no identifier method calls'); - - t.throws(() => parse({ - f: () => ('abc')['toUpperCase']() - }), 'no indirect method calls'); - - t.throws(() => parse({ - f: 'd => op.mean(var foo = d.x)' - }), 'no funny business'); - - t.end(); -}); + it('parses expressions with block statements', () => { + const exprs = { + val: d => { const s = op.sum(d.a); return s * s; } + }; -tape('parse supports ast output option', t => { - const ast = parse({ - constant: () => 1 + Math.E, - column: d => d.a * d.b, - agg1: d => op.mean(d.a), - agg2: d => op.corr(d.a, d.b), - agg3: d => 1 + op.quantile(-d.bar, 0.5/2), - win1: d => d.value - op.lag(d.value, 2), - win2: rolling(d => op.mean(d.value), [-3, 3]), - win3: rolling(() => op.count(), [-3, 3], true) - }, { ast: true }); - - t.deepEqual( - JSON.parse(JSON.stringify(ast.exprs)), - [ + assert.deepEqual( + parse(exprs, { compiler }), { - 'type': 'BinaryExpression', - 'left': { - 'type': 'Literal', - 'value': 1, - 'raw': '1' - }, - 'operator': '+', - 'right': { - 'type': 'Constant', - 'name': 'E', - 'raw': 'Math.E' - } + names: [ 'val' ], + exprs: [ '{const s=op(0,row);return (s * s);}' ], + ops: [ + { name: 'sum', fields: [ 'data.a.at(row)' ], params: [], id: 0 } + ] }, + 'parsed block' + ); + + assert.equal( + parse(exprs).exprs[0] + '', + '(row,data,op)=>{const s=op(0,row);return (s * s);}', + 'compiled block' + ); + }); + + it('parses expressions with if statements', () => { + const exprs = { + val1: () => { + const d = 3 - 2; + if (d < 1) { return 1; } else { return 0; } + }, + val2: () => { + const d = 3 - 2; + if (d < 1) { return 1; } + return 0; + } + }; + + assert.deepEqual( + parse(exprs, { compiler }), { - 'type': 'BinaryExpression', - 'left': { - 'type': 'Column', - 'name': 'a' - }, - 'operator': '*', - 'right': { - 'type': 'Column', - 'name': 'b' + names: ['val1', 'val2'], + exprs: [ + '{const d=(3 - 2);if ((d < 1)){return 1;} else {return 0;};}', + '{const d=(3 - 2);if ((d < 1)){return 1;};return 0;}' + ], + ops: [] + }, + 'parsed if' + ); + }); + + it('parses expressions with switch statements', () => { + const exprs = { + val: () => { + const v = 'foo'; + switch (v) { + case 'foo': return 1; + case 'bar': return 2; + default: return 3; } + } + }; + + assert.equal( + parse(exprs, { compiler }).exprs[0], + '{const v=\'foo\';switch (v) {case \'foo\': return 1;case \'bar\': return 2;default: return 3;};}', + 'parsed switch' + ); + }); + + it('parses expressions with destructuring assignments', () => { + const exprs = { + arr: () => { + const [start, stop, step] = op.bins('value'); + return op.bin('value', start, stop, step); + }, + obj: () => { + const { start, stop, step } = op.bins('value'); + return op.bin('value', start, stop, step); }, + nest: () => { + const { start: [{ baz: bop }], stop, step } = op.bins('value'); + return op.bin('value', bop, stop, step); + } + }; + + assert.deepEqual( + parse(exprs, { compiler }), { - 'type': 'CallExpression', - 'callee': { - 'type': 'Function', - 'name': 'mean' - }, - 'arguments': [ - { - 'type': 'Column', - 'name': 'a' - } + names: ['arr', 'obj', 'nest'], + exprs: [ + '{const [start,stop,step]=op(0,row);return fn.bin(\'value\',start,stop,step);}', + '{const {start:start,stop:stop,step:step}=op(0,row);return fn.bin(\'value\',start,stop,step);}', + '{const {start:[{baz:bop}],stop:stop,step:step}=op(0,row);return fn.bin(\'value\',bop,stop,step);}' + ], + ops: [ + { name: 'bins', fields: [ '\'value\'' ], params: [], id: 0 } ] }, - { - 'type': 'CallExpression', - 'callee': { - 'type': 'Function', - 'name': 'corr' + 'parsed destructuring assignmeents' + ); + }); + + it('throws on expressions with for loops', () => { + const exprs = { + val: () => { + let v = 0; + for (let i = 0; i < 5; ++i) { + v += i; + } + return v; + } + }; + assert.throws(() => parse(exprs), 'no for loops'); + }); + + it('throws on expressions with while loops', () => { + const exprs = { + val: () => { + let v = 0; + let i = 0; + while (i < 5) { + v += i++; + } + return v; + } + }; + assert.throws(() => parse(exprs), 'no while loops'); + }); + + it('throws on expressions with do-while loops', () => { + const exprs = { + val: () => { + let v = 0; + let i = 0; + do { + v += i; + } while (++i < 5); + return v; + } + }; + assert.throws(() => parse(exprs), 'no do-while loops'); + }); + + it('throws on expressions with comma sequences', () => { + const exprs = { val: () => (1, 2) }; + assert.throws(() => parse(exprs), 'no comma sequences'); + }); + + it('throws on dirty tricks', () => { + assert.throws(() => parse({ f: () => globalThis }), 'no globalThis access'); + assert.throws(() => parse({ f: () => global }), 'no global access'); + assert.throws(() => parse({ f: () => window }), 'no window access'); + assert.throws(() => parse({ f: () => self }), 'no self access'); + assert.throws(() => parse({ f: () => this }), 'no this access'); + assert.throws(() => parse({ f: () => Object }), 'no Object access'); + assert.throws(() => parse({ f: () => Date }), 'no Date access'); + assert.throws(() => parse({ f: () => Array }), 'no Array access'); + assert.throws(() => parse({ f: () => Number }), 'no Number access'); + assert.throws(() => parse({ f: () => Math }), 'no Math access'); + assert.throws(() => parse({ f: () => String }), 'no String access'); + assert.throws(() => parse({ f: () => RegExp }), 'no RegExp access'); + + assert.throws(() => parse({ + f: () => { const foo = [].constructor; return new foo(3); } + }), 'no instantiation'); + + assert.throws(() => parse({ + f: () => [].constructor() + }), 'no property invocation'); + + assert.throws(() => parse({ + f: () => [].__proto__.unsafe = 1 + }), 'no __proto__ assignment'); + + assert.throws(() => parse({ + f: () => 'abc'.toUpperCase() + }), 'no literal method calls'); + + assert.throws(() => parse({ + f: () => { const s = 'abc'; return s.toUpperCase(); } + }), 'no identifier method calls'); + + assert.throws(() => parse({ + f: () => ('abc')['toUpperCase']() + }), 'no indirect method calls'); + + assert.throws(() => parse({ + f: 'd => op.mean(var foo = d.x)' + }), 'no funny business'); + }); + + it('supports ast output option', () => { + const ast = parse({ + constant: () => 1 + Math.E, + column: d => d.a * d.b, + agg1: d => op.mean(d.a), + agg2: d => op.corr(d.a, d.b), + agg3: d => 1 + op.quantile(-d.bar, 0.5/2), + win1: d => d.value - op.lag(d.value, 2), + win2: rolling(d => op.mean(d.value), [-3, 3]), + win3: rolling(() => op.count(), [-3, 3], true) + }, { ast: true }); + + assert.deepEqual( + JSON.parse(JSON.stringify(ast.exprs)), + [ + { + 'type': 'BinaryExpression', + 'left': { + 'type': 'Literal', + 'value': 1, + 'raw': '1' + }, + 'operator': '+', + 'right': { + 'type': 'Constant', + 'name': 'E', + 'raw': 'Math.E' + } }, - 'arguments': [ - { + { + 'type': 'BinaryExpression', + 'left': { 'type': 'Column', 'name': 'a' }, - { + 'operator': '*', + 'right': { 'type': 'Column', 'name': 'b' } - ] - }, - { - 'type': 'BinaryExpression', - 'left': { - 'type': 'Literal', - 'value': 1, - 'raw': '1' }, - 'operator': '+', - 'right': { + { 'type': 'CallExpression', 'callee': { 'type': 'Function', - 'name': 'quantile' + 'name': 'mean' }, 'arguments': [ { - 'type': 'UnaryExpression', - 'operator': '-', - 'prefix': true, - 'argument': { - 'type': 'Column', - 'name': 'bar' - } + 'type': 'Column', + 'name': 'a' + } + ] + }, + { + 'type': 'CallExpression', + 'callee': { + 'type': 'Function', + 'name': 'corr' + }, + 'arguments': [ + { + 'type': 'Column', + 'name': 'a' }, { - 'type': 'BinaryExpression', - 'left': { - 'type': 'Literal', - 'value': 0.5, - 'raw': '0.5' + 'type': 'Column', + 'name': 'b' + } + ] + }, + { + 'type': 'BinaryExpression', + 'left': { + 'type': 'Literal', + 'value': 1, + 'raw': '1' + }, + 'operator': '+', + 'right': { + 'type': 'CallExpression', + 'callee': { + 'type': 'Function', + 'name': 'quantile' + }, + 'arguments': [ + { + 'type': 'UnaryExpression', + 'operator': '-', + 'prefix': true, + 'argument': { + 'type': 'Column', + 'name': 'bar' + } + }, + { + 'type': 'BinaryExpression', + 'left': { + 'type': 'Literal', + 'value': 0.5, + 'raw': '0.5' + }, + 'operator': '/', + 'right': { + 'type': 'Literal', + 'value': 2, + 'raw': '2' + } + } + ] + } + }, + { + 'type': 'BinaryExpression', + 'left': { + 'type': 'Column', + 'name': 'value' + }, + 'operator': '-', + 'right': { + 'type': 'CallExpression', + 'callee': { + 'type': 'Function', + 'name': 'lag' + }, + 'arguments': [ + { + 'type': 'Column', + 'name': 'value' }, - 'operator': '/', - 'right': { + { 'type': 'Literal', 'value': 2, 'raw': '2' } - } - ] - } - }, - { - 'type': 'BinaryExpression', - 'left': { - 'type': 'Column', - 'name': 'value' + ] + } }, - 'operator': '-', - 'right': { + { 'type': 'CallExpression', 'callee': { 'type': 'Function', - 'name': 'lag' + 'name': 'mean' }, 'arguments': [ { 'type': 'Column', 'name': 'value' - }, - { - 'type': 'Literal', - 'value': 2, - 'raw': '2' } ] + }, + { + 'type': 'CallExpression', + 'callee': { + 'type': 'Function', + 'name': 'count' + }, + 'arguments': [] } - }, + ] + ); + }); + + it('optimizes dictionary references', () => { + const cols = { v: { keyFor() { return 1; } } }; + const dt = { column: name => cols[name] }; + + const optimized = { + l_eq2: d => d.v == 'a', + r_eq2: d => 'a' == d.v, + l_eq3: d => d.v === 'a', + r_eq3: d => 'a' === d.v, + l_ne2: d => d.v != 'a', + r_ne2: d => 'a' != d.v, + l_ne3: d => d.v !== 'a', + r_ne3: d => 'a' !== d.v, + l_eqo: d => op.equal(d.v, 'a'), + r_eqo: d => op.equal('a', d.v), + destr: ({ v }) => v === 'a' + }; + + assert.deepEqual( + parse(optimized, { compiler, table: dt }), { - 'type': 'CallExpression', - 'callee': { - 'type': 'Function', - 'name': 'mean' - }, - 'arguments': [ - { - 'type': 'Column', - 'name': 'value' - } - ] + names: [ + 'l_eq2', 'r_eq2', + 'l_eq3', 'r_eq3', + 'l_ne2', 'r_ne2', + 'l_ne3', 'r_ne3', + 'l_eqo', 'r_eqo', + 'destr' + ], + exprs: [ + '(data.v.key(row) == 1)', + '(1 == data.v.key(row))', + '(data.v.key(row) === 1)', + '(1 === data.v.key(row))', + '(data.v.key(row) != 1)', + '(1 != data.v.key(row))', + '(data.v.key(row) !== 1)', + '(1 !== data.v.key(row))', + 'fn.equal(data.v.key(row),1)', + 'fn.equal(1,data.v.key(row))', + '(data.v.key(row) === 1)' + ], + ops: [] }, - { - 'type': 'CallExpression', - 'callee': { - 'type': 'Function', - 'name': 'count' - }, - 'arguments': [] - } - ] - ); + 'optimized references' + ); - t.end(); -}); + const unoptimized = { + ref: d => d.v, + nest: d => d.v.x === 'a', + destr: ({ v }) => v.x === 'a', + l_lte: d => d.v <= 'a', + r_lte: d => 'a' <= d.v + }; -tape('parse optimizes dictionary references', t => { - const cols = { v: { keyFor() { return 1; } } }; - const dt = { column: name => cols[name] }; - - const optimized = { - l_eq2: d => d.v == 'a', - r_eq2: d => 'a' == d.v, - l_eq3: d => d.v === 'a', - r_eq3: d => 'a' === d.v, - l_ne2: d => d.v != 'a', - r_ne2: d => 'a' != d.v, - l_ne3: d => d.v !== 'a', - r_ne3: d => 'a' !== d.v, - l_eqo: d => op.equal(d.v, 'a'), - r_eqo: d => op.equal('a', d.v), - destr: ({ v }) => v === 'a' - }; - - t.deepEqual( - parse(optimized, { compiler, table: dt }), - { - names: [ - 'l_eq2', 'r_eq2', - 'l_eq3', 'r_eq3', - 'l_ne2', 'r_ne2', - 'l_ne3', 'r_ne3', - 'l_eqo', 'r_eqo', - 'destr' - ], - exprs: [ - '(data.v.key(row) == 1)', - '(1 == data.v.key(row))', - '(data.v.key(row) === 1)', - '(1 === data.v.key(row))', - '(data.v.key(row) != 1)', - '(1 != data.v.key(row))', - '(data.v.key(row) !== 1)', - '(1 !== data.v.key(row))', - 'fn.equal(data.v.key(row),1)', - 'fn.equal(1,data.v.key(row))', - '(data.v.key(row) === 1)' - ], - ops: [] - }, - 'optimized references' - ); - - const unoptimized = { - ref: d => d.v, - nest: d => d.v.x === 'a', - destr: ({ v }) => v.x === 'a', - l_lte: d => d.v <= 'a', - r_lte: d => 'a' <= d.v - }; - - t.deepEqual( - parse(unoptimized, { compiler, table: dt }), - { - names: [ 'ref', 'nest', 'destr', 'l_lte', 'r_lte' ], - exprs: [ - 'data.v.get(row)', - "(data.v.get(row).x === 'a')", - "(data.v.get(row).x === 'a')", - "(data.v.get(row) <= 'a')", - "('a' <= data.v.get(row))" - ], - ops: [] - }, - 'unoptimized references' - ); - - t.end(); -}); \ No newline at end of file + assert.deepEqual( + parse(unoptimized, { compiler, table: dt }), + { + names: [ 'ref', 'nest', 'destr', 'l_lte', 'r_lte' ], + exprs: [ + 'data.v.at(row)', + "(data.v.at(row).x === 'a')", + "(data.v.at(row).x === 'a')", + "(data.v.at(row) <= 'a')", + "('a' <= data.v.at(row))" + ], + ops: [] + }, + 'unoptimized references' + ); + }); +}); diff --git a/test/format/format-value-test.js b/test/format/format-value-test.js index 5f84f32a..7a97ee7c 100644 --- a/test/format/format-value-test.js +++ b/test/format/format-value-test.js @@ -1,11 +1,11 @@ -import tape from 'tape'; -import inferFormat from '../../src/format/infer'; -import formatValue from '../../src/format/value'; +import assert from 'node:assert'; +import inferFormat from '../../src/format/infer.js'; +import formatValue from '../../src/format/value.js'; -function formatsAs(t, values, strings, options) { +function formatsAs(values, strings, options) { const { format } = inferFormat(f => values.forEach(f), options); const out = values.map(v => formatValue(v, format)); - t.deepEqual(out, strings, `formats [${strings.join(', ')}]`); + assert.deepEqual(out, strings, `formats [${strings.join(', ')}]`); } const loc = (y, m, d, H, M, S, u) => @@ -14,121 +14,113 @@ const loc = (y, m, d, H, M, S, u) => const utc = (y, m, d, H, M, S, u) => new Date(Date.UTC(y, m - 1, d, H || 0, M || 0, S || 0, u || 0)); -tape('formatValue formats invalid values', t => { - formatsAs(t, [NaN], ['NaN']); - formatsAs(t, [null], ['null']); - formatsAs(t, [undefined], ['undefined']); - t.end(); +describe('formatValue', () => { + it('formats invalid values', () => { + formatsAs([NaN], ['NaN']); + formatsAs([null], ['null']); + formatsAs([undefined], ['undefined']); + }); + + it('formats boolean values', () => { + formatsAs([true], ['true']); + formatsAs([false], ['false']); + formatsAs([true, false, null], ['true', 'false', 'null']); + }); + + it('formats number values', () => { + // integers + formatsAs([0], ['0']); + formatsAs([-0], ['0']); + formatsAs([1], ['1']); + formatsAs([-1], ['-1']); + + // decimals + formatsAs([Math.E], ['2.718282']); + formatsAs([3.14], ['3.14']); + formatsAs([1/3], ['0.3333'], { maxdigits: 4 }); + formatsAs([1/3], ['0.333333'], { maxdigits: 6 }); + formatsAs([1/3], ['0.33333333'], { maxdigits: 8 }); + formatsAs([-1/3], ['-0.3333'], { maxdigits: 4 }); + formatsAs([-1/3], ['-0.333333'], { maxdigits: 6 }); + formatsAs([-1/3], ['-0.33333333'], { maxdigits: 8 }); + + // fixed -> exponential + formatsAs([0.1], ['0.1'], { maxdigits: 4 }); + formatsAs([0.01], ['0.01'], { maxdigits: 4 }); + formatsAs([0.001], ['0.001'], { maxdigits: 4 }); + formatsAs([0.0001], ['0.0001'], { maxdigits: 4 }); + formatsAs([0.00001], ['1.0000e-5'], { maxdigits: 4 }); + formatsAs([0.000001], ['1.0000e-6'], { maxdigits: 4 }); + formatsAs([1e30], ['1e+30']); + formatsAs([-1e30], ['-1e+30']); + formatsAs([1.23e-18], ['1.23e-18']); + formatsAs([-1.23e-18], ['-1.23e-18']); + + // grouped inference + formatsAs([0, 1, 2, 3], ['0', '1', '2', '3']); + formatsAs( + [3.14, null, NaN, 2.71828], + ['3.14000', 'null', 'NaN', '2.71828'] + ); + formatsAs( + [-4/3, -1, -2/3, 1/3, 1, 4/3, 5/3], + ['-1.333', '-1.000', '-0.667', '0.333', '1.000', '1.333', '1.667'], + { maxdigits: 3} + ); + formatsAs( + [-1.23e-18, 9.87654321e24, 1], + ['-1.230000e-18', '9.876543e+24', '1.000000'] + ); + }); + + it('formats date values', () => { + formatsAs([utc(2000, 1, 1)], ['2000-01-01T00:00:00.000Z']); + formatsAs( + [utc(2000, 1, 1), utc(2001, 3, 14)], + ['2000-01-01T00:00:00.000Z', '2001-03-14T00:00:00.000Z'] + ); + + formatsAs([loc(2000, 1, 1)], ['2000-01-01T00:00:00.000']); + formatsAs([loc(2005, 2, 3, 7, 11)], ['2005-02-03T07:11:00.000']); + formatsAs([loc(2005, 2, 3, 7, 11, 0, 5)], ['2005-02-03T07:11:00.005']); + formatsAs( + [loc(2000, 1, 1), loc(2001, 3, 14)], + ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000'] + ); + formatsAs( + [loc(2000, 1, 1), loc(2001, 3, 14), loc(2005, 2, 3, 7, 11)], + ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000', '2005-02-03T07:11:00.000'] + ); + + formatsAs( + [loc(2000, 1, 1), utc(2001, 3, 14)], + ['2000-01-01T00:00:00.000', '2001-03-13T16:00:00.000'] + ); + formatsAs( + [loc(2000, 1, 1), loc(2001, 3, 14), utc(2005, 2, 3, 7, 11)], + ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000', '2005-02-02T23:11:00.000'] + ); + }); + + it('formats array values', () => { + formatsAs([[1, 2, 3]], ['[1,2,3]']); + formatsAs([Int32Array.of(1, 2, 3)], ['[1,2,3]']); + formatsAs([Float32Array.of(1, 2, 3)], ['[1,2,3]']); + + formatsAs([['foo']], ['["foo"]']); + formatsAs( + [['foo boo goo woo soo loo roo']], + ['["foo boo goo woo soo loo ro…]'] + ); + }); + + it('formats object values', () => { + formatsAs([{a:1}], ['{"a":1}']); + formatsAs([{a:1}], ['{"a":1}']); + formatsAs([{a: Int32Array.of(1, 2, 3)}], ['{"a":[1,2,3]}']); + formatsAs( + [{key: 'value', 'another key': 'another vlaue'}], + ['{"key":"value","another key"…}'] + ); + }); }); - -tape('formatValue formats boolean values', t => { - formatsAs(t, [true], ['true']); - formatsAs(t, [false], ['false']); - formatsAs(t, [true, false, null], ['true', 'false', 'null']); - t.end(); -}); - -tape('formatValue formats number values', t => { - // integers - formatsAs(t, [0], ['0']); - formatsAs(t, [-0], ['0']); - formatsAs(t, [1], ['1']); - formatsAs(t, [-1], ['-1']); - - // decimals - formatsAs(t, [Math.E], ['2.718282']); - formatsAs(t, [3.14], ['3.14']); - formatsAs(t, [1/3], ['0.3333'], { maxdigits: 4 }); - formatsAs(t, [1/3], ['0.333333'], { maxdigits: 6 }); - formatsAs(t, [1/3], ['0.33333333'], { maxdigits: 8 }); - formatsAs(t, [-1/3], ['-0.3333'], { maxdigits: 4 }); - formatsAs(t, [-1/3], ['-0.333333'], { maxdigits: 6 }); - formatsAs(t, [-1/3], ['-0.33333333'], { maxdigits: 8 }); - - // fixed -> exponential - formatsAs(t, [0.1], ['0.1'], { maxdigits: 4 }); - formatsAs(t, [0.01], ['0.01'], { maxdigits: 4 }); - formatsAs(t, [0.001], ['0.001'], { maxdigits: 4 }); - formatsAs(t, [0.0001], ['0.0001'], { maxdigits: 4 }); - formatsAs(t, [0.00001], ['1.0000e-5'], { maxdigits: 4 }); - formatsAs(t, [0.000001], ['1.0000e-6'], { maxdigits: 4 }); - formatsAs(t, [1e30], ['1e+30']); - formatsAs(t, [-1e30], ['-1e+30']); - formatsAs(t, [1.23e-18], ['1.23e-18']); - formatsAs(t, [-1.23e-18], ['-1.23e-18']); - - // grouped inference - formatsAs(t, [0, 1, 2, 3], ['0', '1', '2', '3']); - formatsAs(t, - [3.14, null, NaN, 2.71828], - ['3.14000', 'null', 'NaN', '2.71828'] - ); - formatsAs(t, - [-4/3, -1, -2/3, 1/3, 1, 4/3, 5/3], - ['-1.333', '-1.000', '-0.667', '0.333', '1.000', '1.333', '1.667'], - { maxdigits: 3} - ); - formatsAs(t, - [-1.23e-18, 9.87654321e24, 1], - ['-1.230000e-18', '9.876543e+24', '1.000000'] - ); - - t.end(); -}); - -tape('formatValue formats date values', t => { - formatsAs(t, [utc(2000, 1, 1)], ['2000-01-01T00:00:00.000Z']); - formatsAs(t, - [utc(2000, 1, 1), utc(2001, 3, 14)], - ['2000-01-01T00:00:00.000Z', '2001-03-14T00:00:00.000Z'] - ); - - formatsAs(t, [loc(2000, 1, 1)], ['2000-01-01T00:00:00.000']); - formatsAs(t, [loc(2005, 2, 3, 7, 11)], ['2005-02-03T07:11:00.000']); - formatsAs(t, [loc(2005, 2, 3, 7, 11, 0, 5)], ['2005-02-03T07:11:00.005']); - formatsAs(t, - [loc(2000, 1, 1), loc(2001, 3, 14)], - ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000'] - ); - formatsAs(t, - [loc(2000, 1, 1), loc(2001, 3, 14), loc(2005, 2, 3, 7, 11)], - ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000', '2005-02-03T07:11:00.000'] - ); - - formatsAs(t, - [loc(2000, 1, 1), utc(2001, 3, 14)], - ['2000-01-01T00:00:00.000', '2001-03-13T16:00:00.000'] - ); - formatsAs(t, - [loc(2000, 1, 1), loc(2001, 3, 14), utc(2005, 2, 3, 7, 11)], - ['2000-01-01T00:00:00.000', '2001-03-14T00:00:00.000', '2005-02-02T23:11:00.000'] - ); - - t.end(); -}); - -tape('formatValue formats array values', t => { - formatsAs(t, [[1, 2, 3]], ['[1,2,3]']); - formatsAs(t, [Int32Array.of(1, 2, 3)], ['[1,2,3]']); - formatsAs(t, [Float32Array.of(1, 2, 3)], ['[1,2,3]']); - - formatsAs(t, [['foo']], ['["foo"]']); - formatsAs(t, - [['foo boo goo woo soo loo roo']], - ['["foo boo goo woo soo loo ro…]'] - ); - - t.end(); -}); - -tape('formatValue formats object values', t => { - formatsAs(t, [{a:1}], ['{"a":1}']); - formatsAs(t, [{a:1}], ['{"a":1}']); - formatsAs(t, [{a: Int32Array.of(1, 2, 3)}], ['{"a":[1,2,3]}']); - formatsAs(t, - [{key: 'value', 'another key': 'another vlaue'}], - ['{"key":"value","another key"…}'] - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/format/from-arrow-test.js b/test/format/from-arrow-test.js deleted file mode 100644 index 0461a664..00000000 --- a/test/format/from-arrow-test.js +++ /dev/null @@ -1,112 +0,0 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import fromArrow from '../../src/format/from-arrow'; -import { not } from '../../src/helpers/selection'; -import { table } from '../../src'; -import { isFixedSizeList, isList, isStruct } from '../../src/arrow/arrow-types'; -import { Utf8 } from 'apache-arrow'; - -function arrowTable(data, types) { - return table(data).toArrow({ types }); -} - -tape('fromArrow imports Apache Arrow tables', t => { - const data = { - u: [1, 2, 3, 4, 5], - v: ['a', 'b', null, 'd', 'e'] - }; - const at = arrowTable(data); - - tableEqual(t, fromArrow(at), data, 'arrow data'); - t.end(); -}); - -tape('fromArrow can unpack Apache Arrow tables', t => { - const data = { - u: [1, 2, 3, 4, 5], - v: ['a', 'b', null, 'd', 'e'], - x: ['cc', 'dd', 'cc', 'dd', 'cc'], - y: ['aa', 'aa', null, 'bb', 'bb'] - }; - const at = arrowTable(data, { v: new Utf8() }); - const dt = fromArrow(at); - - tableEqual(t, dt, data, 'arrow data'); - t.ok(dt.column('x').keyFor, 'create dictionary column without nulls'); - t.ok(dt.column('y').keyFor, 'create dictionary column with nulls'); - t.end(); -}); - -tape('fromArrow can select Apache Arrow columns', t => { - const data = { - u: [1, 2, 3, 4, 5], - v: ['a', 'b', null, 'd', 'e'], - x: ['cc', 'dd', 'cc', 'dd', 'cc'], - y: ['aa', 'aa', null, 'bb', 'bb'] - }; - const at = arrowTable(data); - - const s1 = fromArrow(at, { columns: 'x' }); - t.deepEqual(s1.columnNames(), ['x'], 'select by column name'); - tableEqual(t, s1, { x: data.x }, 'correct columns selected'); - - const s2 = fromArrow(at, { columns: ['u', 'y'] }); - t.deepEqual(s2.columnNames(), ['u', 'y'], 'select by column names'); - tableEqual(t, s2, { u: data.u, y: data.y }, 'correct columns selected'); - - const s3 = fromArrow(at, { columns: not('u', 'y') }); - t.deepEqual(s3.columnNames(), ['v', 'x'], 'select by helper'); - tableEqual(t, s3, { v: data.v, x: data.x }, 'correct columns selected'); - - const s4 = fromArrow(at, { columns: { u: 'a', x: 'b'} }); - t.deepEqual(s4.columnNames(), ['a', 'b'], 'select by helper'); - tableEqual(t, s4, { a: data.u, b: data.x }, 'correct columns selected'); - - t.end(); -}); - -tape('fromArrow can read Apache Arrow lists', t => { - const l = [[1, 2, 3], null, [4, 5]]; - const at = arrowTable({ l }); - - if (!isList(at.getChild('l').type)) { - t.fail('Arrow column should have List type'); - } - tableEqual(t, fromArrow(at), { l }, 'extract Arrow list'); - t.end(); -}); - -tape('fromArrow can read Apache Arrow fixed-size lists', t => { - const l = [[1, 2], null, [4, 5]]; - const at = arrowTable({ l }); - - if (!isFixedSizeList(at.getChild('l').type)) { - t.fail('Arrow column should have FixedSizeList type'); - } - tableEqual(t, fromArrow(at), { l }, 'extract Arrow list'); - t.end(); -}); - -tape('fromArrow can read Apache Arrow structs', t => { - const s = [{ foo: 1, bar: [2, 3] }, null, { foo: 2, bar: [4] }]; - const at = arrowTable({ s }); - - if (!isStruct(at.getChild('s').type)) { - t.fail('Arrow column should have Struct type'); - } - tableEqual(t, fromArrow(at), { s }, 'extract Arrow struct'); - - t.end(); -}); - -tape('fromArrow can read nested Apache Arrow structs', t => { - const s = [{ foo: 1, bar: { bop: 2 } }, { foo: 2, bar: { bop: 3 } }]; - const at = arrowTable({ s }); - - if (!isStruct(at.getChild('s').type)) { - t.fail('Arrow column should have Struct type'); - } - tableEqual(t, fromArrow(at), { s }, 'extract nested Arrow struct'); - - t.end(); -}); diff --git a/test/format/from-csv-test.js b/test/format/from-csv-test.js index da62b8ab..ae48ea29 100644 --- a/test/format/from-csv-test.js +++ b/test/format/from-csv-test.js @@ -1,6 +1,6 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import fromCSV from '../../src/format/from-csv'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { fromCSV } from '../../src/index.js'; function data() { return { @@ -21,105 +21,97 @@ const text = [ const tabText = text.map(t => t.split(',').join('\t')); -tape('fromCSV parses delimited text', t => { - const table = fromCSV(text.join('\n')); - t.equal(table.numRows(), 3, 'num rows'); - t.equal(table.numCols(), 5, 'num cols'); - tableEqual(t, table, data(), 'csv parsed data'); - t.end(); -}); - -tape('fromCSV infers types', t => { - function check(msg, values, test) { - const d = fromCSV('col\n' + values.join('\n')).array('col'); - t.ok(d.every(v => v == null || test(v)), msg); - } - - check('boolean', [true, false, '', true], v => typeof v === 'boolean'); - check('number', [1, Math.PI, '', 'NaN'], v => typeof v === 'number'); - check('string', ['a', 1, '', 'c'], v => typeof v === 'string'); - check('date', [ - new Date().toISOString(), '', - new Date(2000, 0, 1).toISOString(), - new Date(1979, 3, 14, 3, 45).toISOString() - ], v => v instanceof Date); - check('date-like strings', ['2022-23', '2023-24'], v => typeof v === 'string'); - t.end(); -}); +describe('fromCSV', () => { + it('parses delimited text', () => { + const table = fromCSV(text.join('\n')); + assert.equal(table.numRows(), 3, 'num rows'); + assert.equal(table.numCols(), 5, 'num cols'); + tableEqual(table, data(), 'csv parsed data'); + }); -tape('fromCSV parses delimited text with delimiter', t => { - const table = fromCSV(tabText.join('\n'), { delimiter: '\t' }); - t.equal(table.numRows(), 3, 'num rows'); - t.equal(table.numCols(), 5, 'num cols'); - tableEqual(t, table, data(), 'csv parsed data with delimiter'); - t.end(); -}); + it('infers types', () => { + function check(msg, values, test) { + const d = fromCSV('col\n' + values.join('\n')).array('col'); + assert.ok(d.every(v => v == null || test(v)), msg); + } -tape('fromCSV parses delimited text with header option', t => { - const table = fromCSV(text.slice(1).join('\n'), { header: false }); - const cols = data(); - const d = { - col1: cols.str, - col2: cols.int, - col3: cols.num, - col4: cols.bool, - col5: cols.date - }; - tableEqual(t, table, d, 'csv parsed data with no header'); - t.end(); -}); + check('boolean', [true, false, '', true], v => typeof v === 'boolean'); + check('number', [1, Math.PI, '', 'NaN'], v => typeof v === 'number'); + check('string', ['a', 1, '', 'c'], v => typeof v === 'string'); + check('date', [ + new Date().toISOString(), '', + new Date(2000, 0, 1).toISOString(), + new Date(1979, 3, 14, 3, 45).toISOString() + ], v => v instanceof Date); + check('date-like strings', ['2022-23', '2023-24'], v => typeof v === 'string'); + }); -tape('fromCSV parses delimited text with parse option', t => { - const table = fromCSV(text.join('\n'), { parse: { str: d => d + d } }); - const d = { ...data(), str: ['aa', 'bb', 'cc'] }; - tableEqual(t, table, d, 'csv parsed data with custom parse'); - t.end(); -}); + it('parses delimited text with delimiter', () => { + const table = fromCSV(tabText.join('\n'), { delimiter: '\t' }); + assert.equal(table.numRows(), 3, 'num rows'); + assert.equal(table.numCols(), 5, 'num cols'); + tableEqual(table, data(), 'csv parsed data with delimiter'); + }); -tape('fromCSV parses delimited text with decimal option', t => { - tableEqual(t, - fromCSV('a;b\nu;-1,23\nv;4,56e5\nw;', { delimiter: ';', decimal: ',' }), - { a: ['u', 'v', 'w'], b: [-1.23, 4.56e5, null] }, - 'csv parsed data with decimal option' - ); - t.end(); -}); + it('parses delimited text with header option', () => { + const table = fromCSV(text.slice(1).join('\n'), { header: false }); + const cols = data(); + const d = { + col1: cols.str, + col2: cols.int, + col3: cols.num, + col4: cols.bool, + col5: cols.date + }; + tableEqual(table, d, 'csv parsed data with no header'); + }); -tape('fromCSV parses delimited text with skip options', t => { - const text = '# line 1\n# line 2\na,b\n1,2\n3,4'; - const data = { a: [1, 3], b: [2, 4] }; + it('parses delimited text with parse option', () => { + const table = fromCSV(text.join('\n'), { parse: { str: d => d + d } }); + const d = { ...data(), str: ['aa', 'bb', 'cc'] }; + tableEqual(table, d, 'csv parsed data with custom parse'); + }); - tableEqual(t, fromCSV(text, { skip: 2 }), data, - 'csv parsed data with skip option' - ); + it('parses delimited text with decimal option', () => { + tableEqual( + fromCSV('a;b\nu;-1,23\nv;4,56e5\nw;', { delimiter: ';', decimal: ',' }), + { a: ['u', 'v', 'w'], b: [-1.23, 4.56e5, null] }, + 'csv parsed data with decimal option' + ); + }); - tableEqual(t, fromCSV(text, { comment: '#' }), data, - 'csv parsed data with comment option' - ); + it('parses delimited text with skip options', () => { + const text = '# line 1\n# line 2\na,b\n1,2\n3,4'; + const data = { a: [1, 3], b: [2, 4] }; - tableEqual(t, fromCSV(text, { skip: 1, comment: '#' }), data, - 'csv parsed data with skip and comment options' - ); + tableEqual(fromCSV(text, { skip: 2 }), data, + 'csv parsed data with skip option' + ); - t.end(); -}); + tableEqual(fromCSV(text, { comment: '#' }), data, + 'csv parsed data with comment option' + ); -tape('fromCSV applies parsers regardless of autoType flag', t => { - const text = 'a,b\r\n00152,01/01/2021\r\n30219,01/01/2021'; - const table = autoType => fromCSV(text, { - autoType, - parse: { - a: v => v, - b: v => v.split('/').reverse().join('-') - } + tableEqual(fromCSV(text, { skip: 1, comment: '#' }), data, + 'csv parsed data with skip and comment options' + ); }); - const data = { - a: ['00152', '30219'], - b: ['2021-01-01', '2021-01-01'] - }; - - tableEqual(t, table(true), data, 'csv parsed data with autoType true'); - tableEqual(t, table(false), data, 'csv parsed data with autoType false'); - t.end(); -}); \ No newline at end of file + it('applies parsers regardless of autoType flag', () => { + const text = 'a,b\r\n00152,01/01/2021\r\n30219,01/01/2021'; + const table = autoType => fromCSV(text, { + autoType, + parse: { + a: v => v, + b: v => v.split('/').reverse().join('-') + } + }); + const data = { + a: ['00152', '30219'], + b: ['2021-01-01', '2021-01-01'] + }; + + tableEqual(table(true), data, 'csv parsed data with autoType true'); + tableEqual(table(false), data, 'csv parsed data with autoType false'); + }); +}); diff --git a/test/format/from-fixed-test.js b/test/format/from-fixed-test.js index 89b120e1..31d846a6 100644 --- a/test/format/from-fixed-test.js +++ b/test/format/from-fixed-test.js @@ -1,6 +1,6 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import fromFixed from '../../src/format/from-fixed'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { fromFixed } from '../../src/index.js'; function data() { return { @@ -21,88 +21,83 @@ const text = [ 'c378.9false2020-02-29' ]; -tape('fromFixed parses fixed width files using positions', t => { - const table = fromFixed(text.join('\n'), { names, positions }); - t.equal(table.numRows(), 3, 'num rows'); - t.equal(table.numCols(), 5, 'num cols'); - tableEqual(t, table, data(), 'fixed-width parsed data'); - t.end(); -}); +describe('fromFixed', () => { + it('parses fixed width files using positions', () => { + const table = fromFixed(text.join('\n'), { names, positions }); + assert.equal(table.numRows(), 3, 'num rows'); + assert.equal(table.numCols(), 5, 'num cols'); + tableEqual(table, data(), 'fixed-width parsed data'); + }); -tape('fromFixed parses fixed width files using widths', t => { - const table = fromFixed(text.join('\n'), { names, widths }); - t.equal(table.numRows(), 3, 'num rows'); - t.equal(table.numCols(), 5, 'num cols'); - tableEqual(t, table, data(), 'fixed-width parsed data'); - t.end(); -}); + it('parses fixed width files using widths', () => { + const table = fromFixed(text.join('\n'), { names, widths }); + assert.equal(table.numRows(), 3, 'num rows'); + assert.equal(table.numCols(), 5, 'num cols'); + tableEqual(table, data(), 'fixed-width parsed data'); + }); -tape('fromFixed infers types', t => { - function check(msg, widths, values, test) { - const d = fromFixed(values.join('\n'), { widths }).array('col1'); - t.ok(d.every(v => v == null || test(v)), msg); - } + it('infers types', () => { + function check(msg, widths, values, test) { + const d = fromFixed(values.join('\n'), { widths }).array('col1'); + assert.ok(d.every(v => v == null || test(v)), msg); + } - check('boolean', [5], - ['true ', 'false', ' ', 'true '], - v => typeof v === 'boolean' - ); - check('number', [3], - ['1 ', '3.14', ' ', 'NaN'], - v => typeof v === 'number' - ); - check('string', [1], - ['a', '1', ' ', 'c'], - v => typeof v === 'string' - ); - check('date', [24], - [ - new Date().toISOString(), - ' ', - new Date(2000, 0, 1).toISOString(), - new Date(1979, 3, 14, 3, 45).toISOString() - ], - v => v instanceof Date - ); - t.end(); -}); + check('boolean', [5], + ['true ', 'false', ' ', 'true '], + v => typeof v === 'boolean' + ); + check('number', [3], + ['1 ', '3.14', ' ', 'NaN'], + v => typeof v === 'number' + ); + check('string', [1], + ['a', '1', ' ', 'c'], + v => typeof v === 'string' + ); + check('date', [24], + [ + new Date().toISOString(), + ' ', + new Date(2000, 0, 1).toISOString(), + new Date(1979, 3, 14, 3, 45).toISOString() + ], + v => v instanceof Date + ); + }); -tape('fromFixed parses text with parse option', t => { - const table = fromFixed(text.join('\n'), { names, widths, parse: { str: d => d + d } }); - const d = { ...data(), str: ['aa', 'bb', 'cc'] }; - tableEqual(t, table, d, 'fixed-width parsed data with custom parse'); - t.end(); -}); + it('parses text with parse option', () => { + const table = fromFixed(text.join('\n'), { names, widths, parse: { str: d => d + d } }); + const d = { ...data(), str: ['aa', 'bb', 'cc'] }; + tableEqual(table, d, 'fixed-width parsed data with custom parse'); + }); -tape('fromFixed parses text with decimal option', t => { - tableEqual(t, - fromFixed( - 'u -1,23\nv4,56e5\nw', - { decimal: ',', widths: [1, 6], names: ['a', 'b'] } - ), - { a: ['u', 'v', 'w'], b: [-1.23, 4.56e5, null] }, - 'fixed-width parsed data with decimal option' - ); - t.end(); -}); - -tape('fromFixed parses text with skip options', t => { - const text = '# line 1\n# line 2\n12\n34'; - const data = { a: [1, 3], b: [2, 4] }; - const names = ['a', 'b']; - const widths = [1, 1]; + it('parses text with decimal option', () => { + tableEqual( + fromFixed( + 'u -1,23\nv4,56e5\nw', + { decimal: ',', widths: [1, 6], names: ['a', 'b'] } + ), + { a: ['u', 'v', 'w'], b: [-1.23, 4.56e5, null] }, + 'fixed-width parsed data with decimal option' + ); + }); - tableEqual(t, fromFixed(text, { names, widths, skip: 2 }), data, - 'fixed-width parsed data with skip option' - ); + it('parses text with skip options', () => { + const text = '# line 1\n# line 2\n12\n34'; + const data = { a: [1, 3], b: [2, 4] }; + const names = ['a', 'b']; + const widths = [1, 1]; - tableEqual(t, fromFixed(text, { names, widths, comment: '#' }), data, - 'fixed-width parsed data with comment option' - ); + tableEqual(fromFixed(text, { names, widths, skip: 2 }), data, + 'fixed-width parsed data with skip option' + ); - tableEqual(t, fromFixed(text, { names, widths, skip: 1, comment: '#' }), data, - 'fixed-width parsed data with skip and comment options' - ); + tableEqual(fromFixed(text, { names, widths, comment: '#' }), data, + 'fixed-width parsed data with comment option' + ); - t.end(); -}); \ No newline at end of file + tableEqual(fromFixed(text, { names, widths, skip: 1, comment: '#' }), data, + 'fixed-width parsed data with skip and comment options' + ); + }); +}); diff --git a/test/format/from-json-test.js b/test/format/from-json-test.js index 1b628630..c01f7d8b 100644 --- a/test/format/from-json-test.js +++ b/test/format/from-json-test.js @@ -1,8 +1,6 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import ColumnTable from '../../src/table/column-table'; -import fromJSON from '../../src/format/from-json'; -import toJSON from '../../src/format/to-json'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { fromJSON } from '../../src/index.js'; function data() { return { @@ -36,117 +34,63 @@ function wrap(text) { return '{"data":' + text + '}'; } -tape('toJSON formats JSON text with schema', t => { - const dt = new ColumnTable(data()); - t.equal(toJSON(dt), schema(cols(), text), 'json text'); - const names = ['str', 'int']; - t.equal( - toJSON(dt, { limit: 2, columns: names }), - schema(names, '{"str":["a","b"],"int":[1,2]}'), - 'json text with limit' - ); - t.end(); +describe('fromJSON', () => { + it('parses JSON text with schema', () => { + const table = fromJSON(schema(cols(), text)); + tableEqual(table, data(), 'json parsed data'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses JSON text with parse option with schema', () => { + const table = fromJSON(schema(cols(), text), { parse: { str: d => d + d } }); + const d = { ...data(), str: ['aa', 'bb', 'cc'] }; + tableEqual(table, d, 'json parsed data with custom parse'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses JSON text without schema', () => { + const table = fromJSON(wrap(text)); + tableEqual(table, data(), 'json parsed data'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses JSON text with parse option without schema', () => { + const table = fromJSON(wrap(text), { parse: { str: d => d + d } }); + const d = { ...data(), str: ['aa', 'bb', 'cc'] }; + tableEqual(table, d, 'json parsed data with custom parse'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses JSON text as data only', () => { + const table = fromJSON(text); + tableEqual(table, data(), 'json parsed data'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses JSON text with parse option as data only', () => { + const table = fromJSON(text, { parse: { str: d => d + d } }); + const d = { ...data(), str: ['aa', 'bb', 'cc'] }; + tableEqual(table, d, 'json parsed data with custom parse'); + assert.deepEqual(table.columnNames(), cols(), 'column names'); + }); + + it('parses ISO date strings', () => { + const values = [ + 0, '', '2.1', '2000', '2022-2023', + new Date(Date.UTC(2000, 0, 1)), + new Date(Date.UTC(2000, 0, 1)), + new Date(2021, 0, 6, 12), + new Date(2021, 0, 6, 4) + ]; + const str = [ + 0, '', '2.1', '2000', '2022-2023', + '2000-01', + '2000-01-01', + '2021-01-06T12:00:00.000', + '2021-01-06T12:00:00.000Z' + ]; + const json = '{"v":' + JSON.stringify(str) + '}'; + const table = fromJSON(json); + assert.deepEqual(table.column('v'), values, 'column values'); + }); }); - -tape('toJSON formats JSON text with format option with schema', t => { - const dt = new ColumnTable(data()); - const names = ['str']; - t.equal( - toJSON(dt, { limit: 2, columns: names, format: { str: d => d + '!' } }), - schema(names, '{"str":["a!","b!"]}'), - 'json text with custom format' - ); - t.end(); -}); - -tape('toJSON formats JSON text without schema', t => { - const dt = new ColumnTable(data()); - t.equal(toJSON(dt, { schema: false }), text, 'json text'); - t.equal( - toJSON(dt, { limit: 2, columns: ['str', 'int'], schema: false }), - '{"str":["a","b"],"int":[1,2]}', - 'json text with limit' - ); - t.end(); -}); - -tape('toJSON formats JSON text with format option without schema', t => { - const dt = new ColumnTable(data()); - t.equal( - toJSON(dt, { - schema: false, - limit: 2, - columns: ['str'], - format: { str: d => d + '!' } - }), - '{"str":["a!","b!"]}', - 'json text with custom format' - ); - t.end(); -}); - -tape('fromJSON parses JSON text with schema', t => { - const table = fromJSON(schema(cols(), text)); - tableEqual(t, table, data(), 'json parsed data'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses JSON text with parse option with schema', t => { - const table = fromJSON(schema(cols(), text), { parse: { str: d => d + d } }); - const d = { ...data(), str: ['aa', 'bb', 'cc'] }; - tableEqual(t, table, d, 'json parsed data with custom parse'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses JSON text without schema', t => { - const table = fromJSON(wrap(text)); - tableEqual(t, table, data(), 'json parsed data'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses JSON text with parse option without schema', t => { - const table = fromJSON(wrap(text), { parse: { str: d => d + d } }); - const d = { ...data(), str: ['aa', 'bb', 'cc'] }; - tableEqual(t, table, d, 'json parsed data with custom parse'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses JSON text as data only', t => { - const table = fromJSON(text); - tableEqual(t, table, data(), 'json parsed data'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses JSON text with parse option as data only', t => { - const table = fromJSON(text, { parse: { str: d => d + d } }); - const d = { ...data(), str: ['aa', 'bb', 'cc'] }; - tableEqual(t, table, d, 'json parsed data with custom parse'); - t.deepEqual(table.columnNames(), cols(), 'column names'); - t.end(); -}); - -tape('fromJSON parses ISO date strings', t => { - const values = [ - 0, '', '2.1', '2000', '2022-2023', - new Date(Date.UTC(2000, 0, 1)), - new Date(Date.UTC(2000, 0, 1)), - new Date(2021, 0, 6, 12), - new Date(2021, 0, 6, 4) - ]; - const str = [ - 0, '', '2.1', '2000', '2022-2023', - '2000-01', - '2000-01-01', - '2021-01-06T12:00:00.000', - '2021-01-06T12:00:00.000Z' - ]; - const json = '{"v":' + JSON.stringify(str) + '}'; - const table = fromJSON(json); - t.deepEqual(table.column('v').data, values, 'column values'); - t.end(); -}); \ No newline at end of file diff --git a/test/format/load-file-test.js b/test/format/load-file-test.js index 720b0ae1..c5d28386 100644 --- a/test/format/load-file-test.js +++ b/test/format/load-file-test.js @@ -1,59 +1,65 @@ -import tape from 'tape'; -import { load, loadArrow, loadCSV, loadJSON } from '../../src/format/load-file'; +import assert from 'node:assert'; +import { load, loadArrow, loadCSV, loadJSON } from '../../src/index.js'; const PATH = 'test/format/data'; -tape('load loads a file using a relative path', async t => { - const dt = await load(`${PATH}/beers.csv`); - t.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load table'); - t.end(); -}); +describe('load file', () => { + it('loads a file using a relative path', async () => { + const dt = await load(`${PATH}/beers.csv`); + assert.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load table'); + }); -tape('load loads a file using a file protocol url', async t => { - const dt = await load(`file://${process.cwd()}/${PATH}/beers.csv`); - t.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load table'); - t.end(); -}); + it('loads a file using a file protocol url', async () => { + const dt = await load(`file://${process.cwd()}/${PATH}/beers.csv`); + assert.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load table'); + }); -tape('loadCSV loads CSV files from disk', async t => { - const dt = await loadCSV(`${PATH}/beers.csv`); - t.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load csv table'); - t.end(); -}); + it('loadCSV loads CSV files from disk', async () => { + const dt = await loadCSV(`${PATH}/beers.csv`); + assert.deepEqual([dt.numRows(), dt.numCols()], [1203, 5], 'load csv table'); + }); -tape('loadJSON loads JSON files from disk', async t => { - const rt = await loadJSON(`${PATH}/rows.json`); - t.deepEqual([rt.numRows(), rt.numCols()], [3, 3], 'load json rows'); + it('loadJSON loads JSON files from disk', async () => { + const rt = await loadJSON(`${PATH}/rows.json`); + assert.deepEqual([rt.numRows(), rt.numCols()], [3, 3], 'load json rows'); - const st = await loadJSON(`${PATH}/cols-schema.json`); - t.deepEqual([st.numRows(), st.numCols()], [3, 3], 'load json cols with schema'); + const st = await loadJSON(`${PATH}/cols-schema.json`); + assert.deepEqual([st.numRows(), st.numCols()], [3, 3], 'load json cols with schema'); - const ct = await loadJSON(`${PATH}/cols-only.json`); - t.deepEqual([ct.numRows(), ct.numCols()], [3, 3], 'load json cols no schema'); + const ct = await loadJSON(`${PATH}/cols-only.json`); + assert.deepEqual([ct.numRows(), ct.numCols()], [3, 3], 'load json cols no schema'); + }); - t.end(); -}); + it('loadArrow loads Arrow files from disk', async () => { + const dt = await loadArrow(`${PATH}/flights.arrow`); + assert.deepEqual([dt.numRows(), dt.numCols()], [9999, 3], 'load arrow table'); + }); -tape('loadArrow loads Arrow files from disk', async t => { - const dt = await loadArrow(`${PATH}/flights.arrow`); - t.deepEqual([dt.numRows(), dt.numCols()], [9999, 3], 'load arrow table'); - t.end(); -}); + it('fails on non-existent path', async () => { + try { + await load('/foo/bar/baz/does.not.exist'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); -tape('load fails on non-existent path', t => { - load('/foo/bar/baz/does.not.exist') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + } + }); -tape('loadJSON fails on non-JSON file', t => { - loadJSON(`${PATH}/beers.csv`) - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('loadJSON fails on non-JSON file', async () => { + try { + await loadJSON(`${PATH}/beers.csv`); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('loadArrow fails on non-Arrow file', t => { - loadArrow(`${PATH}/beers.csv`) - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); \ No newline at end of file + it('loadArrow fails on non-Arrow file', async () => { + try { + await loadArrow(`${PATH}/beers.csv`); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); +}); diff --git a/test/format/load-file-url-test.js b/test/format/load-file-url-test.js index 84951ea4..5a8bb427 100644 --- a/test/format/load-file-url-test.js +++ b/test/format/load-file-url-test.js @@ -1,55 +1,64 @@ -import tape from 'tape'; -import { load, loadArrow, loadCSV, loadJSON } from '../../src/format/load-file'; - -tape('load loads from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; - const dt = await load(url); - t.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load table'); - t.end(); -}); +import assert from 'node:assert'; +import { load, loadArrow, loadCSV, loadJSON } from '../../src/index.js'; -tape('loadCSV loads CSV files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; - const dt = await loadCSV(url); - t.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load csv table'); - t.end(); -}); +describe('load file url', () => { + it('loads from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; + const dt = await load(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load table'); + }); -tape('loadJSON loads JSON files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/budgets.json'; - const rt = await loadJSON(url); - t.deepEqual([rt.numRows(), rt.numCols()], [230, 3], 'load json rows'); + it('loadCSV loads CSV files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; + const dt = await loadCSV(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load csv table'); + }); - t.end(); -}); + it('loadJSON loads JSON files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/budgets.json'; + const rt = await loadJSON(url); + assert.deepEqual([rt.numRows(), rt.numCols()], [230, 3], 'load json rows'); + }); -tape('loadArrow loads Arrow files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/flights-200k.arrow'; - const dt = await loadArrow(url); - t.deepEqual([dt.numRows(), dt.numCols()], [231083, 3], 'load arrow table'); - t.end(); -}); + it('loadArrow loads Arrow files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/flights-200k.arrow'; + const dt = await loadArrow(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [231083, 3], 'load arrow table'); + }); -tape('load fails on non-existent path', t => { - load('https://foo.bar.baz/does.not.exist') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('fails on non-existent path', async () => { + try { + await load('https://foo.bar.baz/does.not.exist'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('load fails on invalid protocol', t => { - load('htsp://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('fails on invalid protocol', async () => { + try { + await load('htsp://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('loadJSON fails on non-JSON file URL', t => { - loadJSON('https://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('loadJSON fails on non-JSON file URL', async () => { + try { + await loadJSON('https://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('loadArrow fails on non-Arrow file URL', t => { - loadArrow('https://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); \ No newline at end of file + it('loadArrow fails on non-Arrow file URL', async () => { + try { + await loadArrow('https://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); +}); diff --git a/test/format/load-url-test.js b/test/format/load-url-test.js index 51a3bb7a..21edc25b 100644 --- a/test/format/load-url-test.js +++ b/test/format/load-url-test.js @@ -1,57 +1,68 @@ -import tape from 'tape'; -import { load, loadArrow, loadCSV, loadJSON } from '../../src/format/load-url'; +import assert from 'node:assert'; +import fetch from 'node-fetch'; +import { load, loadArrow, loadCSV, loadJSON } from '../../src/index-browser.js'; // add global fetch to emulate DOM environment -global.fetch = require('node-fetch'); +global.fetch = fetch; -tape('load loads from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; - const dt = await load(url); - t.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load table'); - t.end(); -}); +describe('load url', () => { + it('loads from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; + const dt = await load(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load table'); + }); -tape('loadCSV loads CSV files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; - const dt = await loadCSV(url); - t.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load csv table'); - t.end(); -}); + it('loadCSV loads CSV files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/airports.csv'; + const dt = await loadCSV(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [3376, 7], 'load csv table'); + }); -tape('loadJSON loads JSON files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/budgets.json'; - const rt = await loadJSON(url); - t.deepEqual([rt.numRows(), rt.numCols()], [230, 3], 'load json rows'); - t.end(); -}); + it('loadJSON loads JSON files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/budgets.json'; + const rt = await loadJSON(url); + assert.deepEqual([rt.numRows(), rt.numCols()], [230, 3], 'load json rows'); + }); -tape('loadArrow loads Arrow files from a URL', async t => { - const url = 'https://vega.github.io/vega-datasets/data/flights-200k.arrow'; - const dt = await loadArrow(url); - t.deepEqual([dt.numRows(), dt.numCols()], [231083, 3], 'load arrow table'); - t.end(); -}); + it('loadArrow loads Arrow files from a URL', async () => { + const url = 'https://vega.github.io/vega-datasets/data/flights-200k.arrow'; + const dt = await loadArrow(url); + assert.deepEqual([dt.numRows(), dt.numCols()], [231083, 3], 'load arrow table'); + }); -tape('load fails on non-existent path', t => { - load('https://foo.bar.baz/does.not.exist') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('fails on non-existent path', async () => { + try { + await load('https://foo.bar.baz/does.not.exist'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('load fails on invalid protocol', t => { - load('htsp://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('fails on invalid protocol', async () => { + try { + await load('htsp://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('loadJSON fails on non-JSON file URL', t => { - loadJSON('https://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); + it('loadJSON fails on non-JSON file URL', async () => { + try { + await loadJSON('https://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); -tape('loadArrow fails on non-Arrow file URL', t => { - loadArrow('https://vega.github.io/vega-datasets/data/airports.csv') - .then(() => { t.fail('did not fail'); t.end(); }) - .catch(() => { t.pass('failed appropriately'); t.end(); }); -}); \ No newline at end of file + it('loadArrow fails on non-Arrow file URL', async () => { + try { + await loadArrow('https://vega.github.io/vega-datasets/data/airports.csv'); + assert.fail('did not fail'); + } catch (err) { // eslint-disable-line no-unused-vars + assert.ok(true, 'failed appropriately'); + } + }); +}); diff --git a/test/format/to-arrow-test.js b/test/format/to-arrow-test.js deleted file mode 100644 index 4d5cb05e..00000000 --- a/test/format/to-arrow-test.js +++ /dev/null @@ -1,303 +0,0 @@ -import tape from 'tape'; -import { readFileSync } from 'fs'; -import { Int8, Type, tableFromIPC, tableToIPC, vectorFromArray } from 'apache-arrow'; -import fromArrow from '../../src/format/from-arrow'; -import fromCSV from '../../src/format/from-csv'; -import fromJSON from '../../src/format/from-json'; -import toArrow from '../../src/format/to-arrow'; -import { table } from '../../src/table'; - -function date(year, month=0, date=1, hours=0, minutes=0, seconds=0, ms=0) { - return new Date(year, month, date, hours, minutes, seconds, ms); -} - -function utc(year, month=0, date=1, hours=0, minutes=0, seconds=0, ms=0) { - return new Date(Date.UTC(year, month, date, hours, minutes, seconds, ms)); -} - -function Int8Vector(data) { - return vectorFromArray(data, new Int8); -} - -function isArrayType(value) { - return Array.isArray(value) - || (value && value.map === Int8Array.prototype.map); -} - -function compareTables(aqt, art) { - const err = aqt.columnNames() - .map(name => compareColumns(name, aqt, art)) - .filter(a => a.length); - return err.length; -} - -function compareColumns(name, aqt, art) { - const normalize = v => v === undefined ? null : v instanceof Date ? +v : v; - const idx = aqt.indices(); - const aqc = aqt.column(name); - const arc = art.getChild(name); - const err = []; - for (let i = 0; i < idx.length; ++i) { - let v1 = normalize(aqc.get(idx[i])); - let v2 = normalize(arc.get(i)); - if (isArrayType(v1)) { - v1 = v1.join(); - v2 = [...v2].join(); - } else if (typeof v1 === 'object') { - v1 = JSON.stringify(v1); - v2 = JSON.stringify(v2); - } - if (v1 !== v2) { - err.push({ name, index: i, v1, v2 }); - } - } - return err; -} - -tape('toArrow produces Arrow data for an input table', t => { - const dt = table({ - i: [1, 2, 3, undefined, 4, 5], - f: Float32Array.from([1.2, 2.3, 3.0, 3.4, null, 4.5]), - n: [4.5, 4.4, 3.4, 3.0, 2.3, 1.2], - b: [true, true, false, true, null, false], - s: ['foo', null, 'bar', 'baz', 'baz', 'bar'], - d: [date(2000,0,1), date(2000,1,2), null, date(2010,6,9), date(2018,0,1), date(2020,10,3)], - u: [utc(2000,0,1), utc(2000,1,2), null, utc(2010,6,9), utc(2018,0,1), utc(2020,10,3)], - e: [null, null, null, null, null, null], - v: Int8Vector([10, 9, 8, 7, 6, 5]), - a: [[1, null, 3], [4, 5], null, [6, 7], [8, 9], []], - l: [[1], [2], [3], [4], [5], [6]], - o: [1, 2, 3, null, 5, 6].map(v => v ? { key: v } : null) - }); - - const at = dt.toArrow(); - - t.equal( - compareTables(dt, at), 0, - 'arquero and arrow tables match' - ); - - t.equal( - compareTables(dt, toArrow(dt.objects())), 0, - 'object array and arrow tables match' - ); - - const buffer = tableToIPC(at); - const bt = tableFromIPC(buffer); - - t.equal( - compareTables(dt, bt), 0, - 'arquero and serialized arrow tables match' - ); - - t.equal( - compareTables(fromArrow(bt), at), 0, - 'serialized arquero and arrow tables match' - ); - - t.end(); -}); - -tape('toArrow produces Arrow data for an input CSV', async t => { - const dt = fromCSV(readFileSync('test/format/data/beers.csv', 'utf8')); - const st = dt.derive({ name: d => d.name + '' }); - const at = dt.toArrow(); - - t.equal( - compareTables(st, at), 0, - 'arquero and arrow tables match' - ); - - t.equal( - compareTables(st, toArrow(st.objects())), 0, - 'object array and arrow tables match' - ); - - const buffer = tableToIPC(at); - - t.equal( - compareTables(st, tableFromIPC(buffer)), 0, - 'arquero and serialized arrow tables match' - ); - - t.equal( - compareTables(fromArrow(tableFromIPC(buffer)), at), 0, - 'serialized arquero and arrow tables match' - ); - - t.end(); -}); - -tape('toArrow handles ambiguously typed data', async t => { - const at = table({ x: [1, 2, 3, 'foo'] }).toArrow(); - t.deepEqual( - [...at.getChild('x')], - ['1', '2', '3', 'foo'], - 'fallback to string type if a string is observed' - ); - - t.throws( - () => table({ x: [1, 2, 3, true] }).toArrow(), - 'fail on mixed types' - ); - - t.end(); -}); - -tape('toArrow result produces serialized arrow data', t => { - const dt = fromCSV(readFileSync('test/format/data/beers.csv', 'utf8')) - .derive({ name: d => d.name + '' }); - - const json = dt.toJSON(); - const jt = fromJSON(json); - - const bytes = tableToIPC(dt.toArrow()); - const bt = fromArrow(tableFromIPC(bytes)); - - t.deepEqual( - [bt.toJSON(), jt.toJSON()], - [json, json], - 'arrow and json round trips match' - ); - - t.end(); -}); - -tape('toArrow respects columns option', t => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - const at = dt.toArrow({ columns: ['w', 'y'] }); - - t.deepEqual( - at.schema.fields.map(f => f.name), - ['w', 'y'], - 'column subset' - ); - - t.end(); -}); - -tape('toArrow respects limit and offset options', t => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - t.equal( - JSON.stringify([...dt.toArrow({ limit: 2 })]), - '[{"w":"a","x":1,"y":1.6181,"z":true},{"w":"b","x":2,"y":2.7182,"z":true}]', - 'limit' - ); - t.equal( - JSON.stringify([...dt.toArrow({ offset: 1 })]), - '[{"w":"b","x":2,"y":2.7182,"z":true},{"w":"a","x":3,"y":3.1415,"z":false}]', - 'offset' - ); - t.equal( - JSON.stringify([...dt.toArrow({ offset: 1, limit: 1 })]), - '[{"w":"b","x":2,"y":2.7182,"z":true}]', - 'limit and offset' - ); - - t.end(); -}); - -tape('toArrow respects limit and types option', t => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - const at = dt.toArrow({ - types: { w: Type.Utf8, x: Type.Int32, y: Type.Float32 } - }); - - const types = ['w', 'x', 'y', 'z'].map(name => at.getChild(name).type); - - t.deepEqual( - types.map(t => t.typeId), - [Type.Utf8, Type.Int, Type.Float, Type.Bool], - 'type ids match' - ); - t.equal(types[1].bitWidth, 32, 'int32'); - t.equal(types[2].precision, 1, 'float32'); - - t.end(); -}); - -tape('toArrowBuffer generates the correct output for file option', (t) => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - const buffer = dt.toArrowBuffer({ format: 'file' }); - - t.deepEqual( - buffer.slice(0, 8), - new Uint8Array([65, 82, 82, 79, 87, 49, 0, 0]) - ); - t.end(); -}); - -tape('toArrowBuffer generates the correct output for stream option', (t) => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - const buffer = dt.toArrowBuffer({ format: 'stream' }); - - t.deepEqual( - buffer.slice(0, 8), - new Uint8Array([255, 255, 255, 255, 88, 1, 0, 0]) - ); - t.end(); -}); - -tape('toArrowBuffer defaults to using stream option', (t) => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - const buffer = dt.toArrowBuffer(); - - t.deepEqual( - buffer.slice(0, 8), - new Uint8Array([255, 255, 255, 255, 88, 1, 0, 0]) - ); - t.end(); -}); - -tape( - 'toArrowBuffer throws an error if the format is not stream or file', - (t) => { - t.throws(() => { - const dt = table({ - w: ['a', 'b', 'a'], - x: [1, 2, 3], - y: [1.6181, 2.7182, 3.1415], - z: [true, true, false] - }); - - dt.toArrowBuffer({ format: 'nonsense' }); - }, 'Unrecognised output format'); - t.end(); - } -); diff --git a/test/format/to-csv-test.js b/test/format/to-csv-test.js index 115683f6..661ca0a7 100644 --- a/test/format/to-csv-test.js +++ b/test/format/to-csv-test.js @@ -1,7 +1,5 @@ -import tape from 'tape'; -import BitSet from '../../src/table/bit-set'; -import ColumnTable from '../../src/table/column-table'; -import toCSV from '../../src/format/to-csv'; +import assert from 'node:assert'; +import { BitSet, ColumnTable, toCSV } from '../../src/index.js'; function data() { return { @@ -22,53 +20,51 @@ const text = [ const tabText = text.map(t => t.split(',').join('\t')); -tape('toCSV formats delimited text', t => { - const dt = new ColumnTable(data()); - t.equal(toCSV(dt), text.join('\n'), 'csv text'); - t.equal( - toCSV(dt, { limit: 2, columns: ['str', 'int'] }), - text.slice(0, 3) - .map(s => s.split(',').slice(0, 2).join(',')) - .join('\n'), - 'csv text with limit' - ); - t.end(); -}); +describe('toCSV', () => { + it('formats delimited text', () => { + const dt = new ColumnTable(data()); + assert.equal(toCSV(dt), text.join('\n'), 'csv text'); + assert.equal( + toCSV(dt, { limit: 2, columns: ['str', 'int'] }), + text.slice(0, 3) + .map(s => s.split(',').slice(0, 2).join(',')) + .join('\n'), + 'csv text with limit' + ); + }); -tape('toCSV formats delimited text with delimiter option', t => { - const dt = new ColumnTable(data()); - t.equal( - toCSV(dt, { delimiter: '\t' }), - tabText.join('\n'), - 'csv text with delimiter' - ); - t.equal( - toCSV(dt, { limit: 2, delimiter: '\t', columns: ['str', 'int'] }), - text.slice(0, 3) - .map(s => s.split(',').slice(0, 2).join('\t')) - .join('\n'), - 'csv text with delimiter and limit' - ); - t.end(); -}); + it('formats delimited text with delimiter option', () => { + const dt = new ColumnTable(data()); + assert.equal( + toCSV(dt, { delimiter: '\t' }), + tabText.join('\n'), + 'csv text with delimiter' + ); + assert.equal( + toCSV(dt, { limit: 2, delimiter: '\t', columns: ['str', 'int'] }), + text.slice(0, 3) + .map(s => s.split(',').slice(0, 2).join('\t')) + .join('\n'), + 'csv text with delimiter and limit' + ); + }); -tape('toCSV formats delimited text for filtered table', t => { - const bs = new BitSet(3).not(); bs.clear(1); - const dt = new ColumnTable(data(), null, bs); - t.equal( - toCSV(dt), - [ ...text.slice(0, 2), ...text.slice(3) ].join('\n'), - 'csv text with limit' - ); - t.end(); -}); + it('formats delimited text for filtered table', () => { + const bs = new BitSet(3).not(); bs.clear(1); + const dt = new ColumnTable(data(), null, bs); + assert.equal( + toCSV(dt), + [ ...text.slice(0, 2), ...text.slice(3) ].join('\n'), + 'csv text with limit' + ); + }); -tape('toCSV formats delimited text with format option', t => { - const dt = new ColumnTable(data()); - t.equal( - toCSV(dt, { limit: 2, columns: ['str'], format: { str: d => d + '!' } }), - ['str', 'a!', 'b!'].join('\n'), - 'csv text with custom format' - ); - t.end(); -}); \ No newline at end of file + it('formats delimited text with format option', () => { + const dt = new ColumnTable(data()); + assert.equal( + toCSV(dt, { limit: 2, columns: ['str'], format: { str: d => d + '!' } }), + ['str', 'a!', 'b!'].join('\n'), + 'csv text with custom format' + ); + }); +}); diff --git a/test/format/to-html-test.js b/test/format/to-html-test.js index 765d641c..29ad77bb 100644 --- a/test/format/to-html-test.js +++ b/test/format/to-html-test.js @@ -1,135 +1,128 @@ -import tape from 'tape'; -import ColumnTable from '../../src/table/column-table'; -import toHTML from '../../src/format/to-html'; - -tape('toHTML formats html table text', t => { - const l = 'style="text-align: left;"'; - const r = 'style="text-align: right;"'; - const html = (u, v) => [ - '', - ``, - '', - ``, - ``, - ``, - ``, - ``, - '
uv
a1
a2
b3
a4
b5
' - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.equal(toHTML(dt), html(l, r).join(''), 'html text'); - - t.equal( - toHTML(dt, { limit: 3 }), - html(l, r).slice(0, 6).join('') + '', - 'html text with limit' - ); - - t.end(); -}); - -tape('toHTML formats html table text with format option', t => { - const l = 'style="text-align: left;"'; - const r = 'style="text-align: right;"'; - const html = (u, v) => [ - '', - ``, - '', - ``, - ``, - ``, - ``, - ``, - '
uv
aa10
aa20
bb30
aa40
bb50
' - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.equal( - toHTML(dt, { - format: { - u: d => d + d, - v: d => d * 10 - } - }), - html(l, r).join(''), - 'html text with custom format' - ); - - t.end(); +import assert from 'node:assert'; +import { ColumnTable, toHTML } from '../../src/index.js'; + +describe('toHTML', () => { + it('formats html table text', () => { + const l = 'style="text-align: left;"'; + const r = 'style="text-align: right;"'; + const html = (u, v) => [ + '', + ``, + '', + ``, + ``, + ``, + ``, + ``, + '
uv
a1
a2
b3
a4
b5
' + ]; + + const dt = new ColumnTable({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }) + .orderby('v'); + + assert.equal(toHTML(dt), html(l, r).join(''), 'html text'); + + assert.equal( + toHTML(dt, { limit: 3 }), + html(l, r).slice(0, 6).join('') + '', + 'html text with limit' + ); + }); + + it('formats html table text with format option', () => { + const l = 'style="text-align: left;"'; + const r = 'style="text-align: right;"'; + const html = (u, v) => [ + '', + ``, + '', + ``, + ``, + ``, + ``, + ``, + '
uv
aa10
aa20
bb30
aa40
bb50
' + ]; + + const dt = new ColumnTable({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }) + .orderby('v'); + + assert.equal( + toHTML(dt, { + format: { + u: d => d + d, + v: d => d * 10 + } + }), + html(l, r).join(''), + 'html text with custom format' + ); + }); + + it('formats html table text with style option', () => { + const la = 'text-align: left;'; + const ra = 'text-align: right;'; + const cb = 'color: black;'; + const l = `style="${la} ${cb}"`; + const r = `style="${ra} ${cb}"`; + const html = (u, v) => [ + '', + '', + ``, + '', + ``, + ``, + ``, + ``, + ``, + '
uv
a1
a2
b3
a4
b5
' + ]; + + const dt = new ColumnTable({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }) + .orderby('v'); + + assert.equal( + toHTML(dt, { + style: { + tr: (col, idx, row) => `row(${idx},${row})`, + td: 'color: black;' + } + }), + html(l, r).join(''), + 'html text with custom style' + ); + }); + + it('formats html table text with null option', () => { + const a = 'style="text-align: right;"'; + const html = (a) => [ + '', + ``, + '', + ``, + ``, + ``, + ``, + '
u
a
0
null
undefined
' + ]; + + const dt = new ColumnTable({ u: ['a', 0, null, undefined] }); + + assert.equal( + toHTML(dt, { + null: v => `${v}` + }), + html(a).join(''), + 'html text with custom null format' + ); + }); }); - -tape('toHTML formats html table text with style option', t => { - const la = 'text-align: left;'; - const ra = 'text-align: right;'; - const cb = 'color: black;'; - const l = `style="${la} ${cb}"`; - const r = `style="${ra} ${cb}"`; - const html = (u, v) => [ - '', - '', - ``, - '', - ``, - ``, - ``, - ``, - ``, - '
uv
a1
a2
b3
a4
b5
' - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.equal( - toHTML(dt, { - style: { - tr: (col, idx, row) => `row(${idx},${row})`, - td: 'color: black;' - } - }), - html(l, r).join(''), - 'html text with custom style' - ); - - t.end(); -}); - -tape('toHTML formats html table text with null option', t => { - const a = 'style="text-align: right;"'; - const html = (a) => [ - '', - ``, - '', - ``, - ``, - ``, - ``, - '
u
a
0
null
undefined
' - ]; - - const dt = new ColumnTable({ u: ['a', 0, null, undefined] }); - - t.equal( - toHTML(dt, { - null: v => `${v}` - }), - html(a).join(''), - 'html text with custom null format' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/format/to-json-test.js b/test/format/to-json-test.js index ab157767..4c1ba2ce 100644 --- a/test/format/to-json-test.js +++ b/test/format/to-json-test.js @@ -1,6 +1,5 @@ -import tape from 'tape'; -import ColumnTable from '../../src/table/column-table'; -import toJSON from '../../src/format/to-json'; +import assert from 'node:assert'; +import { ColumnTable, toJSON } from '../../src/index.js'; function data() { return { @@ -30,51 +29,49 @@ function schema(names, text) { + '},"data":' + text + '}'; } -tape('toJSON formats JSON text with schema', t => { - const dt = new ColumnTable(data()); - t.equal(toJSON(dt), schema(cols(), text), 'json text'); - const names = ['str', 'int']; - t.equal( - toJSON(dt, { limit: 2, columns: names }), - schema(names, '{"str":["a","b"],"int":[1,2]}'), - 'json text with limit' - ); - t.end(); -}); +describe('toJSON', () => { + it('formats JSON text with schema', () => { + const dt = new ColumnTable(data()); + assert.equal(toJSON(dt), schema(cols(), text), 'json text'); + const names = ['str', 'int']; + assert.equal( + toJSON(dt, { limit: 2, columns: names }), + schema(names, '{"str":["a","b"],"int":[1,2]}'), + 'json text with limit' + ); + }); -tape('toJSON formats JSON text with format option with schema', t => { - const dt = new ColumnTable(data()); - const names = ['str']; - t.equal( - toJSON(dt, { limit: 2, columns: names, format: { str: d => d + '!' } }), - schema(names, '{"str":["a!","b!"]}'), - 'json text with custom format' - ); - t.end(); -}); + it('formats JSON text with format option with schema', () => { + const dt = new ColumnTable(data()); + const names = ['str']; + assert.equal( + toJSON(dt, { limit: 2, columns: names, format: { str: d => d + '!' } }), + schema(names, '{"str":["a!","b!"]}'), + 'json text with custom format' + ); + }); -tape('toJSON formats JSON text without schema', t => { - const dt = new ColumnTable(data()); - t.equal(toJSON(dt, { schema: false }), text, 'json text'); - t.equal( - toJSON(dt, { limit: 2, columns: ['str', 'int'], schema: false }), - '{"str":["a","b"],"int":[1,2]}', - 'json text with limit' - ); - t.end(); -}); + it('formats JSON text without schema', () => { + const dt = new ColumnTable(data()); + assert.equal(toJSON(dt, { schema: false }), text, 'json text'); + assert.equal( + toJSON(dt, { limit: 2, columns: ['str', 'int'], schema: false }), + '{"str":["a","b"],"int":[1,2]}', + 'json text with limit' + ); + }); -tape('toJSON formats JSON text with format option without schema', t => { - const dt = new ColumnTable(data()); - t.equal( - toJSON(dt, { - schema: false, - limit: 2, - columns: ['str'], - format: { str: d => d + '!' } - }), - '{"str":["a!","b!"]}', - 'json text with custom format' - ); - t.end(); -}); \ No newline at end of file + it('formats JSON text with format option without schema', () => { + const dt = new ColumnTable(data()); + assert.equal( + toJSON(dt, { + schema: false, + limit: 2, + columns: ['str'], + format: { str: d => d + '!' } + }), + '{"str":["a!","b!"]}', + 'json text with custom format' + ); + }); +}); diff --git a/test/format/to-markdown-test.js b/test/format/to-markdown-test.js index eebc5eed..52368b6f 100644 --- a/test/format/to-markdown-test.js +++ b/test/format/to-markdown-test.js @@ -1,62 +1,59 @@ -import tape from 'tape'; -import ColumnTable from '../../src/table/column-table'; -import toMarkdown from '../../src/format/to-markdown'; - -tape('toMarkdown formats markdown table text', t => { - const md = [ - '|u|v|\n', - '|:-|-:|\n', - '|a|1|\n', - '|a|2|\n', - '|b|3|\n', - '|a|4|\n', - '|b|5|\n' - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.equal(toMarkdown(dt), md.join(''), 'markdown text'); - - t.equal( - toMarkdown(dt, { limit: 3 }), - md.slice(0, 5).join(''), - 'markdown text with limit' - ); - - t.end(); +import assert from 'node:assert'; +import { ColumnTable, toMarkdown } from '../../src/index.js'; + +describe('toMarkdown', () => { + it('formats markdown table text', () => { + const md = [ + '|u|v|\n', + '|:-|-:|\n', + '|a|1|\n', + '|a|2|\n', + '|b|3|\n', + '|a|4|\n', + '|b|5|\n' + ]; + + const dt = new ColumnTable({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }) + .orderby('v'); + + assert.equal(toMarkdown(dt), md.join(''), 'markdown text'); + + assert.equal( + toMarkdown(dt, { limit: 3 }), + md.slice(0, 5).join(''), + 'markdown text with limit' + ); + }); + + it('formats markdown table text with format option', () => { + const md = [ + '|u|v|\n', + '|:-|-:|\n', + '|aa|10|\n', + '|aa|20|\n', + '|bb|30|\n', + '|aa|40|\n', + '|bb|50|\n' + ]; + + const dt = new ColumnTable({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }) + .orderby('v'); + + assert.equal( + toMarkdown(dt, { + format: { + u: d => d + d, + v: d => d * 10 + } + }), + md.join(''), + 'markdown text with custom format' + ); + }); }); - -tape('toMarkdown formats markdown table text with format option', t => { - const md = [ - '|u|v|\n', - '|:-|-:|\n', - '|aa|10|\n', - '|aa|20|\n', - '|bb|30|\n', - '|aa|40|\n', - '|bb|50|\n' - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.equal( - toMarkdown(dt, { - format: { - u: d => d + d, - v: d => d * 10 - } - }), - md.join(''), - 'markdown text with custom format' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/groupby-equal.js b/test/groupby-equal.js index 2f5c0057..56e6e5e1 100644 --- a/test/groupby-equal.js +++ b/test/groupby-equal.js @@ -1,4 +1,6 @@ -export default function(t, table1, table2, msg) { +import assert from 'node:assert'; + +export default function(table1, table2, msg) { const extract = g => ({ keys: g.keys, names: g.names, @@ -6,9 +8,9 @@ export default function(t, table1, table2, msg) { size: g.size }); - t.deepEqual( + assert.deepEqual( extract(table1.groups()), extract(table2.groups()), msg ); -} \ No newline at end of file +} diff --git a/test/helpers/agg-test.js b/test/helpers/agg-test.js index d2114bd1..f3ff62a0 100644 --- a/test/helpers/agg-test.js +++ b/test/helpers/agg-test.js @@ -1,18 +1,18 @@ -import tape from 'tape'; -import { agg, op, table } from '../../src'; +import assert from 'node:assert'; +import { agg, op, table } from '../../src/index.js'; -tape('agg computes aggregate values', t => { - const dt = table({ a: [1, 2, 3, 4] }); +describe('agg', () => { + it('computes aggregate values', () => { + const dt = table({ a: [1, 2, 3, 4] }); - t.deepEqual( - { - sum: agg(dt, op.sum('a')), - max: agg(dt, op.max('a')), - ext: agg(dt, d => [op.min(d.a), op.max(d.a)]) - }, - { sum: 10, max: 4, ext: [1, 4] }, - 'agg helper' - ); - - t.end(); -}); \ No newline at end of file + assert.deepEqual( + { + sum: agg(dt, op.sum('a')), + max: agg(dt, op.max('a')), + ext: agg(dt, d => [op.min(d.a), op.max(d.a)]) + }, + { sum: 10, max: 4, ext: [1, 4] }, + 'agg helper' + ); + }); +}); diff --git a/test/helpers/escape-test.js b/test/helpers/escape-test.js index 5bf17bbf..93c32663 100644 --- a/test/helpers/escape-test.js +++ b/test/helpers/escape-test.js @@ -1,109 +1,83 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { escape, op, query, table } from '../../src'; - -tape('derive supports escaped functions', t => { - const dt = table({ a: [1, 2], b: [3, 4] }); - const sq = x => x * x; - const off = 1; - - tableEqual(t, - dt.derive({ z: escape(d => sq(d.a) + off) }), - { a: [1, 2], b: [3, 4], z: [2, 5] }, - 'derive data with escape' - ); - - tableEqual(t, - dt.derive({ z: escape(d => d.a * -d.b + off) }), - { a: [1, 2], b: [3, 4], z: [-2, -7] }, - 'derive data with escape, two columns' - ); - - tableEqual(t, - dt.params({ foo: 2 }) - .derive({ z: escape((d, $) => sq(d.a) + off + op.abs($.foo)) }), - { a: [1, 2], b: [3, 4], z: [4, 7] }, - 'derive data with escape, op, and params' - ); - - tableEqual(t, - dt.derive({ z: escape(2) }), - { a: [1, 2], b: [3, 4], z: [2, 2] }, - 'derive data with escaped literal value' - ); - - t.end(); -}); - -tape('filter supports escaped functions', t => { - const thresh = 5; - tableEqual(t, - table({ a: [1, 4, 9], b: [1, 2, 3] }).filter(escape(d => d.a < thresh)), - { a: [1, 4], b: [1, 2] }, - 'filter data with escape' - ); - - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { escape, op, table } from '../../src/index.js'; + +describe('escape', () => { + it('derive supports escaped functions', () => { + const dt = table({ a: [1, 2], b: [3, 4] }); + const sq = x => x * x; + const off = 1; + + tableEqual( + dt.derive({ z: escape(d => sq(d.a) + off) }), + { a: [1, 2], b: [3, 4], z: [2, 5] }, + 'derive data with escape' + ); + + tableEqual( + dt.derive({ z: escape(d => d.a * -d.b + off) }), + { a: [1, 2], b: [3, 4], z: [-2, -7] }, + 'derive data with escape, two columns' + ); + + tableEqual( + dt.params({ foo: 2 }) + .derive({ z: escape((d, $) => sq(d.a) + off + op.abs($.foo)) }), + { a: [1, 2], b: [3, 4], z: [4, 7] }, + 'derive data with escape, op, and params' + ); + + tableEqual( + dt.derive({ z: escape(2) }), + { a: [1, 2], b: [3, 4], z: [2, 2] }, + 'derive data with escaped literal value' + ); + }); + + it('filter supports escaped functions', () => { + const thresh = 5; + tableEqual( + table({ a: [1, 4, 9], b: [1, 2, 3] }).filter(escape(d => d.a < thresh)), + { a: [1, 4], b: [1, 2] }, + 'filter data with escape' + ); + }); + + it('spread supports escaped functions', () => { + const pair = d => [d.v, d.v * d.v + 1]; + + tableEqual( + table({ v: [3, 2, 1] }).spread({ v: escape(pair) }, { as: ['a', 'b'] }), + { a: [3, 2, 1], b: [10, 5, 2] }, + 'spread data with escape' + ); + }); + + it('groupby supports escaped functions', () => { + tableEqual( + table({ v: [3, 2, 1] }).groupby({ g: escape(d => -d.v) }).count(), + { g: [-3, -2, -1], count: [1, 1, 1] }, + 'groupby data with escape' + ); + }); + + it('orderby supports escaped functions', () => { + tableEqual( + table({ v: [1, 2, 3] }).orderby(escape(d => -d.v)), + { v: [3, 2, 1] }, + 'orderby data with escape' + ); + }); + + it('aggregate verbs throw for escaped functions', () => { + assert.throws( + () => table({ v: [1, 2, 3] }).rollup({ v: escape(d => -d.v) }), + 'rollup throws on escaped function' + ); + + assert.throws( + () => table({ g: [1, 2], a: [3, 4] }).pivot('g', { v: escape(d => -d.a) }), + 'pivot throws on escaped function' + ); + }); }); - -tape('spread supports escaped functions', t => { - const pair = d => [d.v, d.v * d.v + 1]; - - tableEqual(t, - table({ v: [3, 2, 1] }).spread({ v: escape(pair) }, { as: ['a', 'b'] }), - { a: [3, 2, 1], b: [10, 5, 2] }, - 'spread data with escape' - ); - - t.end(); -}); - -tape('groupby supports escaped functions', t => { - tableEqual(t, - table({ v: [3, 2, 1] }).groupby({ g: escape(d => -d.v) }).count(), - { g: [-3, -2, -1], count: [1, 1, 1] }, - 'groupby data with escape' - ); - - t.end(); -}); - -tape('orderby supports escaped functions', t => { - tableEqual(t, - table({ v: [1, 2, 3] }).orderby(escape(d => -d.v)), - { v: [3, 2, 1] }, - 'orderby data with escape' - ); - - t.end(); -}); - -tape('aggregate verbs throw for escaped functions', t => { - t.throws( - () => table({ v: [1, 2, 3] }).rollup({ v: escape(d => -d.v) }), - 'rollup throws on escaped function' - ); - - t.throws( - () => table({ g: [1, 2], a: [3, 4] }).pivot('g', { v: escape(d => -d.a) }), - 'pivot throws on escaped function' - ); - - t.end(); -}); - -tape('query serialization throws for escaped functions', t => { - const sq = d => d.a * d.a; - - t.throws( - () => query().derive({ z: escape(sq) }).toObject(), - 'query toObject throws on escaped function' - ); - - t.throws( - () => query().derive({ z: escape(sq) }).toAST(), - 'query toAST throws on escape function' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/helpers/names-test.js b/test/helpers/names-test.js index a5efc645..d2806e3b 100644 --- a/test/helpers/names-test.js +++ b/test/helpers/names-test.js @@ -1,27 +1,27 @@ -import tape from 'tape'; -import { names, table } from '../../src'; +import assert from 'node:assert'; +import { names, table } from '../../src/index.js'; -tape('names produces a rename map', t => { - const dt = table({ x: [1], y: [2], z: [3] }); - const entries = [ ['x', 'a'], ['y', 'b'], ['z', 'c'] ]; +describe('names', () => { + it('produces a rename map', () => { + const dt = table({ x: [1], y: [2], z: [3] }); + const entries = [ ['x', 'a'], ['y', 'b'], ['z', 'c'] ]; - t.deepEqual( - [ - names('a', 'b', 'c')(dt), - names(['a', 'b', 'c'])(dt), - names(['a', 'b'], 'c')(dt), - names('a', 'b')(dt), - names('a', 'b', 'c', 'd')(dt) - ], - [ - new Map(entries), - new Map(entries), - new Map(entries), - new Map(entries.slice(0, 2)), - new Map(entries) - ], - 'names helper' - ); - - t.end(); -}); \ No newline at end of file + assert.deepEqual( + [ + names('a', 'b', 'c')(dt), + names(['a', 'b', 'c'])(dt), + names(['a', 'b'], 'c')(dt), + names('a', 'b')(dt), + names('a', 'b', 'c', 'd')(dt) + ], + [ + new Map(entries), + new Map(entries), + new Map(entries), + new Map(entries.slice(0, 2)), + new Map(entries) + ], + 'names helper' + ); + }); +}); diff --git a/test/op/array-test.js b/test/op/array-test.js index 7adf4ab2..7dfd70a1 100644 --- a/test/op/array-test.js +++ b/test/op/array-test.js @@ -1,196 +1,188 @@ -import tape from 'tape'; -import { op } from '../../src'; +import assert from 'node:assert'; +import { op } from '../../src/index.js'; -tape('op.compact compacts an array', t => { - t.deepEqual( - [ - op.compact(Float64Array.of(1, NaN, 2)), - op.compact([ 1, 2, 3 ]), - op.compact([ 1, null, 2, undefined, NaN, 3 ]), - op.compact(null), - op.compact(undefined), - op.compact(NaN) - ], - [ - Float64Array.of(1, 2), - [ 1, 2, 3 ], - [ 1, 2, 3 ], - null, - undefined, - NaN - ], - 'compact' - ); - t.end(); -}); +describe('array op', () => { + it('compact compacts an array', () => { + assert.deepEqual( + [ + op.compact(Float64Array.of(1, NaN, 2)), + op.compact([ 1, 2, 3 ]), + op.compact([ 1, null, 2, undefined, NaN, 3 ]), + op.compact(null), + op.compact(undefined), + op.compact(NaN) + ], + [ + Float64Array.of(1, 2), + [ 1, 2, 3 ], + [ 1, 2, 3 ], + null, + undefined, + NaN + ], + 'compact' + ); + }); -tape('op.concat concats an array', t => { - t.deepEqual( - [ - op.concat(), - op.concat([ 1, 2 ], [ 3, 4 ], [ 5 ]), - op.concat(1, 2, [ 3 ]) - ], - [ - [], - [ 1, 2, 3, 4, 5 ], - [ 1, 2, 3 ] - ], - 'concat' - ); - t.end(); -}); + it('concat concats an array', () => { + assert.deepEqual( + [ + op.concat(), + op.concat([ 1, 2 ], [ 3, 4 ], [ 5 ]), + op.concat(1, 2, [ 3 ]) + ], + [ + [], + [ 1, 2, 3, 4, 5 ], + [ 1, 2, 3 ] + ], + 'concat' + ); + }); -tape('op.includes checks if a sequence contains a value', t => { - t.deepEqual( - [ - op.includes([1, 2], 1), - op.includes([1, 2], 1, 1), - op.includes('12', '1'), - op.includes('12', '1', 1), - op.includes(null, 1), - op.includes(undefined, 1), - op.includes(NaN, 1) - ], - [ - true, - false, - true, - false, - false, - false, - false - ], - 'includes' - ); - t.end(); -}); + it('includes checks if a sequence contains a value', () => { + assert.deepEqual( + [ + op.includes([1, 2], 1), + op.includes([1, 2], 1, 1), + op.includes('12', '1'), + op.includes('12', '1', 1), + op.includes(null, 1), + op.includes(undefined, 1), + op.includes(NaN, 1) + ], + [ + true, + false, + true, + false, + false, + false, + false + ], + 'includes' + ); + }); -tape('op.indexof finds first index of a value', t => { - t.deepEqual( - [ - op.indexof([2, 1, 1], 1), - op.indexof([2, 1, 1], 3), - op.indexof('211', '1'), - op.indexof('211', '3'), - op.indexof(null, 1), - op.indexof(undefined, 1), - op.indexof(NaN, 1) - ], - [ 1, -1, 1, -1, -1, -1, -1 ], - 'indexof' - ); - t.end(); -}); + it('indexof finds first index of a value', () => { + assert.deepEqual( + [ + op.indexof([2, 1, 1], 1), + op.indexof([2, 1, 1], 3), + op.indexof('211', '1'), + op.indexof('211', '3'), + op.indexof(null, 1), + op.indexof(undefined, 1), + op.indexof(NaN, 1) + ], + [ 1, -1, 1, -1, -1, -1, -1 ], + 'indexof' + ); + }); -tape('op.join maps an array to a string', t => { - t.deepEqual( - [ - op.join([2, 1, 1]), - op.join([2, 1, 1], ' '), - op.join('211', ' '), - op.join(null), - op.join(undefined), - op.join(NaN) - ], - [ '2,1,1', '2 1 1', undefined, undefined, undefined, undefined], - 'join' - ); - t.end(); -}); + it('join maps an array to a string', () => { + assert.deepEqual( + [ + op.join([2, 1, 1]), + op.join([2, 1, 1], ' '), + op.join('211', ' '), + op.join(null), + op.join(undefined), + op.join(NaN) + ], + [ '2,1,1', '2 1 1', undefined, undefined, undefined, undefined], + 'join' + ); + }); -tape('op.lastindexof finds last index of a value', t => { - t.deepEqual( - [ - op.lastindexof([2, 1, 1], 1), - op.lastindexof([2, 1, 1], 3), - op.lastindexof('211', '1'), - op.lastindexof('211', '3'), - op.lastindexof(null, 1), - op.lastindexof(undefined, 1), - op.lastindexof(NaN, 1) - ], - [ 2, -1, 2, -1, -1, -1, -1 ], - 'lastindexof' - ); - t.end(); -}); + it('lastindexof finds last index of a value', () => { + assert.deepEqual( + [ + op.lastindexof([2, 1, 1], 1), + op.lastindexof([2, 1, 1], 3), + op.lastindexof('211', '1'), + op.lastindexof('211', '3'), + op.lastindexof(null, 1), + op.lastindexof(undefined, 1), + op.lastindexof(NaN, 1) + ], + [ 2, -1, 2, -1, -1, -1, -1 ], + 'lastindexof' + ); + }); -tape('op.length returns the length of a sequence', t => { - t.deepEqual( - [ - op.length([]), - op.length(''), - op.length([2, 1, 1]), - op.length('211'), - op.length(null), - op.length(undefined), - op.length(NaN) - ], - [ 0, 0, 3, 3, 0, 0, 0 ], - 'length' - ); - t.end(); -}); + it('length returns the length of a sequence', () => { + assert.deepEqual( + [ + op.length([]), + op.length(''), + op.length([2, 1, 1]), + op.length('211'), + op.length(null), + op.length(undefined), + op.length(NaN) + ], + [ 0, 0, 3, 3, 0, 0, 0 ], + 'length' + ); + }); -tape('op.pluck retrieves a property from each array element', t => { - t.deepEqual( - [ - op.pluck([], 'x'), - op.pluck([{ x: 1 }, { x: 2 }, {}, 'foo'], 'x'), - op.pluck('foo', 'x'), - op.pluck(null, 'x'), - op.pluck(undefined, 'x'), - op.pluck(NaN, 'x') - ], - [ - [], [1, 2, undefined, undefined], - undefined, undefined, undefined, undefined - ], - 'pluck' - ); - t.end(); -}); + it('pluck retrieves a property from each array element', () => { + assert.deepEqual( + [ + op.pluck([], 'x'), + op.pluck([{ x: 1 }, { x: 2 }, {}, 'foo'], 'x'), + op.pluck('foo', 'x'), + op.pluck(null, 'x'), + op.pluck(undefined, 'x'), + op.pluck(NaN, 'x') + ], + [ + [], [1, 2, undefined, undefined], + undefined, undefined, undefined, undefined + ], + 'pluck' + ); + }); -tape('op.reverse reverses a sequence', t => { - t.deepEqual( - [ - op.reverse([]), - op.reverse(''), - op.reverse([2, 1, 1]), - op.reverse('211'), - op.reverse(null), - op.reverse(undefined), - op.reverse(NaN) - ], - [ - [], '', [1, 1, 2], '112', - undefined, undefined, undefined - ], - 'reverse' - ); - t.end(); -}); + it('reverse reverses a sequence', () => { + assert.deepEqual( + [ + op.reverse([]), + op.reverse(''), + op.reverse([2, 1, 1]), + op.reverse('211'), + op.reverse(null), + op.reverse(undefined), + op.reverse(NaN) + ], + [ + [], '', [1, 1, 2], '112', + undefined, undefined, undefined + ], + 'reverse' + ); + }); -tape('op.slice extracts a subsequence', t => { - t.deepEqual( - [ - op.slice([2, 1, 3]), - op.slice([2, 1, 3], 1), - op.slice([2, 1, 3], 1, -1), - op.slice('213'), - op.slice('213', 1), - op.slice('213', 1, -1), - op.slice(null), - op.slice(undefined), - op.slice(NaN) - ], - [ - [2, 1, 3], [1, 3], [1], - '213', '13', '1', - undefined, undefined, undefined - ], - 'slice' - ); - t.end(); + it('slice extracts a subsequence', () => { + assert.deepEqual( + [ + op.slice([2, 1, 3]), + op.slice([2, 1, 3], 1), + op.slice([2, 1, 3], 1, -1), + op.slice('213'), + op.slice('213', 1), + op.slice('213', 1, -1), + op.slice(null), + op.slice(undefined), + op.slice(NaN) + ], + [ + [2, 1, 3], [1, 3], [1], + '213', '13', '1', + undefined, undefined, undefined + ], + 'slice' + ); + }); }); diff --git a/test/op/date-test.js b/test/op/date-test.js index 046ef462..8460c691 100644 --- a/test/op/date-test.js +++ b/test/op/date-test.js @@ -1,42 +1,40 @@ -import tape from 'tape'; -import { op } from '../../src'; +import assert from 'node:assert'; +import { op } from '../../src/index.js'; -tape('op.dayofyear returns the day of the year', t => { - t.deepEqual([ - op.dayofyear(op.datetime(2000, 0, 1)), - op.dayofyear(op.datetime(2000, 0, 2)), - op.dayofyear(+op.datetime(2000, 11, 30)), - op.dayofyear(+op.datetime(2000, 11, 31)) - ], [1, 2, 365, 366], 'dayofyear'); - t.end(); -}); +describe('date op', () => { + it('dayofyear returns the day of the year', () => { + assert.deepEqual([ + op.dayofyear(op.datetime(2000, 0, 1)), + op.dayofyear(op.datetime(2000, 0, 2)), + op.dayofyear(+op.datetime(2000, 11, 30)), + op.dayofyear(+op.datetime(2000, 11, 31)) + ], [1, 2, 365, 366], 'dayofyear'); + }); -tape('op.week returns the week of the year', t => { - t.deepEqual([ - op.week(op.datetime(2000, 0, 1)), - op.week(op.datetime(2000, 0, 2)), - op.week(+op.datetime(2000, 11, 30)), - op.week(+op.datetime(2000, 11, 31)) - ], [0, 1, 52, 53], 'week'); - t.end(); -}); + it('week returns the week of the year', () => { + assert.deepEqual([ + op.week(op.datetime(2000, 0, 1)), + op.week(op.datetime(2000, 0, 2)), + op.week(+op.datetime(2000, 11, 30)), + op.week(+op.datetime(2000, 11, 31)) + ], [0, 1, 52, 53], 'week'); + }); -tape('op.utcdayofyear returns the UTC day of the year', t => { - t.deepEqual([ - op.utcdayofyear(op.utcdatetime(2000, 0, 1)), - op.utcdayofyear(op.utcdatetime(2000, 0, 2)), - op.utcdayofyear(+op.utcdatetime(2000, 11, 30)), - op.utcdayofyear(+op.utcdatetime(2000, 11, 31)) - ], [1, 2, 365, 366], 'utcdayofyear'); - t.end(); -}); + it('utcdayofyear returns the UTC day of the year', () => { + assert.deepEqual([ + op.utcdayofyear(op.utcdatetime(2000, 0, 1)), + op.utcdayofyear(op.utcdatetime(2000, 0, 2)), + op.utcdayofyear(+op.utcdatetime(2000, 11, 30)), + op.utcdayofyear(+op.utcdatetime(2000, 11, 31)) + ], [1, 2, 365, 366], 'utcdayofyear'); + }); -tape('op.utcweek returns the UTC week of the year', t => { - t.deepEqual([ - op.utcweek(op.utcdatetime(2000, 0, 1)), - op.utcweek(op.utcdatetime(2000, 0, 2)), - op.utcweek(+op.utcdatetime(2000, 11, 30)), - op.utcweek(+op.utcdatetime(2000, 11, 31)) - ], [0, 1, 52, 53], 'week'); - t.end(); -}); \ No newline at end of file + it('utcweek returns the UTC week of the year', () => { + assert.deepEqual([ + op.utcweek(op.utcdatetime(2000, 0, 1)), + op.utcweek(op.utcdatetime(2000, 0, 2)), + op.utcweek(+op.utcdatetime(2000, 11, 30)), + op.utcweek(+op.utcdatetime(2000, 11, 31)) + ], [0, 1, 52, 53], 'week'); + }); +}); diff --git a/test/op/json-test.js b/test/op/json-test.js index eea7e917..5174d53d 100644 --- a/test/op/json-test.js +++ b/test/op/json-test.js @@ -1,29 +1,29 @@ -import tape from 'tape'; -import { op } from '../../src'; +import assert from 'node:assert'; +import { op } from '../../src/index.js'; -tape('op.parse_json parses json strings', t => { - t.deepEqual([ - op.parse_json('1'), - op.parse_json('[3,2,1.2]'), - op.parse_json('{"foo":true,"bar":"bop","baz":null}') - ], [ - 1, - [3, 2, 1.2], - {foo: true, bar: 'bop', baz: null} - ], 'parse_json'); - t.end(); -}); +describe('json op', () => { + it('parse_json parses json strings', () => { + assert.deepEqual([ + op.parse_json('1'), + op.parse_json('[3,2,1.2]'), + op.parse_json('{"foo":true,"bar":"bop","baz":null}') + ], [ + 1, + [3, 2, 1.2], + {foo: true, bar: 'bop', baz: null} + ], 'parse_json'); + }); -tape('op.to_json generates json strings', t => { - t.deepEqual([ - op.to_json(1), - op.to_json([3, 2, 1.2]), - op.to_json({foo: true, bar: 'bop', baz: null, buz: undefined}) - ], [ - '1', - '[3,2,1.2]', - '{"foo":true,"bar":"bop","baz":null}' + it('to_json generates json strings', () => { + assert.deepEqual([ + op.to_json(1), + op.to_json([3, 2, 1.2]), + op.to_json({foo: true, bar: 'bop', baz: null, buz: undefined}) + ], [ + '1', + '[3,2,1.2]', + '{"foo":true,"bar":"bop","baz":null}' - ], 'to_json'); - t.end(); -}); \ No newline at end of file + ], 'to_json'); + }); +}); diff --git a/test/op/math-test.js b/test/op/math-test.js index 2c8cbaed..6b33e53c 100644 --- a/test/op/math-test.js +++ b/test/op/math-test.js @@ -1,18 +1,18 @@ -import tape from 'tape'; -import { op } from '../../src'; +import assert from 'node:assert'; +import { op } from '../../src/index.js'; -tape('op.greatest returns the greatest element', t => { - t.equal(op.greatest(1, 2, 3), 3, 'greatest'); - t.equal(op.greatest(1, null, 3), 3, 'greatest with null'); - t.equal(op.greatest(1, undefined, 3), NaN, 'greatest with undefined'); - t.equal(op.greatest(1, NaN, 3), NaN, 'greatest with NaN'); - t.end(); -}); +describe('math op', () => { + it('greatest returns the greatest element', () => { + assert.equal(op.greatest(1, 2, 3), 3, 'greatest'); + assert.equal(op.greatest(1, null, 3), 3, 'greatest with null'); + assert.equal(op.greatest(1, undefined, 3), NaN, 'greatest with undefined'); + assert.equal(op.greatest(1, NaN, 3), NaN, 'greatest with NaN'); + }); -tape('op.least returns the least element', t => { - t.equal(op.least(1, 2, 3), 1, 'least'); - t.equal(op.least(1, null, 3), 0, 'least with null'); - t.equal(op.least(1, undefined, 3), NaN, 'least with undefined'); - t.equal(op.least(1, NaN, 3), NaN, 'least with NaN'); - t.end(); -}); \ No newline at end of file + it('least returns the least element', () => { + assert.equal(op.least(1, 2, 3), 1, 'least'); + assert.equal(op.least(1, null, 3), 0, 'least with null'); + assert.equal(op.least(1, undefined, 3), NaN, 'least with undefined'); + assert.equal(op.least(1, NaN, 3), NaN, 'least with NaN'); + }); +}); diff --git a/test/op/object-test.js b/test/op/object-test.js index 28dcec55..beab8f8b 100644 --- a/test/op/object-test.js +++ b/test/op/object-test.js @@ -1,90 +1,87 @@ -import tape from 'tape'; -import { op } from '../../src'; +import assert from 'node:assert'; +import { op } from '../../src/index.js'; -tape('op.has checks if a object/map/set has a key', t => { - t.deepEqual( - [ - op.has({ a: 1}, 'a'), - op.has({ a: 1}, 'b'), - op.has(new Map([['a', 1]]), 'a'), - op.has(new Map([['a', 1]]), 'b'), - op.has(new Set(['a']), 'a'), - op.has(new Set(['a']), 'b'), - op.has(null, 'a'), - op.has(undefined, 'a'), - op.has(NaN, 'a') - ], - [ - true, false, - true, false, - true, false, - false, - false, - false - ], - 'has' - ); - t.end(); -}); +describe('object op', () => { + it('has checks if a object/map/set has a key', () => { + assert.deepEqual( + [ + op.has({ a: 1}, 'a'), + op.has({ a: 1}, 'b'), + op.has(new Map([['a', 1]]), 'a'), + op.has(new Map([['a', 1]]), 'b'), + op.has(new Set(['a']), 'a'), + op.has(new Set(['a']), 'b'), + op.has(null, 'a'), + op.has(undefined, 'a'), + op.has(NaN, 'a') + ], + [ + true, false, + true, false, + true, false, + false, + false, + false + ], + 'has' + ); + }); -tape('op.keys returns object/map keys', t => { - t.deepEqual( - [ - op.keys({ a: 1}), - op.keys(new Map([['a', 1]])), - op.keys(null), - op.keys(undefined), - op.keys(NaN) - ], - [ ['a'], ['a'], [], [], [] ], - 'keys' - ); - t.end(); -}); + it('keys returns object/map keys', () => { + assert.deepEqual( + [ + op.keys({ a: 1}), + op.keys(new Map([['a', 1]])), + op.keys(null), + op.keys(undefined), + op.keys(NaN) + ], + [ ['a'], ['a'], [], [], [] ], + 'keys' + ); + }); -tape('op.values returns object/map/set values', t => { - t.deepEqual( - [ - op.values({ a: 1}), - op.values(new Map([['a', 1]])), - op.values(new Set(['a'])), - op.values(null), - op.values(undefined), - op.values(NaN) - ], - [ [1], [1], ['a'], [], [], [] ], - 'values' - ); - t.end(); -}); + it('values returns object/map/set values', () => { + assert.deepEqual( + [ + op.values({ a: 1}), + op.values(new Map([['a', 1]])), + op.values(new Set(['a'])), + op.values(null), + op.values(undefined), + op.values(NaN) + ], + [ [1], [1], ['a'], [], [], [] ], + 'values' + ); + }); -tape('op.entries returns object/map/set entries', t => { - t.deepEqual( - [ - op.entries({ a: 1}), - op.entries(new Map([['a', 1]])), - op.entries(new Set(['a'])), - op.entries(null), - op.entries(undefined), - op.entries(NaN) - ], - [ [['a', 1]], [['a', 1]], [['a', 'a']], [], [], [] ], - 'entries' - ); - t.end(); -}); + it('entries returns object/map/set entries', () => { + assert.deepEqual( + [ + op.entries({ a: 1}), + op.entries(new Map([['a', 1]])), + op.entries(new Set(['a'])), + op.entries(null), + op.entries(undefined), + op.entries(NaN) + ], + [ [['a', 1]], [['a', 1]], [['a', 'a']], [], [], [] ], + 'entries' + ); + }); -tape('op.object constructs an object from iterable entries', t => { - t.deepEqual( - [ - op.object([['a', 1]]), - op.object(new Map([['b', 2]])), - op.object(null), - op.object(undefined), - op.object(NaN) - ], - [ {a: 1}, {b: 2}, undefined, undefined, undefined ], - 'object' - ); - t.end(); -}); \ No newline at end of file + it('object constructs an object from iterable entries', () => { + assert.deepEqual( + [ + op.object([['a', 1]]), + op.object(new Map([['b', 2]])), + op.object(null), + op.object(undefined), + op.object(NaN) + ], + [ {a: 1}, {b: 2}, undefined, undefined, undefined ], + 'object' + ); + }); +}); diff --git a/test/op/op-test.js b/test/op/op-test.js index ac3bb56d..97482ed2 100644 --- a/test/op/op-test.js +++ b/test/op/op-test.js @@ -1,46 +1,45 @@ -import tape from 'tape'; -import { aggregateFunctions, functions, windowFunctions } from '../../src/op'; -import op from '../../src/op/op-api'; -import has from '../../src/util/has'; +import assert from 'node:assert'; +import { aggregateFunctions, functions, windowFunctions } from '../../src/op/index.js'; +import op from '../../src/op/op-api.js'; +import has from '../../src/util/has.js'; -tape('op includes all aggregate functions', t => { - let pass = true; - for (const name in aggregateFunctions) { - if (op[name] == null) { - pass = false; - t.fail(`missing aggregate function: ${name}`); +describe('op', () => { + it('includes all aggregate functions', () => { + let pass = true; + for (const name in aggregateFunctions) { + if (op[name] == null) { + pass = false; + assert.fail(`missing aggregate function: ${name}`); + } } - } - t.ok(pass, 'has aggregate functions'); - t.end(); -}); + assert.ok(pass, 'has aggregate functions'); + }); -tape('op includes all window functions', t => { - let pass = true; - for (const name in windowFunctions) { - if (op[name] == null) { - pass = false; - t.fail(`missing window function: ${name}`); + it('includes all window functions', () => { + let pass = true; + for (const name in windowFunctions) { + if (op[name] == null) { + pass = false; + assert.fail(`missing window function: ${name}`); + } } - } - t.ok(pass, 'has window functions'); - t.end(); -}); + assert.ok(pass, 'has window functions'); + }); -tape('op functions do not have name collision', t => { - const overlap = []; + it('functions do not have name collision', () => { + const overlap = []; - for (const name in aggregateFunctions) { - if (has(functions, name) || has(windowFunctions, name)) { - overlap.push(name); + for (const name in aggregateFunctions) { + if (has(functions, name) || has(windowFunctions, name)) { + overlap.push(name); + } } - } - for (const name in windowFunctions) { - if (has(functions, name)) { - overlap.push(name); + for (const name in windowFunctions) { + if (has(functions, name)) { + overlap.push(name); + } } - } - t.deepEqual(overlap, [], 'no name collisons'); - t.end(); -}); \ No newline at end of file + assert.deepEqual(overlap, [], 'no name collisons'); + }); +}); diff --git a/test/op/register-test.js b/test/op/register-test.js new file mode 100644 index 00000000..522c8f01 --- /dev/null +++ b/test/op/register-test.js @@ -0,0 +1,93 @@ +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { + aggregateFunctions, + functions, + windowFunctions +} from '../../src/op/index.js'; +import { + addAggregateFunction, + addFunction, + addWindowFunction +} from '../../src/op/register.js'; +import { op, table } from '../../src/index-browser.js'; + +describe('register', () => { + it('addFunction registers new function', () => { + const SECRET = 0xDEADBEEF; + function secret() { return 0xDEADBEEF; } + + addFunction(secret); + addFunction('sssh', secret); + assert.equal(functions.secret(), SECRET, 'add implicitly named function'); + assert.equal(functions.sssh(), SECRET, 'add explicitly named function'); + + assert.throws( + () => addFunction(() => 'foo'), + 'do not accept anonymous functions' + ); + + assert.throws( + () => addFunction('abs', val => val < 0 ? -val : val), + 'do not overwrite existing functions' + ); + + const abs = op.abs; + assert.doesNotThrow( + () => { + addFunction('abs', val => val < 0 ? -val : val, { override: true }); + addFunction('abs', abs, { override: true }); + }, + 'support override option' + ); + }); + + it('addAggregateFunction registers new aggregate function', () => { + const create = () => ({ + init: s => (s.altsign = -1, s.altsum = 0), + add: (s, v) => s.altsum += (s.altsign *= -1) * v, + rem: () => {}, + value: s => s.altsum + }); + + addAggregateFunction('altsum', { create, param: [1, 0] }); + assert.deepEqual( + aggregateFunctions.altsum, + { create, param: [1, 0] }, + 'register aggregate function' + ); + assert.equal( + table({ x: [1, 2, 3, 4, 5]}).rollup({ a: d => op.altsum(d.x) }).get('a', 0), + 3, 'evaluate aggregate function' + ); + + assert.throws( + () => addAggregateFunction('mean', { create }), + 'do not overwrite existing function' + ); + }); + + it('addWindowFunction registers new window function', () => { + const create = (offset) => ({ + init: () => {}, + value: (w, f) => w.value(w.index, f) - w.index + (offset || 0) + }); + + addWindowFunction('vmi', { create, param: [1, 1] }); + assert.deepEqual( + windowFunctions.vmi, + { create, param: [1, 1] }, + 'register window function' + ); + tableEqual( + table({ x: [1, 2, 3, 4, 5] }).derive({ a: d => op.vmi(d.x, 1) }).select('a'), + { a: [2, 2, 2, 2, 2] }, + 'evaluate window function' + ); + + assert.throws( + () => addWindowFunction('rank', { create }), + 'do not overwrite existing function' + ); + }); +}); diff --git a/test/op/row-object-test.js b/test/op/row-object-test.js index 1c307b37..1a671d28 100644 --- a/test/op/row-object-test.js +++ b/test/op/row-object-test.js @@ -1,44 +1,44 @@ -import tape from 'tape'; -import { op, table } from '../../src'; - -tape('op.row_object generates objects with row data', t => { - const dt = table({ a: [1, 2], b: [3, 4] }); - - t.deepEqual( - dt.derive({ row: op.row_object() }).array('row'), - dt.objects(), - 'row objects, outside function context' - ); - - t.deepEqual( - dt.derive({ row: () => op.row_object() }).array('row'), - dt.objects(), - 'row objects, inside function context' - ); - - t.deepEqual( - dt.derive({ row: op.row_object('a') }).array('row'), - dt.objects({ columns: 'a' }), - 'row objects, column names outside function context' - ); - - t.deepEqual( - dt.derive({ row: () => op.row_object('a' + '') }).array('row'), - dt.objects({ columns: 'a' }), - 'row objects, column names inside function context' - ); - - t.deepEqual( - dt.derive({ row: op.row_object(0) }).array('row'), - dt.objects({ columns: 'a' }), - 'row objects, column indices outside function context' - ); - - t.deepEqual( - dt.derive({ row: () => op.row_object(0 + 0) }).array('row'), - dt.objects({ columns: 'a' }), - 'row objects, column indices inside function context' - ); - - t.end(); +import assert from 'node:assert'; +import { op, table } from '../../src/index.js'; + +describe('row_object op', () => { + it('row_object generates objects with row data', () => { + const dt = table({ a: [1, 2], b: [3, 4] }); + + assert.deepEqual( + dt.derive({ row: op.row_object() }).array('row'), + dt.objects(), + 'row objects, outside function context' + ); + + assert.deepEqual( + dt.derive({ row: () => op.row_object() }).array('row'), + dt.objects(), + 'row objects, inside function context' + ); + + assert.deepEqual( + dt.derive({ row: op.row_object('a') }).array('row'), + dt.objects({ columns: 'a' }), + 'row objects, column names outside function context' + ); + + assert.deepEqual( + dt.derive({ row: () => op.row_object('a' + '') }).array('row'), + dt.objects({ columns: 'a' }), + 'row objects, column names inside function context' + ); + + assert.deepEqual( + dt.derive({ row: op.row_object(0) }).array('row'), + dt.objects({ columns: 'a' }), + 'row objects, column indices outside function context' + ); + + assert.deepEqual( + dt.derive({ row: () => op.row_object(0 + 0) }).array('row'), + dt.objects({ columns: 'a' }), + 'row objects, column indices inside function context' + ); + }); }); diff --git a/test/op/string-test.js b/test/op/string-test.js index d1c9b8f6..3d628c44 100644 --- a/test/op/string-test.js +++ b/test/op/string-test.js @@ -1,236 +1,221 @@ -import tape from 'tape'; -import { op } from '../../src'; - -tape('op.parse_date parses date values', t => { - t.deepEqual( - [ - op.parse_date('2001-01-01'), - op.parse_date(null), - op.parse_date(undefined) - ], - [ new Date(Date.UTC(2001, 0, 1)), null, undefined ], - 'parse_date' - ); - t.end(); -}); - -tape('op.parse_float parses float values', t => { - t.deepEqual( - [ - op.parse_float('1.2'), - op.parse_float(null), - op.parse_float(undefined) - ], - [ 1.2, null, undefined ], - 'parse_float' - ); - t.end(); -}); - -tape('op.parse_int parses integer values', t => { - t.deepEqual( - [ - op.parse_int('1'), - op.parse_int('F', 16), - op.parse_int(null), - op.parse_int(undefined) - ], - [ 1, 15, null, undefined ], - 'parse_int' - ); - t.end(); -}); - -tape('op.endswith tests if a string ends with a substring', t => { - t.deepEqual( - [ - op.endswith('123', '3'), - op.endswith('123', '1'), - op.endswith('123', '3', 2), - op.endswith(null, '1'), - op.endswith(undefined, '1') - ], - [ true, false, false, false, false ], - 'endswith' - ); - t.end(); -}); - -tape('op.match returns pattern matches', t => { - t.deepEqual( - [ - op.match('foo', /bar/), - op.match('1 2 3 4', /\d+/).slice(), - op.match('1 2 3 4', /\d+/g), - op.match('1 2 3 4', /\d+ (\d+)/, 1), - op.match('1 2 3 4', /(?\d+)/, 'digit'), - op.match('1 2 3 4', /\d+/, 'digit'), - op.match(null, /\d+/), - op.match(undefined, /\d+/) - ], - [ - null, ['1'], ['1', '2', '3', '4'], '2', '1', null, - null, undefined - ], - 'match' - ); - t.end(); -}); - -tape('op.normalize normalizes strings', t => { - t.deepEqual( - [ - op.normalize('abc'), - op.normalize('\u006E\u0303'), - op.normalize(null), - op.normalize(undefined) - ], - [ 'abc', '\u00F1', null, undefined ], - 'normalize' - ); - t.end(); -}); - -tape('op.padend pads the end of strings', t => { - t.deepEqual( - [ - op.padend('abc', 4), - op.padend('abc', 4, ' '), - op.padend('abc', 5, '#'), - op.padend(null), - op.padend(undefined) - ], - [ 'abc ', 'abc ', 'abc##', null, undefined ], - 'padend' - ); - t.end(); -}); - -tape('op.padstart pads the start of strings', t => { - t.deepEqual( - [ - op.padstart('abc', 4), - op.padstart('abc', 4, ' '), - op.padstart('abc', 5, '#'), - op.padstart(null), - op.padstart(undefined) - ], - [ ' abc', ' abc', '##abc', null, undefined ], - 'padstart' - ); - t.end(); -}); - -tape('op.upper maps a string to upper-case', t => { - t.deepEqual( - [ - op.upper('abc'), - op.upper(null), - op.upper(undefined) - ], - [ 'ABC', null, undefined ], - 'upper' - ); - t.end(); -}); - -tape('op.lower maps a string to lower-case', t => { - t.deepEqual( - [ - op.lower('ABC'), - op.lower(null), - op.lower(undefined) - ], - [ 'abc', null, undefined ], - 'lower' - ); - t.end(); -}); - -tape('op.repeat repeats a string', t => { - t.deepEqual( - [ - op.repeat('a', 3), - op.repeat(null, 2), - op.repeat(undefined, 2) - ], - [ 'aaa', null, undefined ], - 'repeat' - ); - t.end(); -}); - -tape('op.replace replaces a pattern within a string', t => { - t.deepEqual( - [ - op.replace('aba', 'a', 'c'), - op.replace('aba', /a/, 'c'), - op.replace('aba', /a/g, 'c'), - op.replace(null, 'a', 'c'), - op.replace(undefined, 'a', 'c') - ], - [ 'cba', 'cba', 'cbc', null, undefined ], - 'replace' - ); - t.end(); -}); - -tape('op.substring extracts a substring', t => { - t.deepEqual( - [ - op.substring('aba', 0, 1), - op.substring('aba', 0, 2), - op.substring('aba', 1, 3), - op.substring(null, 0, 1), - op.substring(undefined, 0, 1) - ], - [ 'a', 'ab', 'ba', null, undefined ], - 'substring' - ); - t.end(); -}); - -tape('op.split splits a string on a delimter pattern', t => { - t.deepEqual( - [ - op.split('aba', ''), - op.split('a,b,a', ','), - op.split('a,b,a', /,/), - op.split(null, ','), - op.split(undefined, ',') - ], - [ ['a', 'b', 'a'], ['a', 'b', 'a'], ['a', 'b', 'a'], [], [] ], - 'split' - ); - t.end(); -}); - - -tape('op.startswith tests if a starts ends with a substring', t => { - t.deepEqual( - [ - op.startswith('123', '3'), - op.startswith('123', '1'), - op.startswith('123', '1', 2), - op.startswith(null, '1'), - op.startswith(undefined, '1') - ], - [ false, true, false, false, false ], - 'startswith' - ); - t.end(); -}); - -tape('op.trim trims whitespace from a string', t => { - t.deepEqual( - [ - op.trim('1'), - op.trim(' 1 '), - op.trim(null), - op.trim(undefined) - ], - [ '1', '1', null, undefined ], - 'trim' - ); - t.end(); +import assert from 'node:assert'; +import { op } from '../../src/index.js'; + +describe('string op', () => { + it('parse_date parses date values', () => { + assert.deepEqual( + [ + op.parse_date('2001-01-01'), + op.parse_date(null), + op.parse_date(undefined) + ], + [ new Date(Date.UTC(2001, 0, 1)), null, undefined ], + 'parse_date' + ); + }); + + it('parse_float parses float values', () => { + assert.deepEqual( + [ + op.parse_float('1.2'), + op.parse_float(null), + op.parse_float(undefined) + ], + [ 1.2, null, undefined ], + 'parse_float' + ); + }); + + it('parse_int parses integer values', () => { + assert.deepEqual( + [ + op.parse_int('1'), + op.parse_int('F', 16), + op.parse_int(null), + op.parse_int(undefined) + ], + [ 1, 15, null, undefined ], + 'parse_int' + ); + }); + + it('endswith tests if a string ends with a substring', () => { + assert.deepEqual( + [ + op.endswith('123', '3'), + op.endswith('123', '1'), + op.endswith('123', '3', 2), + op.endswith(null, '1'), + op.endswith(undefined, '1') + ], + [ true, false, false, false, false ], + 'endswith' + ); + }); + + it('match returns pattern matches', () => { + assert.deepEqual( + [ + op.match('foo', /bar/), + op.match('1 2 3 4', /\d+/).slice(), + op.match('1 2 3 4', /\d+/g), + op.match('1 2 3 4', /\d+ (\d+)/, 1), + op.match('1 2 3 4', /(?\d+)/, 'digit'), + op.match('1 2 3 4', /\d+/, 'digit'), + op.match(null, /\d+/), + op.match(undefined, /\d+/) + ], + [ + null, ['1'], ['1', '2', '3', '4'], '2', '1', null, + null, undefined + ], + 'match' + ); + }); + + it('normalize normalizes strings', () => { + assert.deepEqual( + [ + op.normalize('abc'), + op.normalize('\u006E\u0303'), + op.normalize(null), + op.normalize(undefined) + ], + [ 'abc', '\u00F1', null, undefined ], + 'normalize' + ); + }); + + it('padend pads the end of strings', () => { + assert.deepEqual( + [ + op.padend('abc', 4), + op.padend('abc', 4, ' '), + op.padend('abc', 5, '#'), + op.padend(null), + op.padend(undefined) + ], + [ 'abc ', 'abc ', 'abc##', null, undefined ], + 'padend' + ); + }); + + it('padstart pads the start of strings', () => { + assert.deepEqual( + [ + op.padstart('abc', 4), + op.padstart('abc', 4, ' '), + op.padstart('abc', 5, '#'), + op.padstart(null), + op.padstart(undefined) + ], + [ ' abc', ' abc', '##abc', null, undefined ], + 'padstart' + ); + }); + + it('upper maps a string to upper-case', () => { + assert.deepEqual( + [ + op.upper('abc'), + op.upper(null), + op.upper(undefined) + ], + [ 'ABC', null, undefined ], + 'upper' + ); + }); + + it('lower maps a string to lower-case', () => { + assert.deepEqual( + [ + op.lower('ABC'), + op.lower(null), + op.lower(undefined) + ], + [ 'abc', null, undefined ], + 'lower' + ); + }); + + it('repeat repeats a string', () => { + assert.deepEqual( + [ + op.repeat('a', 3), + op.repeat(null, 2), + op.repeat(undefined, 2) + ], + [ 'aaa', null, undefined ], + 'repeat' + ); + }); + + it('replace replaces a pattern within a string', () => { + assert.deepEqual( + [ + op.replace('aba', 'a', 'c'), + op.replace('aba', /a/, 'c'), + op.replace('aba', /a/g, 'c'), + op.replace(null, 'a', 'c'), + op.replace(undefined, 'a', 'c') + ], + [ 'cba', 'cba', 'cbc', null, undefined ], + 'replace' + ); + }); + + it('substring extracts a substring', () => { + assert.deepEqual( + [ + op.substring('aba', 0, 1), + op.substring('aba', 0, 2), + op.substring('aba', 1, 3), + op.substring(null, 0, 1), + op.substring(undefined, 0, 1) + ], + [ 'a', 'ab', 'ba', null, undefined ], + 'substring' + ); + }); + + it('split splits a string on a delimter pattern', () => { + assert.deepEqual( + [ + op.split('aba', ''), + op.split('a,b,a', ','), + op.split('a,b,a', /,/), + op.split(null, ','), + op.split(undefined, ',') + ], + [ ['a', 'b', 'a'], ['a', 'b', 'a'], ['a', 'b', 'a'], [], [] ], + 'split' + ); + }); + + it('startswith tests if a starts ends with a substring', () => { + assert.deepEqual( + [ + op.startswith('123', '3'), + op.startswith('123', '1'), + op.startswith('123', '1', 2), + op.startswith(null, '1'), + op.startswith(undefined, '1') + ], + [ false, true, false, false, false ], + 'startswith' + ); + }); + + it('trim trims whitespace from a string', () => { + assert.deepEqual( + [ + op.trim('1'), + op.trim(' 1 '), + op.trim(null), + op.trim(undefined) + ], + [ '1', '1', null, undefined ], + 'trim' + ); + }); }); diff --git a/test/query/query-test.js b/test/query/query-test.js deleted file mode 100644 index 3c9bfdd2..00000000 --- a/test/query/query-test.js +++ /dev/null @@ -1,1217 +0,0 @@ -import tape from 'tape'; -import groupbyEqual from '../groupby-equal'; -import tableEqual from '../table-equal'; -import Query, { query } from '../../src/query/query'; -import { Verbs } from '../../src/query/verb'; -import isFunction from '../../src/util/is-function'; -import { all, desc, not, op, range, rolling, seed, table } from '../../src'; -import { field, func } from './util'; - -const { - count, dedupe, derive, filter, groupby, orderby, - reify, rollup, select, sample, ungroup, unorder, - relocate, rename, impute, fold, pivot, spread, unroll, - cross, join, semijoin, antijoin, - concat, union, except, intersect -} = Verbs; - -tape('Query builds single-table queries', t => { - const q = query() - .derive({ bar: d => d.foo + 1 }) - .rollup({ count: op.count(), sum: op.sum('bar') }) - .orderby('foo', desc('bar'), d => d.baz, desc(d => d.bop)) - .groupby('foo', { baz: d => d.baz, bop: d => d.bop }); - - t.deepEqual(q.toObject(), { - verbs: [ - { - verb: 'derive', - values: { bar: func('d => d.foo + 1') }, - options: undefined - }, - { - verb: 'rollup', - values: { - count: func('d => op.count()'), - sum: func('d => op.sum(d["bar"])') - } - }, - { - verb: 'orderby', - keys: [ - field('foo'), - field('bar', { desc: true }), - func('d => d.baz'), - func('d => d.bop', { desc: true }) - ] - }, - { - verb: 'groupby', - keys: [ - 'foo', - { - baz: func('d => d.baz'), - bop: func('d => d.bop') - } - ] - } - ] - }, 'serialized query from builder'); - - t.end(); -}); - -tape('Query supports multi-table verbs', t => { - const q = query() - .concat('concat_table') - .join('join_table'); - - t.deepEqual(q.toObject(), { - verbs: [ - { - verb: 'concat', - tables: ['concat_table'] - }, - { - verb: 'join', - table: 'join_table', - on: undefined, - values: undefined, - options: undefined - } - ] - }, 'serialized query from builder'); - - t.end(); -}); - -tape('Query supports multi-table queries', t => { - const qc = query('concat_table') - .select(not('foo')); - - const qj = query('join_table') - .select(not('bar')); - - const q = query() - .concat(qc) - .join(qj); - - t.deepEqual(q.toObject(), { - verbs: [ - { - verb: 'concat', - tables: [ qc.toObject() ] - }, - { - verb: 'join', - table: qj.toObject(), - on: undefined, - values: undefined, - options: undefined - } - ] - }, 'serialized query from builder'); - - t.end(); -}); - -tape('Query supports all defined verbs', t => { - const verbs = Object.keys(Verbs); - const q = query(); - t.equal( - verbs.filter(v => isFunction(q[v])).length, - verbs.length, - 'query builder supports all verbs' - ); - t.end(); -}); - -tape('Query serializes to objects', t => { - const q = new Query([ - derive({ bar: d => d.foo + 1 }), - rollup({ - count: op.count(), - sum: op.sum('bar') - }), - orderby(['foo', desc('bar'), d => d.baz, desc(d => d.bop)]), - groupby(['foo', { baz: d => d.baz, bop: d => d.bop }]) - ]); - - t.deepEqual(q.toObject(), { - verbs: [ - { - verb: 'derive', - values: { bar: func('d => d.foo + 1') }, - options: undefined - }, - { - verb: 'rollup', - values: { - count: func('d => op.count()'), - sum: func('d => op.sum(d["bar"])') - } - }, - { - verb: 'orderby', - keys: [ - field('foo'), - field('bar', { desc: true }), - func('d => d.baz'), - func('d => d.bop', { desc: true }) - ] - }, - { - verb: 'groupby', - keys: [ - 'foo', - { - baz: func('d => d.baz'), - bop: func('d => d.bop') - } - ] - } - ] - }, 'serialized query'); - t.end(); -}); - -tape('Query evaluates unmodified inputs', t => { - const q = new Query([ - derive({ bar: (d, $) => d.foo + $.offset }), - rollup({ count: op.count(), sum: op.sum('bar') }) - ], { offset: 1}); - - const dt = table({ foo: [0, 1, 2, 3] }); - const dr = q.evaluate(dt); - - tableEqual(t, dr, { count: [4], sum: [10] }, 'query data'); - t.end(); -}); - -tape('Query evaluates serialized inputs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([ - derive({ baz: (d, $) => d.foo + $.offset }), - orderby(['bar', 0]), - select([not('bar')]) - ], { offset: 1 }).toObject() - ).evaluate(dt), - { foo: [ 2, 3, 0, 1 ], baz: [ 3, 4, 1, 2 ] }, - 'serialized query data' - ); - - tableEqual( - t, - Query.from( - new Query([ - derive({ bar: (d, $) => d.foo + $.offset }), - rollup({ count: op.count(), sum: op.sum('bar') }) - ], { offset: 1 }).toObject() - ).evaluate(dt), - { count: [4], sum: [10] }, - 'serialized query data' - ); - - t.end(); -}); - -tape('Query evaluates count verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([count()]).toObject() - ).evaluate(dt), - { count: [4] }, - 'count query result' - ); - - tableEqual( - t, - Query.from( - new Query([count({ as: 'cnt' })]).toObject() - ).evaluate(dt), - { cnt: [4] }, - 'count query result, with options' - ); - - t.end(); -}); - -tape('Query evaluates dedupe verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([dedupe([])]).toObject() - ).evaluate(dt), - { foo: [0, 1, 2, 3], bar: [1, 1, 0, 0] }, - 'dedupe query result' - ); - - tableEqual( - t, - Query.from( - new Query([dedupe(['bar'])]).toObject() - ).evaluate(dt), - { foo: [0, 2], bar: [1, 0] }, - 'dedupe query result, key' - ); - - tableEqual( - t, - Query.from( - new Query([dedupe([not('foo')])]).toObject() - ).evaluate(dt), - { foo: [0, 2], bar: [1, 0] }, - 'dedupe query result, key selection' - ); - - t.end(); -}); - -tape('Query evaluates derive verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - const verb = derive( - { - baz: d => d.foo + 1 - op.mean(d.foo), - bop: 'd => 2 * (d.foo - op.mean(d.foo))', - sum: rolling(d => op.sum(d.foo)), - win: rolling(d => op.product(d.foo), [0, 1]) - }, - { - before: 'bar' - } - ); - - tableEqual( - t, - Query.from( - new Query([verb]).toObject() - ).evaluate(dt), - { - foo: [0, 1, 2, 3], - baz: [-0.5, 0.5, 1.5, 2.5], - bop: [-3, -1, 1, 3], - sum: [0, 1, 3, 6], - win: [0, 2, 6, 3], - bar: [1, 1, 0, 0] - }, - 'derive query result' - ); - - t.end(); -}); - -tape('Query evaluates filter verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - const verb = filter(d => d.bar > 0); - - tableEqual( - t, - Query.from( - new Query([verb]).toObject() - ).evaluate(dt), - { - foo: [0, 1], - bar: [1, 1] - }, - 'filter query result' - ); - - t.end(); -}); - -tape('Query evaluates groupby verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - groupbyEqual( - t, - Query.from( - new Query([groupby(['bar'])]).toObject() - ).evaluate(dt), - dt.groupby('bar'), - 'groupby query result' - ); - - groupbyEqual( - t, - Query.from( - new Query([groupby([{bar: d => d.bar}])]).toObject() - ).evaluate(dt), - dt.groupby('bar'), - 'groupby query result, table expression' - ); - - groupbyEqual( - t, - Query.from( - new Query([groupby([not('foo')])]).toObject() - ).evaluate(dt), - dt.groupby('bar'), - 'groupby query result, selection' - ); - - t.end(); -}); - -tape('Query evaluates orderby verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([orderby(['bar', 'foo'])]).toObject() - ).evaluate(dt), - { - foo: [2, 3, 0, 1], - bar: [0, 0, 1, 1] - }, - 'orderby query result' - ); - - tableEqual( - t, - Query.from( - new Query([orderby([ - d => d.bar, - d => d.foo - ])]).toObject() - ).evaluate(dt), - { - foo: [2, 3, 0, 1], - bar: [0, 0, 1, 1] - }, - 'orderby query result' - ); - - tableEqual( - t, - Query.from( - new Query([orderby([desc('bar'), desc('foo')])]).toObject() - ).evaluate(dt), - { - foo: [1, 0, 3, 2], - bar: [1, 1, 0, 0] - }, - 'orderby query result, desc' - ); - - t.end(); -}); - -tape('Query evaluates reify verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }).filter(d => d.foo < 1); - - tableEqual( - t, - Query.from( - new Query([ reify() ]).toObject() - ).evaluate(dt), - { foo: [0], bar: [1] }, - 'reify query result' - ); - - t.end(); -}); - -tape('Query evaluates relocate verbs', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - tableEqual( - t, - Query.from( - new Query([ - relocate('b', { after: 'b' }) - ]).toObject() - ).evaluate(dt), - { a, c, d, b }, - 'relocate query result' - ); - - tableEqual( - t, - Query.from( - new Query([ - relocate(not('b', 'd'), { before: range(0, 1) }) - ]).toObject() - ).evaluate(dt), - { a, c, b, d }, - 'relocate query result' - ); - - t.end(); -}); - -tape('Query evaluates rename verbs', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - tableEqual( - t, - Query.from( - new Query([ - rename({ d: 'w', a: 'z' }) - ]).toObject() - ).evaluate(dt), - { z: a, b, c, w: d }, - 'rename query result' - ); - - t.end(); -}); - -tape('Query evaluates rollup verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([rollup({ - count: op.count(), - sum: op.sum('foo'), - sump1: d => 1 + op.sum(d.foo + d.bar), - avgt2: 'd => 2 * op.mean(op.abs(d.foo))' - })]).toObject() - ).evaluate(dt), - { count: [4], sum: [6], sump1: [9], avgt2: [3] }, - 'rollup query result' - ); - - t.end(); -}); - -tape('Query evaluates sample verbs', t => { - seed(12345); - - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([sample(2)]).toObject() - ).evaluate(dt), - { foo: [ 3, 1 ], bar: [ 0, 1 ] }, - 'sample query result' - ); - - tableEqual( - t, - Query.from( - new Query([sample(2, { replace: true })]).toObject() - ).evaluate(dt), - { foo: [ 3, 0 ], bar: [ 0, 1 ] }, - 'sample query result, replace' - ); - - tableEqual( - t, - Query.from( - new Query([sample(2, { weight: 'foo' })]).toObject() - ).evaluate(dt), - { foo: [ 2, 3 ], bar: [ 0, 0 ] }, - 'sample query result, weight column name' - ); - - tableEqual( - t, - Query.from( - new Query([sample(2, { weight: d => d.foo })]).toObject() - ).evaluate(dt), - { foo: [ 3, 2 ], bar: [ 0, 0 ] }, - 'sample query result, weight table expression' - ); - - seed(null); - t.end(); -}); - -tape('Query evaluates select verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([select(['bar'])]).toObject() - ).evaluate(dt), - { bar: [1, 1, 0, 0] }, - 'select query result, column name' - ); - - tableEqual( - t, - Query.from( - new Query([select([all()])]).toObject() - ).evaluate(dt), - { foo: [0, 1, 2, 3], bar: [1, 1, 0, 0] }, - 'select query result, all' - ); - - tableEqual( - t, - Query.from( - new Query([select([not('foo')])]).toObject() - ).evaluate(dt), - { bar: [1, 1, 0, 0] }, - 'select query result, not' - ); - - tableEqual( - t, - Query.from( - new Query([select([range(1, 1)])]).toObject() - ).evaluate(dt), - { bar: [1, 1, 0, 0] }, - 'select query result, range' - ); - - t.end(); -}); - -tape('Query evaluates ungroup verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }).groupby('bar'); - - const qt = Query - .from( new Query([ ungroup() ]).toObject() ) - .evaluate(dt); - - t.equal(qt.isGrouped(), false, 'table is not grouped'); - t.end(); -}); - -tape('Query evaluates unorder verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }).orderby('foo'); - - const qt = Query - .from( new Query([ unorder() ]).toObject() ) - .evaluate(dt); - - t.equal(qt.isOrdered(), false, 'table is not ordered'); - t.end(); -}); - -tape('Query evaluates impute verbs', t => { - const dt = table({ - x: [1, 2], - y: [3, 4], - z: [1, 1] - }); - - const imputed = { - x: [1, 2, 1, 2], - y: [3, 4, 4, 3], - z: [1, 1, 0, 0] - }; - - const verb = impute( - { z: () => 0 }, - { expand: ['x', 'y'] } - ); - - tableEqual( - t, - Query.from( - new Query([verb]).toObject() - ).evaluate(dt), - imputed, - 'impute query result' - ); - - t.end(); -}); - -tape('Query evaluates fold verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - const folded = { - key: [ 'foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar' ], - value: [ 0, 1, 1, 1, 2, 0, 3, 0 ] - }; - - tableEqual( - t, - Query.from( - new Query([fold(['foo', 'bar'])]).toObject() - ).evaluate(dt), - folded, - 'fold query result, column names' - ); - - tableEqual( - t, - Query.from( - new Query([fold([all()])]).toObject() - ).evaluate(dt), - folded, - 'fold query result, all' - ); - - tableEqual( - t, - Query.from( - new Query([fold([{ foo: d => d.foo }])]).toObject() - ).evaluate(dt), - { - bar: [ 1, 1, 0, 0 ], - key: [ 'foo', 'foo', 'foo', 'foo' ], - value: [ 0, 1, 2, 3 ] - }, - 'fold query result, table expression' - ); - - t.end(); -}); - -tape('Query evaluates pivot verbs', t => { - const dt = table({ - foo: [0, 1, 2, 3], - bar: [1, 1, 0, 0] - }); - - tableEqual( - t, - Query.from( - new Query([pivot(['bar'], ['foo'])]).toObject() - ).evaluate(dt), - { '0': [2], '1': [0] }, - 'pivot query result, column names' - ); - - tableEqual( - t, - Query.from( - new Query([pivot( - [{ bar: d => d.bar }], - [{ foo: op.sum('foo') }] - )]).toObject() - ).evaluate(dt), - { '0': [5], '1': [1] }, - 'pivot query result, table expressions' - ); - - t.end(); -}); - -tape('Query evaluates spread verbs', t => { - const dt = table({ - list: [[1, 2, 3]] - }); - - tableEqual( - t, - Query.from( - new Query([spread(['list'])]).toObject() - ).evaluate(dt), - { - 'list_1': [1], - 'list_2': [2], - 'list_3': [3] - }, - 'spread query result, column names' - ); - - tableEqual( - t, - Query.from( - new Query([spread(['list'], { drop: false })]).toObject() - ).evaluate(dt), - { - 'list': [[1, 2, 3]], - 'list_1': [1], - 'list_2': [2], - 'list_3': [3] - }, - 'spread query result, column names' - ); - - tableEqual( - t, - Query.from( - new Query([spread([{ list: d => d.list }])]).toObject() - ).evaluate(dt), - { - // 'list': [[1, 2, 3]], - 'list_1': [1], - 'list_2': [2], - 'list_3': [3] - }, - 'spread query result, table expression' - ); - - tableEqual( - t, - Query.from( - new Query([spread(['list'], { limit: 2 })]).toObject() - ).evaluate(dt), - { - // 'list': [[1, 2, 3]], - 'list_1': [1], - 'list_2': [2] - }, - 'spread query result, limit' - ); - - t.end(); -}); - -tape('Query evaluates unroll verbs', t => { - const dt = table({ - list: [[1, 2, 3]] - }); - - tableEqual( - t, - Query.from( - new Query([unroll(['list'])]).toObject() - ).evaluate(dt), - { 'list': [1, 2, 3] }, - 'unroll query result, column names' - ); - - tableEqual( - t, - Query.from( - new Query([unroll([{ list: d => d.list }])]).toObject() - ).evaluate(dt), - { 'list': [1, 2, 3] }, - 'unroll query result, table expression' - ); - - tableEqual( - t, - Query.from( - new Query([unroll(['list'], { limit: 2 })]).toObject() - ).evaluate(dt), - { 'list': [1, 2] }, - 'unroll query result, limit' - ); - - t.end(); -}); - -tape('Query evaluates cross verbs', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - u: ['C'], - v: [3] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ - cross('other') - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], u: ['C', 'C'], v: [3, 3] }, - 'cross query result' - ); - - tableEqual( - t, - Query.from( - new Query([ - cross('other', ['y', 'v']) - ]).toObject() - ).evaluate(lt, catalog), - { y: [1, 2], v: [3, 3] }, - 'cross query result, column name values' - ); - - tableEqual( - t, - Query.from( - new Query([ - cross('other', [ - { y: d => d.y }, - { v: d => d.v } - ]) - ]).toObject() - ).evaluate(lt, catalog), - { y: [1, 2], v: [3, 3] }, - 'cross query result, table expression values' - ); - - tableEqual( - t, - Query.from( - new Query([ - cross('other', { - y: a => a.y, - v: (a, b) => b.v - }) - ]).toObject() - ).evaluate(lt, catalog), - { y: [1, 2], v: [3, 3] }, - 'cross query result, two-table expression values' - ); - - t.end(); -}); - -tape('Query evaluates join verbs', t => { - const lt = table({ - x: ['A', 'B', 'C'], - y: [1, 2, 3] - }); - - const rt = table({ - u: ['A', 'B', 'D'], - v: [4, 5, 6] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u']) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], u: ['A', 'B'], v: [4, 5] }, - 'join query result, column name keys' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', (a, b) => op.equal(a.x, b.u)) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], u: ['A', 'B'], v: [4, 5] }, - 'join query result, predicate expression' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u'], [['x', 'y'], 'v']) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], v: [4, 5] }, - 'join query result, column name values' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u'], [all(), not('u')]) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], v: [4, 5] }, - 'join query result, selection values' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u'], [ - { x: d => d.x, y: d => d.y }, - { v: d => d.v } - ]) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], v: [4, 5] }, - 'join query result, table expression values' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u'], { - x: a => a.x, - y: a => a.y, - v: (a, b) => b.v - }) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2], v: [4, 5] }, - 'join query result, two-table expression values' - ); - - tableEqual( - t, - Query.from( - new Query([ - join('other', ['x', 'u'], [['x', 'y'], ['u', 'v']], - { left: true, right: true}) - ]).toObject() - ).evaluate(lt, catalog), - { - x: [ 'A', 'B', 'C', undefined ], - y: [ 1, 2, 3, undefined ], - u: [ 'A', 'B', undefined, 'D' ], - v: [ 4, 5, undefined, 6 ] - }, - 'join query result, full join' - ); - - t.end(); -}); - -tape('Query evaluates semijoin verbs', t => { - const lt = table({ - x: ['A', 'B', 'C'], - y: [1, 2, 3] - }); - - const rt = table({ - u: ['A', 'B', 'D'], - v: [4, 5, 6] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ - semijoin('other', ['x', 'u']) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2] }, - 'semijoin query result, column name keys' - ); - - tableEqual( - t, - Query.from( - new Query([ - semijoin('other', (a, b) => op.equal(a.x, b.u)) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B'], y: [1, 2] }, - 'semijoin query result, predicate expression' - ); - - t.end(); -}); - -tape('Query evaluates antijoin verbs', t => { - const lt = table({ - x: ['A', 'B', 'C'], - y: [1, 2, 3] - }); - - const rt = table({ - u: ['A', 'B', 'D'], - v: [4, 5, 6] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ - antijoin('other', ['x', 'u']) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['C'], y: [3] }, - 'antijoin query result, column name keys' - ); - - tableEqual( - t, - Query.from( - new Query([ - antijoin('other', (a, b) => op.equal(a.x, b.u)) - ]).toObject() - ).evaluate(lt, catalog), - { x: ['C'], y: [3] }, - 'antijoin query result, predicate expression' - ); - - t.end(); -}); - -tape('Query evaluates concat verbs', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - x: ['B', 'C'], - y: [2, 3] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ concat(['other']) ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B', 'B', 'C'], y: [1, 2, 2, 3] }, - 'concat query result' - ); - - t.end(); -}); - -tape('Query evaluates concat verbs with subqueries', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - a: ['B', 'C'], - b: [2, 3] - }); - - const catalog = name => name === 'other' ? rt : null; - - const sub = query('other') - .select({ a: 'x', b: 'y' }); - - tableEqual( - t, - Query.from( - new Query([ concat([sub]) ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B', 'B', 'C'], y: [1, 2, 2, 3] }, - 'concat query result' - ); - - t.end(); -}); - -tape('Query evaluates union verbs', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - x: ['B', 'C'], - y: [2, 3] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ union(['other']) ]).toObject() - ).evaluate(lt, catalog), - { x: ['A', 'B', 'C'], y: [1, 2, 3] }, - 'union query result' - ); - - t.end(); -}); - -tape('Query evaluates except verbs', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - x: ['B', 'C'], - y: [2, 3] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ except(['other']) ]).toObject() - ).evaluate(lt, catalog), - { x: ['A'], y: [1] }, - 'except query result' - ); - - t.end(); -}); - -tape('Query evaluates intersect verbs', t => { - const lt = table({ - x: ['A', 'B'], - y: [1, 2] - }); - - const rt = table({ - x: ['B', 'C'], - y: [2, 3] - }); - - const catalog = name => name === 'other' ? rt : null; - - tableEqual( - t, - Query.from( - new Query([ intersect(['other']) ]).toObject() - ).evaluate(lt, catalog), - { x: ['B'], y: [2] }, - 'intersect query result' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/query/util.js b/test/query/util.js deleted file mode 100644 index c7e3e4c1..00000000 --- a/test/query/util.js +++ /dev/null @@ -1,7 +0,0 @@ -export const field = (expr, props) => ({ - expr, field: true, ...props -}); - -export const func = (expr, props) => ({ - expr, func: true, ...props -}); \ No newline at end of file diff --git a/test/query/verb-test.js b/test/query/verb-test.js deleted file mode 100644 index e45f68e7..00000000 --- a/test/query/verb-test.js +++ /dev/null @@ -1,612 +0,0 @@ -import tape from 'tape'; -import { query } from '../../src/query/query'; -import { Verb, Verbs } from '../../src/query/verb'; -import { - all, bin, desc, endswith, matches, not, op, range, rolling, startswith -} from '../../src'; -import { field, func } from './util'; - -const { - count, dedupe, derive, filter, groupby, orderby, - reify, rollup, sample, select, ungroup, unorder, - relocate, rename, impute, pivot, unroll, join, concat -} = Verbs; - -function test(t, verb, expect, msg) { - const object = verb.toObject(); - t.deepEqual(object, expect, msg); - const rt = Verb.from(object).toObject(); - t.deepEqual(rt, expect, msg + ' round-trip'); -} - -tape('count verb serializes to object', t => { - test(t, - count(), - { - verb: 'count', - options: undefined - }, - 'serialized count, no options' - ); - - test(t, - count({ as: 'cnt' }), - { - verb: 'count', - options: { as: 'cnt' } - }, - 'serialized count, with options' - ); - - t.end(); -}); - -tape('dedupe verb serializes to object', t => { - test(t, - dedupe(), - { - verb: 'dedupe', - keys: [] - }, - 'serialized dedupe, no keys' - ); - - test(t, - dedupe(['id', d => d.foo]), - { - verb: 'dedupe', - keys: [ - 'id', - func('d => d.foo') - ] - }, - 'serialized dedupe, with keys' - ); - - t.end(); -}); - -tape('derive verb serializes to object', t => { - const verb = derive( - { - foo: 'd.bar * 5', - bar: d => d.foo + 1, - baz: rolling(d => op.mean(d.foo), [-3, 3]) - }, - { - before: 'bop' - } - ); - - test(t, - verb, - { - verb: 'derive', - values: { - foo: 'd.bar * 5', - bar: func('d => d.foo + 1'), - baz: func( - 'd => op.mean(d.foo)', - { window: { frame: [ -3, 3 ], peers: false } } - ) - }, - options: { - before: 'bop' - } - }, - 'serialized derive verb' - ); - - t.end(); -}); - -tape('filter verb serializes to object', t => { - test(t, - filter(d => d.foo > 2), - { - verb: 'filter', - criteria: func('d => d.foo > 2') - }, - 'serialized filter verb' - ); - - t.end(); -}); - -tape('groupby verb serializes to object', t => { - const verb = groupby([ - 'foo', - { baz: d => d.baz, bop: d => d.bop } - ]); - - test(t, - verb, - { - verb: 'groupby', - keys: [ - 'foo', - { - baz: func('d => d.baz'), - bop: func('d => d.bop') - } - ] - }, - 'serialized groupby verb' - ); - - const binVerb = groupby([{ bin0: bin('foo') }]); - - test(t, - binVerb, - { - verb: 'groupby', - keys: [ - { - bin0: 'd => op.bin(d["foo"], ...op.bins(d["foo"]), 0)' - } - ] - }, - 'serialized groupby verb, with binning' - ); - - t.end(); -}); - -tape('orderby verb serializes to object', t => { - const verb = orderby([ - 1, - 'foo', - desc('bar'), - d => d.baz, - desc(d => d.bop) - ]); - - test(t, - verb, - { - verb: 'orderby', - keys: [ - 1, - field('foo'), - field('bar', { desc: true }), - func('d => d.baz'), - func('d => d.bop', { desc: true }) - ] - }, - 'serialized orderby verb' - ); - - t.end(); -}); - -tape('reify verb serializes to AST', t => { - const verb = reify(); - - test(t, - verb, - { verb: 'reify' }, - 'serialized reify verb' - ); - - t.end(); -}); - -tape('relocate verb serializes to object', t => { - test(t, - relocate(['foo', 'bar'], { before: 'baz' }), - { - verb: 'relocate', - columns: ['foo', 'bar'], - options: { before: 'baz' } - }, - 'serialized relocate verb' - ); - - test(t, - relocate(not('foo'), { after: range('a', 'b') }), - { - verb: 'relocate', - columns: { not: ['foo'] }, - options: { after: { range: ['a', 'b'] } } - }, - 'serialized relocate verb' - ); - - t.end(); -}); - -tape('rename verb serializes to object', t => { - test(t, - rename([{ foo: 'bar' }]), - { - verb: 'rename', - columns: [{ foo: 'bar' }] - }, - 'serialized rename verb' - ); - - test(t, - rename([{ foo: 'bar' }, { baz: 'bop' }]), - { - verb: 'rename', - columns: [{ foo: 'bar' }, { baz: 'bop' }] - }, - 'serialized rename verb' - ); - - t.end(); -}); - -tape('rollup verb serializes to object', t => { - const verb = rollup({ - count: op.count(), - sum: op.sum('bar'), - mean: d => op.mean(d.foo) - }); - - test(t, - verb, - { - verb: 'rollup', - values: { - count: func('d => op.count()'), - sum: func('d => op.sum(d["bar"])'), - mean: func('d => op.mean(d.foo)') - } - }, - 'serialized rollup verb' - ); - - t.end(); -}); - -tape('sample verb serializes to object', t => { - test(t, - sample(2, { replace: true }), - { - verb: 'sample', - size: 2, - options: { replace: true } - }, - 'serialized sample verb' - ); - - test(t, - sample(() => op.count()), - { - verb: 'sample', - size: { expr: '() => op.count()', func: true }, - options: undefined - }, - 'serialized sample verb, size function' - ); - - test(t, - sample('() => op.count()'), - { - verb: 'sample', - size: '() => op.count()', - options: undefined - }, - 'serialized sample verb, size function as string' - ); - - test(t, - sample(2, { weight: 'foo' }), - { - verb: 'sample', - size: 2, - options: { weight: 'foo' } - }, - 'serialized sample verb, weight column name' - ); - - test(t, - sample(2, { weight: d => 2 * d.foo }), - { - verb: 'sample', - size: 2, - options: { weight: { expr: 'd => 2 * d.foo', func: true } } - }, - 'serialized sample verb, weight table expression' - ); - - t.end(); -}); - -tape('select verb serializes to object', t => { - const verb = select([ - 'foo', - 'bar', - { bop: 'boo', baz: 'bao' }, - all(), - range(0, 1), - range('a', 'b'), - not('foo', 'bar', range(0, 1), range('a', 'b')), - matches('foo.bar'), - matches(/a|b/i), - startswith('foo.'), - endswith('.baz') - ]); - - test(t, - verb, - { - verb: 'select', - columns: [ - 'foo', - 'bar', - { bop: 'boo', baz: 'bao' }, - { all: [] }, - { range: [0, 1] }, - { range: ['a', 'b'] }, - { - not: [ - 'foo', - 'bar', - { range: [0, 1] }, - { range: ['a', 'b'] } - ] - }, - { matches: ['foo\\.bar', ''] }, - { matches: ['a|b', 'i'] }, - { matches: ['^foo\\.', ''] }, - { matches: ['\\.baz$', ''] } - ] - }, - 'serialized select verb' - ); - - t.end(); -}); - -tape('ungroup verb serializes to object', t => { - test(t, - ungroup(), - { verb: 'ungroup' }, - 'serialized ungroup verb' - ); - - t.end(); -}); - -tape('unorder verb serializes to object', t => { - test(t, - unorder(), - { verb: 'unorder' }, - 'serialized unorder verb' - ); - - t.end(); -}); - -tape('impute verb serializes to object', t => { - const verb = impute( - { v: () => 0 }, - { expand: 'x' } - ); - - test(t, - verb, - { - verb: 'impute', - values: { - v: func('() => 0') - }, - options: { - expand: 'x' - } - }, - 'serialized impute verb' - ); - - t.end(); -}); - -tape('pivot verb serializes to object', t => { - const verb = pivot( - ['key'], - ['value', { sum: d => op.sum(d.foo), prod: op.product('bar') }], - { sort: false } - ); - - test(t, - verb, - { - verb: 'pivot', - keys: ['key'], - values: [ - 'value', - { - sum: func('d => op.sum(d.foo)'), - prod: func('d => op.product(d["bar"])') - } - ], - options: { sort: false } - }, - 'serialized pivot verb' - ); - - t.end(); -}); - -tape('unroll verb serializes to object', t => { - test(t, - unroll(['foo', 1]), - { - verb: 'unroll', - values: [ 'foo', 1 ], - options: undefined - }, - 'serialized unroll verb' - ); - - test(t, - unroll({ - foo: d => d.foo, - bar: d => op.split(d.bar, ' ') - }), - { - verb: 'unroll', - values: { - foo: { expr: 'd => d.foo', func: true }, - bar: { expr: 'd => op.split(d.bar, \' \')', func: true } - }, - options: undefined - }, - 'serialized unroll verb, values object' - ); - - test(t, - unroll(['foo'], { index: true }), - { - verb: 'unroll', - values: [ 'foo' ], - options: { index: true } - }, - 'serialized unroll verb, index boolean' - ); - - test(t, - unroll(['foo'], { index: 'idxnum' }), - { - verb: 'unroll', - values: [ 'foo' ], - options: { index: 'idxnum' } - }, - 'serialized unroll verb, index string' - ); - - test(t, - unroll(['foo'], { drop: [ 'bar' ] }), - { - verb: 'unroll', - values: [ 'foo' ], - options: { drop: [ 'bar' ] } - }, - 'serialized unroll verb, drop column name' - ); - - test(t, - unroll(['foo'], { drop: d => d.bar }), - { - verb: 'unroll', - values: [ 'foo' ], - options: { drop: { expr: 'd => d.bar', func: true } } - }, - 'serialized unroll verb, drop table expression' - ); - - t.end(); -}); - -tape('join verb serializes to object', t => { - const verbSel = join( - 'tableRef', - ['keyL', 'keyR'], - [all(), not('keyR')], - { suffix: ['_L', '_R'] } - ); - - test(t, - verbSel, - { - verb: 'join', - table: 'tableRef', - on: [ - [field('keyL')], - [field('keyR')] - ], - values: [ - [ { all: [] } ], - [ { not: ['keyR'] } ] - ], - options: { suffix: ['_L', '_R'] } - }, - 'serialized join verb, column selections' - ); - - const verbCols = join( - 'tableRef', - [ - [d => d.keyL], - [d => d.keyR] - ], - [ - ['keyL', 'valL', { foo: d => 1 + d.valL }], - ['valR', { bar: d => 2 * d.valR }] - ], - { suffix: ['_L', '_R'] } - ); - - test(t, - verbCols, - { - verb: 'join', - table: 'tableRef', - on: [ - [ func('d => d.keyL') ], - [ func('d => d.keyR') ] - ], - values: [ - ['keyL', 'valL', { foo: func('d => 1 + d.valL') } ], - ['valR', { bar: func('d => 2 * d.valR') }] - ], - options: { suffix: ['_L', '_R'] } - }, - 'serialized join verb, column lists' - ); - - const verbExpr = join( - 'tableRef', - (a, b) => op.equal(a.keyL, b.keyR), - { - key: a => a.keyL, - foo: a => a.foo, - bar: (a, b) => b.bar - } - ); - - test(t, - verbExpr, - { - verb: 'join', - table: 'tableRef', - on: func('(a, b) => op.equal(a.keyL, b.keyR)'), - values: { - key: func('a => a.keyL'), - foo: func('a => a.foo'), - bar: func('(a, b) => b.bar') - }, - options: undefined - }, - 'serialized join verb, table expressions' - ); - - t.end(); -}); - -tape('concat verb serializes to object', t => { - test(t, - concat(['foo', 'bar']), - { - verb: 'concat', - tables: ['foo', 'bar'] - }, - 'serialized concat verb' - ); - - const ct1 = query('foo').select(not('bar')); - const ct2 = query('bar').select(not('foo')); - - test(t, - concat([ct1, ct2]), - { - verb: 'concat', - tables: [ ct1.toObject(), ct2.toObject() ] - }, - 'serialized concat verb, with subqueries' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/query/verb-to-ast-test.js b/test/query/verb-to-ast-test.js deleted file mode 100644 index 7365a460..00000000 --- a/test/query/verb-to-ast-test.js +++ /dev/null @@ -1,880 +0,0 @@ -import tape from 'tape'; -import { query } from '../../src/query/query'; -import { Verbs } from '../../src/query/verb'; -import { - all, bin, desc, endswith, matches, not, op, range, rolling, startswith -} from '../../src'; - -const { - count, dedupe, derive, filter, groupby, orderby, - reify, rollup, sample, select, ungroup, unorder, - relocate, rename, impute, pivot, unroll, join, concat -} = Verbs; - -function toAST(verb) { - return JSON.parse(JSON.stringify(verb.toAST())); -} - -tape('count verb serializes to AST', t => { - t.deepEqual( - toAST(count()), - { type: 'Verb', verb: 'count' }, - 'ast count, no options' - ); - - t.deepEqual( - toAST(count({ as: 'cnt' })), - { - type: 'Verb', - verb: 'count', - options: { as: 'cnt' } - }, - 'ast count, with options' - ); - - t.end(); -}); - -tape('dedupe verb serializes to AST', t => { - t.deepEqual( - toAST(dedupe()), - { - type: 'Verb', - verb: 'dedupe', - keys: [] - }, - 'ast dedupe, no keys' - ); - - t.deepEqual( - toAST(dedupe(['id', d => d.foo, d => Math.abs(d.bar)])), - { - type: 'Verb', - verb: 'dedupe', - keys: [ - { type: 'Column', name: 'id' }, - { type: 'Column', name: 'foo' }, - { - type: 'CallExpression', - callee: { type: 'Function', name: 'abs' }, - arguments: [ { type: 'Column', name: 'bar' } ] - } - ] - }, - 'ast dedupe, with keys' - ); - t.end(); -}); - -tape('derive verb serializes to AST', t => { - const verb = derive( - { - col: d => d.foo, - foo: 'd.bar * 5', - bar: d => d.foo + 1, - baz: rolling(d => op.mean(d.foo), [-3, 3]) - }, - { - before: 'bop' - } - ); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'derive', - values: [ - { type: 'Column', name: 'foo', as: 'col' }, - { - type: 'BinaryExpression', - left: { type: 'Column', name: 'bar' }, - operator: '*', - right: { type: 'Literal', value: 5, raw: '5' }, - as: 'foo' - }, - { - type: 'BinaryExpression', - left: { type: 'Column', name: 'foo' }, - operator: '+', - right: { type: 'Literal', value: 1, raw: '1' }, - as: 'bar' - }, - { - type: 'Window', - frame: [ -3, 3 ], - peers: false, - expr: { - type: 'CallExpression', - callee: { type: 'Function', name: 'mean' }, - arguments: [ { type: 'Column', name: 'foo' } ] - }, - as: 'baz' - } - ], - options: { - before: [ - { type: 'Column', name: 'bop' } - ] - } - }, - 'ast derive verb' - ); - - t.end(); -}); - -tape('filter verb serializes to AST', t => { - const ast = { - type: 'Verb', - verb: 'filter', - criteria: { - type: 'BinaryExpression', - left: { type: 'Column', name: 'foo' }, - operator: '>', - right: { type: 'Literal', value: 2, raw: '2' } - } - }; - - t.deepEqual( - toAST(filter(d => d.foo > 2)), - ast, - 'ast filter verb' - ); - - t.deepEqual( - toAST(filter('d.foo > 2')), - ast, - 'ast filter verb, expr string' - ); - - t.end(); -}); - -tape('groupby verb serializes to AST', t => { - t.deepEqual( - toAST(groupby([ - 'foo', - 1, - { baz: d => d.baz, bop: d => d.bop, bar: d => Math.abs(d.bar) } - ])), - { - type: 'Verb', - verb: 'groupby', - keys: [ - { type: 'Column', name: 'foo' }, - { type: 'Column', index: 1 }, - { type: 'Column', name: 'baz', as: 'baz' }, - { type: 'Column', name: 'bop', as: 'bop' }, - { - type: 'CallExpression', - callee: { type: 'Function', name: 'abs' }, - arguments: [ { type: 'Column', name: 'bar' } ], - as: 'bar' - } - ] - }, - 'ast groupby verb' - ); - - t.deepEqual( - toAST(groupby([{ bin0: bin('foo') }])), - { - type: 'Verb', - verb: 'groupby', - keys: [ - { - as: 'bin0', - type: 'CallExpression', - callee: { type: 'Function', name: 'bin' }, - arguments: [ - { type: 'Column', name: 'foo' }, - { - type: 'SpreadElement', - argument: { - type: 'CallExpression', - callee: { type: 'Function', name: 'bins' }, - arguments: [{ type: 'Column', name: 'foo' }] - } - }, - { type: 'Literal', value: 0, raw: '0' } - ] - } - ] - }, - 'ast groupby verb, with binning' - ); - - t.end(); -}); - -tape('orderby verb serializes to AST', t => { - const verb = orderby([ - 1, - 'foo', - desc('bar'), - d => d.baz, - desc(d => d.bop) - ]); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'orderby', - keys: [ - { type: 'Column', index: 1 }, - { type: 'Column', name: 'foo' }, - { type: 'Descending', expr: { type: 'Column', name: 'bar' } }, - { type: 'Column', name: 'baz' }, - { type: 'Descending', expr: { type: 'Column', name: 'bop' } } - ] - }, - 'ast orderby verb' - ); - - t.end(); -}); - -tape('reify verb serializes to AST', t => { - const verb = reify(); - - t.deepEqual( - toAST(verb), - { type: 'Verb', verb: 'reify' }, - 'ast reify verb' - ); - - t.end(); -}); - -tape('relocate verb serializes to AST', t => { - t.deepEqual( - toAST(relocate(['foo', 'bar'], { before: 'baz' })), - { - type: 'Verb', - verb: 'relocate', - columns: [ - { type: 'Column', name: 'foo' }, - { type: 'Column', name: 'bar' } - ], - options: { - before: [ { type: 'Column', name: 'baz' } ] - } - }, - 'ast relocate verb' - ); - - t.deepEqual( - toAST(relocate(not('foo'), { after: range('a', 'b') })), - { - type: 'Verb', - verb: 'relocate', - columns: [ - { - type: 'Selection', - operator: 'not', - arguments: [ { type: 'Column', name: 'foo' } ] - } - ], - options: { - after: [ - { - type: 'Selection', - operator: 'range', - arguments: [ - { type: 'Column', name: 'a' }, - { type: 'Column', name: 'b' } - ] - } - ] - } - }, - 'ast relocate verb' - ); - - t.end(); -}); - -tape('rename verb serializes to AST', t => { - t.deepEqual( - toAST(rename([{ foo: 'bar' }])), - { - type: 'Verb', - verb: 'rename', - columns: [ - { type: 'Column', name: 'foo', as: 'bar' } - ] - }, - 'ast rename verb' - ); - - t.deepEqual( - toAST(rename([{ foo: 'bar' }, { baz: 'bop' }])), - { - type: 'Verb', - verb: 'rename', - columns: [ - { type: 'Column', name: 'foo', as: 'bar' }, - { type: 'Column', name: 'baz', as: 'bop' } - ] - }, - 'ast rename verb' - ); - - t.end(); -}); - -tape('rollup verb serializes to AST', t => { - const verb = rollup({ - count: op.count(), - sum: op.sum('bar'), - mean: d => op.mean(d.foo) - }); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'rollup', - values: [ - { - as: 'count', - type: 'CallExpression', - callee: { type: 'Function', name: 'count' }, - arguments: [] - }, - { - as: 'sum', - type: 'CallExpression', - callee: { type: 'Function', name: 'sum' }, - arguments: [{ type: 'Column', name: 'bar' } ] - }, - { - as: 'mean', - type: 'CallExpression', - callee: { type: 'Function', name: 'mean' }, - arguments: [ { type: 'Column', name: 'foo' } ] - } - ] - }, - 'ast rollup verb' - ); - - t.end(); -}); - -tape('sample verb serializes to AST', t => { - t.deepEqual( - toAST(sample(2, { replace: true })), - { - type: 'Verb', - verb: 'sample', - size: 2, - options: { replace: true } - }, - 'ast sample verb' - ); - - t.deepEqual( - toAST(sample(() => op.count())), - { - type: 'Verb', - verb: 'sample', - size: { - type: 'CallExpression', - callee: { type: 'Function', name: 'count' }, - arguments: [] - } - }, - 'ast sample verb, size function' - ); - - t.deepEqual( - toAST(sample('() => op.count()')), - { - type: 'Verb', - verb: 'sample', - size: { - type: 'CallExpression', - callee: { type: 'Function', name: 'count' }, - arguments: [] - } - }, - 'ast sample verb, size function as string' - ); - - t.deepEqual( - toAST(sample(2, { weight: 'foo' })), - { - type: 'Verb', - verb: 'sample', - size: 2, - options: { weight: { type: 'Column', name: 'foo' } } - }, - 'ast sample verb, weight column name' - ); - - t.deepEqual( - toAST(sample(2, { weight: d => 2 * d.foo })), - { - type: 'Verb', - verb: 'sample', - size: 2, - options: { - weight: { - type: 'BinaryExpression', - left: { type: 'Literal', value: 2, raw: '2' }, - operator: '*', - right: { type: 'Column', name: 'foo' } - } - } - }, - 'ast sample verb, weight table expression' - ); - - t.end(); -}); - -tape('select verb serializes to AST', t => { - const verb = select([ - 'foo', - 'bar', - { bop: 'boo', baz: 'bao' }, - all(), - range(0, 1), - range('a', 'b'), - not('foo', 'bar', range(0, 1), range('a', 'b')), - matches('foo.bar'), - matches(/a|b/i), - startswith('foo.'), - endswith('.baz') - ]); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'select', - columns: [ - { type: 'Column', name: 'foo' }, - { type: 'Column', name: 'bar' }, - { type: 'Column', name: 'bop', as: 'boo' }, - { type: 'Column', name: 'baz', as: 'bao' }, - { type: 'Selection', operator: 'all' }, - { - type: 'Selection', - operator: 'range', - arguments: [ - { type: 'Column', index: 0 }, - { type: 'Column', index: 1 } - ] - }, - { - type: 'Selection', - operator: 'range', - arguments: [ - { type: 'Column', name: 'a' }, - { type: 'Column', name: 'b' } - ] - }, - { - type: 'Selection', - operator: 'not', - arguments: [ - { type: 'Column', name: 'foo' }, - { type: 'Column', name: 'bar' }, - { - type: 'Selection', - operator: 'range', - arguments: [ - { type: 'Column', index: 0 }, - { type: 'Column', index: 1 } - ] - }, - { - type: 'Selection', - operator: 'range', - arguments: [ - { type: 'Column', name: 'a' }, - { type: 'Column', name: 'b' } - ] - } - ] - }, - { - type: 'Selection', - operator: 'matches', - arguments: [ 'foo\\.bar', '' ] - }, - { - type: 'Selection', - operator: 'matches', - arguments: [ 'a|b', 'i' ] - }, - { - type: 'Selection', - operator: 'matches', - arguments: [ '^foo\\.', '' ] - }, - { - type: 'Selection', - operator: 'matches', - arguments: [ '\\.baz$', '' ] - } - ] - }, - 'ast select verb' - ); - - t.end(); -}); - -tape('ungroup verb serializes to AST', t => { - const verb = ungroup(); - - t.deepEqual( - toAST(verb), - { type: 'Verb', verb: 'ungroup' }, - 'ast ungroup verb' - ); - - t.end(); -}); - -tape('unorder verb serializes to AST', t => { - const verb = unorder(); - - t.deepEqual( - toAST(verb), - { type: 'Verb', verb: 'unorder' }, - 'ast unorder verb' - ); - - t.end(); -}); - -tape('pivot verb serializes to AST', t => { - const verb = pivot( - ['key'], - ['value', { sum: d => op.sum(d.foo), prod: op.product('bar') }], - { sort: false } - ); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'pivot', - keys: [ { type: 'Column', name: 'key' } ], - values: [ - { type: 'Column', name: 'value' }, - { - as: 'sum', - type: 'CallExpression', - callee: { type: 'Function', name: 'sum' }, - arguments: [ { type: 'Column', name: 'foo' } ] - }, - { - as: 'prod', - type: 'CallExpression', - callee: { type: 'Function', name: 'product' }, - arguments: [ { type: 'Column', name: 'bar' } ] - } - ], - options: { sort: false } - }, - 'ast pivot verb' - ); - - t.end(); -}); - -tape('impute verb serializes to AST', t => { - const verb = impute( - { v: () => 0 }, - { expand: 'x' } - ); - - t.deepEqual( - toAST(verb), - { - type: 'Verb', - verb: 'impute', - values: [ { as: 'v', type: 'Literal', raw: '0', value: 0 } ], - options: { expand: [ { type: 'Column', name: 'x' } ] } - }, - 'ast impute verb' - ); - - t.end(); -}); - -tape('unroll verb serializes to AST', t => { - t.deepEqual( - toAST(unroll(['foo', 1])), - { - type: 'Verb', - verb: 'unroll', - values: [ - { type: 'Column', name: 'foo' }, - { type: 'Column', index: 1 } - ] - }, - 'ast unroll verb' - ); - - t.deepEqual( - toAST(unroll({ - foo: d => d.foo, - bar: d => op.split(d.bar, ' ') - })), - { - type: 'Verb', - verb: 'unroll', - values: [ - { type: 'Column', name: 'foo', as: 'foo' }, - { - as: 'bar', - type: 'CallExpression', - callee: { type: 'Function', name: 'split' }, - arguments: [ - { type: 'Column', name: 'bar' }, - { type: 'Literal', value: ' ', raw: '\' \'' } - ] - } - ] - }, - 'ast unroll verb, values object' - ); - - t.deepEqual( - toAST(unroll(['foo'], { index: true })), - { - type: 'Verb', - verb: 'unroll', - values: [ { type: 'Column', name: 'foo' } ], - options: { index: true } - }, - 'ast unroll verb, index boolean' - ); - - t.deepEqual( - toAST(unroll(['foo'], { index: 'idxnum' })), - { - type: 'Verb', - verb: 'unroll', - values: [ { type: 'Column', name: 'foo' } ], - options: { index: 'idxnum' } - }, - 'ast unroll verb, index string' - ); - - t.deepEqual( - toAST(unroll(['foo'], { drop: [ 'bar' ] })), - { - type: 'Verb', - verb: 'unroll', - values: [ { type: 'Column', name: 'foo' } ], - options: { - drop: [ { type: 'Column', name: 'bar' } ] - } - }, - 'ast unroll verb, drop column name' - ); - - t.deepEqual( - toAST(unroll(['foo'], { drop: d => d.bar })), - { - type: 'Verb', - verb: 'unroll', - values: [ { type: 'Column', name: 'foo' } ], - options: { - drop: [ { type: 'Column', name: 'bar' } ] - } - }, - 'ast unroll verb, drop table expression' - ); - - t.end(); -}); - -tape('join verb serializes to AST', t => { - const verbSel = join( - 'tableRef', - ['keyL', 'keyR'], - [all(), not('keyR')], - { suffix: ['_L', '_R'] } - ); - - t.deepEqual( - toAST(verbSel), - { - type: 'Verb', - verb: 'join', - table: 'tableRef', - on: [ - [ { type: 'Column', name: 'keyL' } ], - [ { type: 'Column', name: 'keyR' } ] - ], - values: [ - [ { type: 'Selection', operator: 'all' } ], - [ { - type: 'Selection', - operator: 'not', - arguments: [ { type: 'Column', name: 'keyR' } ] - } ] - ], - options: { suffix: ['_L', '_R'] } - }, - 'ast join verb, column selections' - ); - - const verbCols = join( - 'tableRef', - [ - [d => d.keyL], - [d => d.keyR] - ], - [ - ['keyL', 'valL', { foo: d => 1 + d.valL }], - ['valR', { bar: d => 2 * d.valR }] - ], - { suffix: ['_L', '_R'] } - ); - - t.deepEqual( - toAST(verbCols), - { - type: 'Verb', - verb: 'join', - table: 'tableRef', - on: [ - [ { type: 'Column', name: 'keyL' } ], - [ { type: 'Column', name: 'keyR' } ] - ], - values: [ - [ - { type: 'Column', name: 'keyL' }, - { type: 'Column', name: 'valL' }, - { - as: 'foo', - type: 'BinaryExpression', - left: { type: 'Literal', 'value': 1, 'raw': '1' }, - operator: '+', - right: { type: 'Column', name: 'valL' } - } - ], - [ - { type: 'Column', name: 'valR' }, - { - as: 'bar', - type: 'BinaryExpression', - left: { type: 'Literal', 'value': 2, 'raw': '2' }, - operator: '*', - right: { type: 'Column', name: 'valR' } - } - ] - ], - options: { suffix: ['_L', '_R'] } - }, - 'ast join verb, column lists' - ); - - const verbExpr = join( - 'tableRef', - (a, b) => op.equal(a.keyL, b.keyR), - { - key: a => a.keyL, - foo: a => a.foo, - bar: (a, b) => b.bar - } - ); - - t.deepEqual( - toAST(verbExpr), - { - type: 'Verb', - verb: 'join', - table: 'tableRef', - on: { - type: 'CallExpression', - callee: { type: 'Function', name: 'equal' }, - arguments: [ - { type: 'Column', table: 1, name: 'keyL' }, - { type: 'Column', table: 2, name: 'keyR' } - ] - }, - values: [ - { type: 'Column', table: 1, name: 'keyL', as: 'key' }, - { type: 'Column', table: 1, name: 'foo', as: 'foo' }, - { type: 'Column', table: 2, name: 'bar', as: 'bar' } - ] - }, - 'ast join verb, table expressions' - ); - - t.end(); -}); - -tape('concat verb serializes to AST', t => { - t.deepEqual( - toAST(concat(['foo', 'bar'])), - { - type: 'Verb', - verb: 'concat', - tables: ['foo', 'bar'] - }, - 'ast concat verb' - ); - - const ct1 = query('foo').select(not('bar')); - const ct2 = query('bar').select(not('foo')); - - t.deepEqual( - toAST(concat([ct1, ct2])), - { - type: 'Verb', - verb: 'concat', - tables: [ - { - type: 'Query', - verbs: [ - { - type: 'Verb', - verb: 'select', - columns: [ - { - type: 'Selection', - operator: 'not', - arguments: [ { type: 'Column', name: 'bar' } ] - } - ] - } - ], - table: 'foo' - }, - { - type: 'Query', - verbs: [ - { - type: 'Verb', - verb: 'select', - columns: [ - { - type: 'Selection', - operator: 'not', - arguments: [ { type: 'Column', name: 'foo' } ] - } - ] - } - ], - table: 'bar' - } - ] - }, - 'ast concat verb, with subqueries' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/register-test.js b/test/register-test.js deleted file mode 100644 index e718fa96..00000000 --- a/test/register-test.js +++ /dev/null @@ -1,236 +0,0 @@ -import tape from 'tape'; -import tableEqual from './table-equal'; -import { aggregateFunctions, functions, windowFunctions } from '../src/op'; -import { ExprObject } from '../src/query/constants'; -import { - addAggregateFunction, - addFunction, - addPackage, - addTableMethod, - addVerb, - addWindowFunction -} from '../src/register'; -import { op, query, table } from '../src'; - -tape('addFunction registers new function', t => { - const SECRET = 0xDEADBEEF; - function secret() { return 0xDEADBEEF; } - - addFunction(secret); - addFunction('sssh', secret); - t.equal(functions.secret(), SECRET, 'add implicitly named function'); - t.equal(functions.sssh(), SECRET, 'add explicitly named function'); - - t.throws( - () => addFunction(() => 'foo'), - 'do not accept anonymous functions' - ); - - t.throws( - () => addFunction('abs', val => val < 0 ? -val : val), - 'do not overwrite existing functions' - ); - - const abs = op.abs; - t.doesNotThrow( - () => { - addFunction('abs', val => val < 0 ? -val : val, { override: true }); - addFunction('abs', abs, { override: true }); - }, - 'support override option' - ); - - t.end(); -}); - -tape('addAggregateFunction registers new aggregate function', t => { - const create = () => ({ - init: s => (s.altsign = -1, s.altsum = 0), - add: (s, v) => s.altsum += (s.altsign *= -1) * v, - rem: () => {}, - value: s => s.altsum - }); - - addAggregateFunction('altsum', { create, param: [1, 0] }); - t.deepEqual( - aggregateFunctions.altsum, - { create, param: [1, 0] }, - 'register aggregate function' - ); - t.equal( - table({ x: [1, 2, 3, 4, 5]}).rollup({ a: d => op.altsum(d.x) }).get('a', 0), - 3, 'evaluate aggregate function' - ); - - t.throws( - () => addAggregateFunction('mean', { create }), - 'do not overwrite existing function' - ); - - t.end(); -}); - -tape('addWindowFunction registers new window function', t => { - const create = (offset) => ({ - init: () => {}, - value: (w, f) => w.value(w.index, f) - w.index + (offset || 0) - }); - - addWindowFunction('vmi', { create, param: [1, 1] }); - t.deepEqual( - windowFunctions.vmi, - { create, param: [1, 1] }, - 'register window function' - ); - tableEqual(t, - table({ x: [1, 2, 3, 4, 5] }).derive({ a: d => op.vmi(d.x, 1) }).select('a'), - { a: [2, 2, 2, 2, 2] }, - 'evaluate window function' - ); - - t.throws( - () => addWindowFunction('rank', { create }), - 'do not overwrite existing function' - ); - - t.end(); -}); - -tape('addTableMethod registers a new table method', t => { - const dim1 = (t, ...args) => [t.numRows(), t.numCols(), ...args]; - const dim2 = (t) => [t.numRows(), t.numCols()]; - - addTableMethod('dims', dim1); - - t.deepEqual( - table({ a: [1, 2, 3], b: [4, 5, 6] }).dims('a', 'b'), - [3, 2, 'a', 'b'], - 'register table method' - ); - - t.throws( - () => addTableMethod('_foo', dim1), - 'do not allow names that start with underscore' - ); - - t.throws( - () => addTableMethod('toCSV', dim1, { override: true }), - 'do not override reserved names' - ); - - t.doesNotThrow( - () => addTableMethod('dims', dim1), - 'allow reassignment of existing value' - ); - - t.throws( - () => addTableMethod('dims', dim2), - 'do not override without option' - ); - - t.doesNotThrow( - () => addTableMethod('dims', dim2, { override: true }), - 'allow override with option' - ); - - t.deepEqual( - table({ a: [1, 2, 3], b: [4, 5, 6] }).dims('a', 'b'), - [3, 2], - 'register overridden table method' - ); - - t.end(); -}); - -tape('addVerb registers a new verb', t => { - const rup = (t, exprs) => t.rollup(exprs); - - addVerb('rup', rup, [ - { name: 'exprs', type: ExprObject } - ]); - - tableEqual(t, - table({ a: [1, 2, 3], b: [4, 5, 6] }).rup({ sum: op.sum('a') }), - { sum: [ 6 ] }, - 'register verb with table' - ); - - t.deepEqual( - query().rup({ sum: op.sum('a') }).toObject(), - { - verbs: [ - { - verb: 'rup', - exprs: { sum: { expr: 'd => op.sum(d["a"])', func: true } } - } - ] - }, - 'register verb with query' - ); - - t.end(); -}); - -tape('addPackage registers an extension package', t => { - const pkg = { - functions: { - secret_p: () => 0xDEADBEEF - }, - aggregateFunctions: { - altsum_p: { - create: () => ({ - init: s => (s.altsign = -1, s.altsum = 0), - add: (s, v) => s.altsum += (s.altsign *= -1) * v, - rem: () => {}, - value: s => s.altsum - }), - param: [1, 0] - } - }, - windowFunctions: { - vmi_p: { - create: (offset) => ({ - init: () => {}, - value: (w, f) => w.value(w.index, f) - w.index + (offset || 0) - }), - param: [1, 1] - } - }, - tableMethods: { - dims_p: t => [t.numRows(), t.numCols()] - }, - verbs: { - rup_p: { - method: (t, exprs) => t.rollup(exprs), - params: [ { name: 'exprs', type: ExprObject } ] - } - } - }; - - addPackage(pkg); - - t.equal(functions.secret_p, pkg.functions.secret_p, 'functions'); - t.equal(aggregateFunctions.altsum_p, pkg.aggregateFunctions.altsum_p, 'aggregate functions'); - t.equal(windowFunctions.vmi_p, pkg.windowFunctions.vmi_p, 'window functions'); - t.equal(table().dims_p.fn, pkg.tableMethods.dims_p, 'table methods'); - t.equal(table().rup_p.fn, pkg.verbs.rup_p.method, 'verbs'); - - t.doesNotThrow( - () => addPackage(pkg), - 'allow reassignment of existing value' - ); - - t.throws( - () => addPackage({ functions: { secret_p: () => 1 } }), - 'do not override without option' - ); - - const secret_p = () => 42; - addPackage({ functions: { secret_p } }, { override: true }); - t.equal( - functions.secret_p, secret_p, - 'allow override with option' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/table-equal.js b/test/table-equal.js index 553238b0..89f95c27 100644 --- a/test/table-equal.js +++ b/test/table-equal.js @@ -1,16 +1,17 @@ -import isArray from '../src/util/is-array'; -import isDate from '../src/util/is-date'; -import isObject from '../src/util/is-object'; -import isRegExp from '../src/util/is-regexp'; -import isTypedArray from '../src/util/is-typed-array'; +import assert from 'node:assert'; +import isArray from '../src/util/is-array.js'; +import isDate from '../src/util/is-date.js'; +import isObject from '../src/util/is-object.js'; +import isRegExp from '../src/util/is-regexp.js'; +import isTypedArray from '../src/util/is-typed-array.js'; -export default function(t, table, data, message) { +export default function(table, data, message) { table = table.reify(); const tableData = {}; for (const name of table.columnNames()) { tableData[name] = Array.from(table.column(name), arrayMap); } - t.deepEqual(tableData, data, message); + assert.deepEqual(tableData, data, message); } function arrayMap(value) { @@ -26,4 +27,4 @@ function objectMap(value) { obj[name] = arrayMap(value[name]); } return obj; -} \ No newline at end of file +} diff --git a/test/table/bitset-test.js b/test/table/bitset-test.js index 98019fa8..00ed0132 100644 --- a/test/table/bitset-test.js +++ b/test/table/bitset-test.js @@ -1,103 +1,99 @@ -import tape from 'tape'; -import BitSet from '../../src/table/bit-set'; - -tape('BitSet manages a set of bits', t => { - const buckets = 5; - const size = 32 * buckets; - const bs = new BitSet(32 * buckets); - - // size - t.equal(bs.length, size, 'correct size'); - - // empty initial state - let tally = 0; - for (let i = 0; i < size; ++i) { - if (bs.get(i)) ++tally; - } - t.equal(tally, 0, 'bitset is clear'); - t.equal(bs.count(), 0, 'count = 0'); - - // set bits - let set = []; - const idx = [0, 33, 66, 99, 132]; - idx.forEach(i => bs.set(i)); - for (let i = 0; i < size; ++i) { - if (bs.get(i)) set.push(i); - } - t.deepEqual(set, idx, 'bitset has set bits'); - set = []; - for (let i = 0, b = bs.next(0); i < buckets; ++i, b = bs.next(b + 1)) { - set.push(b); - } - t.deepEqual(set, idx, 'bitset iterates set bits'); - t.equal(bs.count(), buckets, `count = ${buckets}`); - - // clear bits - for (let i = 0; i < buckets; ++i) { - bs.clear(33 * i); - } - tally = 0; - for (let i = 0; i < size; ++i) { - if (bs.get(i)) ++tally; - } - t.equal(tally, 0, 'bitset is clear'); - t.equal(bs.count(), 0, 'count = 0'); - - t.end(); -}); - -tape('BitSet ands with another BitSet', t => { - const ai = [1, 5, 9, 32, 34, 56, 62]; - const bi = [1, 4, 9, 33, 34, 55, 68]; - const a = new BitSet(69); - const b = new BitSet(69); - ai.forEach(i => a.set(i)); - bi.forEach(i => b.set(i)); - - const ab = a.and(b); - const ba = b.and(a); - - t.equal(ab.length, Math.min(a.length, b.length), 'correct size'); - t.equal(ab.length, ba.length, 'matching size'); - - const idx = [1, 9, 34].reduce((m, i) => (m[i] = 1, m), {}); - const flags = ['', '', '']; - for (let i = 0; i < ab.length; ++i) { - flags[0] += idx[i] ? 1 : 0; - flags[1] += ab.get(i) ? 1 : 0; - flags[2] += ba.get(i) ? 1 : 0; - } - - t.equal(flags[0], flags[1], 'bitset ab values'); - t.equal(flags[0], flags[2], 'bitset ba values'); - - t.end(); -}); - -tape('BitSet ors with another BitSet', t => { - const ai = [1, 5, 9, 32, 34, 56, 62]; - const bi = [1, 4, 9, 33, 34, 55, 68]; - const a = new BitSet(69); - const b = new BitSet(69); - ai.forEach(i => a.set(i)); - bi.forEach(i => b.set(i)); - - const ab = a.or(b); - const ba = b.or(a); - - t.equal(ab.length, Math.max(a.length, b.length), 'correct size'); - t.equal(ab.length, ba.length, 'matching size'); - - const idx = ai.concat(bi).reduce((m, i) => (m[i] = 1, m), {}); - const flags = ['', '', '']; - for (let i = 0; i < ab.length; ++i) { - flags[0] += idx[i] ? 1 : 0; - flags[1] += ab.get(i) ? 1 : 0; - flags[2] += ba.get(i) ? 1 : 0; - } - - t.equal(flags[0], flags[1], 'bitset ab values'); - t.equal(flags[0], flags[2], 'bitset ba values'); - - t.end(); +import assert from 'node:assert'; +import { BitSet } from '../../src/index.js'; + +describe('BitSet', () => { + it('manages a set of bits', () => { + const buckets = 5; + const size = 32 * buckets; + const bs = new BitSet(32 * buckets); + + // size + assert.equal(bs.length, size, 'correct size'); + + // empty initial state + let tally = 0; + for (let i = 0; i < size; ++i) { + if (bs.get(i)) ++tally; + } + assert.equal(tally, 0, 'bitset is clear'); + assert.equal(bs.count(), 0, 'count = 0'); + + // set bits + let set = []; + const idx = [0, 33, 66, 99, 132]; + idx.forEach(i => bs.set(i)); + for (let i = 0; i < size; ++i) { + if (bs.get(i)) set.push(i); + } + assert.deepEqual(set, idx, 'bitset has set bits'); + set = []; + for (let i = 0, b = bs.next(0); i < buckets; ++i, b = bs.next(b + 1)) { + set.push(b); + } + assert.deepEqual(set, idx, 'bitset iterates set bits'); + assert.equal(bs.count(), buckets, `count = ${buckets}`); + + // clear bits + for (let i = 0; i < buckets; ++i) { + bs.clear(33 * i); + } + tally = 0; + for (let i = 0; i < size; ++i) { + if (bs.get(i)) ++tally; + } + assert.equal(tally, 0, 'bitset is clear'); + assert.equal(bs.count(), 0, 'count = 0'); + }); + + it('ands with another BitSet', () => { + const ai = [1, 5, 9, 32, 34, 56, 62]; + const bi = [1, 4, 9, 33, 34, 55, 68]; + const a = new BitSet(69); + const b = new BitSet(69); + ai.forEach(i => a.set(i)); + bi.forEach(i => b.set(i)); + + const ab = a.and(b); + const ba = b.and(a); + + assert.equal(ab.length, Math.min(a.length, b.length), 'correct size'); + assert.equal(ab.length, ba.length, 'matching size'); + + const idx = [1, 9, 34].reduce((m, i) => (m[i] = 1, m), {}); + const flags = ['', '', '']; + for (let i = 0; i < ab.length; ++i) { + flags[0] += idx[i] ? 1 : 0; + flags[1] += ab.get(i) ? 1 : 0; + flags[2] += ba.get(i) ? 1 : 0; + } + + assert.equal(flags[0], flags[1], 'bitset ab values'); + assert.equal(flags[0], flags[2], 'bitset ba values'); + }); + + it('ors with another BitSet', () => { + const ai = [1, 5, 9, 32, 34, 56, 62]; + const bi = [1, 4, 9, 33, 34, 55, 68]; + const a = new BitSet(69); + const b = new BitSet(69); + ai.forEach(i => a.set(i)); + bi.forEach(i => b.set(i)); + + const ab = a.or(b); + const ba = b.or(a); + + assert.equal(ab.length, Math.max(a.length, b.length), 'correct size'); + assert.equal(ab.length, ba.length, 'matching size'); + + const idx = ai.concat(bi).reduce((m, i) => (m[i] = 1, m), {}); + const flags = ['', '', '']; + for (let i = 0; i < ab.length; ++i) { + flags[0] += idx[i] ? 1 : 0; + flags[1] += ab.get(i) ? 1 : 0; + flags[2] += ba.get(i) ? 1 : 0; + } + + assert.equal(flags[0], flags[1], 'bitset ab values'); + assert.equal(flags[0], flags[2], 'bitset ba values'); + }); }); diff --git a/test/table/column-table-test.js b/test/table/column-table-test.js index d10cf0ea..a7b98b0a 100644 --- a/test/table/column-table-test.js +++ b/test/table/column-table-test.js @@ -1,544 +1,52 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { not } from '../../src/helpers/selection'; -import BitSet from '../../src/table/bit-set'; -import ColumnTable from '../../src/table/column-table'; - -tape('ColumnTable supports varied column types', t => { - const data = { - int: Uint32Array.of(1, 2, 3, 4, 5), - num: Float64Array.of(1.2, 2.3, 3.4, 4.5, 5.6), - str: ['a1', 'b2', 'c3', 'd4', 'e5'], - chr: 'abcde', - obj: [{a:1}, {b:2}, {c:3}, {d:4}, {e:5}] - }; - - const ref = { - int: [1, 2, 3, 4, 5], - num: [1.2, 2.3, 3.4, 4.5, 5.6], - str: ['a1', 'b2', 'c3', 'd4', 'e5'], - chr: ['a', 'b', 'c', 'd', 'e'], - obj: [{a:1}, {b:2}, {c:3}, {d:4}, {e:5}] - }; - - const ct = new ColumnTable(data); - - t.equal(ct.numRows(), 5, 'num rows'); - t.equal(ct.numCols(), 5, 'num cols'); - - const rows = [0, 1, 2, 3, 4]; - const get = { - int: rows.map(row => ct.get('int', row)), - num: rows.map(row => ct.get('num', row)), - str: rows.map(row => ct.get('str', row)), - chr: rows.map(row => ct.get('chr', row)), - obj: rows.map(row => ct.get('obj', row)) - }; - t.deepEqual(get, ref, 'extracted get values match'); - - const getters = ['int', 'num', 'str', 'chr', 'obj'].map(name => ct.getter(name)); - const getter = { - int: rows.map(row => getters[0](row)), - num: rows.map(row => getters[1](row)), - str: rows.map(row => getters[2](row)), - chr: rows.map(row => getters[3](row)), - obj: rows.map(row => getters[4](row)) - }; - t.deepEqual(getter, ref, 'extracted getter values match'); - - const arrays = ['int', 'num', 'str', 'chr', 'obj'].map(name => ct.columnArray(name)); - const array = { - int: rows.map(row => arrays[0][row]), - num: rows.map(row => arrays[1][row]), - str: rows.map(row => arrays[2][row]), - chr: rows.map(row => arrays[3][row]), - obj: rows.map(row => arrays[4][row]) - }; - t.deepEqual(array, ref, 'extracted columnArray values match'); - - const scanned = { - int: [], - num: [], - str: [], - chr: [], - obj: [] - }; - ct.scan((row, data) => { - for (const col in data) { - scanned[col].push(data[col].get(row)); - } - }); - t.deepEqual(scanned, ref, 'scanned values match'); - - t.end(); -}); - -tape('ColumnTable scan supports filtering and ordering', t => { - const table = new ColumnTable({ - a: ['a', 'a', 'a', 'b', 'b'], - b: [2, 1, 4, 5, 3] +import assert from 'node:assert'; +import { ColumnTable, Table } from '../../src/index.js'; +import * as verbs from '../../src/verbs/index.js'; + +describe('ColumnTable', () => { + it('extends Table', () => { + const dt = new ColumnTable({ x: [1, 2, 3] }); + assert.ok(dt instanceof Table, 'ColumnTable extends Table'); }); - let idx = []; - table.scan(row => idx.push(row), true); - t.deepEqual(idx, [0, 1, 2, 3, 4], 'standard scan'); - - const filter = new BitSet(5); - [1, 2, 4].forEach(i => filter.set(i)); - const ft = table.create({ filter }); - idx = []; - ft.scan(row => idx.push(row), true); - t.deepEqual(idx, [1, 2, 4], 'filtered scan'); - - const order = (u, v, { b }) => b.get(u) - b.get(v); - const ot = table.create({ order }); - t.ok(ot.isOrdered(), 'is ordered'); - idx = []; - ot.scan(row => idx.push(row), true); - t.deepEqual(idx, [1, 0, 4, 2, 3], 'ordered scan'); - - idx = []; - ot.scan(row => idx.push(row)); - t.deepEqual(idx, [0, 1, 2, 3, 4], 'no-order scan'); - - t.end(); -}); - -tape('ColumnTable scan supports early termination', t => { - const table = new ColumnTable({ - a: ['a', 'a', 'a', 'b', 'b'], - b: [2, 1, 4, 5, 3] + it('includes transformation verbs', () => { + const proto = ColumnTable.prototype; + assert.ok( + typeof proto.count === 'function', + 'ColumnTable includes count verb' + ); + for (const verbName of Object.keys(verbs)) { + assert.ok( + typeof proto[verbName] === 'function', + `ColumnTable includes ${verbName} verb` + ); + } }); - let count; - const visitor = (row, data, stop) => { if (++count > 1) stop(); }; - - count = 0; - table.scan(visitor, true); - t.equal(count, 2, 'standard scan'); - - count = 0; - const filter = new BitSet(5); - [1, 2, 4].forEach(i => filter.set(i)); - table.create({ filter }).scan(visitor, true); - t.equal(count, 2, 'filtered scan'); - - count = 0; - const order = (u, v, { b }) => b.get(u) - b.get(v); - table.create({ order }).scan(visitor, true); - t.equal(count, 2, 'ordered scan'); - - t.end(); -}); - -tape('ColumnTable memoizes indices', t => { - const ut = new ColumnTable({ v: [1, 3, 2] }); - const ui = ut.indices(false); - t.equal(ut.indices(), ui, 'memoize unordered'); - - const ot = ut.orderby('v'); - const of = ot.indices(false); - const oi = ot.indices(); - t.notEqual(of, oi, 'respect order flag'); - t.equal(ot.indices(), oi, 'memoize ordered'); - t.deepEqual(Array.from(oi), [0, 2, 1], 'indices ordered'); - - t.end(); -}); - -tape('ColumnTable supports column values output', t => { - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .filter(d => d.v > 1) - .orderby('v'); - - t.deepEqual( - Array.from(dt.values('u')), - ['a', 'b', 'a', 'b'], - 'column values, strings' - ); - - t.deepEqual( - Array.from(dt.values('v')), - [2, 3, 4, 5], - 'column values, numbers' - ); - - t.deepEqual( - Int32Array.from(dt.values('v')), - Int32Array.of(2, 3, 4, 5), - 'column values, typed array' - ); - - t.end(); -}); - -tape('ColumnTable supports column array output', t => { - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .filter(d => d.v > 1) - .orderby('v'); - - t.deepEqual( - dt.array('u'), - ['a', 'b', 'a', 'b'], - 'column array, strings' - ); - - t.deepEqual( - dt.array('v'), - [2, 3, 4, 5], - 'column array, numbers' - ); - - t.deepEqual( - dt.array('v', Int32Array), - Int32Array.of(2, 3, 4, 5), - 'column array, typed array' - ); - - t.end(); -}); - -tape('ColumnTable supports object output', t => { - const output = [ - { u: 'a', v: 1 }, - { u: 'a', v: 2 }, - { u: 'b', v: 3 }, - { u: 'a', v: 4 }, - { u: 'b', v: 5 } - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.deepEqual(dt.objects(), output, 'object data'); - - t.deepEqual( - dt.objects({ limit: 3 }), - output.slice(0, 3), - 'object data with limit' - ); - - t.deepEqual( - dt.objects({ columns: not('v') }), - output.map(d => ({ u: d.u })), - 'object data with column selection' - ); - - t.deepEqual( - dt.objects({ columns: { u: 'a', v: 'b'} }), - output.map(d => ({ a: d.u, b: d.v })), - 'object data with renaming column selection' - ); - - t.deepEqual( - dt.object(), - output[0], - 'single object, implicit row' - ); - - t.deepEqual( - dt.object(0), - output[0], - 'single object, explicit row' - ); - - t.deepEqual( - dt.object(1), - output[1], - 'single object, explicit row' - ); - - t.end(); -}); - -tape('ColumnTable supports grouped object output', t => { - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.deepEqual( - dt.groupby('u').objects({ grouped: 'object' }), - { - a: [ - { u: 'a', v: 1 }, - { u: 'a', v: 2 }, - { u: 'a', v: 4 } - ], - b: [ - { u: 'b', v: 3 }, - { u: 'b', v: 5 } - ] - }, - 'grouped object output' - ); - - t.deepEqual( - dt.groupby('u').objects({ grouped: 'entries' }), - [ - ['a',[ - { u: 'a', v: 1 }, - { u: 'a', v: 2 }, - { u: 'a', v: 4 } - ]], - ['b',[ - { u: 'b', v: 3 }, - { u: 'b', v: 5 } - ]] - ], - 'grouped entries output' - ); - - t.deepEqual( - dt.groupby('u').objects({ grouped: 'map' }), - new Map([ - ['a',[ - { u: 'a', v: 1 }, - { u: 'a', v: 2 }, - { u: 'a', v: 4 } - ]], - ['b',[ - { u: 'b', v: 3 }, - { u: 'b', v: 5 } - ]] - ]), - 'grouped map output' - ); - - t.deepEqual( - dt.groupby('u').objects({ grouped: true }), - new Map([ - ['a',[ - { u: 'a', v: 1 }, - { u: 'a', v: 2 }, - { u: 'a', v: 4 } - ]], - ['b',[ - { u: 'b', v: 3 }, - { u: 'b', v: 5 } - ]] - ]), - 'grouped map output, using true' - ); - - t.deepEqual( - dt.filter(d => d.v < 4).groupby('u').objects({ grouped: 'object' }), - { - a: [ - { u: 'a', v: 1 }, - { u: 'a', v: 2 } - ], - b: [ - { u: 'b', v: 3 } - ] - }, - 'grouped object output, with filter' - ); - - t.deepEqual( - dt.groupby('u').objects({ limit: 3, grouped: 'object' }), - { - a: [ - { u: 'a', v: 1 }, - { u: 'a', v: 2 } - ], - b: [ - { u: 'b', v: 3 } - ] - }, - 'grouped object output, with limit' - ); - - t.deepEqual( - dt.groupby('u').objects({ offset: 2, grouped: 'object' }), - { - a: [ - { u: 'a', v: 4 } - ], - b: [ - { u: 'b', v: 3 }, - { u: 'b', v: 5 } - ] - }, - 'grouped object output, with offset' - ); - - const dt2 = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - w: ['y', 'x', 'y', 'z', 'x'], - v: [2, 1, 4, 5, 3] - }) - .orderby('v'); - - t.deepEqual( - dt2.groupby(['u', 'w']).objects({ grouped: 'object' }), - { - a: { - x: [{ u: 'a', w: 'x', v: 1 }], - y: [{ u: 'a', w: 'y', v: 2 },{ u: 'a', w: 'y', v: 4 }] - }, - b: { - x: [{ u: 'b', w: 'x', v: 3 }], - z: [{ u: 'b', w: 'z', v: 5 }] - } - }, - 'grouped nested object output' - ); - - t.deepEqual( - dt2.groupby(['u', 'w']).objects({ grouped: 'entries' }), - [ - ['a', [ - ['y', [{ u: 'a', w: 'y', v: 2 }, { u: 'a', w: 'y', v: 4 }]], - ['x', [{ u: 'a', w: 'x', v: 1 }]] - ]], - ['b', [ - ['z', [{ u: 'b', w: 'z', v: 5 }]], - ['x', [{ u: 'b', w: 'x', v: 3 }]] - ]] - ], - 'grouped nested entries output' - ); - - t.deepEqual( - dt2.groupby(['u', 'w']).objects({ grouped: 'map' }), - new Map([ - ['a', new Map([ - ['x', [{ u: 'a', w: 'x', v: 1 }]], - ['y', [{ u: 'a', w: 'y', v: 2 },{ u: 'a', w: 'y', v: 4 }]] - ])], - ['b', new Map([ - ['x', [{ u: 'b', w: 'x', v: 3 }]], - ['z', [{ u: 'b', w: 'z', v: 5 }]] - ])] - ]), - 'grouped nested map output' - ); - - t.end(); -}); - -tape('ColumnTable supports iterator output', t => { - const output = [ - { u: 'a', v: 2 }, - { u: 'a', v: 1 }, - { u: 'a', v: 4 }, - { u: 'b', v: 5 }, - { u: 'b', v: 3 } - ]; - - const dt = new ColumnTable({ - u: ['a', 'a', 'a', 'b', 'b'], - v: [2, 1, 4, 5, 3] - }); - - t.deepEqual([...dt], output, 'iterator data'); - t.deepEqual( - [...dt.orderby('v')], - output.slice().sort((a, b) => a.v - b.v), - 'iterator data orderby' - ); - - t.end(); -}); - -tape('ColumnTable toString shows table state', t => { - const dt = new ColumnTable({ - a: ['a', 'a', 'a', 'b', 'b'], - b: [2, 1, 4, 5, 3] + it('includes output format methods', () => { + const proto = ColumnTable.prototype; + assert.ok( + typeof proto.toArrow === 'function', + 'ColumnTable includes toArrow' + ); + assert.ok( + typeof proto.toArrowIPC === 'function', + 'ColumnTable includes toArrowIPC' + ); + assert.ok( + typeof proto.toCSV === 'function', + 'ColumnTable includes toCSV' + ); + assert.ok( + typeof proto.toHTML === 'function', + 'ColumnTable includes toHTML' + ); + assert.ok( + typeof proto.toJSON === 'function', + 'ColumnTable includes toJSON' + ); + assert.ok( + typeof proto.toMarkdown === 'function', + 'ColumnTable includes toMarkdown' + ); }); - t.equal( - dt.toString(), - '[object Table: 2 cols x 5 rows]', - 'table toString' - ); - - const filter = new BitSet(5); - [1, 2, 4].forEach(i => filter.set(i)); - t.equal( - dt.create({ filter }).toString(), - '[object Table: 2 cols x 3 rows (5 backing)]', - 'filtered table toString' - ); - - const groups = { names: ['a'], get: [row => dt.get('a', row)], size: 2 }; - t.equal( - dt.create({ groups }).toString(), - '[object Table: 2 cols x 5 rows, 2 groups]', - 'grouped table toString' - ); - - const order = (u, v, { b }) => b[u] - b[v]; - t.equal( - dt.create({ order }).toString(), - '[object Table: 2 cols x 5 rows, ordered]', - 'ordered table toString' - ); - - t.equal( - dt.create({ filter, order, groups }).toString(), - '[object Table: 2 cols x 3 rows (5 backing), 2 groups, ordered]', - 'filtered, grouped, ordered table toString' - ); - - t.end(); }); - -tape('ColumnTable assign merges tables', t => { - const t1 = new ColumnTable({ a: [1], b: [2], c: [3] }); - const t2 = new ColumnTable({ b: [-2], d: [4] }); - const t3 = new ColumnTable({ a: [-1], e: [5] }); - const dt = t1.assign(t2, t3); - - tableEqual(t, dt, { - a: [-1], b: [-2], c: [3], d: [4], e: [5] - }, 'assigned data'); - - t.deepEqual( - dt.columnNames(), - ['a', 'b', 'c', 'd', 'e'], - 'assigned names' - ); - - t.throws( - () => t1.assign(new ColumnTable({ c: [1, 2, 3] })), - 'throws on mismatched row counts' - ); - - tableEqual(t, t1.assign({ b: [-2], d: [4] }), { - a: [1], b: [-2], c: [3], d: [4] - }, 'assigned data from object'); - - t.throws( - () => t1.assign({ c: [1, 2, 3] }), - 'throws on mismatched row counts from object' - ); - - t.end(); -}); - -tape('ColumnTable transform applies transformations', t => { - const dt = new ColumnTable({ a: [1, 2], b: [2, 3], c: [3, 4] }); - - tableEqual(t, - dt.transform( - t => t.filter(d => d.c > 3), - t => t.select('a', 'b'), - t => t.reify() - ), - { a: [2], b: [3] }, - 'transform pipeline' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/table/columns-from-test.js b/test/table/columns-from-test.js index bb877be9..a0f9105c 100644 --- a/test/table/columns-from-test.js +++ b/test/table/columns-from-test.js @@ -1,130 +1,123 @@ -import tape from 'tape'; -import columnsFrom from '../../src/table/columns-from'; - -tape('columnsFrom supports array input', t => { - t.deepEqual( - columnsFrom([]), - { }, - 'from empty array, names implicit' - ); - - t.deepEqual( - columnsFrom([], ['a', 'b']), - { a: [], b: [] }, - 'from empty array, names explicit' - ); - - t.deepEqual( - columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ]), - { a: [1, 3], b: [2, 4] }, - 'from array, names implicit' - ); - - t.deepEqual( - columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ], ['a', 'b']), - { a: [1, 3], b: [2, 4] }, - 'from array, names explicit' - ); - - t.deepEqual( - columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ], ['b']), - { b: [2, 4] }, - 'from array, names partial' - ); - - t.end(); +import assert from 'node:assert'; +import { columnsFrom } from '../../src/table/columns-from.js'; + +describe('columnsFrom', () => { + it('supports array input', () => { + assert.deepEqual( + columnsFrom([]), + { }, + 'from empty array, names implicit' + ); + + assert.deepEqual( + columnsFrom([], ['a', 'b']), + { a: [], b: [] }, + 'from empty array, names explicit' + ); + + assert.deepEqual( + columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ]), + { a: [1, 3], b: [2, 4] }, + 'from array, names implicit' + ); + + assert.deepEqual( + columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ], ['a', 'b']), + { a: [1, 3], b: [2, 4] }, + 'from array, names explicit' + ); + + assert.deepEqual( + columnsFrom([ {a: 1, b: 2}, {a: 3, b: 4} ], ['b']), + { b: [2, 4] }, + 'from array, names partial' + ); + }); + + it('supports iterable input', () => { + const data = [ {a: 1, b: 2}, {a: 3, b: 4} ]; + const iterable = { + [Symbol.iterator]: () => data.values() + }; + + assert.deepEqual( + columnsFrom(iterable), + { a: [1, 3], b: [2, 4] }, + 'from iterable, names implicit' + ); + + assert.deepEqual( + columnsFrom(iterable, ['a', 'b']), + { a: [1, 3], b: [2, 4] }, + 'from iterable, names explicit' + ); + + assert.deepEqual( + columnsFrom(iterable, ['b']), + { b: [2, 4] }, + 'from iterable, names partial' + ); + }); + + it('supports object input', () => { + assert.deepEqual( + columnsFrom({}), + { key: [], value: [] }, + 'from empty object, names implicit' + ); + + assert.deepEqual( + columnsFrom([], ['k', 'v']), + { k: [], v: [] }, + 'from empty object, names explicit' + ); + + assert.deepEqual( + columnsFrom({ a: 1, b: 2, c: 3 }), + { key: ['a', 'b', 'c'], value: [1, 2, 3] }, + 'from object, names implicit' + ); + + assert.deepEqual( + columnsFrom({ a: 1, b: 2, c: 3 }, ['k', 'v']), + { k: ['a', 'b', 'c'], v: [1, 2, 3] }, + 'from object, names explicit' + ); + + assert.deepEqual( + columnsFrom({ a: 1, b: 2, c: 3 }, [null, 'v']), + { v: [1, 2, 3] }, + 'from object, names partial' + ); + }); + + it('supports map input', () => { + const map = new Map([ ['a', 1], ['b', 2], ['c', 3] ]); + + assert.deepEqual( + columnsFrom(map), + { key: ['a', 'b', 'c'], value: [1, 2, 3] }, + 'from map, names implicit' + ); + + assert.deepEqual( + columnsFrom(map, ['k', 'v']), + { k: ['a', 'b', 'c'], v: [1, 2, 3] }, + 'from map, names explicit' + ); + + assert.deepEqual( + columnsFrom(map, [null, 'v']), + { v: [1, 2, 3] }, + 'from map, names partial' + ); + }); + + it('throws on unsupported type', () => { + assert.throws(() => columnsFrom(true), 'no boolean'); + assert.throws(() => columnsFrom(new Date()), 'no date'); + assert.throws(() => columnsFrom(12.3), 'no number'); + assert.throws(() => columnsFrom(/bop/), 'no regexp'); + assert.throws(() => columnsFrom('foo'), 'no string'); + }); }); - -tape('columnsFrom supports iterable input', t => { - const data = [ {a: 1, b: 2}, {a: 3, b: 4} ]; - const iterable = { - [Symbol.iterator]: () => data.values() - }; - - t.deepEqual( - columnsFrom(iterable), - { a: [1, 3], b: [2, 4] }, - 'from iterable, names implicit' - ); - - t.deepEqual( - columnsFrom(iterable, ['a', 'b']), - { a: [1, 3], b: [2, 4] }, - 'from iterable, names explicit' - ); - - t.deepEqual( - columnsFrom(iterable, ['b']), - { b: [2, 4] }, - 'from iterable, names partial' - ); - - t.end(); -}); - -tape('columnsFrom supports object input', t => { - t.deepEqual( - columnsFrom({}), - { key: [], value: [] }, - 'from empty object, names implicit' - ); - - t.deepEqual( - columnsFrom([], ['k', 'v']), - { k: [], v: [] }, - 'from empty object, names explicit' - ); - - t.deepEqual( - columnsFrom({ a: 1, b: 2, c: 3 }), - { key: ['a', 'b', 'c'], value: [1, 2, 3] }, - 'from object, names implicit' - ); - - t.deepEqual( - columnsFrom({ a: 1, b: 2, c: 3 }, ['k', 'v']), - { k: ['a', 'b', 'c'], v: [1, 2, 3] }, - 'from object, names explicit' - ); - - t.deepEqual( - columnsFrom({ a: 1, b: 2, c: 3 }, [null, 'v']), - { v: [1, 2, 3] }, - 'from object, names partial' - ); - - t.end(); -}); - -tape('columnsFrom supports map input', t => { - const map = new Map([ ['a', 1], ['b', 2], ['c', 3] ]); - - t.deepEqual( - columnsFrom(map), - { key: ['a', 'b', 'c'], value: [1, 2, 3] }, - 'from map, names implicit' - ); - - t.deepEqual( - columnsFrom(map, ['k', 'v']), - { k: ['a', 'b', 'c'], v: [1, 2, 3] }, - 'from map, names explicit' - ); - - t.deepEqual( - columnsFrom(map, [null, 'v']), - { v: [1, 2, 3] }, - 'from map, names partial' - ); - - t.end(); -}); - -tape('columnsFrom throws on unsupported type', t => { - t.throws(() => columnsFrom(true), 'no boolean'); - t.throws(() => columnsFrom(new Date()), 'no date'); - t.throws(() => columnsFrom(12.3), 'no number'); - t.throws(() => columnsFrom(/bop/), 'no regexp'); - t.throws(() => columnsFrom('foo'), 'no string'); - t.end(); -}); \ No newline at end of file diff --git a/test/table/table-test.js b/test/table/table-test.js new file mode 100644 index 00000000..e2146d5d --- /dev/null +++ b/test/table/table-test.js @@ -0,0 +1,491 @@ +import assert from 'node:assert'; +import { BitSet, Table, not } from '../../src/index.js'; +import { filter, groupby, orderby } from '../../src/verbs/index.js'; + +describe('Table', () => { + it('supports varied column types', () => { + const data = { + int: Uint32Array.of(1, 2, 3, 4, 5), + num: Float64Array.of(1.2, 2.3, 3.4, 4.5, 5.6), + str: ['a1', 'b2', 'c3', 'd4', 'e5'], + chr: 'abcde', + obj: [{a:1}, {b:2}, {c:3}, {d:4}, {e:5}] + }; + + const ref = { + int: [1, 2, 3, 4, 5], + num: [1.2, 2.3, 3.4, 4.5, 5.6], + str: ['a1', 'b2', 'c3', 'd4', 'e5'], + chr: ['a', 'b', 'c', 'd', 'e'], + obj: [{a:1}, {b:2}, {c:3}, {d:4}, {e:5}] + }; + + const ct = new Table(data); + + assert.equal(ct.numRows(), 5, 'num rows'); + assert.equal(ct.numCols(), 5, 'num cols'); + + const rows = [0, 1, 2, 3, 4]; + const get = { + int: rows.map(row => ct.get('int', row)), + num: rows.map(row => ct.get('num', row)), + str: rows.map(row => ct.get('str', row)), + chr: rows.map(row => ct.get('chr', row)), + obj: rows.map(row => ct.get('obj', row)) + }; + assert.deepEqual(get, ref, 'extracted get values match'); + + const getters = ['int', 'num', 'str', 'chr', 'obj'].map(name => ct.getter(name)); + const getter = { + int: rows.map(row => getters[0](row)), + num: rows.map(row => getters[1](row)), + str: rows.map(row => getters[2](row)), + chr: rows.map(row => getters[3](row)), + obj: rows.map(row => getters[4](row)) + }; + assert.deepEqual(getter, ref, 'extracted getter values match'); + + const arrays = ['int', 'num', 'str', 'chr', 'obj'].map(name => ct.array(name)); + const array = { + int: rows.map(row => arrays[0][row]), + num: rows.map(row => arrays[1][row]), + str: rows.map(row => arrays[2][row]), + chr: rows.map(row => arrays[3][row]), + obj: rows.map(row => arrays[4][row]) + }; + assert.deepEqual(array, ref, 'extracted array values match'); + + const scanned = { + int: [], + num: [], + str: [], + chr: [], + obj: [] + }; + ct.scan((row, data) => { + for (const col in data) { + scanned[col].push(data[col].at(row)); + } + }); + assert.deepEqual(scanned, ref, 'scanned values match'); + }); + + it('copies and freezes column object', () => { + const cols = { x: [1, 2, 3 ]}; + const table = new Table(cols); + const data = table.data(); + assert.notStrictEqual(data, cols, 'is copied'); + assert.strictEqual(data, table._data, 'is direct'); + assert.ok(Object.isFrozen(data), 'is frozen'); + assert.throws(() => data.y = [4, 5, 6], 'throws on edit'); + }); + + it('copies and freezes column name list', () => { + const names = ['y', 'x']; + const cols = { x: [1, 2, 3 ], y: [4, 5, 6]}; + const table = new Table(cols, names); + assert.notStrictEqual(table._names, names, 'is copied'); + assert.ok(Object.isFrozen(table._names), 'is frozen'); + assert.throws(() => table._names.push('z'), 'throws on edit'); + }); + + it('scan supports filtering and ordering', () => { + const table = new Table({ + a: ['a', 'a', 'a', 'b', 'b'], + b: [2, 1, 4, 5, 3] + }); + + let idx = []; + table.scan(row => idx.push(row), true); + assert.deepEqual(idx, [0, 1, 2, 3, 4], 'standard scan'); + + const filter = new BitSet(5); + [1, 2, 4].forEach(i => filter.set(i)); + const ft = table.create({ filter }); + idx = []; + ft.scan(row => idx.push(row), true); + assert.deepEqual(idx, [1, 2, 4], 'filtered scan'); + + const order = (u, v, { b }) => b.at(u) - b.at(v); + const ot = table.create({ order }); + assert.ok(ot.isOrdered(), 'is ordered'); + idx = []; + ot.scan(row => idx.push(row), true); + assert.deepEqual(idx, [1, 0, 4, 2, 3], 'ordered scan'); + + idx = []; + ot.scan(row => idx.push(row)); + assert.deepEqual(idx, [0, 1, 2, 3, 4], 'no-order scan'); + }); + + it('scan supports early termination', () => { + const table = new Table({ + a: ['a', 'a', 'a', 'b', 'b'], + b: [2, 1, 4, 5, 3] + }); + + let count; + const visitor = (row, data, stop) => { if (++count > 1) stop(); }; + + count = 0; + table.scan(visitor, true); + assert.equal(count, 2, 'standard scan'); + + count = 0; + const filter = new BitSet(5); + [1, 2, 4].forEach(i => filter.set(i)); + table.create({ filter }).scan(visitor, true); + assert.equal(count, 2, 'filtered scan'); + + count = 0; + const order = (u, v, { b }) => b.at(u) - b.at(v); + table.create({ order }).scan(visitor, true); + assert.equal(count, 2, 'ordered scan'); + }); + + it('memoizes indices', () => { + const ut = new Table({ v: [1, 3, 2] }); + const ui = ut.indices(false); + assert.equal(ut.indices(), ui, 'memoize unordered'); + + const ot = orderby(ut, 'v'); + const of = ot.indices(false); + const oi = ot.indices(); + assert.notEqual(of, oi, 'respect order flag'); + assert.equal(ot.indices(), oi, 'memoize ordered'); + assert.deepEqual(Array.from(oi), [0, 2, 1], 'indices ordered'); + }); + + it('supports column values output', () => { + const t = new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }); + const dt = orderby(filter(t, d => d.v > 1), 'v'); + + assert.deepEqual( + Array.from(dt.values('u')), + ['a', 'b', 'a', 'b'], + 'column values, strings' + ); + + assert.deepEqual( + Array.from(dt.values('v')), + [2, 3, 4, 5], + 'column values, numbers' + ); + + assert.deepEqual( + Int32Array.from(dt.values('v')), + Int32Array.of(2, 3, 4, 5), + 'column values, typed array' + ); + }); + + it('supports column array output', () => { + const t = new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }); + const dt = orderby(filter(t, d => d.v > 1), 'v'); + + assert.deepEqual( + dt.array('u'), + ['a', 'b', 'a', 'b'], + 'column array, strings' + ); + + assert.deepEqual( + dt.array('v'), + [2, 3, 4, 5], + 'column array, numbers' + ); + + assert.deepEqual( + dt.array('v', Int32Array), + Int32Array.of(2, 3, 4, 5), + 'column array, typed array' + ); + }); + + it('supports object output', () => { + const output = [ + { u: 'a', v: 1 }, + { u: 'a', v: 2 }, + { u: 'b', v: 3 }, + { u: 'a', v: 4 }, + { u: 'b', v: 5 } + ]; + + const dt = orderby(new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }), 'v'); + + assert.deepEqual(dt.objects(), output, 'object data'); + + assert.deepEqual( + dt.objects({ limit: 3 }), + output.slice(0, 3), + 'object data with limit' + ); + + assert.deepEqual( + dt.objects({ columns: not('v') }), + output.map(d => ({ u: d.u })), + 'object data with column selection' + ); + + assert.deepEqual( + dt.objects({ columns: { u: 'a', v: 'b'} }), + output.map(d => ({ a: d.u, b: d.v })), + 'object data with renaming column selection' + ); + + assert.deepEqual( + dt.object(), + output[0], + 'single object, implicit row' + ); + + assert.deepEqual( + dt.object(0), + output[0], + 'single object, explicit row' + ); + + assert.deepEqual( + dt.object(1), + output[1], + 'single object, explicit row' + ); + }); + + it('supports grouped object output', () => { + const dt = orderby(new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }), 'v'); + + assert.deepEqual( + groupby(dt, 'u').objects({ grouped: 'object' }), + { + a: [ + { u: 'a', v: 1 }, + { u: 'a', v: 2 }, + { u: 'a', v: 4 } + ], + b: [ + { u: 'b', v: 3 }, + { u: 'b', v: 5 } + ] + }, + 'grouped object output' + ); + + assert.deepEqual( + groupby(dt, 'u').objects({ grouped: 'entries' }), + [ + ['a',[ + { u: 'a', v: 1 }, + { u: 'a', v: 2 }, + { u: 'a', v: 4 } + ]], + ['b',[ + { u: 'b', v: 3 }, + { u: 'b', v: 5 } + ]] + ], + 'grouped entries output' + ); + + assert.deepEqual( + groupby(dt, 'u').objects({ grouped: 'map' }), + new Map([ + ['a',[ + { u: 'a', v: 1 }, + { u: 'a', v: 2 }, + { u: 'a', v: 4 } + ]], + ['b',[ + { u: 'b', v: 3 }, + { u: 'b', v: 5 } + ]] + ]), + 'grouped map output' + ); + + assert.deepEqual( + groupby(dt, 'u').objects({ grouped: true }), + new Map([ + ['a',[ + { u: 'a', v: 1 }, + { u: 'a', v: 2 }, + { u: 'a', v: 4 } + ]], + ['b',[ + { u: 'b', v: 3 }, + { u: 'b', v: 5 } + ]] + ]), + 'grouped map output, using true' + ); + + assert.deepEqual( + groupby(filter(dt, d => d.v < 4), 'u') + .objects({ grouped: 'object' }), + { + a: [ + { u: 'a', v: 1 }, + { u: 'a', v: 2 } + ], + b: [ + { u: 'b', v: 3 } + ] + }, + 'grouped object output, with filter' + ); + + assert.deepEqual( + groupby(dt, 'u').objects({ limit: 3, grouped: 'object' }), + { + a: [ + { u: 'a', v: 1 }, + { u: 'a', v: 2 } + ], + b: [ + { u: 'b', v: 3 } + ] + }, + 'grouped object output, with limit' + ); + + assert.deepEqual( + groupby(dt, 'u').objects({ offset: 2, grouped: 'object' }), + { + a: [ + { u: 'a', v: 4 } + ], + b: [ + { u: 'b', v: 3 }, + { u: 'b', v: 5 } + ] + }, + 'grouped object output, with offset' + ); + + const dt2 = orderby(new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + w: ['y', 'x', 'y', 'z', 'x'], + v: [2, 1, 4, 5, 3] + }), 'v'); + + assert.deepEqual( + groupby(dt2, ['u', 'w']).objects({ grouped: 'object' }), + { + a: { + x: [{ u: 'a', w: 'x', v: 1 }], + y: [{ u: 'a', w: 'y', v: 2 },{ u: 'a', w: 'y', v: 4 }] + }, + b: { + x: [{ u: 'b', w: 'x', v: 3 }], + z: [{ u: 'b', w: 'z', v: 5 }] + } + }, + 'grouped nested object output' + ); + + assert.deepEqual( + groupby(dt2, ['u', 'w']).objects({ grouped: 'entries' }), + [ + ['a', [ + ['y', [{ u: 'a', w: 'y', v: 2 }, { u: 'a', w: 'y', v: 4 }]], + ['x', [{ u: 'a', w: 'x', v: 1 }]] + ]], + ['b', [ + ['z', [{ u: 'b', w: 'z', v: 5 }]], + ['x', [{ u: 'b', w: 'x', v: 3 }]] + ]] + ], + 'grouped nested entries output' + ); + + assert.deepEqual( + groupby(dt2, ['u', 'w']).objects({ grouped: 'map' }), + new Map([ + ['a', new Map([ + ['x', [{ u: 'a', w: 'x', v: 1 }]], + ['y', [{ u: 'a', w: 'y', v: 2 },{ u: 'a', w: 'y', v: 4 }]] + ])], + ['b', new Map([ + ['x', [{ u: 'b', w: 'x', v: 3 }]], + ['z', [{ u: 'b', w: 'z', v: 5 }]] + ])] + ]), + 'grouped nested map output' + ); + }); + + it('supports iterator output', () => { + const output = [ + { u: 'a', v: 2 }, + { u: 'a', v: 1 }, + { u: 'a', v: 4 }, + { u: 'b', v: 5 }, + { u: 'b', v: 3 } + ]; + + const dt = new Table({ + u: ['a', 'a', 'a', 'b', 'b'], + v: [2, 1, 4, 5, 3] + }); + + assert.deepEqual([...dt], output, 'iterator data'); + assert.deepEqual( + [...orderby(dt, 'v')], + output.slice().sort((a, b) => a.v - b.v), + 'iterator data orderby' + ); + }); + + it('toString shows table state', () => { + const dt = new Table({ + a: ['a', 'a', 'a', 'b', 'b'], + b: [2, 1, 4, 5, 3] + }); + + assert.equal( + dt.toString(), + '[object Table: 2 cols x 5 rows]', + 'table toString' + ); + + const filter = new BitSet(5); + [1, 2, 4].forEach(i => filter.set(i)); + assert.equal( + dt.create({ filter }).toString(), + '[object Table: 2 cols x 3 rows (5 backing)]', + 'filtered table toString' + ); + + const groups = { names: ['a'], get: [row => dt.get('a', row)], size: 2 }; + assert.equal( + dt.create({ groups }).toString(), + '[object Table: 2 cols x 5 rows, 2 groups]', + 'grouped table toString' + ); + + const order = (u, v, { b }) => b[u] - b[v]; + assert.equal( + dt.create({ order }).toString(), + '[object Table: 2 cols x 5 rows, ordered]', + 'ordered table toString' + ); + + assert.equal( + dt.create({ filter, order, groups }).toString(), + '[object Table: 2 cols x 3 rows (5 backing), 2 groups, ordered]', + 'filtered, grouped, ordered table toString' + ); + }); +}); diff --git a/test/types-test.ts b/test/types-test.ts new file mode 100644 index 00000000..b6422555 --- /dev/null +++ b/test/types-test.ts @@ -0,0 +1,104 @@ +// Example code that should not cause any TypeScript errors +import * as aq from '../src/api.js'; +const { op } = aq; + +const dt = aq.table({ + x: [1, 2, 3], + y: ['a', 'b', 'c'] +}); +const other = aq.table({ u: [3, 2, 1 ] }); +const other2 = aq.table({ x: [4, 5, 6 ] }); + +export const rt = dt + .antijoin(other) + .antijoin(other, ['keyL', 'keyR']) + .antijoin(other, (a, b) => op.equal(a.keyL, b.keyR)) + .assign({ z: [4, 5, 6] }, other) + .concat(other) + .concat(other, other2) + .concat([other, other2]) + .count() + .count({ as: 'foo' }) + .cross(other) + .cross(other, [['leftKey', 'leftVal'], ['rightVal']]) + .dedupe() + .dedupe('y') + .derive({ + row1: op.row_number(), + lead1: op.lead('s'), + row2: () => op.row_number(), + lead2: (d: {s: string}) => op.lead(op.trim(d.s)), + z: (d: {x: number}) => (d.x - op.average(d.x)) / op.stdev(d.x), + avg: aq.rolling( + (d: {x: number}) => op.average(d.x), + [-5, 5] + ), + mix: (d: any) => d.x > 2 ? d.u : d.z + }) + .except(other) + .except(other, other2) + .except([other, other2]) + .filter((d: any) => d.x > 2 && d.s !== 'foo') + .filter((d: {x: number, s: string}) => d.x > 2 && d.s !== 'foo') + .fold('colA') + .fold(['colA', 'colB'], { as: ['k', 'v'] }) + .groupby('y') + .ungroup() + .groupby({ g: 'y' }) + .ungroup() + .impute({ v: () => 0 }) + .impute({ v: (d: {v: number}) => op.mean(d.v) }) + .impute({ v: () => 0 }, { expand: ['x', 'y'] }) + .intersect(other) + .intersect(other, other2) + .intersect([other, other2]) + .join(other, ['keyL', 'keyR']) + .join(other, (a, b) => op.equal(a.keyL, b.keyR)) + .join_left(other, ['keyL', 'keyR']) + .join_left(other, (a, b) => op.equal(a.keyL, b.keyR)) + .join_right(other, ['keyL', 'keyR']) + .join_right(other, (a, b) => op.equal(a.keyL, b.keyR)) + .join_full(other, ['keyL', 'keyR']) + .join_full(other, (a, b) => op.equal(a.keyL, b.keyR)) + .lookup(other, ['key1', 'key2'], 'value1', 'value2') + .orderby('x', aq.desc('u')) + .unorder() + .pivot('key', 'value') + .pivot(['keyA', 'keyB'], ['valueA', 'valueB']) + .pivot({ key: (d: any) => d.key }, { value: (d: any) => op.sum(d.value) }) + .relocate(['x', 'y'], { after: 'z' }) + .rename({ x: 'xx', y: 'yy' }) + .rollup({ + min1: op.min('x'), + max1: op.max('x'), + sum1: op.sum('x'), + mode1: op.mode('x'), + min2: (d: {x: number}) => op.min(d.x), + max2: (d: {s: string}) => op.max(d.s), + sum2: (d: {x: number}) => op.sum(d.x), + mode2: (d: {d: Date}) => op.mode(d.d), + mix: (d: {x: number, z: number}) => op.min(d.x) + op.sum(d.z) + }) + .sample(100) + .sample(100, { replace: true }) + .select('x') + .select({ x: 'xx' }) + .select(aq.all(), aq.not('x'), aq.range(0, 5)) + .semijoin(other) + .semijoin(other, ['keyL', 'keyR']) + .semijoin(other, (a, b) => op.equal(a.keyL, b.keyR)) + .slice(1, -1) + .slice(2) + .spread({ a: (d: any) => op.split(d.y, '') }) + .spread('arrayCol', { limit: 100 }) + .union(other) + .union(other, other2) + .union([other, other2]) + .unroll('arrayCol', { limit: 1000 }); + +export const arrow : import('apache-arrow').Table = dt.toArrow(); +export const buf : Uint8Array = dt.toArrowIPC(); +export const csv : string = dt.toCSV({ delimiter: '\t' }); +export const json : string = dt.toJSON({ columns: ['x', 'y'] }); +export const html : string = dt.toHTML(); +export const md : string = dt.toMarkdown(); diff --git a/test/verbs/assign-test.js b/test/verbs/assign-test.js new file mode 100644 index 00000000..c256f29f --- /dev/null +++ b/test/verbs/assign-test.js @@ -0,0 +1,36 @@ +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; + +describe('assign', () => { + it('assign merges tables', () => { + const t1 = table({ a: [1], b: [2], c: [3] }); + const t2 = table({ b: [-2], d: [4] }); + const t3 = table({ a: [-1], e: [5] }); + const dt = t1.assign(t2, t3); + + tableEqual(dt, { + a: [-1], b: [-2], c: [3], d: [4], e: [5] + }, 'assigned data'); + + assert.deepEqual( + dt.columnNames(), + ['a', 'b', 'c', 'd', 'e'], + 'assigned names' + ); + + assert.throws( + () => t1.assign(table({ c: [1, 2, 3] })), + 'throws on mismatched row counts' + ); + + tableEqual(t1.assign({ b: [-2], d: [4] }), { + a: [1], b: [-2], c: [3], d: [4] + }, 'assigned data from object'); + + assert.throws( + () => t1.assign({ c: [1, 2, 3] }), + 'throws on mismatched row counts from object' + ); + }); +}); diff --git a/test/verbs/concat-test.js b/test/verbs/concat-test.js index e5627e4f..3d63b4ab 100644 --- a/test/verbs/concat-test.js +++ b/test/verbs/concat-test.js @@ -1,42 +1,40 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; -tape('concat combines tables', t => { - const t1 = table({ a: [1, 2], b: [3, 4] }); - const t2 = table({ a: [3, 4], c: [5, 6] }); - const dt = t1.concat(t2); +describe('concat', () => { + it('combines tables', () => { + const t1 = table({ a: [1, 2], b: [3, 4] }); + const t2 = table({ a: [3, 4], c: [5, 6] }); + const dt = t1.concat(t2); - t.equal(dt.numRows(), 4, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 3, 4], - b: [3, 4, undefined, undefined] - }, 'concat data'); + assert.equal(dt.numRows(), 4, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 3, 4], + b: [3, 4, undefined, undefined] + }, 'concat data'); + }); - t.end(); -}); - -tape('concat combines multiple tables', t => { - const t1 = table({ a: [1, 2], b: [3, 4] }); - const t2 = table({ a: [3, 4], c: [5, 6] }); - const t3 = table({ a: [5, 6], b: [7, 8] }); + it('combines multiple tables', () => { + const t1 = table({ a: [1, 2], b: [3, 4] }); + const t2 = table({ a: [3, 4], c: [5, 6] }); + const t3 = table({ a: [5, 6], b: [7, 8] }); - const dt = t1.concat(t2, t3); - t.equal(dt.numRows(), 6, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 3, 4, 5, 6], - b: [3, 4, undefined, undefined, 7, 8] - }, 'concat data'); + const dt = t1.concat(t2, t3); + assert.equal(dt.numRows(), 6, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 3, 4, 5, 6], + b: [3, 4, undefined, undefined, 7, 8] + }, 'concat data'); - const at = t1.concat([t2, t3]); - t.equal(at.numRows(), 6, 'num rows'); - t.equal(at.numCols(), 2, 'num cols'); - tableEqual(t, at, { - a: [1, 2, 3, 4, 5, 6], - b: [3, 4, undefined, undefined, 7, 8] - }, 'concat data'); - - t.end(); -}); \ No newline at end of file + const at = t1.concat([t2, t3]); + assert.equal(at.numRows(), 6, 'num rows'); + assert.equal(at.numCols(), 2, 'num cols'); + tableEqual(at, { + a: [1, 2, 3, 4, 5, 6], + b: [3, 4, undefined, undefined, 7, 8] + }, 'concat data'); + }); +}); diff --git a/test/verbs/dedupe-test.js b/test/verbs/dedupe-test.js index fb989404..6281a26d 100644 --- a/test/verbs/dedupe-test.js +++ b/test/verbs/dedupe-test.js @@ -1,31 +1,31 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; -tape('dedupe de-duplicates table', t => { - const dt = table({ a: [1, 2, 1, 2, 1], b: [3, 4, 3, 4, 5] }) - .dedupe(); +describe('dedupe', () => { + it('de-duplicates table', () => { + const dt = table({ a: [1, 2, 1, 2, 1], b: [3, 4, 3, 4, 5] }) + .dedupe(); - t.equal(dt.numRows(), 3, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 1], - b: [3, 4, 5] - }, 'dedupe data'); - t.equal(dt.isGrouped(), false, 'dedupe not grouped'); - t.end(); -}); + assert.equal(dt.numRows(), 3, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 1], + b: [3, 4, 5] + }, 'dedupe data'); + assert.equal(dt.isGrouped(), false, 'dedupe not grouped'); + }); -tape('dedupe de-duplicates table based on keys', t => { - const dt = table({ a: [1, 2, 1, 2, 1], b: [3, 4, 3, 4, 5] }) - .dedupe('a'); + it('de-duplicates table based on keys', () => { + const dt = table({ a: [1, 2, 1, 2, 1], b: [3, 4, 3, 4, 5] }) + .dedupe('a'); - t.equal(dt.numRows(), 2, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2], - b: [3, 4] - }, 'dedupe data'); - t.equal(dt.isGrouped(), false, 'dedupe not grouped'); - t.end(); -}); \ No newline at end of file + assert.equal(dt.numRows(), 2, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2], + b: [3, 4] + }, 'dedupe data'); + assert.equal(dt.isGrouped(), false, 'dedupe not grouped'); + }); +}); diff --git a/test/verbs/derive-test.js b/test/verbs/derive-test.js index ffe53e08..321cc5f5 100644 --- a/test/verbs/derive-test.js +++ b/test/verbs/derive-test.js @@ -1,359 +1,342 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, rolling, table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, rolling, table } from '../../src/index.js'; const { abs, lag, mean, median, rank, stdev } = op; -tape('derive creates new columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const dt = table(data).derive({ c: d => d.a + d.b }); - t.equal(dt.numRows(), 4, 'num rows'); - t.equal(dt.numCols(), 3, 'num cols'); - tableEqual(t, dt, { ...data, c: [3, 7, 11, 15] }, 'derive data'); - t.end(); -}); - -tape('derive overwrites existing columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - tableEqual(t, - table(data).derive({ a: d => d.a + d.b }), - { ...data, a: [3, 7, 11, 15] }, - 'derive data' - ); - t.end(); -}); - -tape('derive drops existing columns with option', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - tableEqual(t, - table(data).derive({ z: d => d.a + d.b }, { drop: true }), - { z: [3, 7, 11, 15] }, - 'derive data' - ); - t.end(); -}); - -tape('derive can relocate new columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const t1 = table(data).derive({ z: d => d.a + d.b }, { before: 'a' }); - - tableEqual(t, - t1, - { z: [3, 7, 11, 15], ...data }, - 'derive data, with before' - ); - - t.deepEqual( - t1.columnNames(), - ['z', 'a', 'b'], - 'derive data columns, with before' - ); - - const t2 = table(data).derive({ z: d => d.a + d.b }, { after: 'a' }); - - tableEqual(t, - t2, - { a: data.a, z: [3, 7, 11, 15], b: data.b }, - 'derive data, with after' - ); - - t.deepEqual( - t2.columnNames(), - ['a', 'z', 'b'], - 'derive data columns, with after' - ); - - const t3 = table(data).derive({ a: d => -d.a, z: d => d.a + d.b }, { after: 'b' }); - - tableEqual(t, - t3, - { a: [-1, -3, -5, -7], b: data.b, z: [3, 7, 11, 15] }, - 'derive data, with after and overwrite' - ); - - t.deepEqual( - t3.columnNames(), - ['a', 'b', 'z'], - 'derive data columns, with after and overwrite' - ); - - t.end(); -}); - -tape('derive supports aggregate and window operators', t => { - const n = 10; - const k = Array(n); - const a = Array(n); - const b = Array(n); - - for (let i = 0; i < n; ++i) { - k[i] = i % 3; - a[i] = i; - b[i] = i + 1; - } - - const td = table({ k, a, b }) - .groupby('k') - .orderby('a') - .derive({ - rank: () => rank(), - diff: ({ a, b }) => a - lag(b, 1, 0), - roll: rolling(d => mean(d.a), [-2, 0]) - }); - tableEqual(t, td.select('rank', 'diff', 'roll'), { - rank: [1, 1, 1, 2, 2, 2, 3, 3, 3, 4], - diff: [0, 1, 2, 2, 2, 2, 2, 2, 2, 2], - roll: [0, 1, 2, 1.5, 2.5, 3.5, 3, 4, 5, 6] - }, 'derive window queries'); - - const tz = td - .ungroup() - .derive({ - z: ({ a }) => abs(a - mean(a)) / stdev(a) - }); - tableEqual(t, tz.select('z'), { - z: [ - 1.4863010829205867, - 1.1560119533826787, - 0.8257228238447705, - 0.49543369430686224, - 0.1651445647689541, - 0.1651445647689541, - 0.49543369430686224, - 0.8257228238447705, - 1.1560119533826787, - 1.4863010829205867 - ] - }, 'z-score'); - - const tm = tz - .derive({ dev: d => abs(d.a - median(d.a)) }) - .rollup({ mad: d => median(d.dev) }); - tableEqual(t, tm.select('mad'), { - mad: [ 2.5 ] - }, 'mad'); - - t.end(); -}); - -tape('derive supports parameters', t => { - const output = { - n: [1, 2, 3, 4], - p: [NaN, 1, 1, 1 ] - }; - - const dt = table({ n: [1, 2, 3, 4] }).params({lag: 1}); - - tableEqual(t, - dt.derive({p: (d, $) => d.n * $.lag - op.lag(d.n, 1)}), - output, - 'parameter in main scope' - ); - - tableEqual(t, - dt.derive({p: (d, $) => d.n - op.lag(d.n, $.lag)}), - output, - 'parameter in operator input scope' - ); - - tableEqual(t, - dt.derive({p: 'd.n * $.lag - op.lag(d.n, 1)'}), - output, - 'default parameter in main scope' - ); - - tableEqual(t, - dt.derive({p: 'd.n * lag - op.lag(d.n, 1)'}), - output, - 'direct parameter in main scope' - ); - - tableEqual(t, - dt.derive({p: 'd.n - op.lag(d.n, $.lag)'}), - output, - 'default parameter in operator input scope' - ); - - tableEqual(t, - dt.derive({p: 'd.n - op.lag(d.n, lag)'}), - output, - 'direct parameter in operator input scope' - ); - - t.end(); -}); - -tape('derive supports differing window frames', t => { - const dt = table({ x: [1, 2, 3, 4, 5, 6] }) - .derive({ - cs0: rolling(d => op.sum(d.x)), - cs4: rolling(d => op.sum(d.x), [-4, 0]), - cs2: rolling(d => op.sum(d.x), [-2, 0]) - }); - - tableEqual(t, dt, - { - x: [1, 2, 3, 4, 5, 6], - cs0: [1, 3, 6, 10, 15, 21], - cs4: [1, 3, 6, 10, 15, 20], - cs2: [1, 3, 6, 9, 12, 15] - }, - 'derive data' - ); - - t.end(); -}); - -tape('derive supports streaming value windows', t => { - const dt = table({ val: [1, 2, 3, 4, 5] }) - .orderby('val') - .derive({ - sum: rolling(op.sum('val'), [-2, 0]), - index: () => op.row_number() - 1 - }) - .derive({ - frame: rolling(op.array_agg('index'), [-2, 0]) - }); - - tableEqual(t, dt, - { - val: [1, 2, 3, 4, 5], - sum: [1, 3, 6, 9, 12], - index: [0, 1, 2, 3, 4], - frame: [ [0], [0,1], [0,1,2], [1,2,3], [2,3,4] ] - }, - 'derive data' - ); - t.end(); -}); - -tape('derive supports bigint values', t => { - const data = { - v: [1n, 2n, 3n, 4n, 5n] - }; - - function roll(obj) { - for (const key in obj) { - obj[key] = rolling(obj[key], [-1, 1]); +describe('derive', () => { + it('creates new columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const dt = table(data).derive({ c: d => d.a + d.b }); + assert.equal(dt.numRows(), 4, 'num rows'); + assert.equal(dt.numCols(), 3, 'num cols'); + tableEqual(dt, { ...data, c: [3, 7, 11, 15] }, 'derive data'); + }); + + it('overwrites existing columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + tableEqual( + table(data).derive({ a: d => d.a + d.b }), + { ...data, a: [3, 7, 11, 15] }, + 'derive data' + ); + }); + + it('drops existing columns with option', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + tableEqual( + table(data).derive({ z: d => d.a + d.b }, { drop: true }), + { z: [3, 7, 11, 15] }, + 'derive data' + ); + }); + + it('can relocate new columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const t1 = table(data).derive({ z: d => d.a + d.b }, { before: 'a' }); + + tableEqual( + t1, + { z: [3, 7, 11, 15], ...data }, + 'derive data, with before' + ); + + assert.deepEqual( + t1.columnNames(), + ['z', 'a', 'b'], + 'derive data columns, with before' + ); + + const t2 = table(data).derive({ z: d => d.a + d.b }, { after: 'a' }); + + tableEqual( + t2, + { a: data.a, z: [3, 7, 11, 15], b: data.b }, + 'derive data, with after' + ); + + assert.deepEqual( + t2.columnNames(), + ['a', 'z', 'b'], + 'derive data columns, with after' + ); + + const t3 = table(data).derive({ a: d => -d.a, z: d => d.a + d.b }, { after: 'b' }); + + tableEqual( + t3, + { a: [-1, -3, -5, -7], b: data.b, z: [3, 7, 11, 15] }, + 'derive data, with after and overwrite' + ); + + assert.deepEqual( + t3.columnNames(), + ['a', 'b', 'z'], + 'derive data columns, with after and overwrite' + ); + }); + + it('supports aggregate and window operators', () => { + const n = 10; + const k = Array(n); + const a = Array(n); + const b = Array(n); + + for (let i = 0; i < n; ++i) { + k[i] = i % 3; + a[i] = i; + b[i] = i + 1; } - return obj; - } - - const dt = table(data) - .derive(roll({ - v: d => 2n ** d.v, - sum: op.sum('v'), - prod: op.product('v'), - min: op.min('v'), - max: op.max('v'), - med: op.median('v'), - vals: op.array_agg('v'), - uniq: op.array_agg_distinct('v') - })); - - t.deepEqual( - dt.objects(), - [ - { v: 2n, sum: 3n, prod: 2n, min: 1n, max: 2n, med: 1n, vals: [ 1n, 2n ], uniq: [ 1n, 2n ] }, - { v: 4n, sum: 6n, prod: 6n, min: 1n, max: 3n, med: 2n, vals: [ 1n, 2n, 3n ], uniq: [ 1n, 2n, 3n ] }, - { v: 8n, sum: 9n, prod: 24n, min: 2n, max: 4n, med: 3n, vals: [ 2n, 3n, 4n ], uniq: [ 2n, 3n, 4n ] }, - { v: 16n, sum: 12n, prod: 60n, min: 3n, max: 5n, med: 4n, vals: [ 3n, 4n, 5n ], uniq: [ 3n, 4n, 5n ] }, - { v: 32n, sum: 9n, prod: 20n, min: 4n, max: 5n, med: 4n, vals: [ 4n, 5n ], uniq: [ 4n, 5n ] } - ], - 'derive data' - ); - - t.end(); -}); - -tape('derive aggregates support ordered tables', t => { - const rt = table({ v: [3, 1, 4, 2] }) - .orderby('v') - .derive({ a: op.array_agg('v') }); - tableEqual(t, rt, { - v: [1, 2, 3, 4], - a: [[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]] - }, 'derive data'); - t.end(); -}); + const td = table({ k, a, b }) + .groupby('k') + .orderby('a') + .derive({ + rank: () => rank(), + diff: ({ a, b }) => a - lag(b, 1, 0), + roll: rolling(d => mean(d.a), [-2, 0]) + }); + tableEqual(td.select('rank', 'diff', 'roll'), { + rank: [1, 1, 1, 2, 2, 2, 3, 3, 3, 4], + diff: [0, 1, 2, 2, 2, 2, 2, 2, 2, 2], + roll: [0, 1, 2, 1.5, 2.5, 3.5, 3, 4, 5, 6] + }, 'derive window queries'); + + const tz = td + .ungroup() + .derive({ + z: ({ a }) => abs(a - mean(a)) / stdev(a) + }); + tableEqual(tz.select('z'), { + z: [ + 1.4863010829205867, + 1.1560119533826787, + 0.8257228238447705, + 0.49543369430686224, + 0.1651445647689541, + 0.1651445647689541, + 0.49543369430686224, + 0.8257228238447705, + 1.1560119533826787, + 1.4863010829205867 + ] + }, 'z-score'); + + const tm = tz + .derive({ dev: d => abs(d.a - median(d.a)) }) + .rollup({ mad: d => median(d.dev) }); + tableEqual(tm.select('mad'), { + mad: [ 2.5 ] + }, 'mad'); + }); + + it('supports parameters', () => { + const output = { + n: [1, 2, 3, 4], + p: [NaN, 1, 1, 1 ] + }; + + const dt = table({ n: [1, 2, 3, 4] }).params({lag: 1}); + + tableEqual( + dt.derive({p: (d, $) => d.n * $.lag - op.lag(d.n, 1)}), + output, + 'parameter in main scope' + ); + + tableEqual( + dt.derive({p: (d, $) => d.n - op.lag(d.n, $.lag)}), + output, + 'parameter in operator input scope' + ); + + tableEqual( + dt.derive({p: 'd.n * $.lag - op.lag(d.n, 1)'}), + output, + 'default parameter in main scope' + ); + + tableEqual( + dt.derive({p: 'd.n * lag - op.lag(d.n, 1)'}), + output, + 'direct parameter in main scope' + ); + + tableEqual( + dt.derive({p: 'd.n - op.lag(d.n, $.lag)'}), + output, + 'default parameter in operator input scope' + ); + + tableEqual( + dt.derive({p: 'd.n - op.lag(d.n, lag)'}), + output, + 'direct parameter in operator input scope' + ); + }); + + it('supports differing window frames', () => { + const dt = table({ x: [1, 2, 3, 4, 5, 6] }) + .derive({ + cs0: rolling(d => op.sum(d.x)), + cs4: rolling(d => op.sum(d.x), [-4, 0]), + cs2: rolling(d => op.sum(d.x), [-2, 0]) + }); + + tableEqual(dt, + { + x: [1, 2, 3, 4, 5, 6], + cs0: [1, 3, 6, 10, 15, 21], + cs4: [1, 3, 6, 10, 15, 20], + cs2: [1, 3, 6, 9, 12, 15] + }, + 'derive data' + ); + }); + + it('supports streaming value windows', () => { + const dt = table({ val: [1, 2, 3, 4, 5] }) + .orderby('val') + .derive({ + sum: rolling(op.sum('val'), [-2, 0]), + index: () => op.row_number() - 1 + }) + .derive({ + frame: rolling(op.array_agg('index'), [-2, 0]) + }); + + tableEqual(dt, + { + val: [1, 2, 3, 4, 5], + sum: [1, 3, 6, 9, 12], + index: [0, 1, 2, 3, 4], + frame: [ [0], [0,1], [0,1,2], [1,2,3], [2,3,4] ] + }, + 'derive data' + ); + }); + + it('supports bigint values', () => { + const data = { + v: [1n, 2n, 3n, 4n, 5n] + }; + + function roll(obj) { + for (const key in obj) { + obj[key] = rolling(obj[key], [-1, 1]); + } + return obj; + } -tape('derive supports recode function', t => { - const dt = table({ x: ['foo', 'bar', 'baz'] }); - - tableEqual(t, - dt.derive({ x: d => op.recode(d.x, {foo: 'farp', bar: 'borp'}, 'other') }), - { x: ['farp', 'borp', 'other'] }, - 'derive data, recode inline map Object' - ); - - const map = { - foo: 'farp', - bar: 'borp' - }; - - tableEqual(t, - dt.params({ map }) - .derive({ x: (d, $) => op.recode(d.x, $.map) }), - { x: ['farp', 'borp', 'baz'] }, - 'derive data, recode param map Object' - ); - - const map2 = new Map([['foo', 'farp'], ['bar', 'borp']]); - - tableEqual(t, - dt.params({ map2 }) - .derive({ x: (d, $) => op.recode(d.x, $.map2) }), - { x: ['farp', 'borp', 'baz'] }, - 'derive data, recode param map Map' - ); - - t.end(); + const dt = table(data) + .derive(roll({ + v: d => 2n ** d.v, + sum: op.sum('v'), + prod: op.product('v'), + min: op.min('v'), + max: op.max('v'), + med: op.median('v'), + vals: op.array_agg('v'), + uniq: op.array_agg_distinct('v') + })); + + assert.deepEqual( + dt.objects(), + [ + { v: 2n, sum: 3n, prod: 2n, min: 1n, max: 2n, med: 1n, vals: [ 1n, 2n ], uniq: [ 1n, 2n ] }, + { v: 4n, sum: 6n, prod: 6n, min: 1n, max: 3n, med: 2n, vals: [ 1n, 2n, 3n ], uniq: [ 1n, 2n, 3n ] }, + { v: 8n, sum: 9n, prod: 24n, min: 2n, max: 4n, med: 3n, vals: [ 2n, 3n, 4n ], uniq: [ 2n, 3n, 4n ] }, + { v: 16n, sum: 12n, prod: 60n, min: 3n, max: 5n, med: 4n, vals: [ 3n, 4n, 5n ], uniq: [ 3n, 4n, 5n ] }, + { v: 32n, sum: 9n, prod: 20n, min: 4n, max: 5n, med: 4n, vals: [ 4n, 5n ], uniq: [ 4n, 5n ] } + ], + 'derive data' + ); + }); + + it('aggregates support ordered tables', () => { + const rt = table({ v: [3, 1, 4, 2] }) + .orderby('v') + .derive({ a: op.array_agg('v') }); + + tableEqual(rt, { + v: [1, 2, 3, 4], + a: [[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]] + }, 'derive data'); + }); + + it('supports recode function', () => { + const dt = table({ x: ['foo', 'bar', 'baz'] }); + + tableEqual( + dt.derive({ x: d => op.recode(d.x, {foo: 'farp', bar: 'borp'}, 'other') }), + { x: ['farp', 'borp', 'other'] }, + 'derive data, recode inline map Object' + ); + + const map = { + foo: 'farp', + bar: 'borp' + }; + + tableEqual( + dt.params({ map }) + .derive({ x: (d, $) => op.recode(d.x, $.map) }), + { x: ['farp', 'borp', 'baz'] }, + 'derive data, recode param map Object' + ); + + const map2 = new Map([['foo', 'farp'], ['bar', 'borp']]); + + tableEqual( + dt.params({ map2 }) + .derive({ x: (d, $) => op.recode(d.x, $.map2) }), + { x: ['farp', 'borp', 'baz'] }, + 'derive data, recode param map Map' + ); + }); + + it('supports fill window functions', () => { + const t1 = table({ x: ['a', null, undefined, 'b', NaN, null, 'c'] }); + + tableEqual( + t1.derive({ x: op.fill_down('x') }), + { x: ['a', 'a', 'a', 'b', 'b', 'b', 'c'] }, + 'derive data, fill_down' + ); + + tableEqual( + t1.derive({ x: op.fill_up('x') }), + { x: ['a', 'b', 'b', 'b', 'c', 'c', 'c'] }, + 'derive data, fill_up' + ); + + const t2 = table({ x: [null, 'a', null] }); + + tableEqual( + t2.derive({ x: op.fill_down('x', '?') }), + { x: ['?', 'a', 'a'] }, + 'derive data, fill_down with default' + ); + + tableEqual( + t2.derive({ x: op.fill_up('x', '?') }), + { x: ['a', 'a', '?'] }, + 'derive data, fill_up with default' + ); + }); }); - -tape('derive supports fill window functions', t => { - const t1 = table({ x: ['a', null, undefined, 'b', NaN, null, 'c'] }); - - tableEqual(t, - t1.derive({ x: op.fill_down('x') }), - { x: ['a', 'a', 'a', 'b', 'b', 'b', 'c'] }, - 'derive data, fill_down' - ); - - tableEqual(t, - t1.derive({ x: op.fill_up('x') }), - { x: ['a', 'b', 'b', 'b', 'c', 'c', 'c'] }, - 'derive data, fill_up' - ); - - const t2 = table({ x: [null, 'a', null] }); - - tableEqual(t, - t2.derive({ x: op.fill_down('x', '?') }), - { x: ['?', 'a', 'a'] }, - 'derive data, fill_down with default' - ); - - tableEqual(t, - t2.derive({ x: op.fill_up('x', '?') }), - { x: ['a', 'a', '?'] }, - 'derive data, fill_up with default' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/filter-test.js b/test/verbs/filter-test.js index 34aa6b39..8427cd21 100644 --- a/test/verbs/filter-test.js +++ b/test/verbs/filter-test.js @@ -1,55 +1,52 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, table } from '../../src'; - -tape('filter filters a table', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const ft = table(cols).filter(d => 1 < d.a && d.a < 7).reify(); - - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - tableEqual(t, ft, { a: [3, 5], b: [4, 6] }, 'filtered data'); - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table } from '../../src/index.js'; + +describe('filter', () => { + it('filters a table', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const ft = table(cols).filter(d => 1 < d.a && d.a < 7).reify(); + + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + tableEqual(ft, { a: [3, 5], b: [4, 6] }, 'filtered data'); + }); + + it('filters a filtered table', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const ft = table(cols) + .filter(d => 1 < d.a) + .filter(d => d.a < 7) + .reify(); + + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + tableEqual(ft, { a: [3, 5], b: [4, 6] }, 'filter data'); + }); + + it('supports value functions', () => { + const cols = { a: ['aa', 'ab', 'ba', 'bb'], b: [2, 4, 6, 8] }; + const ft = table(cols).filter(d => op.startswith(d.a, 'a')); + tableEqual(ft, { a: ['aa', 'ab'], b: [2, 4] }, 'filter data'); + }); + + it('supports aggregate functions', () => { + const cols = { a: [1, 3, 5, 7], b: [2, 4, 6, 8] }; + const ft = table(cols).filter(({ a }) => a < op.median(a)); + tableEqual(ft, { a: [1, 3], b: [2, 4] }, 'filter data'); + }); + + it('supports window functions', () => { + const cols = { a: [1, 3, 5, 7], b: [2, 4, 6, 8]}; + const ft = table(cols).filter(({ a }) => op.lag(a) > 2); + tableEqual(ft, { a: [5, 7], b: [6, 8] }, 'filter data'); + }); }); - -tape('filter filters a filtered table', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const ft = table(cols) - .filter(d => 1 < d.a) - .filter(d => d.a < 7) - .reify(); - - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - tableEqual(t, ft, { a: [3, 5], b: [4, 6] }, 'filter data'); - t.end(); -}); - -tape('filter supports value functions', t => { - const cols = { a: ['aa', 'ab', 'ba', 'bb'], b: [2, 4, 6, 8] }; - const ft = table(cols).filter(d => op.startswith(d.a, 'a')); - tableEqual(t, ft, { a: ['aa', 'ab'], b: [2, 4] }, 'filter data'); - t.end(); -}); - -tape('filter supports aggregate functions', t => { - const cols = { a: [1, 3, 5, 7], b: [2, 4, 6, 8] }; - const ft = table(cols).filter(({ a }) => a < op.median(a)); - tableEqual(t, ft, { a: [1, 3], b: [2, 4] }, 'filter data'); - t.end(); -}); - -tape('filter supports window functions', t => { - const cols = { a: [1, 3, 5, 7], b: [2, 4, 6, 8]}; - const ft = table(cols).filter(({ a }) => op.lag(a) > 2); - tableEqual(t, ft, { a: [5, 7], b: [6, 8] }, 'filter data'); - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/fold-test.js b/test/verbs/fold-test.js index dc4187a7..b4fc3fe6 100644 --- a/test/verbs/fold-test.js +++ b/test/verbs/fold-test.js @@ -1,6 +1,5 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { not, table } from '../../src'; +import tableEqual from '../table-equal.js'; +import { not, table } from '../../src/index.js'; function data() { return { @@ -18,20 +17,19 @@ function output(key = 'key', value = 'value') { }; } -tape('fold generates key-value pair columns', t => { - const ut = table(data()).fold(['x', 'y']); - tableEqual(t, ut, output(), 'fold data'); - t.end(); -}); +describe('fold', () => { + it('generates key-value pair columns', () => { + const ut = table(data()).fold(['x', 'y']); + tableEqual(ut, output(), 'fold data'); + }); -tape('fold accepts select statements', t => { - const ut = table(data()).fold(not('k')); - tableEqual(t, ut, output(), 'fold selected data'); - t.end(); -}); + it('accepts select statements', () => { + const ut = table(data()).fold(not('k')); + tableEqual(ut, output(), 'fold selected data'); + }); -tape('fold accepts named output columns', t => { - const ut = table(data()).fold(['x', 'y'], { as: ['u', 'v'] }); - tableEqual(t, ut, output('u', 'v'), 'fold as data'); - t.end(); -}); \ No newline at end of file + it('accepts named output columns', () => { + const ut = table(data()).fold(['x', 'y'], { as: ['u', 'v'] }); + tableEqual(ut, output('u', 'v'), 'fold as data'); + }); +}); diff --git a/test/verbs/groupby-test.js b/test/verbs/groupby-test.js index 9d806f87..601897ab 100644 --- a/test/verbs/groupby-test.js +++ b/test/verbs/groupby-test.js @@ -1,173 +1,161 @@ -import tape from 'tape'; -import fromArrow from '../../src/format/from-arrow'; -import { desc, op, table } from '../../src'; - -tape('groupby computes groups based on field names', t => { - const data = { - k: 'aabb'.split(''), - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const gt = table(data).groupby('k'); - - t.equal(gt.numRows(), 4, 'num rows'); - t.equal(gt.numCols(), 3, 'num cols'); - t.equal(gt.isGrouped(), true, 'is grouped'); - - const { keys, names, rows, size } = gt.groups(); - t.deepEqual( - { keys, names, rows, size }, - { - keys: Uint32Array.from([0, 0, 1, 1]), - names: ['k'], - rows: [0, 2], - size: 2 - }, - 'group data' - ); - t.end(); +import assert from 'node:assert'; +import { desc, fromArrow, op, table, toArrow } from '../../src/index.js'; + +describe('groupby', () => { + it('computes groups based on field names', () => { + const data = { + k: 'aabb'.split(''), + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const gt = table(data).groupby('k'); + + assert.equal(gt.numRows(), 4, 'num rows'); + assert.equal(gt.numCols(), 3, 'num cols'); + assert.equal(gt.isGrouped(), true, 'is grouped'); + + const { keys, names, rows, size } = gt.groups(); + assert.deepEqual( + { keys, names, rows, size }, + { + keys: Uint32Array.from([0, 0, 1, 1]), + names: ['k'], + rows: [0, 2], + size: 2 + }, + 'group data' + ); + }); + + it('computes groups based on a function', () => { + const data = { + k: 'aabb'.split(''), + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const gt = table(data).groupby({ key: d => d.k }); + + assert.equal(gt.numRows(), 4, 'num rows'); + assert.equal(gt.numCols(), 3, 'num cols'); + assert.equal(gt.isGrouped(), true, 'is grouped'); + + const { keys, names, rows, size } = gt.groups(); + assert.deepEqual( + { keys, names, rows, size }, + { + keys: Uint32Array.from([0, 0, 1, 1]), + names: ['key'], + rows: [0, 2], + size: 2 + }, + 'group data' + ); + }); + + it('supports aggregate functions', () => { + const data = { a: [1, 3, 5, 7] }; + const gt = table(data).groupby({ res: d => op.abs(d.a - op.mean(d.a)) }); + + const { keys, names, rows, size } = gt.groups(); + assert.deepEqual( + { keys, names, rows, size }, + { + keys: Uint32Array.from([0, 1, 1, 0]), + names: ['res'], + rows: [0, 1], + size: 2 + }, + 'group data' + ); + }); + + it('supports grouped aggregate functions', () => { + const data = { k: [0, 0, 1, 1], a: [1, 3, 5, 7] }; + const gt = table(data) + .groupby('k') + .groupby({ res: d => d.a - op.mean(d.a) }); + + const { keys, names, rows, size } = gt.groups(); + assert.deepEqual( + { keys, names, rows, size }, + { + keys: Uint32Array.from([0, 1, 0, 1]), + names: ['res'], + rows: [0, 1], + size: 2 + }, + 'group data' + ); + }); + + it('throws on window functions', () => { + const data = { a: [1, 3, 5, 7] }; + assert.throws(() => table(data).groupby({ res: d => op.lag(d.a) }), 'no window'); + }); + + it('persists after filter', () => { + const dt = table({ a: [1, 3, 5, 7] }) + .groupby('a') + .filter(d => d.a > 1); + + assert.ok(dt.isGrouped(), 'is grouped'); + + const { rows, get } = dt.groups(); + assert.deepEqual( + rows.map(r => get[0](r)), + [3, 5, 7], + 'retrieves correct group values' + ); + }); + + it('persists after select', () => { + const dt = table({ a: [1, 3, 5, 7], b: [2, 4, 6, 8] }) + .groupby('a') + .select('b'); + + assert.ok(dt.isGrouped(), 'is grouped'); + + const { rows, get } = dt.groups(); + assert.deepEqual( + rows.map(r => get[0](r)), + [1, 3, 5, 7], + 'retrieves correct group values' + ); + }); + + it('persists after reify', () => { + const dt = table({ a: [1, 3, 5, 7], b: [2, 4, 6, 8] }) + .groupby('a') + .orderby(desc('b')) + .filter(d => d.a > 1) + .select('b') + .reify(); + + assert.ok(dt.isGrouped(), 'is grouped'); + + const { rows, get } = dt.groups(); + assert.deepEqual( + rows.map(r => get[0](r)), + [3, 5, 7], + 'retrieves correct group values' + ); + }); + + it('optimizes Arrow dictionary columns', () => { + const dt = fromArrow(toArrow( + table({ + d: ['a', 'a', 'b', 'b'], + v: [1, 2, 3, 4] + }) + )); + + const gt = dt.groupby('d'); + assert.equal( + gt.groups().keys, + dt.column('d').groups().keys, + 'groupby reuses internal dictionary keys' + ); + }); }); - -tape('groupby computes groups based on a function', t => { - const data = { - k: 'aabb'.split(''), - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const gt = table(data).groupby({ key: d => d.k }); - - t.equal(gt.numRows(), 4, 'num rows'); - t.equal(gt.numCols(), 3, 'num cols'); - t.equal(gt.isGrouped(), true, 'is grouped'); - - const { keys, names, rows, size } = gt.groups(); - t.deepEqual( - { keys, names, rows, size }, - { - keys: Uint32Array.from([0, 0, 1, 1]), - names: ['key'], - rows: [0, 2], - size: 2 - }, - 'group data' - ); - t.end(); -}); - -tape('groupby supports aggregate functions', t => { - const data = { a: [1, 3, 5, 7] }; - const gt = table(data).groupby({ res: d => op.abs(d.a - op.mean(d.a)) }); - - const { keys, names, rows, size } = gt.groups(); - t.deepEqual( - { keys, names, rows, size }, - { - keys: Uint32Array.from([0, 1, 1, 0]), - names: ['res'], - rows: [0, 1], - size: 2 - }, - 'group data' - ); - t.end(); -}); - -tape('groupby supports grouped aggregate functions', t => { - const data = { k: [0, 0, 1, 1], a: [1, 3, 5, 7] }; - const gt = table(data) - .groupby('k') - .groupby({ res: d => d.a - op.mean(d.a) }); - - const { keys, names, rows, size } = gt.groups(); - t.deepEqual( - { keys, names, rows, size }, - { - keys: Uint32Array.from([0, 1, 0, 1]), - names: ['res'], - rows: [0, 1], - size: 2 - }, - 'group data' - ); - t.end(); -}); - -tape('groupby throws on window functions', t => { - const data = { a: [1, 3, 5, 7] }; - t.throws(() => table(data).groupby({ res: d => op.lag(d.a) }), 'no window'); - t.end(); -}); - -tape('groupby persists after filter', t => { - const dt = table({ a: [1, 3, 5, 7] }) - .groupby('a') - .filter(d => d.a > 1); - - t.ok(dt.isGrouped(), 'is grouped'); - - const { rows, get } = dt.groups(); - t.deepEqual( - rows.map(r => get[0](r)), - [3, 5, 7], - 'retrieves correct group values' - ); - - t.end(); -}); - -tape('groupby persists after select', t => { - const dt = table({ a: [1, 3, 5, 7], b: [2, 4, 6, 8] }) - .groupby('a') - .select('b'); - - t.ok(dt.isGrouped(), 'is grouped'); - - const { rows, get } = dt.groups(); - t.deepEqual( - rows.map(r => get[0](r)), - [1, 3, 5, 7], - 'retrieves correct group values' - ); - - t.end(); -}); - -tape('groupby persists after reify', t => { - const dt = table({ a: [1, 3, 5, 7], b: [2, 4, 6, 8] }) - .groupby('a') - .orderby(desc('b')) - .filter(d => d.a > 1) - .select('b') - .reify(); - - t.ok(dt.isGrouped(), 'is grouped'); - - const { rows, get } = dt.groups(); - t.deepEqual( - rows.map(r => get[0](r)), - [3, 5, 7], - 'retrieves correct group values' - ); - - t.end(); -}); - -tape('groupby optimizes Arrow dictionary columns', t => { - const dt = fromArrow( - table({ - d: ['a', 'a', 'b', 'b'], - v: [1, 2, 3, 4] - }).toArrow() - ); - - const gt = dt.groupby('d'); - t.equal( - gt.groups().keys, - dt.column('d').groups().keys, - 'groupby reuses internal dictionary keys' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/impute-test.js b/test/verbs/impute-test.js index ab0734bb..a0bdeee7 100644 --- a/test/verbs/impute-test.js +++ b/test/verbs/impute-test.js @@ -1,183 +1,168 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table } from '../../src/index.js'; const na = undefined; -tape('impute imputes values for an ungrouped table', t => { - const dt = table({ x: [1, null, NaN, undefined, 3] }); - - tableEqual(t, - dt.impute({ x: () => 2 }), - { x: [1, 2, 2, 2, 3] }, - 'impute data, constant' - ); - - tableEqual(t, - dt.impute({ x: op.mean('x') }), - { x: [1, 2, 2, 2, 3] }, - 'impute data, mean' - ); - - t.end(); -}); - -tape('impute imputes values for a grouped table', t => { - const dt = table({ - k: [0, 0, 0, 0, 1, 1, 1, 1], - x: [1, null, NaN, 3, 3, null, undefined, 5] - }).groupby('k'); - - const t1 = dt.impute({ x: () => 2 }); - tableEqual(t, t1, { - k: [0, 0, 0, 0, 1, 1, 1, 1], - x: [1, 2, 2, 3, 3, 2, 2, 5] - }, 'impute data, constant'); - - t.equal(dt.groups(), t1.groups(), 'groups'); - - const t2 = dt.impute({ x: op.mean('x') }); - tableEqual(t, t2, { - k: [0, 0, 0, 0, 1, 1, 1, 1], - x: [1, 2, 2, 3, 3, 4, 4, 5] - }, 'impute data, mean'); - - t.equal(dt.groups(), t2.groups(), 'groups'); - - t.end(); -}); - -tape('impute imputes expanded rows for an ungrouped table', t => { - const dt = table({ - x: ['a', 'b', 'c'], - y: [1, 2, 3], - z: ['x', 'x', 'x'] - }) - .impute(null, { expand: ['x', 'y'] }) - .orderby('x', 'y') - .reify(); - - tableEqual(t, dt, { - x: ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], - y: [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ], - z: ['x', na, na, na, 'x', na, na, na, 'x'] - }, 'impute data'); - - t.equal(dt.groups(), null, 'no groups'); - - t.end(); +describe('impute', () => { + it('imputes values for an ungrouped table', () => { + const dt = table({ x: [1, null, NaN, undefined, 3] }); + + tableEqual( + dt.impute({ x: () => 2 }), + { x: [1, 2, 2, 2, 3] }, + 'impute data, constant' + ); + + tableEqual( + dt.impute({ x: op.mean('x') }), + { x: [1, 2, 2, 2, 3] }, + 'impute data, mean' + ); + }); + + it('imputes values for a grouped table', () => { + const dt = table({ + k: [0, 0, 0, 0, 1, 1, 1, 1], + x: [1, null, NaN, 3, 3, null, undefined, 5] + }).groupby('k'); + + const t1 = dt.impute({ x: () => 2 }); + tableEqual(t1, { + k: [0, 0, 0, 0, 1, 1, 1, 1], + x: [1, 2, 2, 3, 3, 2, 2, 5] + }, 'impute data, constant'); + + assert.equal(dt.groups(), t1.groups(), 'groups'); + + const t2 = dt.impute({ x: op.mean('x') }); + tableEqual(t2, { + k: [0, 0, 0, 0, 1, 1, 1, 1], + x: [1, 2, 2, 3, 3, 4, 4, 5] + }, 'impute data, mean'); + + assert.equal(dt.groups(), t2.groups(), 'groups'); + }); + + it('imputes expanded rows for an ungrouped table', () => { + const dt = table({ + x: ['a', 'b', 'c'], + y: [1, 2, 3], + z: ['x', 'x', 'x'] + }) + .impute(null, { expand: ['x', 'y'] }) + .orderby('x', 'y') + .reify(); + + tableEqual(dt, { + x: ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], + y: [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ], + z: ['x', na, na, na, 'x', na, na, na, 'x'] + }, 'impute data'); + + assert.equal(dt.groups(), null, 'no groups'); + }); + + it('imputes expanded rows for a grouped table', () => { + const dt = table({ + x: ['a', 'a', 'b', 'c'], + y: [1, 1, 2, 3], + z: ['x', 'x', 'y', 'z'], + v: [0, 9, 8, 7] + }) + .groupby('x', 'y') + .impute(null, { expand: 'z' }) + .orderby('x', 'y', 'z') + .reify(); + + tableEqual(dt, { + x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], + y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], + z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], + v: [ 0, 9, na, na, na, 8, na, na, na, 7 ] + }, 'impute data'); + + assert.deepEqual( + Array.from(dt.groups().keys), + [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], + 'group keys' + ); + }); + + it('imputes values and rows for an ungrouped table', () => { + const imp = 'imp'; + const dt = table({ + x: ['a', 'b', 'c'], + y: [1, 2, 3], + z: ['x', 'x', 'x'] + }) + .impute({ z: () => 'imp' }, { expand: ['x', 'y'] }) + .orderby('x', 'y') + .reify(); + + tableEqual(dt, { + x: ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], + y: [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ], + z: ['x', imp, imp, imp, 'x', imp, imp, imp, 'x'] + }, 'impute data'); + + assert.equal(dt.groups(), null, 'no groups'); + }); + + it('imputes expanded rows for a grouped table', () => { + const dt = table({ + x: ['a', 'a', 'b', 'c'], + y: [1, 1, 2, 3], + z: ['x', 'x', 'y', 'z'], + v: [0, 9, 8, 7] + }) + .groupby('x', 'y') + .impute({ v: op.max('v') }, { expand: 'z' }) + .orderby('x', 'y', 'z') + .reify(); + + tableEqual(dt, { + x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], + y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], + z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], + v: [ 0, 9, 9, 9, 8, 8, 8, 7, 7, 7 ] + }, 'impute data'); + + assert.deepEqual( + Array.from(dt.groups().keys), + [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], + 'group keys' + ); + }); + + it('imputes expanded rows given fixed values', () => { + const dt = table({ + x: ['a', 'a', 'b', 'c'], + y: [1, 1, 2, 3], + z: ['x', 'x', 'y', 'z'], + v: [0, 9, 8, 7] + }) + .groupby('x', 'y') + .impute({ v: op.max('v') }, { expand: { z: ['x', 'y', 'z'] } }) + .orderby('x', 'y', 'z') + .reify(); + + tableEqual(dt, { + x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], + y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], + z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], + v: [ 0, 9, 9, 9, 8, 8, 8, 7, 7, 7 ] + }, 'impute data'); + + assert.deepEqual( + Array.from(dt.groups().keys), + [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], + 'group keys' + ); + }); + + it('throws on non-existent values column', () => { + const dt = table({ x: [1, null, NaN, undefined, 3] }); + assert.throws(() => dt.impute({ z: () => 1 })); + }); }); - -tape('impute imputes expanded rows for a grouped table', t => { - const dt = table({ - x: ['a', 'a', 'b', 'c'], - y: [1, 1, 2, 3], - z: ['x', 'x', 'y', 'z'], - v: [0, 9, 8, 7] - }) - .groupby('x', 'y') - .impute(null, { expand: 'z' }) - .orderby('x', 'y', 'z') - .reify(); - - tableEqual(t, dt, { - x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], - y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], - z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], - v: [ 0, 9, na, na, na, 8, na, na, na, 7 ] - }, 'impute data'); - - t.deepEqual( - Array.from(dt.groups().keys), - [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], - 'group keys' - ); - - t.end(); -}); - -tape('impute imputes values and rows for an ungrouped table', t => { - const imp = 'imp'; - const dt = table({ - x: ['a', 'b', 'c'], - y: [1, 2, 3], - z: ['x', 'x', 'x'] - }) - .impute({ z: () => 'imp' }, { expand: ['x', 'y'] }) - .orderby('x', 'y') - .reify(); - - tableEqual(t, dt, { - x: ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], - y: [ 1, 2, 3, 1, 2, 3, 1, 2, 3 ], - z: ['x', imp, imp, imp, 'x', imp, imp, imp, 'x'] - }, 'impute data'); - - t.equal(dt.groups(), null, 'no groups'); - - t.end(); -}); - -tape('impute imputes expanded rows for a grouped table', t => { - const dt = table({ - x: ['a', 'a', 'b', 'c'], - y: [1, 1, 2, 3], - z: ['x', 'x', 'y', 'z'], - v: [0, 9, 8, 7] - }) - .groupby('x', 'y') - .impute({ v: op.max('v') }, { expand: 'z' }) - .orderby('x', 'y', 'z') - .reify(); - - tableEqual(t, dt, { - x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], - y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], - z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], - v: [ 0, 9, 9, 9, 8, 8, 8, 7, 7, 7 ] - }, 'impute data'); - - t.deepEqual( - Array.from(dt.groups().keys), - [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], - 'group keys' - ); - - t.end(); -}); - -tape('impute imputes expanded rows given fixed values', t => { - const dt = table({ - x: ['a', 'a', 'b', 'c'], - y: [1, 1, 2, 3], - z: ['x', 'x', 'y', 'z'], - v: [0, 9, 8, 7] - }) - .groupby('x', 'y') - .impute({ v: op.max('v') }, { expand: { z: ['x', 'y', 'z'] } }) - .orderby('x', 'y', 'z') - .reify(); - - tableEqual(t, dt, { - x: ['a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'], - y: [ 1, 1, 1, 1, 2, 2, 2, 3, 3, 3 ], - z: ['x', 'x', 'y', 'z', 'x', 'y', 'z', 'x', 'y', 'z'], - v: [ 0, 9, 9, 9, 8, 8, 8, 7, 7, 7 ] - }, 'impute data'); - - t.deepEqual( - Array.from(dt.groups().keys), - [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], - 'group keys' - ); - - t.end(); -}); - -tape('impute throws on non-existent values column', t => { - const dt = table({ x: [1, null, NaN, undefined, 3] }); - - t.throws(() => dt.impute({ z: () => 1 })); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/join-filter-test.js b/test/verbs/join-filter-test.js index f8103466..405099a4 100644 --- a/test/verbs/join-filter-test.js +++ b/test/verbs/join-filter-test.js @@ -1,6 +1,5 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; function joinTables() { return [ @@ -16,144 +15,133 @@ function joinTables() { ]; } -tape('semijoin uses natural join criteria', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); - const tr = table({ k: [1, 2], b: [5, 6]}); - - const tj = tl.semijoin(tr); - - tableEqual(t, tj, { - k: [ 1, 2 ], - a: [ 3, 4 ] - }, 'natural semijoin data'); - - t.end(); -}); - -tape('semijoin filters left table to matching rows', t => { - const [tl, tr] = joinTables(); - const output = { - k: [ 'a', 'b', 'b' ], - x: [ 1, 2, 3 ], - y: [ 9, 8, 7 ] - }; - - tableEqual(t, - tl.semijoin(tr, ['k', 'u']), - output, - 'semijoin data, with keys' - ); - - tableEqual(t, - tl.semijoin(tr, (a, b) => a.k === b.u), - output, - 'semijoin data, with predicate' - ); - - t.end(); -}); - -tape('antijoin uses natural join criteria', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); - const tr = table({ k: [1, 2], b: [5, 6]}); - - const tj = tl.antijoin(tr); - - tableEqual(t, tj, { - k: [ 3 ], - a: [ 0 ] - }, 'natural antijoin data'); - - t.end(); +describe('semijoin', () => { + it('uses natural join criteria', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); + const tr = table({ k: [1, 2], b: [5, 6]}); + + const tj = tl.semijoin(tr); + + tableEqual(tj, { + k: [ 1, 2 ], + a: [ 3, 4 ] + }, 'natural semijoin data'); + }); + + it('filters left table to matching rows', () => { + const [tl, tr] = joinTables(); + const output = { + k: [ 'a', 'b', 'b' ], + x: [ 1, 2, 3 ], + y: [ 9, 8, 7 ] + }; + + tableEqual( + tl.semijoin(tr, ['k', 'u']), + output, + 'semijoin data, with keys' + ); + + tableEqual( + tl.semijoin(tr, (a, b) => a.k === b.u), + output, + 'semijoin data, with predicate' + ); + }); }); -tape('antijoin filters left table to non-matching rows', t => { - const [tl, tr] = joinTables(); - const output = { - k: [ 'c' ], - x: [ 4 ], - y: [ 6 ] - }; - - tableEqual(t, - tl.antijoin(tr, ['k', 'u']), - output, - 'antijoin data, with keys' - ); - - tableEqual(t, - tl.antijoin(tr, (a, b) => a.k === b.u), - output, - 'antijoin data, with predicate' - ); - - t.end(); +describe('antijoin', () => { + it('uses natural join criteria', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); + const tr = table({ k: [1, 2], b: [5, 6]}); + + const tj = tl.antijoin(tr); + + tableEqual(tj, { + k: [ 3 ], + a: [ 0 ] + }, 'natural antijoin data'); + }); + + it('filters left table to non-matching rows', () => { + const [tl, tr] = joinTables(); + const output = { + k: [ 'c' ], + x: [ 4 ], + y: [ 6 ] + }; + + tableEqual( + tl.antijoin(tr, ['k', 'u']), + output, + 'antijoin data, with keys' + ); + + tableEqual( + tl.antijoin(tr, (a, b) => a.k === b.u), + output, + 'antijoin data, with predicate' + ); + }); }); -tape('except returns table given empty input', t => { - const data = { k: [1, 2, 3], a: [3, 4, 0] }; - const tl = table(data); - tableEqual(t, tl.except([]), data, 'except data'); - t.end(); +describe('except', () => { + it('returns table given empty input', () => { + const data = { k: [1, 2, 3], a: [3, 4, 0] }; + const tl = table(data); + tableEqual(tl.except([]), data, 'except data'); + }); + + it('removes intersecting rows', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); + const tr = table({ k: [1, 2], a: [3, 4]}); + + tableEqual(tl.except(tr), { + k: [ 3 ], + a: [ 0 ] + }, 'except data'); + }); + + it('removes intersecting rows for multiple tables', () => { + const t0 = table({ k: [1, 2, 3], a: [3, 4, 0] }); + const t1 = table({ k: [1], a: [3]}); + const t2 = table({ k: [2], a: [4]}); + + tableEqual(t0.except(t1, t2), { + k: [ 3 ], + a: [ 0 ] + }, 'except data'); + }); }); -tape('except removes intersecting rows', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 0]}); - const tr = table({ k: [1, 2], a: [3, 4]}); - - tableEqual(t, tl.except(tr), { - k: [ 3 ], - a: [ 0 ] - }, 'except data'); - - t.end(); +describe('intersect', () => { + it('returns empty table given empty input', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 0] }); + + tableEqual(tl.intersect([]), { + k: [ ], + a: [ ] + }, 'intersect data'); + }); + + it('removes non-intersecting rows', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 0] }); + const tr = table({ k: [1, 2], a: [3, 4]}); + + tableEqual(tl.intersect(tr), { + k: [ 1, 2 ], + a: [ 3, 4 ] + }, 'intersect data'); + }); + + it('removes non-intersecting rows for multiple tables', () => { + const t0 = table({ k: [1, 2, 3], a: [3, 4, 0] }); + const t1 = table({ k: [1], a: [3]}); + const t2 = table({ k: [2], a: [4]}); + + tableEqual(t0.intersect(t1, t2), { + k: [ ], + a: [ ] + }, 'intersect data'); + }); }); - -tape('except removes intersecting rows for multiple tables', t => { - const t0 = table({ k: [1, 2, 3], a: [3, 4, 0] }); - const t1 = table({ k: [1], a: [3]}); - const t2 = table({ k: [2], a: [4]}); - - tableEqual(t, t0.except(t1, t2), { - k: [ 3 ], - a: [ 0 ] - }, 'except data'); - - t.end(); -}); - -tape('intersect returns empty table given empty input', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 0] }); - - tableEqual(t, tl.intersect([]), { - k: [ ], - a: [ ] - }, 'intersect data'); - - t.end(); -}); - -tape('intersect removes non-intersecting rows', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 0] }); - const tr = table({ k: [1, 2], a: [3, 4]}); - - tableEqual(t, tl.intersect(tr), { - k: [ 1, 2 ], - a: [ 3, 4 ] - }, 'intersect data'); - - t.end(); -}); - -tape('intersect removes non-intersecting rows for multiple tables', t => { - const t0 = table({ k: [1, 2, 3], a: [3, 4, 0] }); - const t1 = table({ k: [1], a: [3]}); - const t2 = table({ k: [2], a: [4]}); - - tableEqual(t, t0.intersect(t1, t2), { - k: [ ], - a: [ ] - }, 'intersect data'); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/join-test.js b/test/verbs/join-test.js index 0baaf022..916a5917 100644 --- a/test/verbs/join-test.js +++ b/test/verbs/join-test.js @@ -1,6 +1,6 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { all, not, op, table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { all, not, op, table } from '../../src/index.js'; function joinTables() { return [ @@ -16,481 +16,451 @@ function joinTables() { ]; } -tape('cross computes Cartesian product', t => { - const [tl, tr] = joinTables(); +describe('cross', () => { + it('computes Cartesian product', () => { + const [tl, tr] = joinTables(); - const tj = tl.cross(tr); + const tj = tl.cross(tr); - t.equal(tj.numRows(), tl.numRows() * tr.numRows()); - tableEqual(t, tj, { - k: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], - x: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], - y: [ 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 7, 6, 6, 6, 6 ], - u: [ 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd' ], - v: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] - }, 'cross data'); - - t.end(); -}); - -tape('cross computes Cartesian product with column selection', t => { - const [tl, tr] = joinTables(); - - const tj = tl.cross(tr, [not('y'), not('u')]); - - t.equal(tj.numRows(), tl.numRows() * tr.numRows()); - tableEqual(t, tj, { - k: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], - x: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], - v: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] - }, 'selected cross data'); - - t.end(); -}); - -tape('cross computes Cartesian product with column renaming', t => { - const [tl, tr] = joinTables(); - - const tj = tl.cross(tr, [ - {j: d => d.k, z: d => d.x}, - {w: d => d.v} - ]); - - t.equal(tj.numRows(), tl.numRows() * tr.numRows()); - tableEqual(t, tj, { - j: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], - z: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], - w: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] - }, 'selected cross data'); - - t.end(); -}); - -tape('join performs natural join', t => { - const tl = table({ k: [1, 2, 3], a: [3, 4, 1]}); - const t1 = table({ k: [1, 2, 4], b: [5, 6, 2]}); - const t2 = table({ u: [1, 2], v: [5, 6]}); - - tableEqual(t, tl.join(t1), { - k: [ 1, 2 ], - a: [ 3, 4 ], - b: [ 5, 6 ] - }, 'natural join data, common columns'); - - tableEqual(t, tl.join_left(t1), { - k: [ 1, 2, 3 ], - a: [ 3, 4, 1 ], - b: [ 5, 6, undefined ] - }, 'natural left join data, common columns'); - - tableEqual(t, tl.join_right(t1), { - k: [ 1, 2, 4 ], - a: [ 3, 4, undefined ], - b: [ 5, 6, 2 ] - }, 'natural right join data, common columns'); - - tableEqual(t, tl.join_full(t1), { - k: [ 1, 2, 3, 4 ], - a: [ 3, 4, 1, undefined ], - b: [ 5, 6, undefined, 2 ] - }, 'natural full join data, common columns'); - - t.throws( - () =>tl.join(t2), - 'natural join throws, no common columns' - ); - - t.end(); -}); - -tape('join handles filtered tables', t => { - const tl = table({ - key: [1, 2, 3, 4], - value1: [1, 2, 3, 4] - }).filter(d => d.key < 3); - - const tr = table({ - key: [1, 2, 5], - value2: [1, 2, 5] + assert.equal(tj.numRows(), tl.numRows() * tr.numRows()); + tableEqual(tj, { + k: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], + x: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], + y: [ 9, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 7, 6, 6, 6, 6 ], + u: [ 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd', 'b', 'a', 'b', 'd' ], + v: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] + }, 'cross data'); }); - tableEqual(t, tl.join_left(tr), { - key: [ 1, 2 ], - value1: [ 1, 2 ], - value2: [ 1, 2 ] - }, 'natural left join on filtered data'); - - tableEqual(t, tl.join_right(tr), { - key: [ 1, 2, 5 ], - value1: [ 1, 2, undefined ], - value2: [ 1, 2, 5 ] - }, 'natural right join on filtered data'); - - const dt = table({ - year: [2017, 2017, 2017, 2018, 2018, 2018], - month: ['01', '02', 'YR', '01', '02', 'YR'], - count: [6074, 7135, 220582, 5761, 6764, 222153] - }); - - const jt = dt - .filter(d => d.month === 'YR') - .select('year', {count: 'total'}) - .join(dt.filter(d => d.month !== 'YR')); - - tableEqual(t, jt, { - total: [ 220582, 220582, 222153, 222153 ], - year: [ 2017, 2017, 2018, 2018 ], - month: [ '01', '02', '01', '02' ], - count: [ 6074, 7135, 5761, 6764 ] - }, 'join of two filtered tables'); - - t.end(); -}); + it('computes Cartesian product with column selection', () => { + const [tl, tr] = joinTables(); -tape('join performs inner join with predicate', t => { - const [tl, tr] = joinTables(); + const tj = tl.cross(tr, [not('y'), not('u')]); - const tj = tl.join(tr, (a, b) => a.k === b.u, { - k: d => d.k, - x: d => d.x, - y: d => d.y, - u: (a, b) => b.u, - v: (a, b) => b.v, - z: (a, b) => a.x + b.v + assert.equal(tj.numRows(), tl.numRows() * tr.numRows()); + tableEqual(tj, { + k: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], + x: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], + v: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] + }, 'selected cross data'); }); - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b' ], - x: [ 1, 2, 2, 3, 3 ], - y: [ 9, 8, 8, 7, 7 ], - u: [ 'a', 'b', 'b', 'b', 'b' ], - v: [ 4, 5, 6, 5, 6 ], - z: [ 5, 7, 8, 8, 9 ] - }, 'inner join data'); - - t.end(); -}); + it('computes Cartesian product with column renaming', () => { + const [tl, tr] = joinTables(); -tape('join_left performs left outer join with predicate', t => { - const [tl, tr] = joinTables(); + const tj = tl.cross(tr, [ + {j: d => d.k, z: d => d.x}, + {w: d => d.v} + ]); - const tj = tl.join_left(tr, (a, b) => a.k === b.u, { - k: d => d.k, - x: d => d.x, - y: d => d.y, - u: (a, b) => b.u, - v: (a, b) => b.v, - z: (a, b) => a.x + b.v + assert.equal(tj.numRows(), tl.numRows() * tr.numRows()); + tableEqual(tj, { + j: [ 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c' ], + z: [ 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4 ], + w: [ 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0, 5, 4, 6, 0 ] + }, 'selected cross data'); }); - - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b', 'c' ], - x: [ 1, 2, 2, 3, 3, 4 ], - y: [ 9, 8, 8, 7, 7, 6 ], - u: [ 'a', 'b', 'b', 'b', 'b', undefined ], - v: [ 4, 5, 6, 5, 6, undefined ], - z: [ 5, 7, 8, 8, 9, NaN ] - }, 'left join data'); - - t.end(); }); -tape('join_right performs right outer join with predicate', t => { - const [tl, tr] = joinTables(); - - const tj = tl.join_right(tr, (a, b) => a.k === b.u, { - k: d => d.k, - x: d => d.x, - y: d => d.y, - u: (a, b) => b.u, - v: (a, b) => b.v, - z: (a, b) => a.x + b.v +describe('join', () => { + it('performs natural join', () => { + const tl = table({ k: [1, 2, 3], a: [3, 4, 1]}); + const t1 = table({ k: [1, 2, 4], b: [5, 6, 2]}); + const t2 = table({ u: [1, 2], v: [5, 6]}); + + tableEqual(tl.join(t1), { + k: [ 1, 2 ], + a: [ 3, 4 ], + b: [ 5, 6 ] + }, 'natural join data, common columns'); + + tableEqual(tl.join_left(t1), { + k: [ 1, 2, 3 ], + a: [ 3, 4, 1 ], + b: [ 5, 6, undefined ] + }, 'natural left join data, common columns'); + + tableEqual(tl.join_right(t1), { + k: [ 1, 2, 4 ], + a: [ 3, 4, undefined ], + b: [ 5, 6, 2 ] + }, 'natural right join data, common columns'); + + tableEqual(tl.join_full(t1), { + k: [ 1, 2, 3, 4 ], + a: [ 3, 4, 1, undefined ], + b: [ 5, 6, undefined, 2 ] + }, 'natural full join data, common columns'); + + assert.throws( + () =>tl.join(t2), + 'natural join throws, no common columns' + ); }); - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b', undefined ], - x: [ 1, 2, 2, 3, 3, undefined ], - y: [ 9, 8, 8, 7, 7, undefined ], - u: [ 'a', 'b', 'b', 'b', 'b', 'd' ], - v: [ 4, 5, 6, 5, 6, 0 ], - z: [ 5, 7, 8, 8, 9, NaN ] - }, 'right join data'); - - t.end(); -}); - -tape('join_full performs full outer join with predicate', t => { - const [tl, tr] = joinTables(); - - const tj = tl.join_full(tr, (a, b) => a.k === b.u, { - k: d => d.k, - x: d => d.x, - y: d => d.y, - u: (a, b) => b.u, - v: (a, b) => b.v, - z: (a, b) => a.x + b.v + it('handles filtered tables', () => { + const tl = table({ + key: [1, 2, 3, 4], + value1: [1, 2, 3, 4] + }).filter(d => d.key < 3); + + const tr = table({ + key: [1, 2, 5], + value2: [1, 2, 5] + }); + + tableEqual(tl.join_left(tr), { + key: [ 1, 2 ], + value1: [ 1, 2 ], + value2: [ 1, 2 ] + }, 'natural left join on filtered data'); + + tableEqual(tl.join_right(tr), { + key: [ 1, 2, 5 ], + value1: [ 1, 2, undefined ], + value2: [ 1, 2, 5 ] + }, 'natural right join on filtered data'); + + const dt = table({ + year: [2017, 2017, 2017, 2018, 2018, 2018], + month: ['01', '02', 'YR', '01', '02', 'YR'], + count: [6074, 7135, 220582, 5761, 6764, 222153] + }); + + const jt = dt + .filter(d => d.month === 'YR') + .select('year', {count: 'total'}) + .join(dt.filter(d => d.month !== 'YR')); + + tableEqual(jt, { + total: [ 220582, 220582, 222153, 222153 ], + year: [ 2017, 2017, 2018, 2018 ], + month: [ '01', '02', '01', '02' ], + count: [ 6074, 7135, 5761, 6764 ] + }, 'join of two filtered tables'); }); - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b', 'c', undefined ], - x: [ 1, 2, 2, 3, 3, 4, undefined ], - y: [ 9, 8, 8, 7, 7, 6, undefined ], - u: [ 'a', 'b', 'b', 'b', 'b', undefined, 'd' ], - v: [ 4, 5, 6, 5, 6, undefined, 0 ], - z: [ 5, 7, 8, 8, 9, NaN, NaN ] - }, 'full join data'); - - t.end(); -}); + it('performs inner join with predicate', () => { + const [tl, tr] = joinTables(); + + const tj = tl.join(tr, (a, b) => a.k === b.u, { + k: d => d.k, + x: d => d.x, + y: d => d.y, + u: (a, b) => b.u, + v: (a, b) => b.v, + z: (a, b) => a.x + b.v + }); + + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b' ], + x: [ 1, 2, 2, 3, 3 ], + y: [ 9, 8, 8, 7, 7 ], + u: [ 'a', 'b', 'b', 'b', 'b' ], + v: [ 4, 5, 6, 5, 6 ], + z: [ 5, 7, 8, 8, 9 ] + }, 'inner join data'); + }); -tape('join performs inner join with keys', t => { - const [tl, tr] = joinTables(); + it('performs inner join with keys', () => { + const [tl, tr] = joinTables(); - const tj = tl.join(tr, ['k', 'u'], [all(), not('u')]); + const tj = tl.join(tr, ['k', 'u'], [all(), not('u')]); - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b' ], - x: [ 1, 2, 2, 3, 3 ], - y: [ 9, 8, 8, 7, 7 ], - v: [ 4, 5, 6, 5, 6 ] - }, 'inner join data'); + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b' ], + x: [ 1, 2, 2, 3, 3 ], + y: [ 9, 8, 8, 7, 7 ], + v: [ 4, 5, 6, 5, 6 ] + }, 'inner join data'); + }); - t.end(); -}); + it('handles column name collisions', () => { + const [tl] = joinTables(); + const tr = table({ k: ['a', 'b'], x: [9, 8] }); + + const tj_inner = tl.join(tr, 'k'); + tableEqual(tj_inner, { + k: [ 'a', 'b', 'b' ], + x_1: [ 1, 2, 3 ], + y: [ 9, 8, 7 ], + x_2: [ 9, 8, 8 ] + }, 'name collision inner join data'); + + const tj_full = tl.join_full(tr, 'k'); + tableEqual(tj_full, { + k: [ 'a', 'b', 'b', 'c' ], + x_1: [ 1, 2, 3, 4 ], + y: [ 9, 8, 7, 6 ], + x_2: [ 9, 8, 8, undefined ] + }, 'name collision full join data'); + + const tj1 = tl.join(tr, ['k', 'k'], [all(), all()]); + tableEqual(tj1, { + k_1: [ 'a', 'b', 'b' ], + x_1: [ 1, 2, 3 ], + y: [ 9, 8, 7 ], + k_2: [ 'a', 'b', 'b' ], + x_2: [ 9, 8, 8 ] + }, 'name collision join data'); + + const tj2 = tl.join(tr, ['k', 'k'], [ + all(), + all(), + { y: (a, b) => a.x + b.x } + ]); + tableEqual(tj2, { + k_1: [ 'a', 'b', 'b' ], + x_1: [ 1, 2, 3 ], + k_2: [ 'a', 'b', 'b' ], + x_2: [ 9, 8, 8 ], + y: [ 10, 10, 11 ] + }, 'name override join data'); + }); -tape('join_left performs left outer join with keys', t => { - const [tl, tr] = joinTables(); + it('does not treat null values as equal', () => { + const tl = table({ u: ['a', null, undefined, NaN], a: [1, 2, 3, 4] }); + const tr = table({ v: [null, undefined, NaN, 'a'], b: [9, 8, 7, 6] }); - const tj = tl.join_left(tr, ['k', 'u'], [all(), not('u')]); + const tj1 = tl.join(tr, ['u', 'v'], [all(), all()]); - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b', 'c' ], - x: [ 1, 2, 2, 3, 3, 4 ], - y: [ 9, 8, 8, 7, 7, 6 ], - v: [ 4, 5, 6, 5, 6, undefined ] - }, 'left join data'); + tableEqual(tj1, { + u: [ 'a' ], + v: [ 'a' ], + a: [ 1 ], + b: [ 6 ] + }, 'null join data with keys'); - t.end(); -}); + const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), all()]); -tape('join_right performs right outer join with keys', t => { - const [tl, tr] = joinTables(); + tableEqual(tj2, { + u: [ 'a' ], + v: [ 'a' ], + a: [ 1 ], + b: [ 6 ] + }, 'null join data with equal predicate'); + }); - const tj = tl.join_right(tr, ['k', 'u'], [not('k'), all()]); + it('supports date-valued keys', () => { + const d1 = new Date(2000, 0, 1); + const d2 = new Date(2012, 1, 3); + const tl = table({ u: [d1, d2, null], a: [9, 8, 7] }); + const tr = table({ v: [new Date(+d1), +d2], b: [5, 4] }); - tableEqual(t, tj, { - x: [ 1, 2, 2, 3, 3, undefined ], - y: [ 9, 8, 8, 7, 7, undefined ], - u: [ 'a', 'b', 'b', 'b', 'b', 'd' ], - v: [ 4, 5, 6, 5, 6, 0 ] - }, 'right join data'); + const tj1 = tl.join(tr, ['u', 'v'], [all(), not('v')]); - t.end(); -}); + tableEqual(tj1, { + u: [d1, d2], + a: [9, 8], + b: [5, 4] + }, 'hash join data with date keys'); -tape('join_full performs full outer join with keys', t => { - const [tl, tr] = joinTables(); + const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), not('v')]); - const tj = tl.join_full(tr, ['k', 'u'], [all(), all()]); - - tableEqual(t, tj, { - k: [ 'a', 'b', 'b', 'b', 'b', 'c', undefined ], - x: [ 1, 2, 2, 3, 3, 4, undefined ], - y: [ 9, 8, 8, 7, 7, 6, undefined ], - u: [ 'a', 'b', 'b', 'b', 'b', undefined, 'd' ], - v: [ 4, 5, 6, 5, 6, undefined, 0 ] - }, 'full join data'); + tableEqual(tj2, { + u: [d1, d2], + a: [9, 8], + b: [5, 4] + }, 'loop join data with date keys'); + }); - t.end(); -}); + it('supports regexp-valued keys', () => { + const tl = table({ u: [/foo/g, /bar.*/i, null], a: [9, 8, 7] }); + const tr = table({ v: [/foo/g, /bar.*/i], b: [5, 4] }); -tape('join handles column name collisions', t => { - const [tl] = joinTables(); - const tr = table({ k: ['a', 'b'], x: [9, 8] }); - - const tj_inner = tl.join(tr, 'k'); - tableEqual(t, tj_inner, { - k: [ 'a', 'b', 'b' ], - x_1: [ 1, 2, 3 ], - y: [ 9, 8, 7 ], - x_2: [ 9, 8, 8 ] - }, 'name collision inner join data'); - - const tj_full = tl.join_full(tr, 'k'); - tableEqual(t, tj_full, { - k: [ 'a', 'b', 'b', 'c' ], - x_1: [ 1, 2, 3, 4 ], - y: [ 9, 8, 7, 6 ], - x_2: [ 9, 8, 8, undefined ] - }, 'name collision full join data'); - - const tj1 = tl.join(tr, ['k', 'k'], [all(), all()]); - tableEqual(t, tj1, { - k_1: [ 'a', 'b', 'b' ], - x_1: [ 1, 2, 3 ], - y: [ 9, 8, 7 ], - k_2: [ 'a', 'b', 'b' ], - x_2: [ 9, 8, 8 ] - }, 'name collision join data'); - - const tj2 = tl.join(tr, ['k', 'k'], [ - all(), - all(), - { y: (a, b) => a.x + b.x } - ]); - tableEqual(t, tj2, { - k_1: [ 'a', 'b', 'b' ], - x_1: [ 1, 2, 3 ], - k_2: [ 'a', 'b', 'b' ], - x_2: [ 9, 8, 8 ], - y: [ 10, 10, 11 ] - }, 'name override join data'); - - t.end(); -}); + const tj1 = tl.join(tr, ['u', 'v'], [all(), not('v')]); -tape('join does not treat null values as equal', t => { - const tl = table({ u: ['a', null, undefined, NaN], a: [1, 2, 3, 4] }); - const tr = table({ v: [null, undefined, NaN, 'a'], b: [9, 8, 7, 6] }); + tableEqual(tj1, { + u: [/foo/g, /bar.*/i], + a: [9, 8], + b: [5, 4] + }, 'hash join data with regexp keys'); - const tj1 = tl.join(tr, ['u', 'v'], [all(), all()]); + const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), not('v')]); - tableEqual(t, tj1, { - u: [ 'a' ], - v: [ 'a' ], - a: [ 1 ], - b: [ 6 ] - }, 'null join data with keys'); + tableEqual(tj2, { + u: [/foo/g, /bar.*/i], + a: [9, 8], + b: [5, 4] + }, 'loop join data with regexp keys'); + }); - const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), all()]); + it('supports array-valued keys', () => { + const tl = table({ u: [[1, 2], [3, 4], null], a: [9, 8, 7] }); + const tr = table({ v: [[1, 2], [3, 4]], b: [5, 4] }); - tableEqual(t, tj2, { - u: [ 'a' ], - v: [ 'a' ], - a: [ 1 ], - b: [ 6 ] - }, 'null join data with equal predicate'); + const tj1 = tl.join(tr, ['u', 'v']); - t.end(); -}); + tableEqual(tj1, { + u: [[1, 2], [3, 4]], + a: [9, 8], + v: [[1, 2], [3, 4]], + b: [5, 4] + }, 'hash join data with array keys'); -tape('join supports date-valued keys', t => { - const d1 = new Date(2000, 0, 1); - const d2 = new Date(2012, 1, 3); - const tl = table({ u: [d1, d2, null], a: [9, 8, 7] }); - const tr = table({ v: [new Date(+d1), +d2], b: [5, 4] }); + const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v)); - const tj1 = tl.join(tr, ['u', 'v'], [all(), not('v')]); + tableEqual(tj2, { + u: [[1, 2], [3, 4]], + a: [9, 8], + v: [[1, 2], [3, 4]], + b: [5, 4] + }, 'loop join data with array keys'); + }); - tableEqual(t, tj1, { - u: [d1, d2], - a: [9, 8], - b: [5, 4] - }, 'hash join data with date keys'); + it('supports object-valued keys', () => { + const tl = table({ u: [{k: 1, l: [2]}, {k: 2}, null], a: [9, 8, 7] }); + const tr = table({ v: [{k: 1, l: [2]}, {k: 2}], b: [5, 4] }); - const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), not('v')]); + const tj1 = tl.join(tr, ['u', 'v']); - tableEqual(t, tj2, { - u: [d1, d2], - a: [9, 8], - b: [5, 4] - }, 'loop join data with date keys'); + tableEqual(tj1, { + u: [{k: 1, l: [2]}, {k: 2}], + a: [9, 8], + v: [{k: 1, l: [2]}, {k: 2}], + b: [5, 4] + }, 'hash join data with object keys'); - t.end(); -}); + const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v)); -tape('join supports regexp-valued keys', t => { - const tl = table({ u: [/foo/g, /bar.*/i, null], a: [9, 8, 7] }); - const tr = table({ v: [/foo/g, /bar.*/i], b: [5, 4] }); + tableEqual(tj2, { + u: [{k: 1, l: [2]}, {k: 2}], + a: [9, 8], + v: [{k: 1, l: [2]}, {k: 2}], + b: [5, 4] + }, 'loop join data with object keys'); + }); - const tj1 = tl.join(tr, ['u', 'v'], [all(), not('v')]); + it('allows empty suffix', () => { + const t1 = table({ k: [1, 2, 3], a: [3, 4, 1]}); + const t2 = table({ k: [1, 2, 3], a: [5, 6, 2]}); - tableEqual(t, tj1, { - u: [/foo/g, /bar.*/i], - a: [9, 8], - b: [5, 4] - }, 'hash join data with regexp keys'); + const tj = t1.join(t2, ['k','k'], [all(), not('k')], { suffix: ['', '_2'] }); - const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v), [all(), not('v')]); + tableEqual(tj, { + k: [1, 2, 3], + a: [3, 4, 1], + a_2: [5, 6, 2] + }, 'join with empty suffix left'); - tableEqual(t, tj2, { - u: [/foo/g, /bar.*/i], - a: [9, 8], - b: [5, 4] - }, 'loop join data with regexp keys'); + const tj2 = t1.join(t2, ['k','k'], [all(), not('k')], { suffix: ['_2', ''] }); - t.end(); + tableEqual(tj2, { + k: [1, 2, 3], + a_2: [3, 4, 1], + a: [5, 6, 2] + }, 'join with empty suffix right'); + }); }); -tape('join supports array-valued keys', t => { - const tl = table({ u: [[1, 2], [3, 4], null], a: [9, 8, 7] }); - const tr = table({ v: [[1, 2], [3, 4]], b: [5, 4] }); - - const tj1 = tl.join(tr, ['u', 'v']); - - tableEqual(t, tj1, { - u: [[1, 2], [3, 4]], - a: [9, 8], - v: [[1, 2], [3, 4]], - b: [5, 4] - }, 'hash join data with array keys'); +describe('join_left', () => { + it('performs left outer join with predicate', () => { + const [tl, tr] = joinTables(); + + const tj = tl.join_left(tr, (a, b) => a.k === b.u, { + k: d => d.k, + x: d => d.x, + y: d => d.y, + u: (a, b) => b.u, + v: (a, b) => b.v, + z: (a, b) => a.x + b.v + }); + + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b', 'c' ], + x: [ 1, 2, 2, 3, 3, 4 ], + y: [ 9, 8, 8, 7, 7, 6 ], + u: [ 'a', 'b', 'b', 'b', 'b', undefined ], + v: [ 4, 5, 6, 5, 6, undefined ], + z: [ 5, 7, 8, 8, 9, NaN ] + }, 'left join data'); + }); - const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v)); + it('performs left outer join with keys', () => { + const [tl, tr] = joinTables(); - tableEqual(t, tj2, { - u: [[1, 2], [3, 4]], - a: [9, 8], - v: [[1, 2], [3, 4]], - b: [5, 4] - }, 'loop join data with array keys'); + const tj = tl.join_left(tr, ['k', 'u'], [all(), not('u')]); - t.end(); + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b', 'c' ], + x: [ 1, 2, 2, 3, 3, 4 ], + y: [ 9, 8, 8, 7, 7, 6 ], + v: [ 4, 5, 6, 5, 6, undefined ] + }, 'left join data'); + }); }); -tape('join supports object-valued keys', t => { - const tl = table({ u: [{k: 1, l: [2]}, {k: 2}, null], a: [9, 8, 7] }); - const tr = table({ v: [{k: 1, l: [2]}, {k: 2}], b: [5, 4] }); - - const tj1 = tl.join(tr, ['u', 'v']); - - tableEqual(t, tj1, { - u: [{k: 1, l: [2]}, {k: 2}], - a: [9, 8], - v: [{k: 1, l: [2]}, {k: 2}], - b: [5, 4] - }, 'hash join data with object keys'); +describe('join_right', () => { + it('performs right outer join with predicate', () => { + const [tl, tr] = joinTables(); + + const tj = tl.join_right(tr, (a, b) => a.k === b.u, { + k: d => d.k, + x: d => d.x, + y: d => d.y, + u: (a, b) => b.u, + v: (a, b) => b.v, + z: (a, b) => a.x + b.v + }); + + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b', undefined ], + x: [ 1, 2, 2, 3, 3, undefined ], + y: [ 9, 8, 8, 7, 7, undefined ], + u: [ 'a', 'b', 'b', 'b', 'b', 'd' ], + v: [ 4, 5, 6, 5, 6, 0 ], + z: [ 5, 7, 8, 8, 9, NaN ] + }, 'right join data'); + }); - const tj2 = tl.join(tr, (a, b) => op.equal(a.u, b.v)); + it('performs right outer join with keys', () => { + const [tl, tr] = joinTables(); - tableEqual(t, tj2, { - u: [{k: 1, l: [2]}, {k: 2}], - a: [9, 8], - v: [{k: 1, l: [2]}, {k: 2}], - b: [5, 4] - }, 'loop join data with object keys'); + const tj = tl.join_right(tr, ['k', 'u'], [not('k'), all()]); - t.end(); + tableEqual(tj, { + x: [ 1, 2, 2, 3, 3, undefined ], + y: [ 9, 8, 8, 7, 7, undefined ], + u: [ 'a', 'b', 'b', 'b', 'b', 'd' ], + v: [ 4, 5, 6, 5, 6, 0 ] + }, 'right join data'); + }); }); -tape('join allows empty suffix', t => { - const t1 = table({ k: [1, 2, 3], a: [3, 4, 1]}); - const t2 = table({ k: [1, 2, 3], a: [5, 6, 2]}); - - const tj = t1.join(t2, ['k','k'], [all(), not('k')], { suffix: ['', '_2'] }); +describe('join_full', () => { + it('performs full outer join with keys', () => { + const [tl, tr] = joinTables(); - tableEqual(t, tj, { - k: [1, 2, 3], - a: [3, 4, 1], - a_2: [5, 6, 2] - }, 'join with empty suffix left'); + const tj = tl.join_full(tr, ['k', 'u'], [all(), all()]); - const tj2 = t1.join(t2, ['k','k'], [all(), not('k')], { suffix: ['_2', ''] }); - - tableEqual(t, tj2, { - k: [1, 2, 3], - a_2: [3, 4, 1], - a: [5, 6, 2] - }, 'join with empty suffix right'); + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b', 'c', undefined ], + x: [ 1, 2, 2, 3, 3, 4, undefined ], + y: [ 9, 8, 8, 7, 7, 6, undefined ], + u: [ 'a', 'b', 'b', 'b', 'b', undefined, 'd' ], + v: [ 4, 5, 6, 5, 6, undefined, 0 ] + }, 'full join data'); + }); - t.end(); + it('performs full outer join with predicate', () => { + const [tl, tr] = joinTables(); + + const tj = tl.join_full(tr, (a, b) => a.k === b.u, { + k: d => d.k, + x: d => d.x, + y: d => d.y, + u: (a, b) => b.u, + v: (a, b) => b.v, + z: (a, b) => a.x + b.v + }); + + tableEqual(tj, { + k: [ 'a', 'b', 'b', 'b', 'b', 'c', undefined ], + x: [ 1, 2, 2, 3, 3, 4, undefined ], + y: [ 9, 8, 8, 7, 7, 6, undefined ], + u: [ 'a', 'b', 'b', 'b', 'b', undefined, 'd' ], + v: [ 4, 5, 6, 5, 6, undefined, 0 ], + z: [ 5, 7, 8, 8, 9, NaN, NaN ] + }, 'full join data'); + }); }); diff --git a/test/verbs/lookup-test.js b/test/verbs/lookup-test.js index ef022b1b..1b45c8d6 100644 --- a/test/verbs/lookup-test.js +++ b/test/verbs/lookup-test.js @@ -1,54 +1,54 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, table } from '../../src'; - -tape('lookup retrieves values from lookup table with string values', t => { - const right = table({ - key: [1, 2, 3], - u: ['a', 'b', 'c'], - v: [5, 3, 1] +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table } from '../../src/index.js'; + +describe('lookup', () => { + it('retrieves values from lookup table with string values', () => { + const right = table({ + key: [1, 2, 3], + u: ['a', 'b', 'c'], + v: [5, 3, 1] + }); + + const left = table({ + id: [1, 2, 3, 4, 1] + }); + + const lt = left.lookup(right, ['id', 'key'], ['u', 'v']); + + assert.equal(lt.numRows(), 5, 'num rows'); + assert.equal(lt.numCols(), 3, 'num cols'); + + tableEqual(lt, { + id: [1, 2, 3, 4, 1], + u: ['a', 'b', 'c', undefined, 'a'], + v: [5, 3, 1, undefined, 5] + }, 'lookup data'); }); - const left = table({ - id: [1, 2, 3, 4, 1] + it('retrieves values from lookup table with function values', () => { + const right = table({ + key: [1, 2, 3], + u: ['a', 'b', 'c'], + v: [5, 3, 1] + }); + + const left = table({ + id: [1, 2, 3, 4, 1] + }); + + const lt = left.lookup(right, ['id', 'key'], { + u: d => d.u, + v: d => d.v - op.mean(d.v) + }); + + assert.equal(lt.numRows(), 5, 'num rows'); + assert.equal(lt.numCols(), 3, 'num cols'); + + tableEqual(lt, { + id: [1, 2, 3, 4, 1], + u: ['a', 'b', 'c', undefined, 'a'], + v: [2, 0, -2, undefined, 2] + }, 'lookup data'); }); - - const lt = left.lookup(right, ['id', 'key'], ['u', 'v']); - - t.equal(lt.numRows(), 5, 'num rows'); - t.equal(lt.numCols(), 3, 'num cols'); - - tableEqual(t, lt, { - id: [1, 2, 3, 4, 1], - u: ['a', 'b', 'c', undefined, 'a'], - v: [5, 3, 1, undefined, 5] - }, 'lookup data'); - t.end(); }); - -tape('lookup retrieves values from lookup table with function values', t => { - const right = table({ - key: [1, 2, 3], - u: ['a', 'b', 'c'], - v: [5, 3, 1] - }); - - const left = table({ - id: [1, 2, 3, 4, 1] - }); - - const lt = left.lookup(right, ['id', 'key'], { - u: d => d.u, - v: d => d.v - op.mean(d.v) - }); - - t.equal(lt.numRows(), 5, 'num rows'); - t.equal(lt.numCols(), 3, 'num cols'); - - tableEqual(t, lt, { - id: [1, 2, 3, 4, 1], - u: ['a', 'b', 'c', undefined, 'a'], - v: [2, 0, -2, undefined, 2] - }, 'lookup data'); - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/orderby-test.js b/test/verbs/orderby-test.js index 4fcf8fa1..ea30f1f3 100644 --- a/test/verbs/orderby-test.js +++ b/test/verbs/orderby-test.js @@ -1,50 +1,47 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { desc, op, table } from '../../src'; - -tape('orderby orders a table', t => { - const data = { - a: [2, 2, 3, 3, 1, 1], - b: [1, 2, 1, 2, 1, 2] - }; - - const ordered = { - a: [1, 1, 2, 2, 3, 3], - b: [2, 1, 2, 1, 2, 1] - }; - - const dt = table(data).orderby('a', desc('b')); - - const rows = []; - dt.scan(row => rows.push(row), true); - t.deepEqual(rows, [5, 4, 1, 0, 3, 2], 'orderby scan'); - - tableEqual(t, dt, ordered, 'orderby data'); - - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { desc, op, table } from '../../src/index.js'; + +describe('orderby', () => { + it('orders a table', () => { + const data = { + a: [2, 2, 3, 3, 1, 1], + b: [1, 2, 1, 2, 1, 2] + }; + + const ordered = { + a: [1, 1, 2, 2, 3, 3], + b: [2, 1, 2, 1, 2, 1] + }; + + const dt = table(data).orderby('a', desc('b')); + + const rows = []; + dt.scan(row => rows.push(row), true); + assert.deepEqual(rows, [5, 4, 1, 0, 3, 2], 'orderby scan'); + + tableEqual(dt, ordered, 'orderby data'); + }); + + it('supports aggregate functions', () => { + const data = { + a: [1, 2, 2, 3, 4, 5], + b: [9, 8, 7, 6, 5, 4] + }; + + const dt = table(data) + .groupby('a') + .orderby(d => op.mean(d.b)) + .reify(); + + tableEqual( dt, { + a: [5, 4, 3, 2, 2, 1], + b: [4, 5, 6, 8, 7, 9] + }, 'orderby data'); + }); + + it('throws on window functions', () => { + const data = { a: [1, 3, 5, 7] }; + assert.throws(() => table(data).orderby({ res: d => op.lag(d.a) }), 'no window'); + }); }); - -tape('orderby supports aggregate functions', t => { - const data = { - a: [1, 2, 2, 3, 4, 5], - b: [9, 8, 7, 6, 5, 4] - }; - - const dt = table(data) - .groupby('a') - .orderby(d => op.mean(d.b)) - .reify(); - - tableEqual(t, dt, { - a: [5, 4, 3, 2, 2, 1], - b: [4, 5, 6, 8, 7, 9] - }, 'orderby data'); - - t.end(); -}); - -tape('orderby throws on window functions', t => { - const data = { a: [1, 3, 5, 7] }; - t.throws(() => table(data).orderby({ res: d => op.lag(d.a) }), 'no window'); - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/pivot-test.js b/test/verbs/pivot-test.js index 28924c2c..db186686 100644 --- a/test/verbs/pivot-test.js +++ b/test/verbs/pivot-test.js @@ -1,142 +1,133 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, table } from '../../src'; - -tape('pivot generates cross-tabulation for single value', t => { - const data = { - k: ['a', 'b', 'c'], - x: [1, 2, 3] - }; - - const ut = table(data).pivot('k', 'x'); - tableEqual(t, ut, { - a: [ 1 ], b: [ 2 ], c: [ 3 ] - }, 'pivot data'); - t.end(); -}); - -tape('pivot generates cross-tabulation for single value as function', t => { - const data = { - k: ['a', 'b', 'c'], - x: [1, 2, 3] - }; - - const ut = table(data).pivot('k', { x: d => op.any(d.x) }); - tableEqual(t, ut, { - a: [ 1 ], b: [ 2 ], c: [ 3 ] - }, 'pivot data'); - t.end(); -}); - -tape('pivot generates cross-tabulation with groupby', t => { - const data = { - g: [0, 0, 1, 1], - k: ['a', 'b', 'a', 'b'], - x: [1, 2, 3, 4] - }; - - const ut = table(data).groupby('g').pivot('k', 'x'); - tableEqual(t, ut, { - g: [ 0, 1 ], a: [ 1, 3 ], b: [ 2, 4 ] - }, 'pivot data'); - t.end(); -}); - -tape('pivot generates cross-tabulation for multiple values', t => { - const data = { - k: ['a', 'b', 'b', 'c'], - x: [+1, +2, +2, +3], - y: [-9, -2, +2, -7] - }; - - const ut = table(data).pivot('k', { - x: d => op.sum(d.x), - y: d => op.product(op.abs(d.y)) +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table } from '../../src/index.js'; + +describe('pivot', () => { + it('generates cross-tabulation for single value', () => { + const data = { + k: ['a', 'b', 'c'], + x: [1, 2, 3] + }; + + const ut = table(data).pivot('k', 'x'); + tableEqual(ut, { + a: [ 1 ], b: [ 2 ], c: [ 3 ] + }, 'pivot data'); }); - tableEqual(t, ut, { - x_a: [ 1 ], - x_b: [ 4 ], - x_c: [ 3 ], - y_a: [ 9 ], - y_b: [ 4 ], - y_c: [ 7 ] - }, 'pivot data'); + it('generates cross-tabulation for single value as function', () => { + const data = { + k: ['a', 'b', 'c'], + x: [1, 2, 3] + }; - t.end(); -}); + const ut = table(data).pivot('k', { x: d => op.any(d.x) }); + tableEqual(ut, { + a: [ 1 ], b: [ 2 ], c: [ 3 ] + }, 'pivot data'); + }); -tape('pivot respects input options', t => { - const data = { - k: ['a', 'b', 'c'], - j: ['d', 'e', 'f'], - x: [1, 2, 3], - y: [9, 8, 7] - }; - - const ut = table(data).pivot(['k', 'j'], ['x', 'y'], { - keySeparator: '/', - valueSeparator: ':', - limit: 2 + it('generates cross-tabulation with groupby', () => { + const data = { + g: [0, 0, 1, 1], + k: ['a', 'b', 'a', 'b'], + x: [1, 2, 3, 4] + }; + + const ut = table(data).groupby('g').pivot('k', 'x'); + tableEqual(ut, { + g: [ 0, 1 ], a: [ 1, 3 ], b: [ 2, 4 ] + }, 'pivot data'); }); - tableEqual(t, ut, { - 'x:a/d': [ 1 ], - 'x:b/e': [ 2 ], - 'y:a/d': [ 9 ], - 'y:b/e': [ 8 ] - }, 'pivot data'); + it('generates cross-tabulation for multiple values', () => { + const data = { + k: ['a', 'b', 'b', 'c'], + x: [+1, +2, +2, +3], + y: [-9, -2, +2, -7] + }; + + const ut = table(data).pivot('k', { + x: d => op.sum(d.x), + y: d => op.product(op.abs(d.y)) + }); + + tableEqual(ut, { + x_a: [ 1 ], + x_b: [ 4 ], + x_c: [ 3 ], + y_a: [ 9 ], + y_b: [ 4 ], + y_c: [ 7 ] + }, 'pivot data'); + }); - t.end(); -}); + it('respects input options', () => { + const data = { + k: ['a', 'b', 'c'], + j: ['d', 'e', 'f'], + x: [1, 2, 3], + y: [9, 8, 7] + }; + + const ut = table(data).pivot(['k', 'j'], ['x', 'y'], { + keySeparator: '/', + valueSeparator: ':', + limit: 2 + }); + + tableEqual(ut, { + 'x:a/d': [ 1 ], + 'x:b/e': [ 2 ], + 'y:a/d': [ 9 ], + 'y:b/e': [ 8 ] + }, 'pivot data'); + }); -tape('pivot correctly orders integer column names', t => { - const data = { - g: ['a', 'a', 'a', 'b', 'b', 'b'], - k: [2002, 2001, 2000, 2002, 2001, 2000], - x: [1, 2, 3, 4, 5, 6] - }; + it('correctly orders integer column names', () => { + const data = { + g: ['a', 'a', 'a', 'b', 'b', 'b'], + k: [2002, 2001, 2000, 2002, 2001, 2000], + x: [1, 2, 3, 4, 5, 6] + }; - const ut = table(data) - .groupby('g') - .pivot('k', 'x', { sort: false }); + const ut = table(data) + .groupby('g') + .pivot('k', 'x', { sort: false }); - tableEqual(t, ut, { - 'g': ['a', 'b'], '2002': [ 1, 4 ], '2001': [ 2, 5 ], '2000': [ 3, 6 ] - }, 'pivot data'); + tableEqual(ut, { + 'g': ['a', 'b'], '2002': [ 1, 4 ], '2001': [ 2, 5 ], '2000': [ 3, 6 ] + }, 'pivot data'); - t.deepEqual(ut.columnNames(), ['g', '2002', '2001', '2000']); - t.end(); -}); + assert.deepEqual(ut.columnNames(), ['g', '2002', '2001', '2000']); + }); -tape('pivot handles filtered and ordered table', t => { - const dt = table({ - country: ['France', 'France', 'France', 'Germany', 'Germany', 'Germany', 'Japan', 'Japan', 'Japan'], - year: [2017, 2018, 2019, 2017, 2018, 2019, 2017, 2018, 2019], - expenditure: ['NA', 51410, 52229, 45340, 46512, 51190, 46542, 46618, 46562] - }) - .filter(d => d.year > 2017) - .orderby('country') - .groupby('country') - .pivot('year', 'expenditure'); - - tableEqual(t, dt, { - country: ['France', 'Germany', 'Japan'], - 2018: [51410, 46512, 46618], - 2019: [52229,51190,46562] - }, 'pivot data'); - - t.end(); -}); + it('handles filtered and ordered table', () => { + const dt = table({ + country: ['France', 'France', 'France', 'Germany', 'Germany', 'Germany', 'Japan', 'Japan', 'Japan'], + year: [2017, 2018, 2019, 2017, 2018, 2019, 2017, 2018, 2019], + expenditure: ['NA', 51410, 52229, 45340, 46512, 51190, 46542, 46618, 46562] + }) + .filter(d => d.year > 2017) + .orderby('country') + .groupby('country') + .pivot('year', 'expenditure'); + + tableEqual(dt, { + country: ['France', 'Germany', 'Japan'], + 2018: [51410, 46512, 46618], + 2019: [52229,51190,46562] + }, 'pivot data'); + }); -tape('pivot handles count aggregates', t => { - const data = { - k: ['a', 'b', 'b'] - }; + it('handles count aggregates', () => { + const data = { + k: ['a', 'b', 'b'] + }; - const ut = table(data).pivot('k', { count: op.count() }); - tableEqual(t, ut, { - a: [ 1 ], b: [ 2 ] - }, 'pivot data'); - t.end(); + const ut = table(data).pivot('k', { count: op.count() }); + tableEqual(ut, { + a: [ 1 ], b: [ 2 ] + }, 'pivot data'); + }); }); diff --git a/test/verbs/reduce-test.js b/test/verbs/reduce-test.js index 470c536d..5e2531d4 100644 --- a/test/verbs/reduce-test.js +++ b/test/verbs/reduce-test.js @@ -1,41 +1,40 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import countPattern from '../../src/engine/reduce/count-pattern'; -import { table } from '../../src'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; +import countPattern from '../../src/verbs/reduce/count-pattern.js'; -tape('reduce produces multiple aggregates', t => { - const data = { - text: ['foo bar', 'foo', 'bar baz', 'baz'] - }; +describe('reduce', () => { + it('produces multiple aggregates', () => { + const data = { + text: ['foo bar', 'foo', 'bar baz', 'baz'] + }; - const dt = table(data).reduce(countPattern('text')); + const dt = table(data).reduce(countPattern('text')); - t.equal(dt.numRows(), 3, 'num rows'); - t.equal(dt.numCols(), 2, 'num columns'); - tableEqual(t, dt, { - word: ['foo', 'bar', 'baz'], - count: [2, 2, 2] - }, 'reduce data'); - t.end(); -}); - -tape('reduce produces grouped multiple aggregates', t => { - const data = { - key: ['a', 'a', 'b', 'b'], - text: ['foo bar', 'foo', 'bar baz', 'baz bop'] - }; + assert.equal(dt.numRows(), 3, 'num rows'); + assert.equal(dt.numCols(), 2, 'num columns'); + tableEqual(dt, { + word: ['foo', 'bar', 'baz'], + count: [2, 2, 2] + }, 'reduce data'); + }); - const dt = table(data) - .groupby('key') - .reduce(countPattern('text')); + it('produces grouped multiple aggregates', () => { + const data = { + key: ['a', 'a', 'b', 'b'], + text: ['foo bar', 'foo', 'bar baz', 'baz bop'] + }; - t.equal(dt.numRows(), 5, 'num rows'); - t.equal(dt.numCols(), 3, 'num columns'); - tableEqual(t, dt, { - key: ['a', 'a', 'b', 'b', 'b'], - word: ['foo', 'bar', 'bar', 'baz', 'bop'], - count: [2, 1, 1, 2, 1] - }, 'reduce data'); + const dt = table(data) + .groupby('key') + .reduce(countPattern('text')); - t.end(); -}); \ No newline at end of file + assert.equal(dt.numRows(), 5, 'num rows'); + assert.equal(dt.numCols(), 3, 'num columns'); + tableEqual(dt, { + key: ['a', 'a', 'b', 'b', 'b'], + word: ['foo', 'bar', 'bar', 'baz', 'bop'], + count: [2, 1, 1, 2, 1] + }, 'reduce data'); + }); +}); diff --git a/test/verbs/reify-test.js b/test/verbs/reify-test.js index bea6401c..ac11a16e 100644 --- a/test/verbs/reify-test.js +++ b/test/verbs/reify-test.js @@ -1,50 +1,46 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src/table'; -import { fromArrow, toArrow } from '../../src'; - -tape('reify materializes filtered and ordered tables', t => { - const dt = table({ - a: [5, 4, 3, 2, 1], - b: [1, 1, 0, 0, 1] +import tableEqual from '../table-equal.js'; +import { fromArrow, table, toArrow } from '../../src/index.js'; + +describe('reify', () => { + it('materializes filtered and ordered tables', () => { + const dt = table({ + a: [5, 4, 3, 2, 1], + b: [1, 1, 0, 0, 1] + }); + + const rt = dt.filter(d => d.b) + .orderby('a') + .reify(); + + tableEqual(rt, + { a: [1, 4, 5], b: [1, 1, 1] }, + 'reify data' + ); }); - const rt = dt.filter(d => d.b) - .orderby('a') - .reify(); - - tableEqual(t, rt, - { a: [1, 4, 5], b: [1, 1, 1] }, - 'reify data' - ); - - t.end(); + it('preserves binary data', () => { + const data = [ + { a: 1.0, b: 'a', c: [1], d: new Date(2000, 0, 1, 1) }, + { a: 1.3, b: 'b', c: [2], d: new Date(2001, 1, 1, 2) }, + { a: 1.5, b: 'c', c: [3], d: new Date(2002, 2, 1, 3) }, + { a: 1.7, b: 'd', c: [4], d: new Date(2003, 3, 1, 4) } + ]; + + const dt = fromArrow(toArrow(data)); + const rt = dt.filter(d => d.b !== 'c').reify(); + + tableEqual(rt, + { + a: [1.0, 1.3, 1.7], + b: ['a', 'b', 'd'], + c: [[1], [2], [4]], + d: [ + new Date(2000, 0, 1, 1), + new Date(2001, 1, 1, 2), + new Date(2003, 3, 1, 4) + ] + }, + 'reify data' + ); + }); }); - -tape('reify preserves binary data', t => { - const data = [ - { a: 1.0, b: 'a', c: [1], d: new Date(2000, 0, 1, 1) }, - { a: 1.3, b: 'b', c: [2], d: new Date(2001, 1, 1, 2) }, - { a: 1.5, b: 'c', c: [3], d: new Date(2002, 2, 1, 3) }, - { a: 1.7, b: 'd', c: [4], d: new Date(2003, 3, 1, 4) } - ]; - - const dt = fromArrow(toArrow(data)); - const rt = dt.filter(d => d.b !== 'c').reify(); - - tableEqual(t, rt, - { - a: [1.0, 1.3, 1.7], - b: ['a', 'b', 'd'], - c: [[1], [2], [4]], - d: [ - new Date(2000, 0, 1, 1), - new Date(2001, 1, 1, 2), - new Date(2003, 3, 1, 4) - ] - }, - 'reify data' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/relocate-test.js b/test/verbs/relocate-test.js index 2509dba9..1702c0ed 100644 --- a/test/verbs/relocate-test.js +++ b/test/verbs/relocate-test.js @@ -1,91 +1,85 @@ -import tape from 'tape'; -import { not, range, table } from '../../src'; - -tape('relocate repositions columns', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - t.deepEqual( - dt.relocate('a', { before: 'd' }).columnNames(), - ['b', 'c', 'a', 'd'], - 'relocate data, before' - ); - - t.deepEqual( - dt.relocate(not('b', 'd'), { before: 'b' }).columnNames(), - ['a', 'c', 'b', 'd'], - 'relocate data, before' - ); - - t.deepEqual( - dt.relocate(not('b', 'd'), { after: 'd' }).columnNames(), - ['b', 'd', 'a', 'c'], - 'relocate data, after' - ); - - t.deepEqual( - dt.relocate(not('b', 'd'), { before: 'c' }).columnNames(), - ['b', 'a', 'c', 'd'], - 'relocate data, before self' - ); - - t.deepEqual( - dt.relocate(not('b', 'd'), { after: 'a' }).columnNames(), - ['a', 'c', 'b', 'd'], - 'relocate data, after self' - ); - - t.end(); -}); - -tape('relocate repositions columns using multi-column anchor', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - t.deepEqual( - dt.relocate([1, 3], { before: range(2, 3) }).columnNames(), - ['a', 'b', 'd', 'c'], - 'relocate data, before range' - ); - - t.deepEqual( - dt.relocate([1, 3], { after: range(2, 3) }).columnNames(), - ['a', 'c', 'b', 'd'], - 'relocate data, after range' - ); - - t.end(); -}); - -tape('relocate repositions and renames columns', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - t.deepEqual( - dt.relocate({ a: 'e', c: 'f' }, { before: { b: '?' } }).columnNames(), - ['e', 'f', 'b', 'd'], - 'relocate data, before plus rename' - ); - - t.deepEqual( - dt.relocate({ a: 'e', c: 'f' }, { after: { b: '?' } }).columnNames(), - ['b', 'e', 'f', 'd'], - 'relocate data, after plus rename' - ); - - t.end(); -}); - -tape('relocate throws errors for invalid options', t => { - const a = [1], b = [2], c = [3], d = [4]; - const dt = table({ a, b, c, d }); - - t.throws(() => dt.relocate(not('b', 'd')), 'missing options'); - t.throws(() => dt.relocate(not('b', 'd'), {}), 'empty options'); - t.throws( - () => dt.relocate(not('b', 'd'), { before: 'b', after: 'b' }), - 'over-specified options' - ); - - t.end(); +import assert from 'node:assert'; +import { not, range, table } from '../../src/index.js'; + +describe('relocate', () => { + it(' repositions columns', () => { + const a = [1], b = [2], c = [3], d = [4]; + const dt = table({ a, b, c, d }); + + assert.deepEqual( + dt.relocate('a', { before: 'd' }).columnNames(), + ['b', 'c', 'a', 'd'], + 'relocate data, before' + ); + + assert.deepEqual( + dt.relocate(not('b', 'd'), { before: 'b' }).columnNames(), + ['a', 'c', 'b', 'd'], + 'relocate data, before' + ); + + assert.deepEqual( + dt.relocate(not('b', 'd'), { after: 'd' }).columnNames(), + ['b', 'd', 'a', 'c'], + 'relocate data, after' + ); + + assert.deepEqual( + dt.relocate(not('b', 'd'), { before: 'c' }).columnNames(), + ['b', 'a', 'c', 'd'], + 'relocate data, before self' + ); + + assert.deepEqual( + dt.relocate(not('b', 'd'), { after: 'a' }).columnNames(), + ['a', 'c', 'b', 'd'], + 'relocate data, after self' + ); + }); + + it(' repositions columns using multi-column anchor', () => { + const a = [1], b = [2], c = [3], d = [4]; + const dt = table({ a, b, c, d }); + + assert.deepEqual( + dt.relocate([1, 3], { before: range(2, 3) }).columnNames(), + ['a', 'b', 'd', 'c'], + 'relocate data, before range' + ); + + assert.deepEqual( + dt.relocate([1, 3], { after: range(2, 3) }).columnNames(), + ['a', 'c', 'b', 'd'], + 'relocate data, after range' + ); + }); + + it(' repositions and renames columns', () => { + const a = [1], b = [2], c = [3], d = [4]; + const dt = table({ a, b, c, d }); + + assert.deepEqual( + dt.relocate({ a: 'e', c: 'f' }, { before: { b: '?' } }).columnNames(), + ['e', 'f', 'b', 'd'], + 'relocate data, before plus rename' + ); + + assert.deepEqual( + dt.relocate({ a: 'e', c: 'f' }, { after: { b: '?' } }).columnNames(), + ['b', 'e', 'f', 'd'], + 'relocate data, after plus rename' + ); + }); + + it(' throws errors for invalid options', () => { + const a = [1], b = [2], c = [3], d = [4]; + const dt = table({ a, b, c, d }); + + assert.throws(() => dt.relocate(not('b', 'd')), 'missing options'); + assert.throws(() => dt.relocate(not('b', 'd'), {}), 'empty options'); + assert.throws( + () => dt.relocate(not('b', 'd'), { before: 'b', after: 'b' }), + 'over-specified options' + ); + }); }); diff --git a/test/verbs/rename-test.js b/test/verbs/rename-test.js index 05151fe2..e54ec14e 100644 --- a/test/verbs/rename-test.js +++ b/test/verbs/rename-test.js @@ -1,49 +1,49 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; - -tape('rename renames columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - tableEqual(t, - table(data).rename({ a: 'z'}), - { z: data.a, b: data.b, c: data.c }, - 'renamed data, single column' - ); - - tableEqual(t, - table(data).rename({ a: 'z', b: 'y' }), - { z: data.a, y: data.b, c: data.c }, - 'renamed data, multiple columns' - ); - - t.deepEqual( - table(data).rename({ a: 'z', c: 'x' }).columnNames(), - ['z', 'b', 'x'], - 'renamed data, preserves order' - ); - - tableEqual(t, - table(data).rename('a', 'b'), - data, - 'renamed data, no rename' - ); - - tableEqual(t, - table(data).rename(), - data, - 'renamed data, no arguments' - ); - - tableEqual(t, - table(data).rename({ a: 'z'}, { c: 'x' }), - { z: data.a, b: data.b, x: data.c }, - 'renamed data, multiple arguments' - ); - - t.end(); -}); \ No newline at end of file +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; + +describe('rename', () => { + it(' renames columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + tableEqual( + table(data).rename({ a: 'z'}), + { z: data.a, b: data.b, c: data.c }, + 'renamed data, single column' + ); + + tableEqual( + table(data).rename({ a: 'z', b: 'y' }), + { z: data.a, y: data.b, c: data.c }, + 'renamed data, multiple columns' + ); + + assert.deepEqual( + table(data).rename({ a: 'z', c: 'x' }).columnNames(), + ['z', 'b', 'x'], + 'renamed data, preserves order' + ); + + tableEqual( + table(data).rename('a', 'b'), + data, + 'renamed data, no rename' + ); + + tableEqual( + table(data).rename(), + data, + 'renamed data, no arguments' + ); + + tableEqual( + table(data).rename({ a: 'z'}, { c: 'x' }), + { z: data.a, b: data.b, x: data.c }, + 'renamed data, multiple arguments' + ); + }); +}); diff --git a/test/verbs/rollup-test.js b/test/verbs/rollup-test.js index fd7d7b62..3cd53a33 100644 --- a/test/verbs/rollup-test.js +++ b/test/verbs/rollup-test.js @@ -1,297 +1,278 @@ -/* eslint-disable no-undef */ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { bin, op, table } from '../../src'; - -tape('rollup produces flat aggregates', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const rt = table(data).rollup({ sum: d => op.sum(d.a + d.b) }); - - t.equal(rt.numRows(), 1, 'num rows'); - t.equal(rt.numCols(), 1, 'num cols'); - tableEqual(t, rt, { sum: [36] }, 'rollup data'); - t.end(); -}); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { bin, op, table } from '../../src/index.js'; + +const { + avg_rank, bins, corr, exp, log, log2, mean, + median, sqrt, stdev, sum, valid, variance +} = op; + +describe('rollup', () => { + it('produces flat aggregates', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const rt = table(data).rollup({ sum: d => op.sum(d.a + d.b) }); + + assert.equal(rt.numRows(), 1, 'num rows'); + assert.equal(rt.numCols(), 1, 'num cols'); + tableEqual(rt, { sum: [36] }, 'rollup data'); + }); -tape('rollup produces grouped aggregates', t => { - const data = { - k: ['a', 'a', 'b', 'b'], - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const rt = table(data) - .groupby({ key: d => d.k }) - .rollup({ sum: d => op.sum(d.a + d.b) }); - - t.equal(rt.numRows(), 2, 'num rows'); - t.equal(rt.numCols(), 2, 'num cols'); - tableEqual(t, rt, { - key: ['a', 'b'], - sum: [10, 26] - }, 'rollup data'); - t.end(); -}); + it('produces grouped aggregates', () => { + const data = { + k: ['a', 'a', 'b', 'b'], + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; + + const rt = table(data) + .groupby({ key: d => d.k }) + .rollup({ sum: d => op.sum(d.a + d.b) }); + + assert.equal(rt.numRows(), 2, 'num rows'); + assert.equal(rt.numCols(), 2, 'num cols'); + tableEqual(rt, { + key: ['a', 'b'], + sum: [10, 26] + }, 'rollup data'); + }); -tape('rollup handles empty tables', t => { - [[], [null, null], [undefined, undefined], [NaN, NaN]].forEach(v => { - const rt = table({ v }).rollup({ - sum: op.sum('v'), - prod: op.product('v'), - mode: op.mode('v'), - med: op.median('v'), - min: op.min('v'), - max: op.min('v'), - sd: op.stdev('v') + it('handles empty tables', () => { + [[], [null, null], [undefined, undefined], [NaN, NaN]].forEach(v => { + const rt = table({ v }).rollup({ + sum: op.sum('v'), + prod: op.product('v'), + mode: op.mode('v'), + med: op.median('v'), + min: op.min('v'), + max: op.min('v'), + sd: op.stdev('v') + }); + + tableEqual(rt, { + sum: [undefined], + prod: [undefined], + mode: [undefined], + med: [undefined], + min: [undefined], + max: [undefined], + sd: [undefined] + }, 'rollup data, ' + (v.length ? v[0] : 'empty')); }); - - tableEqual(t, rt, { - sum: [undefined], - prod: [undefined], - mode: [undefined], - med: [undefined], - min: [undefined], - max: [undefined], - sd: [undefined] - }, 'rollup data, ' + (v.length ? v[0] : 'empty')); }); - t.end(); -}); - -tape('rollup handles empty input', t => { - const dt = table({ x: ['a', 'b', 'c'] }); - - tableEqual(t, dt.rollup(), { }, 'rollup data, no groups'); + it('handles empty input', () => { + const dt = table({ x: ['a', 'b', 'c'] }); - tableEqual(t, - dt.groupby('x').rollup(), - { x: ['a', 'b', 'c'] }, - 'rollup data, groups' - ); - - t.end(); -}); + tableEqual(dt.rollup(), { }, 'rollup data, no groups'); -tape('rollup supports bigint values', t => { - const data = { - v: [1n, 2n, 3n, 4n, 5n] - }; - - const dt = table(data) - .rollup({ - any: op.any('v'), - dist: op.distinct('v'), - cnt: op.count('v'), - val: op.valid('v'), - inv: op.invalid('v'), - sum: op.sum('v'), - prod: op.product('v'), - min: op.min('v'), - max: op.max('v'), - med: op.median('v'), - vals: op.array_agg('v'), - uniq: op.array_agg_distinct('v') - }); + tableEqual( + dt.groupby('x').rollup(), + { x: ['a', 'b', 'c'] }, + 'rollup data, groups' + ); + }); - t.deepEqual( - dt.objects()[0], - { - any: 1n, - dist: 5, - cnt: 5, - val: 5, - inv: 0, - sum: 15n, - prod: 120n, - min: 1n, - max: 5n, - med: 3n, - vals: [1n, 2n, 3n, 4n, 5n], - uniq: [1n, 2n, 3n, 4n, 5n] - }, - 'rollup data' - ); - t.end(); -}); + it('supports bigint values', () => { + const data = { + v: [1n, 2n, 3n, 4n, 5n] + }; + + const dt = table(data) + .rollup({ + any: op.any('v'), + dist: op.distinct('v'), + cnt: op.count('v'), + val: op.valid('v'), + inv: op.invalid('v'), + sum: op.sum('v'), + prod: op.product('v'), + min: op.min('v'), + max: op.max('v'), + med: op.median('v'), + vals: op.array_agg('v'), + uniq: op.array_agg_distinct('v') + }); + + assert.deepEqual( + dt.objects()[0], + { + any: 1n, + dist: 5, + cnt: 5, + val: 5, + inv: 0, + sum: 15n, + prod: 120n, + min: 1n, + max: 5n, + med: 3n, + vals: [1n, 2n, 3n, 4n, 5n], + uniq: [1n, 2n, 3n, 4n, 5n] + }, + 'rollup data' + ); + }); -tape('rollup supports ordered tables', t => { - const rt = table({ v: [3, 1, 4, 2] }) - .orderby('v') - .rollup({ v: op.array_agg('v') }); + it('supports ordered tables', () => { + const rt = table({ v: [3, 1, 4, 2] }) + .orderby('v') + .rollup({ v: op.array_agg('v') }); + tableEqual(rt, { v: [ [1, 2, 3, 4] ] }, 'rollup data'); + }); - tableEqual(t, rt, { v: [ [1, 2, 3, 4] ] }, 'rollup data'); - t.end(); -}); + it('supports object_agg functions', () => { + const data = { + g: [0, 0, 1, 1, 1], + k: ['a', 'b', 'a', 'b', 'a'], + v: [1, 2, 3, 4, 5] + }; + + const dt = table(data).groupby('g'); + + assert.deepEqual( + dt.rollup({ o: op.object_agg('k', 'v') }).array('o'), + [ + { a: 1, b: 2 }, + { a: 5, b: 4 } + ], + 'rollup data - object_agg' + ); + + assert.deepEqual( + dt.rollup({ o: op.entries_agg('k', 'v') }).array('o'), + [ + [['a', 1], ['b', 2]], + [['a', 3], ['b', 4], ['a', 5]] + ], + 'rollup data - entries_agg' + ); + + assert.deepEqual( + dt.rollup({ o: op.map_agg('k', 'v') }).array('o'), + [ + new Map([['a', 1], ['b', 2]]), + new Map([['a', 5], ['b', 4]]) + ], + 'rollup data - map_agg' + ); + }); -tape('rollup supports object_agg functions', t => { - const data = { - g: [0, 0, 1, 1, 1], - k: ['a', 'b', 'a', 'b', 'a'], - v: [1, 2, 3, 4, 5] - }; - - const dt = table(data).groupby('g'); - - t.deepEqual( - dt.rollup({ o: op.object_agg('k', 'v') }) - .columnArray('o'), - [ - { a: 1, b: 2 }, - { a: 5, b: 4 } - ], - 'rollup data - object_agg' - ); - - t.deepEqual( - dt.rollup({ o: op.entries_agg('k', 'v') }) - .columnArray('o'), - [ - [['a', 1], ['b', 2]], - [['a', 3], ['b', 4], ['a', 5]] - ], - 'rollup data - entries_agg' - ); - - t.deepEqual( - dt.rollup({ o: op.map_agg('k', 'v') }) - .columnArray('o'), - [ - new Map([['a', 1], ['b', 2]]), - new Map([['a', 5], ['b', 4]]) - ], - 'rollup data - map_agg' - ); - - t.end(); -}); + it('supports histogram', () => { + const data = { x: [1, 1, 3, 4, 5, 6, 7, 8, 9, 10] }; + const result = { + b0: [1, 3, 4, 5, 6, 7, 8, 9, 10], + b1: [1.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5], + count: [2, 1, 1, 1, 1, 1, 1, 1, 1] + }; + + const dt = table(data) + .groupby({ + b0: d => bin(d.x, ...bins(d.x, 20)), + b1: d => bin(d.x, ...bins(d.x, 20), 1) + }) + .count(); + tableEqual( dt, result, 'histogram'); + + const ht = table(data) + .groupby({ + b0: bin('x', { maxbins: 20 }), + b1: bin('x', { maxbins: 20, offset: 1 }) + }) + .count(); + tableEqual( ht, result, 'histogram from bin helper, maxbins'); + + const st = table(data) + .groupby({ + b0: bin('x', { step: 0.5 }), + b1: bin('x', { step: 0.5, offset: 1 }) + }) + .count(); + tableEqual( st, result, 'histogram from bin helper, step'); + }); -tape('rollup supports histogram', t => { - const data = { x: [1, 1, 3, 4, 5, 6, 7, 8, 9, 10] }; - const result = { - b0: [1, 3, 4, 5, 6, 7, 8, 9, 10], - b1: [1.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5], - count: [2, 1, 1, 1, 1, 1, 1, 1, 1] - }; - - const dt = table(data) - .groupby({ - b0: d => bin(d.x, ...bins(d.x, 20)), - b1: d => bin(d.x, ...bins(d.x, 20), 1) - }) - .count(); - tableEqual(t, dt, result, 'histogram'); - - const ht = table(data) - .groupby({ - b0: bin('x', { maxbins: 20 }), - b1: bin('x', { maxbins: 20, offset: 1 }) - }) - .count(); - tableEqual(t, ht, result, 'histogram from bin helper, maxbins'); - - const st = table(data) - .groupby({ - b0: bin('x', { step: 0.5 }), - b1: bin('x', { step: 0.5, offset: 1 }) - }) - .count(); - tableEqual(t, st, result, 'histogram from bin helper, step'); - - t.end(); -}); + it('supports dot product', () => { + const data = { x: [1, 2, 3], y: [4, 5, 6] }; + const dt = table(data).rollup({ dot: d => sum(d.x * d.y) }); + tableEqual(dt, { dot: [32] }, 'dot product'); + }); -tape('rollup supports dot product', t => { - const data = { x: [1, 2, 3], y: [4, 5, 6] }; - const dt = table(data).rollup({ dot: d => sum(d.x * d.y) }); - tableEqual(t, dt, { dot: [32] }, 'dot product'); - t.end(); -}); + it('supports geometric mean', () => { + const data = { x: [1, 2, 3, 4, 5, 6] }; + const dt = table(data).rollup({ gm: d => exp(mean(log(d.x))) }); + const gm = [ Math.pow(1 * 2 * 3 * 4 * 5 * 6, 1/6) ]; + tableEqual(dt, { gm }, 'geometric mean'); + }); -tape('rollup supports geometric mean', t => { - const data = { x: [1, 2, 3, 4, 5, 6] }; - const dt = table(data).rollup({ gm: d => exp(mean(log(d.x))) }); - const gm = [ Math.pow(1 * 2 * 3 * 4 * 5 * 6, 1/6) ]; - tableEqual(t, dt, { gm }, 'geometric mean'); - t.end(); -}); + it('supports harmonic mean', () => { + const data = { x: [1, 2, 3, 4, 5, 6] }; + const dt = table(data).rollup({ hm: d => 1 / mean(1 / d.x) }); + const hm = [ 6 / (1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6) ]; + tableEqual(dt, { hm }, 'harmonic mean'); + }); -tape('rollup supports harmonic mean', t => { - const data = { x: [1, 2, 3, 4, 5, 6] }; - const dt = table(data).rollup({ hm: d => 1 / mean(1 / d.x) }); - const hm = [ 6 / (1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6) ]; - tableEqual(t, dt, { hm }, 'harmonic mean'); - t.end(); -}); + it('supports median skew', () => { + const data = { x: [1, 2, 3, 4, 5, 1000] }; + const dt = table(data).rollup({ + ms: ({ x }) => stdev(x) ? (mean(x) - median(x)) / stdev(x) : 0 + }); + tableEqual(dt, { ms: [ 0.4070174034861516 ] }, 'median skew'); + }); -tape('rollup supports median skew', t => { - const data = { x: [1, 2, 3, 4, 5, 1000] }; - const dt = table(data).rollup({ - ms: ({ x }) => stdev(x) ? (mean(x) - median(x)) / stdev(x) : 0 + it('supports vector norm', () => { + const data = { x: [1, 2, 3, 4, 5] }; + const dt = table(data).rollup({ vn: d => sqrt(sum(d.x * d.x)) }); + const vn = [ Math.sqrt(1 + 4 + 9 + 16 + 25) ]; + tableEqual(dt, { vn }, 'vector norm'); }); - tableEqual(t, dt, { ms: [ 0.4070174034861516 ] }, 'median skew'); - t.end(); -}); -tape('rollup supports vector norm', t => { - const data = { x: [1, 2, 3, 4, 5] }; - const dt = table(data).rollup({ vn: d => sqrt(sum(d.x * d.x)) }); - const vn = [ Math.sqrt(1 + 4 + 9 + 16 + 25) ]; - tableEqual(t, dt, { vn }, 'vector norm'); - t.end(); -}); + it('supports cohens d', () => { + const data = { + a: [3, 4, 5, 6, 7], + b: [1, 2, 3, 4, 5] + }; + const dt = table(data).rollup({ + cd: ({ a, b }) => { + const va = (valid(a) - 1) * variance(a); + const vb = (valid(b) - 1) * variance(b); + const sd = sqrt((va + vb) / (valid(a) + valid(b) - 2)); + return sd ? (mean(a) - mean(b)) / sd : 0; + } + }); + tableEqual(dt, { cd: [ 1.2649110640673518 ] }, 'cohens d'); + }); -tape('rollup supports cohens d', t => { - const data = { - a: [3, 4, 5, 6, 7], - b: [1, 2, 3, 4, 5] - }; - const dt = table(data).rollup({ - cd: ({ a, b }) => { - const va = (valid(a) - 1) * variance(a); - const vb = (valid(b) - 1) * variance(b); - const sd = sqrt((va + vb) / (valid(a) + valid(b) - 2)); - return sd ? (mean(a) - mean(b)) / sd : 0; - } + it('supports entropy', () => { + const data = { x: [1, 1, 1, 2, 2, 3] }; + const dt = table(data) + .groupby('x') + .count({ as: 'num' }) + .derive({ p: d => d.num / sum(d.num) }) + .rollup({ h: d => -sum(d.p ? d.p * log2(d.p) : 0) }); + tableEqual(dt, { h: [ 1.4591479170272448 ] }, 'entropy'); }); - tableEqual(t, dt, { cd: [ 1.2649110640673518 ] }, 'cohens d'); - t.end(); -}); -tape('rollup supports entropy', t => { - const data = { x: [1, 1, 1, 2, 2, 3] }; - const dt = table(data) - .groupby('x') - .count({ as: 'num' }) - .derive({ p: d => d.num / sum(d.num) }) - .rollup({ h: d => -sum(d.p ? d.p * log2(d.p) : 0) }); - tableEqual(t, dt, { h: [ 1.4591479170272448 ] }, 'entropy'); - t.end(); -}); + it('supports spearman rank correlation', () => { + const data = { + a: [1, 2, 2, 3, 4, 5], + b: [9, 8, 8, 7, 6, 5] + }; -tape('rollup supports spearman rank correlation', t => { - const data = { - a: [1, 2, 2, 3, 4, 5], - b: [9, 8, 8, 7, 6, 5] - }; + const dt = table(data) + .orderby('a').derive({ ra: () => avg_rank() }) + .orderby('b').derive({ rb: () => avg_rank() }) + .rollup({ rho: d => corr(d.ra, d.rb) }); - const dt = table(data) - .orderby('a').derive({ ra: () => avg_rank() }) - .orderby('b').derive({ rb: () => avg_rank() }) - .rollup({ rho: d => corr(d.ra, d.rb) }); + tableEqual(dt, { rho: [ -1 ]}, 'spearman rank correlation'); + }); - tableEqual(t, dt, { rho: [ -1 ]}, 'spearman rank correlation'); - t.end(); + it('errors when parsing window functions', () => { + assert.throws(() => + table({ a: [1, 2, 2, 3, 4, 5] }) + .rollup({ x: d => op.first_value(d.a) }) + ); + }); }); - -tape('rollup errors when parsing window functions', t => { - t.throws(() => - table({ a: [1, 2, 2, 3, 4, 5] }) - .rollup({ x: d => op.first_value(d.a) }) - ); - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/sample-test.js b/test/verbs/sample-test.js index e7ce1652..415916f7 100644 --- a/test/verbs/sample-test.js +++ b/test/verbs/sample-test.js @@ -1,17 +1,17 @@ -import tape from 'tape'; -import { frac, table } from '../../src'; +import assert from 'node:assert'; +import { frac, table } from '../../src/index.js'; -function check(t, table, replace, prefix = '') { +function check(table, replace, prefix = '') { prefix = `${prefix}sample ${replace ? 'replace ' : ''}rows`; const vals = []; const cnts = {}; table.scan((row, data) => { - const val = data.a.get(row); + const val = data.a.at(row); vals.push(val); cnts[val] = (cnts[val] || 0) + 1; }); - t.ok( + assert.ok( vals.every(v => v === 1 || v === 3 || v === 5 || v === 7), `${prefix} valid` ); @@ -20,138 +20,131 @@ function check(t, table, replace, prefix = '') { ? Object.values(cnts).some(c => c > 1) : Object.values(cnts).every(c => c === 1); - t.ok(test, `${prefix} count`); + assert.ok(test, `${prefix} count`); } -tape('sample draws a sample without replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; +describe('sample', () => { + it('draws a sample without replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; - const ft = table(cols).sample(2); + const ft = table(cols).sample(2); - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, false); - t.end(); -}); - -tape('sample draws a maximal sample without replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, false); + }); - const ft = table(cols).sample(10); + it('draws a maximal sample without replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; - t.equal(ft.numRows(), 4, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, false); - t.end(); -}); + const ft = table(cols).sample(10); -tape('sample draws a sample with replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + assert.equal(ft.numRows(), 4, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, false); + }); - const ft = table(cols).sample(10, { replace: true }); + it('draws a sample with replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; - t.equal(ft.numRows(), 10, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, true); - t.end(); -}); + const ft = table(cols).sample(10, { replace: true }); -tape('sample draws a column-weighted sample without replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; - - const ft = table(cols).sample(2, { weight: 'a' }); + assert.equal(ft.numRows(), 10, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, true); + }); - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, false, 'weighted '); - t.end(); -}); + it('draws a column-weighted sample without replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; -tape('sample draws an expression-weighted sample without replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + const ft = table(cols).sample(2, { weight: 'a' }); - const ft = table(cols).sample(2, { weight: d => d.a }); + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, false, 'weighted '); + }); - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, false, 'expr weighted '); - t.end(); -}); + it('draws an expression-weighted sample without replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; -tape('sample draws a weighted sample with replacement', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + const ft = table(cols).sample(2, { weight: d => d.a }); - const ft = table(cols).sample(10, { weight: 'a', replace: true }); + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, false, 'expr weighted '); + }); - t.equal(ft.numRows(), 10, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, true, 'weighted '); - t.end(); -}); + it('draws a weighted sample with replacement', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; -tape('sample tables support downstream transforms', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + const ft = table(cols).sample(10, { weight: 'a', replace: true }); - const dt = table(cols) - .sample(10, { weight: 'a', replace: true }) - .filter(d => d.a > 1) - .groupby(['a', 'b']) - .count(); + assert.equal(ft.numRows(), 10, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, true, 'weighted '); + }); - t.equal(dt.numCols(), 3, 'num cols'); - t.end(); -}); + it('tables support downstream transforms', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; -tape('sample supports dynamic sample size', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8] - }; + const dt = table(cols) + .sample(10, { weight: 'a', replace: true }) + .filter(d => d.a > 1) + .groupby(['a', 'b']) + .count(); - const ft = table(cols).sample(frac(0.5)); + assert.equal(dt.numCols(), 3, 'num cols'); + }); - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - check(t, ft, false); - t.end(); -}); + it('supports dynamic sample size', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8] + }; -tape('sample supports stratified sample', t => { - const cols = { - a: [1, 3, 5, 7], - b: [2, 2, 4, 4] - }; + const ft = table(cols).sample(frac(0.5)); - const ft = table(cols).groupby('b').sample(1); + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + check(ft, false); + }); - t.equal(ft.numRows(), 2, 'num rows'); - t.equal(ft.numCols(), 2, 'num cols'); - t.deepEqual( - ft.column('b').data.sort((a, b) => a - b), - [2, 4], - 'stratify keys' - ); - check(t, ft, false); - t.end(); -}); \ No newline at end of file + it('supports stratified sample', () => { + const cols = { + a: [1, 3, 5, 7], + b: [2, 2, 4, 4] + }; + + const ft = table(cols).groupby('b').sample(1); + + assert.equal(ft.numRows(), 2, 'num rows'); + assert.equal(ft.numCols(), 2, 'num cols'); + assert.deepEqual( + ft.column('b').sort((a, b) => a - b), + [2, 4], + 'stratify keys' + ); + check(ft, false); + }); +}); diff --git a/test/verbs/select-test.js b/test/verbs/select-test.js index 26a10c77..8e6d0861 100644 --- a/test/verbs/select-test.js +++ b/test/verbs/select-test.js @@ -1,214 +1,204 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; import { all, desc, endswith, matches, not, range, startswith, table -} from '../../src'; - -tape('select selects a subset of columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - const st = table(data).select('a', 'c'); - - t.equal(st.numRows(), 4, 'num rows'); - t.equal(st.numCols(), 2, 'num cols'); - tableEqual(t, st, { a: data.a, c: data.c }, 'selected data'); - t.end(); -}); - -tape('select handles columns with numeric names', t => { - const data = { - country: [0], - '1999': [1], - '2000': [2] - }; - - const dt = table(data, ['country', '1999', '2000']); - t.deepEqual( - dt.columnNames(), - ['country', '1999', '2000'] - ); - - t.deepEqual( - dt.select('1999', 'country', '2000').columnNames(), - ['1999', 'country', '2000'] - ); - - t.end(); -}); - -tape('select renames columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - const st = table(data).select({ b: 'foo', c: 'bar', a: 'baz' }); - - t.deepEqual(st.columnNames(), ['foo', 'bar', 'baz'], 'renamed columns'); - tableEqual(t, st, { - foo: data.b, bar: data.c, baz: data.a - }, 'selected data'); - t.end(); +} from '../../src/index.js'; + +describe('select', () => { + it('selects a subset of columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + const st = table(data).select('a', 'c'); + + assert.equal(st.numRows(), 4, 'num rows'); + assert.equal(st.numCols(), 2, 'num cols'); + tableEqual(st, { a: data.a, c: data.c }, 'selected data'); + }); + + it('handles columns with numeric names', () => { + const data = { + country: [0], + '1999': [1], + '2000': [2] + }; + + const dt = table(data, ['country', '1999', '2000']); + assert.deepEqual( + dt.columnNames(), + ['country', '1999', '2000'] + ); + + assert.deepEqual( + dt.select('1999', 'country', '2000').columnNames(), + ['1999', 'country', '2000'] + ); + }); + + it('renames columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + const st = table(data).select({ b: 'foo', c: 'bar', a: 'baz' }); + + assert.deepEqual(st.columnNames(), ['foo', 'bar', 'baz'], 'renamed columns'); + tableEqual(st, { + foo: data.b, bar: data.c, baz: data.a + }, 'selected data'); + }); + + it('uses last instance of repeated columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + const st = table(data).select(all(), { a: 'x', c: 'y' }, { c: 'z' }); + + assert.deepEqual(st.columnNames(), ['x', 'b', 'z'], 'renamed columns'); + tableEqual(st, { + x: data.a, b: data.b, z: data.c + }, 'selected data'); + }); + + it('reorders columns', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + const dt = table(data); + const st = dt.select(dt.columnNames().reverse()); + + assert.deepEqual(st.columnNames(), ['c', 'b', 'a'], 'reordered names'); + assert.deepEqual( + [st.columnIndex('a'), st.columnIndex('b'), st.columnIndex('c')], + [2, 1, 0], + 'reordered indices' + ); + tableEqual(st, data, 'selected data'); + }); + + it('accepts selection helpers', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abcd'.split('') + }; + + assert.deepEqual( + table(data).select(not('a', 'c')).columnNames(), + ['b'], + 'select not name' + ); + + assert.deepEqual( + table(data).select(not(1)).columnNames(), + ['a', 'c'], + 'select not index' + ); + + assert.deepEqual( + table(data).select(range('b', 'c')).columnNames(), + ['b', 'c'], + 'select range name' + ); + + assert.deepEqual( + table(data).select(range('c', 'b')).columnNames(), + ['b', 'c'], + 'select reversed range name' + ); + + assert.deepEqual( + table(data).select(range(0, 1)).columnNames(), + ['a', 'b'], + 'select range index' + ); + + assert.deepEqual( + table(data).select(range(1, 0)).columnNames(), + ['a', 'b'], + 'select reversed range index' + ); + + assert.deepEqual( + table(data).select(not(range(0, 1))).columnNames(), + ['c'], + 'select not range' + ); + + assert.deepEqual( + table(data).select(matches('b')).columnNames(), + ['b'], + 'select match string' + ); + + assert.deepEqual( + table(data).select(matches(/A|c/i)).columnNames(), + ['a', 'c'], + 'select match regexp' + ); + + const data2 = { + 'foo.bar': [], + 'foo.baz': [], + 'bop.bar': [], + 'bop.baz': [] + }; + + assert.deepEqual( + table(data2).select(startswith('foo.')).columnNames(), + ['foo.bar', 'foo.baz'], + 'select startswith' + ); + + assert.deepEqual( + table(data2).select(endswith('.baz')).columnNames(), + ['foo.baz', 'bop.baz'], + 'select startswith' + ); + }); + + it('does not conflict with groupby', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 4, 6, 8], + c: 'abbb'.split('') + }; + + const st = table(data).groupby('c').select('a', 'b', {'c': 'd'}); + + assert.deepEqual( + st.columnNames(), + ['a', 'b', 'd'], + 'renamed columns' + ); + + tableEqual(st.count({ as: 'n' }), { + c: ['a', 'b'], n: [1, 3] + }, 'groupby not conflicted'); + }); + + it('does not conflict with orderby', () => { + const data = { + a: [1, 3, 5, 7], + b: [2, 6, 8, 4], + c: 'abcd'.split('') + }; + + const st = table(data).orderby(desc('b')).select('c').reify(); + + tableEqual(st, { + c: ['c', 'b', 'd', 'a'] + }, 'orderby not conflicted'); + }); }); - -tape('select uses last instance of repeated columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - const st = table(data).select(all(), { a: 'x', c: 'y' }, { c: 'z' }); - - t.deepEqual(st.columnNames(), ['x', 'b', 'z'], 'renamed columns'); - tableEqual(t, st, { - x: data.a, b: data.b, z: data.c - }, 'selected data'); - t.end(); -}); - -tape('select reorders columns', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - const dt = table(data); - const st = dt.select(dt.columnNames().reverse()); - - t.deepEqual(st.columnNames(), ['c', 'b', 'a'], 'reordered names'); - t.deepEqual( - [st.columnIndex('a'), st.columnIndex('b'), st.columnIndex('c')], - [2, 1, 0], - 'reordered indices' - ); - tableEqual(t, st, data, 'selected data'); - t.end(); -}); - -tape('select accepts selection helpers', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abcd'.split('') - }; - - t.deepEqual( - table(data).select(not('a', 'c')).columnNames(), - ['b'], - 'select not name' - ); - - t.deepEqual( - table(data).select(not(1)).columnNames(), - ['a', 'c'], - 'select not index' - ); - - t.deepEqual( - table(data).select(range('b', 'c')).columnNames(), - ['b', 'c'], - 'select range name' - ); - - t.deepEqual( - table(data).select(range('c', 'b')).columnNames(), - ['b', 'c'], - 'select reversed range name' - ); - - t.deepEqual( - table(data).select(range(0, 1)).columnNames(), - ['a', 'b'], - 'select range index' - ); - - t.deepEqual( - table(data).select(range(1, 0)).columnNames(), - ['a', 'b'], - 'select reversed range index' - ); - - t.deepEqual( - table(data).select(not(range(0, 1))).columnNames(), - ['c'], - 'select not range' - ); - - t.deepEqual( - table(data).select(matches('b')).columnNames(), - ['b'], - 'select match string' - ); - - t.deepEqual( - table(data).select(matches(/A|c/i)).columnNames(), - ['a', 'c'], - 'select match regexp' - ); - - const data2 = { - 'foo.bar': [], - 'foo.baz': [], - 'bop.bar': [], - 'bop.baz': [] - }; - - t.deepEqual( - table(data2).select(startswith('foo.')).columnNames(), - ['foo.bar', 'foo.baz'], - 'select startswith' - ); - - t.deepEqual( - table(data2).select(endswith('.baz')).columnNames(), - ['foo.baz', 'bop.baz'], - 'select startswith' - ); - - t.end(); -}); - -tape('select does not conflict with groupby', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 4, 6, 8], - c: 'abbb'.split('') - }; - - const st = table(data).groupby('c').select('a', 'b', {'c': 'd'}); - - t.deepEqual( - st.columnNames(), - ['a', 'b', 'd'], - 'renamed columns' - ); - - tableEqual(t, st.count({ as: 'n' }), { - c: ['a', 'b'], n: [1, 3] - }, 'groupby not conflicted'); - - t.end(); -}); - -tape('select does not conflict with orderby', t => { - const data = { - a: [1, 3, 5, 7], - b: [2, 6, 8, 4], - c: 'abcd'.split('') - }; - - const st = table(data).orderby(desc('b')).select('c').reify(); - - tableEqual(t, st, { - c: ['c', 'b', 'd', 'a'] - }, 'orderby not conflicted'); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/slice-test.js b/test/verbs/slice-test.js index 8a63cc47..5f7a303b 100644 --- a/test/verbs/slice-test.js +++ b/test/verbs/slice-test.js @@ -1,88 +1,85 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; - -tape('slice slices a table', t => { - const dt = table({ v: [1, 2, 3, 4] }); - - tableEqual(t, - dt.slice(), - { v: [1, 2, 3, 4] }, - 'sliced data, all' - ); - - tableEqual(t, - dt.slice(2), - { v: [3, 4] }, - 'sliced data, start' - ); - - tableEqual(t, - dt.slice(1, 3), - { v: [2, 3] }, - 'sliced data, start and end' - ); - - tableEqual(t, - dt.slice(1, -1), - { v: [2, 3] }, - 'sliced data, start and negative end' - ); - - tableEqual(t, - dt.slice(-3, -1), - { v: [2, 3] }, - 'sliced data, negative start and end' - ); - - tableEqual(t, - dt.slice(-1000, -900), - { v: [] }, - 'sliced data, extreme negative start and end' - ); - - t.end(); +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; + +describe('slice', () => { + it('slices a table', () => { + const dt = table({ v: [1, 2, 3, 4] }); + + tableEqual( + dt.slice(), + { v: [1, 2, 3, 4] }, + 'sliced data, all' + ); + + tableEqual( + dt.slice(2), + { v: [3, 4] }, + 'sliced data, start' + ); + + tableEqual( + dt.slice(1, 3), + { v: [2, 3] }, + 'sliced data, start and end' + ); + + tableEqual( + dt.slice(1, -1), + { v: [2, 3] }, + 'sliced data, start and negative end' + ); + + tableEqual( + dt.slice(-3, -1), + { v: [2, 3] }, + 'sliced data, negative start and end' + ); + + tableEqual( + dt.slice(-1000, -900), + { v: [] }, + 'sliced data, extreme negative start and end' + ); + }); + + it('slices a grouped table', () => { + const dt = table({ v: [1, 2, 3, 4, 5, 6, 7] }) + .groupby({ k: d => d.v % 2 }); + + tableEqual( + dt.slice(), + { v: [1, 2, 3, 4, 5, 6, 7] }, + 'sliced data, all' + ); + + tableEqual( + dt.slice(2), + { v: [5, 6, 7] }, + 'sliced data, start' + ); + + tableEqual( + dt.slice(1, 3), + { v: [3, 4, 5, 6] }, + 'sliced data, start and end' + ); + + tableEqual( + dt.slice(1, -1), + { v: [3, 4, 5] }, + 'sliced data, start and negative end' + ); + + tableEqual( + dt.slice(-3, -1), + { v: [2, 3, 4, 5] }, + 'sliced data, negative start and end' + ); + + tableEqual( + dt.slice(-1000, -900), + { v: [] }, + 'sliced data, extreme negative start and end' + ); + }); }); - -tape('slice slices a grouped table', t => { - const dt = table({ v: [1, 2, 3, 4, 5, 6, 7] }) - .groupby({ k: d => d.v % 2 }); - - tableEqual(t, - dt.slice(), - { v: [1, 2, 3, 4, 5, 6, 7] }, - 'sliced data, all' - ); - - tableEqual(t, - dt.slice(2), - { v: [5, 6, 7] }, - 'sliced data, start' - ); - - tableEqual(t, - dt.slice(1, 3), - { v: [3, 4, 5, 6] }, - 'sliced data, start and end' - ); - - tableEqual(t, - dt.slice(1, -1), - { v: [3, 4, 5] }, - 'sliced data, start and negative end' - ); - - tableEqual(t, - dt.slice(-3, -1), - { v: [2, 3, 4, 5] }, - 'sliced data, negative start and end' - ); - - tableEqual(t, - dt.slice(-1000, -900), - { v: [] }, - 'sliced data, extreme negative start and end' - ); - - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/spread-test.js b/test/verbs/spread-test.js index a1a3e5e8..d6a14595 100644 --- a/test/verbs/spread-test.js +++ b/test/verbs/spread-test.js @@ -1,130 +1,124 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { op, table } from '../../src'; - -tape('spread produces multiple columns from arrays', t => { - const data = { - text: ['foo bar bop', 'foo', 'bar baz', 'baz bop'] - }; - - const dt = table(data).spread( - { split: d => op.split(d.text, ' ') }, - { limit: 2 } - ); - - tableEqual(t, dt, { - ...data, - split_1: [ 'foo', 'foo', 'bar', 'baz' ], - split_2: [ 'bar', undefined, 'baz', 'bop' ] - }, 'spread data'); - t.end(); -}); - -tape('spread supports column name argument', t => { - const data = { - list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] - }; - - const dt = table(data).spread('list', { drop: false, limit: 2 }); - - tableEqual(t, dt, { - ...data, - list_1: [ 'foo', 'foo', 'bar', 'baz' ], - list_2: [ 'bar', undefined, 'baz', 'bop' ] - }, 'spread data'); - t.end(); -}); - -tape('spread supports column index argument', t => { - const data = { - list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] - }; - - const dt = table(data).spread(0, { limit: 2 }); - - tableEqual(t, dt, { - list_1: [ 'foo', 'foo', 'bar', 'baz' ], - list_2: [ 'bar', undefined, 'baz', 'bop' ] - }, 'spread data'); - t.end(); -}); - -tape('spread supports multiple input columns', t => { - const data = { - a: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']], - b: [['baz', 'bop'], ['bar', 'baz'], ['foo'], ['foo', 'bar', 'bop']] - }; - - const dt = table(data).spread(['a', 'b'], { limit: 2 }); - - tableEqual(t, dt, { - a_1: [ 'foo', 'foo', 'bar', 'baz' ], - a_2: [ 'bar', undefined, 'baz', 'bop' ], - b_1: [ 'baz', 'bar', 'foo', 'foo' ], - b_2: [ 'bop', 'baz', undefined, 'bar' ] - }, 'spread data'); - t.end(); -}); - -tape('spread supports as option with single column input', t => { - const data = { - list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] - }; - - const dt = table(data).spread('list', { as: ['bip', 'bop'] }); - - tableEqual(t, dt, { - bip: [ 'foo', 'foo', 'bar', 'baz' ], - bop: [ 'bar', undefined, 'baz', 'bop' ] - }, 'spread data with as'); - t.end(); -}); - -tape('spread ignores as option with multi column input', t => { - const data = { - key: ['a', 'b', 'c', 'd'], - a: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']], - b: [['baz', 'bop'], ['bar', 'baz'], ['foo'], ['foo', 'bar', 'bop']] - }; - - const dt = table(data).spread(['a', 'b'], { limit: 2, as: ['bip', 'bop'] }); - - tableEqual(t, dt, { - key: ['a', 'b', 'c', 'd'], - a_1: [ 'foo', 'foo', 'bar', 'baz' ], - a_2: [ 'bar', undefined, 'baz', 'bop' ], - b_1: [ 'baz', 'bar', 'foo', 'foo' ], - b_2: [ 'bop', 'baz', undefined, 'bar' ] - }, 'spread data with as'); - t.end(); -}); - -tape('spread handles arrays of varying length', t => { - const data1 = { - u: [ - ['A', 'B', 'C'], - ['D', 'E'] - ] - }; - const data2 = { - u: data1.u.slice().reverse() - }; - const obj = [ - { u_1: 'A', u_2: 'B', u_3: 'C' }, - { u_1: 'D', u_2: 'E', u_3: undefined } - ]; - - t.deepEqual( - table(data1).spread('u').objects(), - obj, - 'spread data, larger first' - ); - - t.deepEqual( - table(data2).spread('u').objects(), - obj.slice().reverse(), - 'spread data, smaller first' - ); - - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { op, table } from '../../src/index.js'; + +describe('spread', () => { + it('produces multiple columns from arrays', () => { + const data = { + text: ['foo bar bop', 'foo', 'bar baz', 'baz bop'] + }; + + const dt = table(data).spread( + { split: d => op.split(d.text, ' ') }, + { limit: 2 } + ); + + tableEqual(dt, { + ...data, + split_1: [ 'foo', 'foo', 'bar', 'baz' ], + split_2: [ 'bar', undefined, 'baz', 'bop' ] + }, 'spread data'); + }); + + it('supports column name argument', () => { + const data = { + list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] + }; + + const dt = table(data).spread('list', { drop: false, limit: 2 }); + + tableEqual(dt, { + ...data, + list_1: [ 'foo', 'foo', 'bar', 'baz' ], + list_2: [ 'bar', undefined, 'baz', 'bop' ] + }, 'spread data'); + }); + + it('supports column index argument', () => { + const data = { + list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] + }; + + const dt = table(data).spread(0, { limit: 2 }); + + tableEqual(dt, { + list_1: [ 'foo', 'foo', 'bar', 'baz' ], + list_2: [ 'bar', undefined, 'baz', 'bop' ] + }, 'spread data'); + }); + + it('supports multiple input columns', () => { + const data = { + a: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']], + b: [['baz', 'bop'], ['bar', 'baz'], ['foo'], ['foo', 'bar', 'bop']] + }; + + const dt = table(data).spread(['a', 'b'], { limit: 2 }); + + tableEqual(dt, { + a_1: [ 'foo', 'foo', 'bar', 'baz' ], + a_2: [ 'bar', undefined, 'baz', 'bop' ], + b_1: [ 'baz', 'bar', 'foo', 'foo' ], + b_2: [ 'bop', 'baz', undefined, 'bar' ] + }, 'spread data'); + }); + + it('supports as option with single column input', () => { + const data = { + list: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']] + }; + + const dt = table(data).spread('list', { as: ['bip', 'bop'] }); + + tableEqual(dt, { + bip: [ 'foo', 'foo', 'bar', 'baz' ], + bop: [ 'bar', undefined, 'baz', 'bop' ] + }, 'spread data with as'); + }); + + it('ignores as option with multi column input', () => { + const data = { + key: ['a', 'b', 'c', 'd'], + a: [['foo', 'bar', 'bop'], ['foo'], ['bar', 'baz'], ['baz', 'bop']], + b: [['baz', 'bop'], ['bar', 'baz'], ['foo'], ['foo', 'bar', 'bop']] + }; + + const dt = table(data).spread(['a', 'b'], { limit: 2, as: ['bip', 'bop'] }); + + tableEqual(dt, { + key: ['a', 'b', 'c', 'd'], + a_1: [ 'foo', 'foo', 'bar', 'baz' ], + a_2: [ 'bar', undefined, 'baz', 'bop' ], + b_1: [ 'baz', 'bar', 'foo', 'foo' ], + b_2: [ 'bop', 'baz', undefined, 'bar' ] + }, 'spread data with as'); + }); + + it('handles arrays of varying length', () => { + const data1 = { + u: [ + ['A', 'B', 'C'], + ['D', 'E'] + ] + }; + const data2 = { + u: data1.u.slice().reverse() + }; + const obj = [ + { u_1: 'A', u_2: 'B', u_3: 'C' }, + { u_1: 'D', u_2: 'E', u_3: undefined } + ]; + + assert.deepEqual( + table(data1).spread('u').objects(), + obj, + 'spread data, larger first' + ); + + assert.deepEqual( + table(data2).spread('u').objects(), + obj.slice().reverse(), + 'spread data, smaller first' + ); + }); }); diff --git a/test/verbs/union-test.js b/test/verbs/union-test.js index 892daabf..3bec6a33 100644 --- a/test/verbs/union-test.js +++ b/test/verbs/union-test.js @@ -1,65 +1,63 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { table } from '../../src'; - -tape('union combines tables', t => { - const t1 = table({ a: [1, 2], b: [3, 4] }); - const t2 = table({ a: [3, 4], c: [5, 6] }); - const dt = t1.union(t2); - - t.equal(dt.numRows(), 4, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 3, 4], - b: [3, 4, undefined, undefined] - }, 'union data'); - t.end(); +import assert from 'node:assert'; +import tableEqual from '../table-equal.js'; +import { table } from '../../src/index.js'; + +describe('union', () => { + it('combines tables', () => { + const t1 = table({ a: [1, 2], b: [3, 4] }); + const t2 = table({ a: [3, 4], c: [5, 6] }); + const dt = t1.union(t2); + + assert.equal(dt.numRows(), 4, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 3, 4], + b: [3, 4, undefined, undefined] + }, 'union data'); + }); + + it('combines multiple tables', () => { + const t1 = table({ a: [1, 2], b: [3, 4] }); + const t2 = table({ a: [3, 4], c: [5, 6] }); + const t3 = table({ a: [5, 6], b: [7, 8] }); + + const dt = t1.union(t2, t3); + assert.equal(dt.numRows(), 6, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 3, 4, 5, 6], + b: [3, 4, undefined, undefined, 7, 8] + }, 'union data'); + + const at = t1.union([t2, t3]); + assert.equal(at.numRows(), 6, 'num rows'); + assert.equal(at.numCols(), 2, 'num cols'); + tableEqual(at, { + a: [1, 2, 3, 4, 5, 6], + b: [3, 4, undefined, undefined, 7, 8] + }, 'union data'); + }); + + it('deduplicates combined data', () => { + const t1 = table({ a: [1, 2], b: [3, 4] }); + const t2 = table({ a: [1, 2], b: [3, 6] }); + const dt = t1.union(t2); + + assert.equal(dt.numRows(), 3, 'num rows'); + assert.equal(dt.numCols(), 2, 'num cols'); + tableEqual(dt, { + a: [1, 2, 2], + b: [3, 4, 6] + }, 'union data'); + + const t3 = table({ a: [1, 1], b: [5, 5] }); + const ut = t1.union(t3); + + assert.equal(ut.numRows(), 3, 'num rows'); + assert.equal(ut.numCols(), 2, 'num cols'); + tableEqual(ut, { + a: [1, 2, 1], + b: [3, 4, 5] + }, 'union data'); + }); }); - -tape('union combines multiple tables', t => { - const t1 = table({ a: [1, 2], b: [3, 4] }); - const t2 = table({ a: [3, 4], c: [5, 6] }); - const t3 = table({ a: [5, 6], b: [7, 8] }); - - const dt = t1.union(t2, t3); - t.equal(dt.numRows(), 6, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 3, 4, 5, 6], - b: [3, 4, undefined, undefined, 7, 8] - }, 'union data'); - - const at = t1.union([t2, t3]); - t.equal(at.numRows(), 6, 'num rows'); - t.equal(at.numCols(), 2, 'num cols'); - tableEqual(t, at, { - a: [1, 2, 3, 4, 5, 6], - b: [3, 4, undefined, undefined, 7, 8] - }, 'union data'); - - t.end(); -}); - -tape('union deduplicates combined data', t => { - const t1 = table({ a: [1, 2], b: [3, 4] }); - const t2 = table({ a: [1, 2], b: [3, 6] }); - const dt = t1.union(t2); - - t.equal(dt.numRows(), 3, 'num rows'); - t.equal(dt.numCols(), 2, 'num cols'); - tableEqual(t, dt, { - a: [1, 2, 2], - b: [3, 4, 6] - }, 'union data'); - - const t3 = table({ a: [1, 1], b: [5, 5] }); - const ut = t1.union(t3); - - t.equal(ut.numRows(), 3, 'num rows'); - t.equal(ut.numCols(), 2, 'num cols'); - tableEqual(t, ut, { - a: [1, 2, 1], - b: [3, 4, 5] - }, 'union data'); - t.end(); -}); \ No newline at end of file diff --git a/test/verbs/unroll-test.js b/test/verbs/unroll-test.js index 8837fe5d..5ff51df8 100644 --- a/test/verbs/unroll-test.js +++ b/test/verbs/unroll-test.js @@ -1,180 +1,169 @@ -import tape from 'tape'; -import tableEqual from '../table-equal'; -import { not, op, table } from '../../src'; - -tape('unroll generates rows for array values', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [1, 2, 3]] - }; - - const ut = table(data).unroll('x', { limit: 2 }); - - tableEqual(t, ut, { - k: ['a', 'a', 'b', 'b'], - x: [1, 2, 1, 2] - }, 'unroll data'); - t.end(); +import tableEqual from '../table-equal.js'; +import { not, op, table } from '../../src/index.js'; + +describe('unroll', () => { + it('generates rows for array values', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [1, 2, 3]] + }; + + const ut = table(data).unroll('x', { limit: 2 }); + + tableEqual(ut, { + k: ['a', 'a', 'b', 'b'], + x: [1, 2, 1, 2] + }, 'unroll data'); + }); + + it('generates rows for array values with index', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [1, 2, 3]] + }; + + const ut = table(data).unroll('x', { limit: 2, index: true }); + + tableEqual(ut, { + k: ['a', 'a', 'b', 'b'], + x: [1, 2, 1, 2], + index: [0, 1, 0, 1] + }, 'unroll data with index'); + }); + + it('generates rows for array values with named index', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [1, 2, 3]] + }; + + const ut = table(data).unroll('x', { limit: 2, index: 'arridx' }); + + tableEqual(ut, { + k: ['a', 'a', 'b', 'b'], + x: [1, 2, 1, 2], + arridx: [0, 1, 0, 1] + }, 'unroll data with index'); + }); + + it('generates rows for parallel array values', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [4, 5, 6]], + y: [[9, 8, 7], [9, 8]] + }; + + const ut = table(data).unroll(['x', 'y']); + + tableEqual(ut, { + k: ['a', 'a', 'a', 'b', 'b', 'b'], + x: [1, 2, 3, 4, 5, 6], + y: [9, 8, 7, 9, 8, undefined] + }, 'unroll data'); + }); + + it('generates rows for parallel array values with index', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [4, 5, 6]], + y: [[9, 8, 7], [9, 8]] + }; + + const ut = table(data).unroll(['x', 'y'], { index: true }); + + tableEqual(ut, { + k: ['a', 'a', 'a', 'b', 'b', 'b'], + x: [1, 2, 3, 4, 5, 6], + y: [9, 8, 7, 9, 8, undefined], + index: [0, 1, 2, 0, 1, 2] + }, 'unroll data with index'); + }); + + it('generates rows for parallel array values with named index', () => { + const data = { + k: ['a', 'b'], + x: [[1, 2, 3], [4, 5, 6]], + y: [[9, 8, 7], [9, 8]] + }; + + const ut = table(data).unroll(['x', 'y'], { index: 'arridx' }); + + tableEqual(ut, { + k: ['a', 'a', 'a', 'b', 'b', 'b'], + x: [1, 2, 3, 4, 5, 6], + y: [9, 8, 7, 9, 8, undefined], + arridx: [0, 1, 2, 0, 1, 2] + }, 'unroll data with index'); + }); + + it('generates rows for derived array', () => { + const data = { + k: ['a', 'b'], + x: ['foo bar', 'baz bop bop'] + }; + + const ut = table(data).unroll({ t: d => op.split(d.x, ' ') }); + + tableEqual(ut, { + k: ['a', 'a', 'b', 'b', 'b'], + x: ['foo bar', 'foo bar', 'baz bop bop', 'baz bop bop', 'baz bop bop'], + t: ['foo', 'bar', 'baz', 'bop', 'bop'] + }, 'unroll data'); + }); + + it('can invert a rollup', () => { + const data = { + k: ['a', 'a', 'b', 'b'], + x: [1, 2, 3, 4] + }; + + const ut = table(data) + .groupby('k') + .rollup({ x: d => op.array_agg(d.x) }) + .unroll('x'); + + tableEqual(ut, data, 'unroll rollup data'); + }); + + it('preserves column order', () => { + const ut = table({ + x: [[1, 2, 3, 4, 5]], + v: [0] + }) + .unroll('x'); + + tableEqual(ut, { + x: [1, 2, 3, 4, 5], + v: [0, 0, 0, 0, 0] + }, 'unroll data'); + }); + + it('can drop columns', () => { + const dt = table({ + x: [[1, 2, 3, 4, 5]], + u: [0], + v: [1] + }); + + tableEqual(dt.unroll('x', { drop: 'x' }), { + u: [0, 0, 0, 0, 0], + v: [1, 1, 1, 1, 1] + }, 'unroll drop-1 data'); + + tableEqual(dt.unroll('x', { drop: ['u', 'x'] }), { + v: [1, 1, 1, 1, 1] + }, 'unroll drop-2 array data'); + + tableEqual(dt.unroll('x', { drop: [0, 1] }), { + v: [1, 1, 1, 1, 1] + }, 'unroll drop-2 index data'); + + tableEqual(dt.unroll('x', { drop: { u: 1, x: 1 } }), { + v: [1, 1, 1, 1, 1] + }, 'unroll drop-2 object data'); + + tableEqual(dt.unroll('x', { drop: not('v') }), { + v: [1, 1, 1, 1, 1] + }, 'unroll drop-not data'); + }); }); - -tape('unroll generates rows for array values with index', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [1, 2, 3]] - }; - - const ut = table(data).unroll('x', { limit: 2, index: true }); - - tableEqual(t, ut, { - k: ['a', 'a', 'b', 'b'], - x: [1, 2, 1, 2], - index: [0, 1, 0, 1] - }, 'unroll data with index'); - t.end(); -}); - -tape('unroll generates rows for array values with named index', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [1, 2, 3]] - }; - - const ut = table(data).unroll('x', { limit: 2, index: 'arridx' }); - - tableEqual(t, ut, { - k: ['a', 'a', 'b', 'b'], - x: [1, 2, 1, 2], - arridx: [0, 1, 0, 1] - }, 'unroll data with index'); - t.end(); -}); - -tape('unroll generates rows for parallel array values', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [4, 5, 6]], - y: [[9, 8, 7], [9, 8]] - }; - - const ut = table(data).unroll(['x', 'y']); - - tableEqual(t, ut, { - k: ['a', 'a', 'a', 'b', 'b', 'b'], - x: [1, 2, 3, 4, 5, 6], - y: [9, 8, 7, 9, 8, undefined] - }, 'unroll data'); - t.end(); -}); - -tape('unroll generates rows for parallel array values with index', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [4, 5, 6]], - y: [[9, 8, 7], [9, 8]] - }; - - const ut = table(data).unroll(['x', 'y'], { index: true }); - - tableEqual(t, ut, { - k: ['a', 'a', 'a', 'b', 'b', 'b'], - x: [1, 2, 3, 4, 5, 6], - y: [9, 8, 7, 9, 8, undefined], - index: [0, 1, 2, 0, 1, 2] - }, 'unroll data with index'); - t.end(); -}); - -tape('unroll generates rows for parallel array values with named index', t => { - const data = { - k: ['a', 'b'], - x: [[1, 2, 3], [4, 5, 6]], - y: [[9, 8, 7], [9, 8]] - }; - - const ut = table(data).unroll(['x', 'y'], { index: 'arridx' }); - - tableEqual(t, ut, { - k: ['a', 'a', 'a', 'b', 'b', 'b'], - x: [1, 2, 3, 4, 5, 6], - y: [9, 8, 7, 9, 8, undefined], - arridx: [0, 1, 2, 0, 1, 2] - }, 'unroll data with index'); - t.end(); -}); - -tape('unroll generates rows for derived array', t => { - const data = { - k: ['a', 'b'], - x: ['foo bar', 'baz bop bop'] - }; - - const ut = table(data).unroll({ t: d => op.split(d.x, ' ') }); - - tableEqual(t, ut, { - k: ['a', 'a', 'b', 'b', 'b'], - x: ['foo bar', 'foo bar', 'baz bop bop', 'baz bop bop', 'baz bop bop'], - t: ['foo', 'bar', 'baz', 'bop', 'bop'] - }, 'unroll data'); - t.end(); -}); - -tape('unroll can invert a rollup', t => { - const data = { - k: ['a', 'a', 'b', 'b'], - x: [1, 2, 3, 4] - }; - - const ut = table(data) - .groupby('k') - .rollup({ x: d => op.array_agg(d.x) }) - .unroll('x'); - - tableEqual(t, ut, data, 'unroll rollup data'); - t.end(); -}); - -tape('unroll preserves column order', t => { - const ut = table({ - x: [[1, 2, 3, 4, 5]], - v: [0] - }) - .unroll('x'); - - tableEqual(t, ut, { - x: [1, 2, 3, 4, 5], - v: [0, 0, 0, 0, 0] - }, 'unroll data'); - - t.end(); -}); - -tape('unroll can drop columns', t => { - const dt = table({ - x: [[1, 2, 3, 4, 5]], - u: [0], - v: [1] - }); - - tableEqual(t, dt.unroll('x', { drop: 'x' }), { - u: [0, 0, 0, 0, 0], - v: [1, 1, 1, 1, 1] - }, 'unroll drop-1 data'); - - tableEqual(t, dt.unroll('x', { drop: ['u', 'x'] }), { - v: [1, 1, 1, 1, 1] - }, 'unroll drop-2 array data'); - - tableEqual(t, dt.unroll('x', { drop: [0, 1] }), { - v: [1, 1, 1, 1, 1] - }, 'unroll drop-2 index data'); - - tableEqual(t, dt.unroll('x', { drop: { u: 1, x: 1 } }), { - v: [1, 1, 1, 1, 1] - }, 'unroll drop-2 object data'); - - tableEqual(t, dt.unroll('x', { drop: not('v') }), { - v: [1, 1, 1, 1, 1] - }, 'unroll drop-not data'); - - t.end(); -}); \ No newline at end of file diff --git a/tsconfig.json b/tsconfig.json index ee068bf1..b94ec148 100644 --- a/tsconfig.json +++ b/tsconfig.json @@ -1,10 +1,15 @@ { - "include": ["src/**/*"], + "include": ["src/index-browser.js", "src/index.js"], "compilerOptions": { "allowJs": true, + "checkJs": true, "declaration": true, "emitDeclarationOnly": true, + "esModuleInterop": true, + "module": "node16", + "moduleResolution": "node16", "outDir": "dist/types", + "target": "es2022", "skipLibCheck": true } }