Skip to content

Commit

Permalink
Implemented ListArray and ListOffsetArray's __getitem__. (#11)
Browse files Browse the repository at this point in the history
* Start by refreshing RawArray class.

* Reinstate the RawArray tests.

* RawArray now has the infrastructure for getitem.

* Maybe I shouldn't be working on RawArray; I don't see how this will fit into ListOffsetArray.

* Work on ListArray instead.

* Cleaned up setid.

* Ready to work on ListArray.

* More cleaning up; working on making getitem universal.

* Cleaned up a lot of duplication in pyawkward.cpp.

* Rename Content::get (and others) to ::getitem_at and ::slice to ::getitem_range.

* Start on ListArray tests.

* Tested ListArray::getitem_at and ListArray::getitem_range.

* Ready to work on ListArray::getitem_next.

* Implemented basic (not entirely correct) ListArray::getitem for SliceArray.

* more correct

* Very nearly have recursive ListArray::getitem((array, array)).

* it works

* Split ListArray::getitem(array) into advanced and non-advanced cases.

* [skip ci] calling NumpyArray::getitem_next from getitem_next isn't looking promising

* It looks like NumpyArray::getitem_next(3 args) can be a simple call to NumpyArray::getitem_next(6 args)

* ListArray::getitem_next slice and array tests work; need to clean up.

* Cleaned up.

* Solved ListArray::getitem_next(SliceAt).

* Cleaned up ListArray::getitem_next(SliceAt).

* ListArray::getitem_next(SliceEllipsis), but SliceNewAxis will have to wait for RegularArray.

* ListOffsetArray can do everything ListArray can do.

* Started converting cases that create ListArrays into creating ListOffsetArrays.

* Continuing to convert cases that create ListArrays into creating ListOffsetArrays.

* ListArray and ListOffsetArray now share an entry getitem_next.

* Fix problems in compilation.

* Fix more problems in compilation.

* If *not* py27.

* Implemented a new setid for ListOffsetArray.

* Finished up PR #11.

* Fix warnings on Windows and MacOS.
  • Loading branch information
jpivarski authored Sep 26, 2019
1 parent 9fb0a9a commit fe17e09
Show file tree
Hide file tree
Showing 33 changed files with 2,872 additions and 1,236 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ docs/_build/

############################################################# C and C++

# ctest
Testing/

# Prerequisites
*.d

Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ add_library(awkward-static STATIC $<TARGET_OBJECTS:awkward-objects>)
add_library(awkward SHARED $<TARGET_OBJECTS:awkward-objects>)
target_link_libraries(awkward-static PRIVATE awkward-cpu-kernels-static)
target_link_libraries(awkward PRIVATE awkward-cpu-kernels-static)
addtest(test-PR8-rawarray "tests/test_PR8_rawarray_and_slices.cpp")
addtest(test-PR10 "tests/test_PR10_rawarray_getitem.cpp")

pybind11_add_module(layout src/pyawkward.cpp)
target_link_libraries(layout PRIVATE awkward-static)
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ The following features of awkward 0.x will be features of awkward 1.x.
* 2019-09-02 (PR [#7](../../pull/7)): refactored `Index`, `Identity`, and `ListOffsetArray` (and any other array types with `Index`, which is nearly all of them) to have a 32-bit and a 64-bit version. My original plan to only support 64-bit in "chunked arrays" with 32-bit everywhere else is hereby scrapped—both bit widths will be supported on all indexes. Non-native endian, non-trivial strides, and multidimensional `Index`/`Identity` are not supported, though all of these features are allowed for `NumpyArray` (which is _content_, not an _index_). The only limitation on `NumpyArray` is that data must be C-ordered, not Fortran-ordered.
* 2019-09-21 (PR [#8](../../pull/8)): C++ NumpyArray::getitem is done, setting the pattern for other classes (external C functions). The Numba and Identity extensions are not done, which would be necessary to fully set the pattern. This involved a lot of investigation (see [studies/getitem.py](https://github.com/jpivarski/awkward-1.0/blob/master/studies/getitem.py)).
* 2019-09-21 (PR [#9](../../pull/9)): `Identity` is correctly passed through `NumpyArray` slices and `__getitem__` uses `get`, `slice`, or the full `getitem`, depending on argument complexity.
* 2019-09-26 (PR [#11](../../pull/11)): fully implemented `ListArray` and `ListOffsetArray`'s `__getitem__`.

## Roadmap

Expand Down
2 changes: 1 addition & 1 deletion VERSION_INFO
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.1.3
0.1.4
2 changes: 1 addition & 1 deletion awkward1/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

import awkward1.layout
import awkward1._numba
from awkward1.operations.format import *
from awkward1.operations.convert import *

__version__ = awkward1.layout.__version__
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

import numpy

import awkward1.util
import awkward1.layout

def tolist(array):
Expand All @@ -16,7 +17,7 @@ def tolist(array):
elif isinstance(array, awkward1.layout.NumpyArray):
return numpy.asarray(array).tolist()

elif isinstance(array, (awkward1.layout.ListOffsetArray32, awkward1.layout.ListOffsetArray64)):
elif isinstance(array, awkward1.util.anycontent):
return [tolist(x) for x in array]

else:
Expand Down
11 changes: 11 additions & 0 deletions awkward1/util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# BSD 3-Clause License; see https://github.com/jpivarski/awkward-1.0/blob/master/LICENSE

import awkward1.layout

anycontent = (
awkward1.layout.NumpyArray,
awkward1.layout.ListArray32,
awkward1.layout.ListArray64,
awkward1.layout.ListOffsetArray32,
awkward1.layout.ListOffsetArray64,
)
10 changes: 8 additions & 2 deletions include/awkward/Content.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

#include "awkward/cpu-kernels/util.h"
#include "awkward/Identity.h"
#include "awkward/Slice.h"

namespace awkward {
class Content {
Expand All @@ -15,11 +16,16 @@ namespace awkward {
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const = 0;
virtual int64_t length() const = 0;
virtual const std::shared_ptr<Content> shallow_copy() const = 0;
virtual const std::shared_ptr<Content> get(int64_t at) const = 0;
virtual const std::shared_ptr<Content> slice(int64_t start, int64_t stop) const = 0;
virtual const std::shared_ptr<Content> getitem_at(int64_t at) const = 0;
virtual const std::shared_ptr<Content> getitem_range(int64_t start, int64_t stop) const = 0;
virtual const std::shared_ptr<Content> getitem(const Slice& where) const;
virtual const std::shared_ptr<Content> getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, const Index64& advanced) const = 0;
virtual const std::shared_ptr<Content> carry(const Index64& carry) const = 0;
virtual const std::pair<int64_t, int64_t> minmax_depth() const = 0;

const std::string tostring() const;
const std::shared_ptr<Content> getitem_ellipsis(const Slice& tail, const Index64& advanced) const;
const std::shared_ptr<Content> getitem_newaxis(const Slice& tail, const Index64& advanced) const;
};
}

Expand Down
10 changes: 6 additions & 4 deletions include/awkward/Identity.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ namespace awkward {
const int64_t width() const { return width_; }
const int64_t length() const { return length_; }

virtual const std::shared_ptr<Identity> to64() const = 0;
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const = 0;
virtual const std::shared_ptr<Identity> slice(int64_t start, int64_t stop) const = 0;
virtual const std::shared_ptr<Identity> getitem_range(int64_t start, int64_t stop) const = 0;
virtual const std::shared_ptr<Identity> shallow_copy() const = 0;
virtual const std::shared_ptr<Identity> getitem_carry_64(Index64& carry) const = 0;
virtual const std::shared_ptr<Identity> getitem_carry_64(const Index64& carry) const = 0;

protected:
const Ref ref_;
Expand All @@ -57,10 +58,11 @@ namespace awkward {

const std::shared_ptr<T> ptr() const { return ptr_; }

virtual const std::shared_ptr<Identity> to64() const;
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
virtual const std::shared_ptr<Identity> slice(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Identity> getitem_range(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Identity> shallow_copy() const;
virtual const std::shared_ptr<Identity> getitem_carry_64(Index64& carry) const;
virtual const std::shared_ptr<Identity> getitem_carry_64(const Index64& carry) const;

const std::string tostring() const;
const std::vector<T> get(int64_t at) const;
Expand Down
6 changes: 3 additions & 3 deletions include/awkward/Index.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ namespace awkward {
class IndexOf: public Index {
public:
IndexOf<T>(int64_t length)
: ptr_(std::shared_ptr<T>(new T[(size_t)length], awkward::util::array_deleter<T>()))
: ptr_(std::shared_ptr<T>(length == 0 ? nullptr : new T[(size_t)length], awkward::util::array_deleter<T>()))
, offset_(0)
, length_(length) { }
IndexOf<T>(const std::shared_ptr<T> ptr, int64_t offset, int64_t length)
Expand All @@ -32,8 +32,8 @@ namespace awkward {

const std::string tostring() const;
const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
T get(int64_t at) const;
IndexOf<T> slice(int64_t start, int64_t stop) const;
T getitem_at(int64_t at) const;
IndexOf<T> getitem_range(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Index> shallow_copy() const;

private:
Expand Down
2 changes: 1 addition & 1 deletion include/awkward/Iterator.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ namespace awkward {
const int64_t where() const { return where_; }

const bool isdone() const { return where_ >= content_.get()->length(); }
const std::shared_ptr<Content> next() { return content_.get()->get(where_++); }
const std::shared_ptr<Content> next() { return content_.get()->getitem_at(where_++); }

const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
const std::string tostring() const;
Expand Down
50 changes: 50 additions & 0 deletions include/awkward/ListArray.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
// BSD 3-Clause License; see https://github.com/jpivarski/awkward-1.0/blob/master/LICENSE

#ifndef AWKWARD_LISTARRAY_H_
#define AWKWARD_LISTARRAY_H_

#include <memory>

#include "awkward/cpu-kernels/util.h"
#include "awkward/Index.h"
#include "awkward/Identity.h"
#include "awkward/Content.h"

namespace awkward {
template <typename T>
class ListArrayOf: public Content {
public:
ListArrayOf<T>(const std::shared_ptr<Identity> id, const IndexOf<T> starts, const IndexOf<T> stops, const std::shared_ptr<Content> content)
: id_(id)
, starts_(starts)
, stops_(stops)
, content_(content) { }

const IndexOf<T> starts() const { return starts_; }
const IndexOf<T> stops() const { return stops_; }
const std::shared_ptr<Content> content() const { return content_.get()->shallow_copy(); }

virtual const std::shared_ptr<Identity> id() const { return id_; }
virtual void setid();
virtual void setid(const std::shared_ptr<Identity> id);
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
virtual int64_t length() const;
virtual const std::shared_ptr<Content> shallow_copy() const;
virtual const std::shared_ptr<Content> getitem_at(int64_t at) const;
virtual const std::shared_ptr<Content> getitem_range(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Content> getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, const Index64& advanced) const;
virtual const std::shared_ptr<Content> carry(const Index64& carry) const;
virtual const std::pair<int64_t, int64_t> minmax_depth() const;

private:
std::shared_ptr<Identity> id_;
const IndexOf<T> starts_;
const IndexOf<T> stops_;
const std::shared_ptr<Content> content_;
};

typedef ListArrayOf<int32_t> ListArray32;
typedef ListArrayOf<int64_t> ListArray64;
}

#endif // AWKWARD_LISTARRAY_H_
12 changes: 7 additions & 5 deletions include/awkward/ListOffsetArray.h
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// BSD 3-Clause License; see https://github.com/jpivarski/awkward-1.0/blob/master/LICENSE

#ifndef AWKWARD_LISTOFFSETARRAYCONTENT_H_
#define AWKWARD_LISTOFFSETARRAYCONTENT_H_
#ifndef AWKWARD_LISTOFFSETARRAY_H_
#define AWKWARD_LISTOFFSETARRAY_H_

#include <memory>

Expand All @@ -28,8 +28,10 @@ namespace awkward {
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
virtual int64_t length() const;
virtual const std::shared_ptr<Content> shallow_copy() const;
virtual const std::shared_ptr<Content> get(int64_t at) const;
virtual const std::shared_ptr<Content> slice(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Content> getitem_at(int64_t at) const;
virtual const std::shared_ptr<Content> getitem_range(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Content> getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, const Index64& advanced) const;
virtual const std::shared_ptr<Content> carry(const Index64& carry) const;
virtual const std::pair<int64_t, int64_t> minmax_depth() const;

private:
Expand All @@ -42,4 +44,4 @@ namespace awkward {
typedef ListOffsetArrayOf<int64_t> ListOffsetArray64;
}

#endif // AWKWARD_LISTOFFSETARRAYCONTENT_H_
#endif // AWKWARD_LISTOFFSETARRAY_H_
10 changes: 6 additions & 4 deletions include/awkward/NumpyArray.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,17 +46,19 @@ namespace awkward {
virtual const std::string tostring_part(const std::string indent, const std::string pre, const std::string post) const;
virtual int64_t length() const;
virtual const std::shared_ptr<Content> shallow_copy() const;
virtual const std::shared_ptr<Content> get(int64_t at) const;
virtual const std::shared_ptr<Content> slice(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Content> getitem_at(int64_t at) const;
virtual const std::shared_ptr<Content> getitem_range(int64_t start, int64_t stop) const;
virtual const std::shared_ptr<Content> getitem(const Slice& where) const;
virtual const std::shared_ptr<Content> getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, const Index64& advanced) const;
virtual const std::shared_ptr<Content> carry(const Index64& carry) const;
virtual const std::pair<int64_t, int64_t> minmax_depth() const;

bool iscontiguous() const;
void become_contiguous();
const NumpyArray contiguous() const;
const NumpyArray contiguous_next(Index64 bytepos) const;
const std::shared_ptr<Content> getitem(const Slice& slice) const;
const NumpyArray getitem_bystrides(const std::shared_ptr<SliceItem>& head, const Slice& tail, int64_t length) const;
const NumpyArray getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, Index64& carry, Index64& advanced, int64_t length, int64_t stride) const;
const NumpyArray getitem_next(const std::shared_ptr<SliceItem> head, const Slice& tail, const Index64& carry, const Index64& advanced, int64_t length, int64_t stride) const;

private:
std::shared_ptr<Identity> id_;
Expand Down
Loading

0 comments on commit fe17e09

Please sign in to comment.