-
Notifications
You must be signed in to change notification settings - Fork 0
Table item access definition
Examples of item access using astropy.table
are shown below. The outputs from this example are shown in detail further below along with commentary about the returned object. Please feel free to edit here directly to comment.
import numpy as np from astropy.table import Column, TableColumns, Table tc = TableColumns(cols=[Column('a'), Column('b'), Column('c')]) tc['a'] tc[1] tc['a', 'b'] tc[1:3] t = Table(np.arange(30).reshape(10,3), names=('a','b','c')) t.columns t['a'] t[1] t[2:5] t[np.array([2,5,7])] t['a', 'c']
Define Table Columns
>>> tc = TableColumns(cols=[Column('a'), Column('b'), Column('c')])
The current implementation does not have a tight coupling between
TableColumns and a parent table. There is a table
attribute
but it isn't actually used for anything (and will be removed unless
we identify a use). The original suggestion of
having slicing or multiple column access return the corresponding
Table
seems inconsistent. It feels like slicing a TableColumns
object should return another TableColumns object. One can always
do Table(tc[2:5])
to make a new table with columns 2:5, and the
Table
object now supports selecting multiple columns to create
a new table.
If there is no real table coupling then calling the class ColumnList
makes more sense, except that is implemented as an OrderedDict
and so implying it behaves like a list would be confusing. I'm not tied to using OrderedDict, this was inherited from ATPy. In my head the columns in a table are really a list-like entity which you should be able to access by matching an item to the column name (or names in the case of VO). I always want to do for col in table.columns
and have col
be a Column, not the name of a Column. So maybe move to a plain list as the basis for ColumnList?
Access table column by name : returns one Column
>>> tc['a'] <Column name='a units='None' format='None' description='None'> array([], dtype=float64)
Access table column by position : returns one Column
>>> tc[1] <Column name='b units='None' format='None' description='None'> array([], dtype=float64)
Multiple columns : Returns new TableColumns
>>> tc['a', 'b'] <TableColumns names=('a','b')>
Columns slice : Returns new TableColumns
>>> tc[1:3] <TableColumns names=('b','c')>
Define a Table : 10 rows and 3 columns
>>> t = Table(np.arange(30).reshape(10,3), names=('a','b','c'))
- Get the underlying TableColumns object
-
>>> t.columns <TableColumns names=('a','b','c')>
The main unique functionality brought by this object is selecting columns by numerical index instead of name. Personally I never find a need to do this, but if others find this useful then it is now there.
Get a Column : returns REF to Table.columns['a']
>>> t['a'] <Column name='a units='None' format='None' description='None'> array([ 0, 3, 6, 9, 12, 15, 18, 21, 24, 27])
Get a Row : returns Row object with REF to table data
>>> t[1] (3, 4, 5)
In the current code this just returns self._data[1]
, i.e. np.void
REF
to the row values.
Get a Table slice : returns new Table object with REF view of rows 2:5
>>> t[2:5] <Table rows=3 names=('a','b','c')> array([(6, 7, 8), (9, 10, 11), (12, 13, 14)], dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
Currently the code returns a COPY of the rows. I think this should be changed to be a view.
Get a fancy index slice : returns COPY of the rows
>>> t[np.array([2,5,7])] # Table obj with rows 2, 5, 7 (COPY) <Table rows=3 names=('a','b','c')> array([(6, 7, 8), (15, 16, 17), (21, 22, 23)], dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
This behavior is consistent with NumPy.
Select columns from Table : returns new Table with COPY of selected column data.
>>> t['a', 'c'] # Table with cols 'a', 'c' (COPY) <Table rows=10 names=('a','c')> array([(0, 2), (3, 5), (6, 8), (9, 11), (12, 14), (15, 17), (18, 20), (21, 23), (24, 26), (27, 29)], dtype=[('a', '<i8'), ('c', '<i8')])
Is COPY or REF better here? Probably most users would imagine they are getting a copy when they do this operation, and in some sense it is closer to fancy indexing than slicing.