@@ -68,7 +68,7 @@ First, initialize column types you want to extract from the `StructuredDataset`.
68
68
69
69
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
70
70
:caption: data_types_and_io/ structured_dataset.py
71
- :lines: 31 - 32
71
+ :lines: 36 - 37
72
72
```
73
73
74
74
Define a task that opens a structured dataset by calling ` all() ` .
@@ -78,7 +78,7 @@ For instance, you can use ``pa.Table`` to convert the Pandas DataFrame to a PyAr
78
78
79
79
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
80
80
:caption: data_types_and_io/ structured_dataset.py
81
- :lines: 42 - 52
81
+ :lines: 47 - 57
82
82
```
83
83
84
84
The code may result in runtime failures if the columns do not match.
@@ -91,7 +91,7 @@ and enable the CSV serialization by annotating the structured dataset with the C
91
91
92
92
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
93
93
:caption: data_types_and_io/ structured_dataset.py
94
- :lines: 58 - 72
94
+ :lines: 63 - 77
95
95
```
96
96
97
97
## Storage driver and location
@@ -230,14 +230,14 @@ and the byte format, which in this case is `PARQUET`.
230
230
231
231
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
232
232
:caption: data_types_and_io/ structured_dataset.py
233
- :lines: 128 - 130
233
+ :lines: 133 - 135
234
234
```
235
235
236
236
You can now use ` numpy.ndarray ` to deserialize the parquet file to NumPy and serialize a task's output (NumPy array) to a parquet file.
237
237
238
238
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
239
239
:caption: data_types_and_io/ structured_dataset.py
240
- :lines: 135 - 148
240
+ :lines: 140 - 153
241
241
```
242
242
243
243
:::{note}
@@ -248,7 +248,7 @@ You can run the code locally as follows:
248
248
249
249
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
250
250
:caption: data_types_and_io/ structured_dataset.py
251
- :lines: 152 - 156
251
+ :lines: 157 - 161
252
252
```
253
253
254
254
### The nested typed columns
@@ -261,7 +261,7 @@ Nested field StructuredDataset should be run when flytekit version > 1.11.0.
261
261
262
262
``` {literalinclude} /examples/data_types_and_io/data_types_and_io/structured_dataset.py
263
263
:caption: data_types_and_io/ structured_dataset.py
264
- :lines: 158 - 285
264
+ :lines: 163 - 290
265
265
```
266
266
267
267
[ flytesnacks ] : https://github.com/flyteorg/flytesnacks/tree/master/examples/data_types_and_io/
0 commit comments