@@ -28,6 +28,7 @@ tensorflow 2's ``tf.data.Dataset``.
2828It provides a simple solution to oversampling / stratification, weighted
2929sampling, and finally converting to a ``torch.utils.data.DataLoader ``.
3030
31+
3132Install
3233=======
3334
@@ -41,6 +42,7 @@ Or, for the old-timers:
4142
4243 pip install pytorch-datastream
4344
45+
4446 Usage
4547=====
4648
@@ -72,6 +74,43 @@ a more extensive list on API and usage.
7274 .state_dict
7375 .load_state_dict
7476
77+
78+ Simple image dataset example
79+ ----------------------------
80+ Here's a basic example of loading images from a directory:
81+
82+ .. code-block :: python
83+
84+ from datastream import Dataset
85+ from pathlib import Path
86+ from PIL import Image
87+
88+ # Assuming images are in a directory structure like:
89+ # images/
90+ # class1/
91+ # image1.jpg
92+ # image2.jpg
93+ # class2/
94+ # image3.jpg
95+ # image4.jpg
96+
97+ image_dir = Path(" images" )
98+ image_paths = list (image_dir.glob(" **/*.jpg" ))
99+
100+ dataset = (
101+ Dataset.from_paths(image_paths, pattern = r " . * /( ?P<class_name> \w + ) /( ?P<image_name> \w + ) . jpg" )
102+ .map(lambda row : dict (
103+ image = Image.open(row[" path" ]),
104+ class_name = row[" class_name" ],
105+ image_name = row[" image_name" ],
106+ ))
107+ )
108+
109+ # Access an item from the dataset
110+ first_item = dataset[0 ]
111+ print (f " Class: { first_item[' class_name' ]} , Image name: { first_item[' image_name' ]} " )
112+
113+
75114 Merge / stratify / oversample datastreams
76115-----------------------------------------
77116The fruit datastreams given below repeatedly yields the string of its fruit
87126 >> > next (iter (datastream.data_loader(batch_size = 8 )))
88127 [' apple' , ' apple' , ' pear' , ' banana' , ' apple' , ' apple' , ' pear' , ' banana' ]
89128
129+
90130 Zip independently sampled datastreams
91131-------------------------------------
92132The fruit datastreams given below repeatedly yields the string of its fruit
@@ -101,12 +141,8 @@ type.
101141 >> > next (iter (datastream.data_loader(batch_size = 4 )))
102142 [(' apple' , ' pear' ), (' apple' , ' banana' ), (' apple' , ' pear' ), (' apple' , ' banana' )]
103143
144+
104145 More usage examples
105146-------------------
106147See the `documentation <https://pytorch-datastream.readthedocs.io/en/latest/ >`_
107148for more usage examples.
108-
109- Install from source
110- ===================
111-
112- .. pip install -e .
0 commit comments