Implement MAE with support for position, time, latlon & channel embeddings #47

srmsoumya · 2023-11-21T11:54:11Z

Add GeoViT from vit-pytorch that encodes position, time, latlon & channel embeddings
Add GeoMAE from vit-pytorch
Run a sample epoch to test the loss curve & training

for more information, see https://pre-commit.ci

weiji14 · 2023-12-01T04:18:02Z

srm_trainer.py

+    dm = ClayDataModule(batch_size=128, num_workers=8)
+    model = GeoMAEModule(lr=1e-2)
+    trainer = L.Trainer(
+        fast_dev_run=False,
+        accelerator="gpu",
+        devices=1,
+        # precision="16-mixed",
+        max_epochs=50,
+        # log_every_n_steps=100,
+        logger=CSVLogger("logs", name="geomae"),
+    )
+
+    trainer.fit(model, dm)


Let me know once you've got the code to a stable state, and I can help with setting up the LightningCLI code to allow for different datamodules/models, following https://lightning.ai/docs/pytorch/2.1.0/cli/lightning_cli_intermediate_2.html#multiple-lightningdatamodules. That way we can have a single trainer file and do:

python trainer.py fit --data=GeoTIFFDataPipeModule --model=ViTLitModule # from PR52 python trainer.py fit --data=ClayDataModule --model=GeoMAEModule # this PR47

Sure, I initially had issues with the cuda 12.x set up on my machine & thought it might be because of Lightning CLI.
Yes, I would like to move to CLI once the experiment is working fine.

Is it an error like RuntimeError: GET was unable to find an engine to execute this computation? I've been getting that on my AWS EC2 G5 instance, and had to set export TORCH_CUDNN_V8_API_DISABLED=1 as a workaround from https://discuss.pytorch.org/t/segment-anything-model-unable-to-find-engine-to-execute-this-computation/179172.

weiji14

Spotted some issues with how you're getting the coordinates from the GeoTIFF (there are UTM coordinates, not latlon or lonlat).

Heads up also that I've opened a small-ish PR to have the LightningDataModule return the spatiotemporal metdata (see #66), since we'll need to save the embeddings with some spatiotemporal metadata too.

weiji14 · 2023-12-03T22:40:04Z

src/dataset.py

+        bounds = chip.bounds
+        centroid_x = (bounds.left + bounds.right) / 2
+        centroid_y = (bounds.top + bounds.bottom) / 2


Coordinates from chip.bounds are UTM coordinates and not latlon. You would need to reproject first, or use chip.lnglat (https://rasterio.readthedocs.io/en/latest/api/rasterio._base.html#rasterio._base.DatasetBase.lnglat).

I made an implementation to compute lat/lon centroids from bounds in the first tiler version, have a look

model/tile_retriever.py

Line 97 in 9a41e3d

def bbox_centroid(bounds, epsg):

weiji14 · 2023-12-03T22:41:04Z

src/dataset.py

+        return {
+            "pixels": chip.read(),
+            "timestep": (year, month, day),
+            "latlon": (centroid_x, centroid_y),


A tuple like (centroid_x, centroid_y) would be lonlat, not latlon. Longitude is X, latitude is Y.

Thanks @weiji14 swapping the values

for more information, see https://pre-commit.ci

srmsoumya · 2023-12-07T15:17:47Z

@weiji14 @lillythomas I ran the clay-small model locally & was able to get a loss curve similar to vanilla MAE implementations from FAIR & vit-pytorch. Here is the wandb log: https://wandb.ai/devseed/CLAY-v0/runs/zj0su9df

brunosan · 2023-12-07T16:54:31Z

@weiji14 @lillythomas I ran the clay-small model locally & was able to get a loss curve similar to vanilla MAE implementations from FAIR & vit-pytorch. Here is the wandb log: https://wandb.ai/devseed/CLAY-v0/runs/zj0su9df

Link is 404 for me. Most probably not shared with me.

Retrieve the patch size value from the model architecture, rather than hardcoding as 32. Also ensure that the input image shape is the same as the predicted image from the decoder.

for more information, see https://pre-commit.ci

weiji14 · 2023-12-14T21:56:51Z

Gonna tidy up a few things before merging in this PR. Specifically, moving code from:

callbacks.py -> callback_wandb.py (after #88 maybe)- logs matplotlib figures to WandB
srm_datamodule.py -> combine into datamodule.py. May need to recompute mean and std?
model.py + clay.py -> combine into single model_clay.py file
srm_trainer.py -> combine into trainer.py

Also, other things:

Check that all required dependencies are in environment.yml

Have one entrypoint to run the model using Lightning CLI. Switched model from VitLitModule to CLAYModule, and datamodule from GeoTIFFDataPipeModule to ClayDataModule. Temporarily disabling the logging and monitoring callbacks for now.

Putting the CLAYModule (LightningModule) together with the CLAY torch.nn.Module in a single model_clay.py file. Have mentioned in src/README.md that model_clay.py is the one with custom spatiotemporal encoders, while the previous model_vit.py contains vanilla Vision Transformer implementation.

Publication quality figures in Python!

Putting the DataLoader code in one file - datamodule.py. The regular torch Dataset classes are placed on top of the existing torchdata-based functions/classes.

Moving the LogIntermediatePredictions callback class from callbacks.py into callbacks_wandb.py.

Getting the WandbLogger directly from the trainer, rather than having to pass it through __init__. Adapted from https://github.com/ashleve/lightning-hydra-template/blob/334601c0326a50ff301fbd76057b36408cf97ffa/src/callbacks/wandb_callbacks.py#L16C1-L34C6

Converted docstrings from numpydoc style which uses less horizontal space but more vertical space. Also added a noqa comment for three instances of `PLR0913 Too many arguments in function definition`.

weiji14

Ok @srmsoumya, I've cleaned up the code a little, and fixed all the lint errors. Would be nice to get some unit tests in at some point, but we can iterate on this later. Thanks again for implementing this Clay model for v0 🚀

…dings (#47) * Add modified ViT to encode latlon, time, channels & position embeddings * Add MAE for modified ViT * Add docstrings & fix issue with complex indexing * Fix the comments on loss computation * Add datamodule & trainer to run an epoch of training * Normalize data before feeding to the model * Add fixed sincos embedding for position & bands * Add logging & ckpt options * Fix the order of coords from lat,lon to lon,lat * Add clay tiny,small,medium,large model versions * Remove hardcoded patch size in LogIntermediatePredictions callback Retrieve the patch size value from the model architecture, rather than hardcoding as 32. Also ensure that the input image shape is the same as the predicted image from the decoder. * Run clay small on image size 512 for 10 epochs with grad_acc * Make the clay construction configurable * Return the data path to reference for vector embeddings * Remove duplicate dataset.py & geovit.py * 🔀 Merge srm_trainer.py into trainer.py Have one entrypoint to run the model using Lightning CLI. Switched model from VitLitModule to CLAYModule, and datamodule from GeoTIFFDataPipeModule to ClayDataModule. Temporarily disabling the logging and monitoring callbacks for now. * 🔀 Combine clay.py and model.py into model_clay Putting the CLAYModule (LightningModule) together with the CLAY torch.nn.Module in a single model_clay.py file. Have mentioned in src/README.md that model_clay.py is the one with custom spatiotemporal encoders, while the previous model_vit.py contains vanilla Vision Transformer implementation. * ➕ Add matplotlib-base Publication quality figures in Python! * 🚚 Move ClayDataset and ClayDataModule into datamodule.py Putting the DataLoader code in one file - datamodule.py. The regular torch Dataset classes are placed on top of the existing torchdata-based functions/classes. * 🚚 Move LogIntermediatePredictions callback into callbacks_wandb Moving the LogIntermediatePredictions callback class from callbacks.py into callbacks_wandb.py. * ♻️ Get WandB logger properly using a shared function Getting the WandbLogger directly from the trainer, rather than having to pass it through __init__. Adapted from https://github.com/ashleve/lightning-hydra-template/blob/334601c0326a50ff301fbd76057b36408cf97ffa/src/callbacks/wandb_callbacks.py#L16C1-L34C6 * 🚨 Wrap docstring and fix too-many-arguments lint error Converted docstrings from numpydoc style which uses less horizontal space but more vertical space. Also added a noqa comment for three instances of `PLR0913 Too many arguments in function definition`. --------- Co-authored-by: SRM <soumya@developmentseed.org> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Wei Ji <23487320+weiji14@users.noreply.github.com>

Add modified ViT to encode latlon, time, channels & position embeddings

4942a19

srmsoumya requested a review from weiji14 November 21, 2023 12:03

This was referenced Nov 21, 2023

Initial Vision Transformer architecture with MAE decoder #37

Merged

Temporal embedding #22

Closed

SRM added 2 commits November 28, 2023 07:36

Add MAE for modified ViT

6d17a4c

Add docstrings & fix issue with complex indexing

81792c6

weiji14 added the model-architecture Pull requests about the neural network model architecture label Nov 29, 2023

weiji14 modified the milestone: v0 Release Nov 29, 2023

weiji14 mentioned this pull request Nov 29, 2023

Generate embeddings via prediction loop #56

Merged

7 tasks

SRM and others added 9 commits November 29, 2023 11:53

Fix the comments on loss computation

9db48c9

Merge main to vit-pytorch

05ecb9e

[pre-commit.ci] auto fixes from pre-commit.com hooks

3773045

for more information, see https://pre-commit.ci

Add datamodule & trainer to run an epoch of training

f0c4170

[pre-commit.ci] auto fixes from pre-commit.com hooks

134ba4b

for more information, see https://pre-commit.ci

Normalize data before feeding to the model

1987a1a

Add fixed sincos embedding for position & bands

37f6f82

Fix pre-commit CI issue

ec5a547

[pre-commit.ci] auto fixes from pre-commit.com hooks

bfc4bad

for more information, see https://pre-commit.ci

weiji14 reviewed Dec 1, 2023

View reviewed changes

weiji14 reviewed Dec 3, 2023

View reviewed changes

Add logging & ckpt options

c0393dd

srmsoumya marked this pull request as ready for review December 6, 2023 13:23

SRM and others added 4 commits December 7, 2023 16:52

Fix the order of coords from lat,lon to lon,lat

088a8c5

Add clay tiny,small,medium,large model versions

25b59a3

Fix pre-commit formatting issue

aabbc83

[pre-commit.ci] auto fixes from pre-commit.com hooks

ec7122a

for more information, see https://pre-commit.ci

srmsoumya requested review from weiji14 and lillythomas December 7, 2023 15:05

Remove hardcoded patch size in LogIntermediatePredictions callback

f3c2a6e

Retrieve the patch size value from the model architecture, rather than hardcoding as 32. Also ensure that the input image shape is the same as the predicted image from the decoder.

weiji14 mentioned this pull request Dec 12, 2023

Send early sample of embeddings #35

Closed

SRM and others added 5 commits December 13, 2023 11:10

Run clay small on image size 512 for 10 epochs with grad_acc

ed6138f

[pre-commit.ci] auto fixes from pre-commit.com hooks

87dd31a

for more information, see https://pre-commit.ci

Make the clay construction configurable

ac5b71e

Return the data path to reference for vector embeddings

2dd8ab0

Remove duplicate dataset.py & geovit.py

eacee3b

weiji14 added 5 commits December 15, 2023 11:26

🔀 Merge srm_trainer.py into trainer.py

4c96387

Have one entrypoint to run the model using Lightning CLI. Switched model from VitLitModule to CLAYModule, and datamodule from GeoTIFFDataPipeModule to ClayDataModule. Temporarily disabling the logging and monitoring callbacks for now.

🔀 Merge branch 'main' into vit-pytorch

3dd7ffc

➕ Add matplotlib-base

3f8a330

Publication quality figures in Python!

🚚 Move ClayDataset and ClayDataModule into datamodule.py

7375f09

Putting the DataLoader code in one file - datamodule.py. The regular torch Dataset classes are placed on top of the existing torchdata-based functions/classes.

weiji14 mentioned this pull request Dec 15, 2023

Callback function to log Masked Autoencoder reconstructions to WandB #88

Merged

5 tasks

weiji14 added 4 commits December 15, 2023 14:53

🔀 Merge branch 'main' into vit-pytorch

c202fdf

🚚 Move LogIntermediatePredictions callback into callbacks_wandb

4d011d0

Moving the LogIntermediatePredictions callback class from callbacks.py into callbacks_wandb.py.

🚨 Wrap docstring and fix too-many-arguments lint error

2af0eed

Converted docstrings from numpydoc style which uses less horizontal space but more vertical space. Also added a noqa comment for three instances of `PLR0913 Too many arguments in function definition`.

weiji14 changed the title ~~Implement MAE with support for position, time, latlon & channel embeddings.~~ Implement MAE with support for position, time, latlon & channel embeddings Dec 15, 2023

weiji14 approved these changes Dec 15, 2023

View reviewed changes

weiji14 merged commit 46fd3a7 into main Dec 15, 2023
1 check passed

weiji14 deleted the vit-pytorch branch December 15, 2023 04:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement MAE with support for position, time, latlon & channel embeddings #47

Implement MAE with support for position, time, latlon & channel embeddings #47

srmsoumya commented Nov 21, 2023 •

edited

Loading

weiji14 Dec 1, 2023

srmsoumya Dec 7, 2023

weiji14 Dec 12, 2023

weiji14 left a comment •

edited

Loading

weiji14 Dec 3, 2023 •

edited

Loading

yellowcap Dec 4, 2023

weiji14 Dec 3, 2023 •

edited

Loading

srmsoumya Dec 7, 2023

srmsoumya commented Dec 7, 2023

brunosan commented Dec 7, 2023

weiji14 commented Dec 14, 2023 •

edited

Loading

weiji14 left a comment

Implement MAE with support for position, time, latlon & channel embeddings #47

Implement MAE with support for position, time, latlon & channel embeddings #47

Conversation

srmsoumya commented Nov 21, 2023 • edited Loading

weiji14 Dec 1, 2023

Choose a reason for hiding this comment

srmsoumya Dec 7, 2023

Choose a reason for hiding this comment

weiji14 Dec 12, 2023

Choose a reason for hiding this comment

weiji14 left a comment • edited Loading

Choose a reason for hiding this comment

weiji14 Dec 3, 2023 • edited Loading

Choose a reason for hiding this comment

yellowcap Dec 4, 2023

Choose a reason for hiding this comment

weiji14 Dec 3, 2023 • edited Loading

Choose a reason for hiding this comment

srmsoumya Dec 7, 2023

Choose a reason for hiding this comment

srmsoumya commented Dec 7, 2023

brunosan commented Dec 7, 2023

weiji14 commented Dec 14, 2023 • edited Loading

weiji14 left a comment

Choose a reason for hiding this comment

srmsoumya commented Nov 21, 2023 •

edited

Loading

weiji14 left a comment •

edited

Loading

weiji14 Dec 3, 2023 •

edited

Loading

weiji14 Dec 3, 2023 •

edited

Loading

weiji14 commented Dec 14, 2023 •

edited

Loading