Temporal embedding #22

lillythomas · 2023-11-03T12:19:21Z

Temporal

For v0, we've decided to structure the inputs as mono-temporal (e.g. time step per data cube = 1). To combine inputs, we'll seek to match S1 and S2 captures within +/- 3 days, but this may take some experimentation. To capture temporal semantics, we'll embed the timestamp.

weiji14 · 2023-11-08T20:18:43Z

The Sentinel-1 and Sentinel-2 image pair will have different timestamps, so I suppose we'll encode the median timestamp for the temporal embedding? Or do we encode two timestamps (one for each satellite sensor)?

yellowcap · 2023-11-13T11:55:08Z

We were also discussing if we want to use absolute time or relative time within a year. The absolute time would help the model understand the weather for each year or season as prior knowledge. A few considerations:

It is not clear what will happen to this embedding moving forward in time, when the model gets timestamps from a future that it has not been trained with.
We can test a cyclic embedding, similar to the spherical harmonics in the spatial dimension
Compare single number timestamp vs year/month/day pattern

weiji14 · 2023-11-21T23:07:08Z

@srmsoumya is starting some work on this at #47. We discussed a little bit yesterday about fixed vs learnable embeddings, but I think we may have confused the terminology a bit. According to https://stats.stackexchange.com/questions/470804/what-is-the-difference-between-position-embedding-vs-positional-encoding-in-bert:

Positional encoding are fixed. E.g. we could have a static function that encodes datetime such that January is close to December.
Positional embedding are learnable. These are representations that have been learned from the input data.

So, just to be clear, do we want to use a temporal positional encoding that is fixed, and/or a temporal embedding that is learned?

danhammer · 2023-12-01T14:10:34Z

How is time handled in the first release of the embeddings, mentioned in Clay-foundation/office issue Scope 1 for webapp #51? Are the embeddings generated for a mosaic for a specific time range?

If yes, I'd be interested in exploring the easiest integration of time at first -- just generating embeddings for the same area at two different time periods (say, 2021 and 2022). We'd just append the two and run the vector search over the appended embedding, length 1,536 = 768*2 for now. I know this sounds overly simple, but it's a pretty decent way to find recent, illegal mining activity, for example. And we wouldn't have to encode time yet within the model. That is, we can start to build out the UI/UX interactions based on time in parallel with a more robust examination of time.

brunosan · 2023-12-01T14:44:16Z

This Issue was unassigned, so kicking to @yellowcap to delegate if needed.

If I understand correctly

the embeddings file there ( https://github.com/Clay-foundation/office/issues/51#issuecomment-1832797456 ) correspond to one 256x256 image (MAE splits into 64 windows of 4). So all one location and time. The current scope is to convert there 64 windows into 1 embeddings for the whole image (256x256 pixels and 10m resolution).
The initial 970M chips are all pairs of co-temporal Sentinel-1 and Sentinel-2 (within 3 days) on any time between 2017 and now. https://github.com/Clay-foundation/model/blob/main/scripts/datacube.py#L420-L421

The cosine similarity of the given file is very self-similar, so the image appears semantically flat, as I would expect. (min of cosine similarity is .999

yellowcap · 2023-12-07T13:36:59Z

generating embeddings for the same area at two different time periods (say, 2021 and 2022)

That is definitively feasible. The current run of embeddings is on the training data, in which we only have one date per location. But as discussed previously, if we agree on an AOI and the dates, we can generate the imagery for these and run inference to generate the embeddings.

brunosan · 2023-12-07T13:59:18Z

Is there a low-effort lift where we can add ~3 or so timestamps per location?
This way the model learns that each location can change semantics to some degree (crop stage, floods, small clouds, ...).

yellowcap · 2023-12-07T15:47:35Z

Current architecture recieves date, will increase date diversity (multiple dates for same location ) in next pipeline run.

yellowcap mentioned this issue Nov 13, 2023

v0 Embeddings #3

Closed

weiji14 added this to the v0 Release milestone Nov 14, 2023

brunosan assigned yellowcap Dec 1, 2023

brunosan mentioned this issue Dec 1, 2023

Embedding receptive field #62

Closed

yellowcap closed this as completed Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temporal embedding #22

Temporal embedding #22

lillythomas commented Nov 3, 2023

weiji14 commented Nov 8, 2023

yellowcap commented Nov 13, 2023

weiji14 commented Nov 21, 2023

danhammer commented Dec 1, 2023 •

edited

Loading

brunosan commented Dec 1, 2023 •

edited

Loading

yellowcap commented Dec 7, 2023

brunosan commented Dec 7, 2023

yellowcap commented Dec 7, 2023

Temporal embedding #22

Temporal embedding #22

Comments

lillythomas commented Nov 3, 2023

Temporal

weiji14 commented Nov 8, 2023

yellowcap commented Nov 13, 2023

weiji14 commented Nov 21, 2023

danhammer commented Dec 1, 2023 • edited Loading

brunosan commented Dec 1, 2023 • edited Loading

yellowcap commented Dec 7, 2023

brunosan commented Dec 7, 2023

yellowcap commented Dec 7, 2023

danhammer commented Dec 1, 2023 •

edited

Loading

brunosan commented Dec 1, 2023 •

edited

Loading