Skip to content

Commit cf0f708

Browse files
committed
ZAP 3: Remote image thumbnailing proposal.
1 parent 71020fd commit cf0f708

File tree

1 file changed

+152
-0
lines changed

1 file changed

+152
-0
lines changed

zaps/0003-remote-thumbnailing.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Remote image thumbnailing
2+
3+
## Types of remote images
4+
5+
### Inline image URLs
6+
7+
These are `![Alt text](https://example.com/image.png)` images, which are
8+
explicit[^1]. At message send time, we should render these as spinners in our
9+
initial `rendered_content`[^2], and insert a background process to fetch them. If
10+
the background fetch returns an image type, we should resize the image into our
11+
set of supported sizes/types, and cache these in-memory[^3]. The duration of
12+
this cache should be based on the caching headers of the remote image.
13+
14+
The message should then be silently updated to point to a signed `/thumbnail`
15+
URL with a "reasonable" size/format; the signing covers the URL, but not the
16+
size, since clients may rewrite it to their preferred size.
17+
18+
On the server side, the `/thumbnail` URL validates the signature, and returns
19+
400 (possibly with the content of a static failure image, varying on `Accepts`
20+
header) if the signature is invalid. It then checks that the requested
21+
size/format is in its supported set, and "rounds" to the closest match if it is
22+
not (based on `Accepts` header for format). If the thumbnail size/format is in
23+
cache, it serves it.
24+
25+
If the requested image is not in our local cache, it must re-fetch it. This
26+
happens synchronously, after which it must resize and store it in cache, and
27+
then provide the appropriate resized image to the client. Any non-expired[^4]
28+
size/format combinations should be re-rendered and inserted into the cache at
29+
the same time, since the network fetch time is likely significant compared to
30+
the resize time, we should endeavor to provide a consistent preview if the image
31+
is mutating over time, and one access may herald other accesses from other
32+
clients.
33+
34+
In the event that either the initial fetch, or subsequent re-fetches, times out,
35+
returns a document with a non-image `Content-Type`, or cannot be parsed as its
36+
purported image type, then we cache and return a stock "invalid image"
37+
content. We may wish to set an upper time bound on this (or multiple different
38+
bounds, based on the failure type), to handle intermittent failures.
39+
40+
The content requests must be made through Smokescreen, to ensure that they
41+
cannot be redirected (via DNS or HTTP) into private IP space.
42+
43+
[^1]:
44+
We render these even if image previews are disabled, presumably? Since
45+
that's mostly about not fetching random network resources, not about
46+
preventing image uploads from rendering inline?
47+
48+
[^2]:
49+
How do we know how much space the spinner should take up? We do not know
50+
anything about the height of the returned image yet, and and yet need to
51+
choose a height that minimizes or avoides veritcal movement.
52+
53+
[^3]: In memcached? Or on disk, but then we need to do manual flushing of it?
54+
[^4]: Possibly _all_ size/format combinations, for maximum consistency?
55+
56+
### Inline URLs
57+
58+
These are messages of the form:
59+
60+
```markdown
61+
Look at my picture:
62+
63+
https://example.com/image.png
64+
```
65+
66+
Or:
67+
68+
```markdown
69+
[Look at my picture](https://example.com/image.png)
70+
```
71+
72+
That is, a link (implied or explicit) with a URL which ends in an image
73+
extension, assuming that image previews are enabled on the server and realm.
74+
75+
The extension provides a light implication that the URL is an image, which we
76+
should inline. The above plan for inline image URLs holds, with the exception
77+
that _nothing_ is inlined upon first message send, and in the event of failure
78+
or non-image content, the message is not updated in any way[^5].
79+
80+
The effect of this is that intermittent failures of non-explicit image URLs is
81+
that they are never retried if they initially fail.
82+
83+
[^5]:
84+
This means that these messages will grow taller after sending, which is a
85+
bad thing? We could also render them as spinners, and update to plain text
86+
if the request fails, which means users will be less likely to have vertical
87+
movement, but will see less information about the image until thumbnailing
88+
completes.
89+
90+
### Inline bare URLs
91+
92+
These are messages of the form:
93+
94+
```markdown
95+
https://example.com/image.png
96+
```
97+
98+
That is, a body entirely of a URL which ends in an image extension, assuming
99+
that image previews are enabled on the server and realm.
100+
101+
These are treated as inline bare URLs, with the additional change that the
102+
entire content of the message is silently updated with the thumbnailed image,
103+
should it turn out to actually be an image.
104+
105+
### Opengraph images
106+
107+
These are messages of the form:
108+
109+
```markdown
110+
https://example.com/
111+
```
112+
113+
...where `example.com` has `og:...` tags which we can preview, assuming that
114+
`INLINE_URL_EMBED_PREVIEW` is enabled and the realm has URL previews enabled.
115+
116+
Any images from this preview will be treated as "Inline image URLs", above.
117+
118+
## Effects on existing URL endpoints
119+
120+
### `/thumbnail?url=...&size=...`
121+
122+
Existing `/thumbnail` URLs are of the form:
123+
124+
/thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=full
125+
/thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=thumbnail
126+
127+
/thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=full
128+
/thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=thumbnail
129+
130+
These were only generated by `THUMBNAIL_IMAGES = True` servers; they may appear
131+
in historical messages even if it is not currently set.
132+
133+
The former two are currently serve the full-size `/user_uploads/` equivalents,
134+
regardless of `size`. The latter two accepted unauthenticated unsigned requests,
135+
and were not rate-limited; for security reasons, they currently always 401.
136+
137+
The endpoint will begin supporting _signed_ URL requests. These sign the `url`
138+
parameter, and allow the caller to adjust the size and format of the response.
139+
See "Inline image URLs", above.
140+
141+
### `/external_content/...`
142+
143+
Since thumbnailing is performed on all remote images, there is no need for Camo
144+
for images in new messages anymore; all images are served either through
145+
`/user_uploads/...` or `/thumbnail?...`
146+
147+
However, until videos are rendered as their server-side-generated thumbnails,
148+
videos must continue to go through Camo; previous messages also still encode
149+
`/external_content/` URLs, which should still be served.
150+
151+
So for backwards-compatibility, the Camo server should be preserved for now, and
152+
continue to serve `/external_content/` URLs.

0 commit comments

Comments
 (0)