|
| 1 | +# Remote image thumbnailing |
| 2 | + |
| 3 | +## Types of remote images |
| 4 | + |
| 5 | +### Inline image URLs |
| 6 | + |
| 7 | +These are `` images, which are |
| 8 | +explicit[^1]. At message send time, we should render these as spinners in our |
| 9 | +initial `rendered_content`[^2], and insert a background process to fetch them. If |
| 10 | +the background fetch returns an image type, we should resize the image into our |
| 11 | +set of supported sizes/types, and cache these in-memory[^3]. The duration of |
| 12 | +this cache should be based on the caching headers of the remote image. |
| 13 | + |
| 14 | +The message should then be silently updated to point to a signed `/thumbnail` |
| 15 | +URL with a "reasonable" size/format; the signing covers the URL, but not the |
| 16 | +size, since clients may rewrite it to their preferred size. |
| 17 | + |
| 18 | +On the server side, the `/thumbnail` URL validates the signature, and returns |
| 19 | +400 (possibly with the content of a static failure image, varying on `Accepts` |
| 20 | +header) if the signature is invalid. It then checks that the requested |
| 21 | +size/format is in its supported set, and "rounds" to the closest match if it is |
| 22 | +not (based on `Accepts` header for format). If the thumbnail size/format is in |
| 23 | +cache, it serves it. |
| 24 | + |
| 25 | +If the requested image is not in our local cache, it must re-fetch it. This |
| 26 | +happens synchronously, after which it must resize and store it in cache, and |
| 27 | +then provide the appropriate resized image to the client. Any non-expired[^4] |
| 28 | +size/format combinations should be re-rendered and inserted into the cache at |
| 29 | +the same time, since the network fetch time is likely significant compared to |
| 30 | +the resize time, we should endeavor to provide a consistent preview if the image |
| 31 | +is mutating over time, and one access may herald other accesses from other |
| 32 | +clients. |
| 33 | + |
| 34 | +In the event that either the initial fetch, or subsequent re-fetches, times out, |
| 35 | +returns a document with a non-image `Content-Type`, or cannot be parsed as its |
| 36 | +purported image type, then we cache and return a stock "invalid image" |
| 37 | +content. We may wish to set an upper time bound on this (or multiple different |
| 38 | +bounds, based on the failure type), to handle intermittent failures. |
| 39 | + |
| 40 | +The content requests must be made through Smokescreen, to ensure that they |
| 41 | +cannot be redirected (via DNS or HTTP) into private IP space. |
| 42 | + |
| 43 | +[^1]: |
| 44 | + We render these even if image previews are disabled, presumably? Since |
| 45 | + that's mostly about not fetching random network resources, not about |
| 46 | + preventing image uploads from rendering inline? |
| 47 | + |
| 48 | +[^2]: |
| 49 | + How do we know how much space the spinner should take up? We do not know |
| 50 | + anything about the height of the returned image yet, and and yet need to |
| 51 | + choose a height that minimizes or avoides veritcal movement. |
| 52 | + |
| 53 | +[^3]: In memcached? Or on disk, but then we need to do manual flushing of it? |
| 54 | +[^4]: Possibly _all_ size/format combinations, for maximum consistency? |
| 55 | + |
| 56 | +### Inline URLs |
| 57 | + |
| 58 | +These are messages of the form: |
| 59 | + |
| 60 | +```markdown |
| 61 | +Look at my picture: |
| 62 | + |
| 63 | +https://example.com/image.png |
| 64 | +``` |
| 65 | + |
| 66 | +Or: |
| 67 | + |
| 68 | +```markdown |
| 69 | +[Look at my picture](https://example.com/image.png) |
| 70 | +``` |
| 71 | + |
| 72 | +That is, a link (implied or explicit) with a URL which ends in an image |
| 73 | +extension, assuming that image previews are enabled on the server and realm. |
| 74 | + |
| 75 | +The extension provides a light implication that the URL is an image, which we |
| 76 | +should inline. The above plan for inline image URLs holds, with the exception |
| 77 | +that _nothing_ is inlined upon first message send, and in the event of failure |
| 78 | +or non-image content, the message is not updated in any way[^5]. |
| 79 | + |
| 80 | +The effect of this is that intermittent failures of non-explicit image URLs is |
| 81 | +that they are never retried if they initially fail. |
| 82 | + |
| 83 | +[^5]: |
| 84 | + This means that these messages will grow taller after sending, which is a |
| 85 | + bad thing? We could also render them as spinners, and update to plain text |
| 86 | + if the request fails, which means users will be less likely to have vertical |
| 87 | + movement, but will see less information about the image until thumbnailing |
| 88 | + completes. |
| 89 | + |
| 90 | +### Inline bare URLs |
| 91 | + |
| 92 | +These are messages of the form: |
| 93 | + |
| 94 | +```markdown |
| 95 | +https://example.com/image.png |
| 96 | +``` |
| 97 | + |
| 98 | +That is, a body entirely of a URL which ends in an image extension, assuming |
| 99 | +that image previews are enabled on the server and realm. |
| 100 | + |
| 101 | +These are treated as inline bare URLs, with the additional change that the |
| 102 | +entire content of the message is silently updated with the thumbnailed image, |
| 103 | +should it turn out to actually be an image. |
| 104 | + |
| 105 | +### Opengraph images |
| 106 | + |
| 107 | +These are messages of the form: |
| 108 | + |
| 109 | +```markdown |
| 110 | +https://example.com/ |
| 111 | +``` |
| 112 | + |
| 113 | +...where `example.com` has `og:...` tags which we can preview, assuming that |
| 114 | +`INLINE_URL_EMBED_PREVIEW` is enabled and the realm has URL previews enabled. |
| 115 | + |
| 116 | +Any images from this preview will be treated as "Inline image URLs", above. |
| 117 | + |
| 118 | +## Effects on existing URL endpoints |
| 119 | + |
| 120 | +### `/thumbnail?url=...&size=...` |
| 121 | + |
| 122 | +Existing `/thumbnail` URLs are of the form: |
| 123 | + |
| 124 | + /thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=full |
| 125 | + /thumbnail?url=user_uploads%2F2%2F85%2FXoqF0K7XEOLVGylgdpof80RB%2Fimg.png&size=thumbnail |
| 126 | + |
| 127 | + /thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=full |
| 128 | + /thumbnail?url=https%3A%2F%2Fwww.example.com%2Fimages%2Ffilename.png&size=thumbnail |
| 129 | + |
| 130 | +These were only generated by `THUMBNAIL_IMAGES = True` servers; they may appear |
| 131 | +in historical messages even if it is not currently set. |
| 132 | + |
| 133 | +The former two are currently serve the full-size `/user_uploads/` equivalents, |
| 134 | +regardless of `size`. The latter two accepted unauthenticated unsigned requests, |
| 135 | +and were not rate-limited; for security reasons, they currently always 401. |
| 136 | + |
| 137 | +The endpoint will begin supporting _signed_ URL requests. These sign the `url` |
| 138 | +parameter, and allow the caller to adjust the size and format of the response. |
| 139 | +See "Inline image URLs", above. |
| 140 | + |
| 141 | +### `/external_content/...` |
| 142 | + |
| 143 | +Since thumbnailing is performed on all remote images, there is no need for Camo |
| 144 | +for images in new messages anymore; all images are served either through |
| 145 | +`/user_uploads/...` or `/thumbnail?...` |
| 146 | + |
| 147 | +However, until videos are rendered as their server-side-generated thumbnails, |
| 148 | +videos must continue to go through Camo; previous messages also still encode |
| 149 | +`/external_content/` URLs, which should still be served. |
| 150 | + |
| 151 | +So for backwards-compatibility, the Camo server should be preserved for now, and |
| 152 | +continue to serve `/external_content/` URLs. |
0 commit comments