Home Oh, The Imagery!
Post
Cancel

Oh, The Imagery!

Pre-note: This is an old article from when I headed Isoscribe. I’m reposting it here, because I thought it was somewhat cool.

We’ve been working with image uploading

Quite a lot, actually. It’s been one of the major pain points of in- development Isoscribe versions.

So, let me walk you through what our process has been so far.

Why we convert images

When you upload an image to our site, it likely is in a very expensive format. Sometimes, the image is very large, and since we pay for storage, a lot of very large images would cost us a lot of money.

Sometimes, images are in a not well supported format. We like to keep all of our images as WebP and PNGs. This means that we have a format for cutting-edge browsers, so they can reduce bandwidth usage. It also means we have a format that is very compatible - PNGs have been around seemingly since the dawn of time, and every browser ever can see them.

But, us converting the images means we have to run a conversion step.

An explanation of our architecture

We use Amazon S3 for our static files, and that lives behind the CloudFlare Content Distribution Network (CDN). All of our servers are stateless, meaning that they never store files locally, and you can connect to any server and the website will still work.

This creates a few issues

Amazon S3 behind the CloudFlare CDN makes our static files incredibly fast. It also is very cheap to run, because we don’t need any servers to support this.

So what’s the drawback?

We can’t upload files directly to one of our servers and have it be presented to users. There has to be an intermediary step to place it on S3.

Our first attempt

We started with a direct file upload to our API, which stored the image in RAM, then uploaded it to S3. This worked really well to start with - except, of course, it used tons of RAM.

Now, as much as this worked well, we had a problem - our requests kept timing out.

The image used on our distributed cache post started out at being 14,000 pixels wide. When I uploaded this file, it took so long to convert and then upload the image to S3 that my requests would time out, and cause errors.

Our second attempt

After this, I tried out some other “fun” things. I rewrote the image uploading code, trying to multithread it properly, to no avail. So what did I do next?

Lambda

Essentially, I wrote a big lambda (Amazon Serverless code product) that would accept a file to be uploaded, convert it, and then upload. This actually worked fairly well, and got the job done.

And then we ran into CORS issues

Essentially, the browser won’t let you upload to upload.isoscribe.com if you’re on isoscribe.com. This makes sense from a security standpoint - what if somebody’s malicious code was running on there? We can’t let anybody break our website.

So I gave up.

Our third attempt

After the direct upload lambda, I tried something new. Amazon S3 has something called “presigned URLs,” which allow me to create a url to which you can upload. This is a remarkable solution, because it maintains security, but allows you to upload directly to our CDN. So, I used a lambda to create presigned URLs for clients.

And then we ran into CORS issues… again

Yeah, so this wasn’t my brightest moment. You can’t make a request to upload.isoscribe.com from isoscribe.com, as I mentioned before.

Attempt 3.5

So, I took all the code out of the lambda and shoved it into our API. Now, you can query our API and ask for an upload URL, and it will happily spit one out to you, and then you upload to S3. This is how it works currently, and works fantastically.

But what about image conversion?

It’s still an issue. With presigned URLs, we can’t directly convert images. So what do we do?

We use an Amazon SNS topic to trigger a lambda to take the uploaded images, convert them, and put them in the right place.

In actual English: We get a notification every time the file is uploaded, and a miniature server will convert the images for me when it gets the right notification.

Thanks for reading!

–E

This post is licensed under CC BY 4.0 by the author.