Hacker School Day 18: Cropping, Letterboxing, Aspect Ratios, Terrible GIFs

I keep saying to myself, “Today will be the last day of the GIF project”. Today was another last day, and still not the last day. I’ve been getting too excited about fixing cropping — I didn’t see another free Youtube to Gif service that provides the ability to crop before processing. Yesterday’s post about cropping ended with me remembering the original cropping plan. The problem:

You can embed a YouTube video in a webpage at any height and width you want! Woo!
YouTube will decide what size video to put in that window. (It might also letterbox the video, which is relevant later).
When a user on my website wants to crop the video for a GIF, the size and coordinates of the cropping box are bound to the size of the embedded viewport.
When my server downloads the video, the size of the video is different than the viewport. If the server downloads a video in 1080p definition, and the request for a 200×200 pixel crop starting at (0,0) came from a viewport that’s 480 pixels wide, everything becomes senseless.
It’s not easy to determine the size of the source video YouTube chose to put into the player, so I can’t tell the server to download the same-sized video on the backend. If I asked the JS API for the size, it responds with “medium”.
When my sister was very young, she did a report on the state of New Jersey. The report included the size of the population. She also wrote “medium”, and it was probably as informative.

The obvious answer is to send the dimensions and coordinates of the bounding box, along with the size of the viewport. Then the server can scale the coordinates and size of the bounding box during processing by the ratio between the real size and the size the user edited from. I implemented that this morning, and it was easy and good.

Not good

It stopped being good as soon as I loaded a widescreen video. Today I read that new YouTube videos typically have a 16:9 aspect ratio (widescreen). This becomes a problem for me:

In my page, I provide a viewer that is 480×360 pixels. This is a 4:3 aspect ratio, which is also the aspect ratio of the video I’ve been using to test things.
When you load a widescreen video in a viewport that’s 4:3, YouTube automatically letterboxes the video, meaning it adds black bars above and below the video
Since my Javascript code doesn’t know where the letterboxing is, the cropping code lets the user select an area that includes letterboxing, which means that the final crop is off — the server never gets letterboxed videos because it has no viewport to cram things into.

After a lot of munging around, I ended up using the still-accessible V2 Youtube Info API, which returns a chunk of JSON about a given video. The important piece of information is the existence of a key named ‘aspectRatio’. If it exists, it has the value ‘widescreen’. Otherwise, the video has a 4:3 aspect ratio. I couldn’t find a more-detailed method of determining this.

Letterboxing

I tripped myself up by using a music video in testing. The video was uploaded with letterboxing from the content creator. So when I played the video in my page, it had letterboxing from YouTube (for being the wrong size player), AND had letterboxing from the actual content of the video itself. This led to me trying to figure out why YouTube has variable-sized letterboxing.

Once I realized the video came with letterboxing, I was able to change the code so it:

Asks the API for JSON about the video.
If ‘aspectRatio’ is defined, create a viewport that is 480×270 pixels, which has a 16:9 aspect ratio and will not force letterboxing on the video.
Otherwise, use 480×360.

I changed the backend to crop based on ratios, and also fought avconv for the right ordering of commands. I needed to crop the video according to the scaled coordinates/size, and then scale the PNG to the actual size of the cropping rectangle. So the video I download gets cropped to the same logical area the user wanted, but I might have a gigantic PNG. The expectation from the user is to get a GIF that’s the same size and the little box they drew, so I need to scale that cropped PNG back down to the real, un-scaled bounding box dimensions.

Almost done

I fixed some other weird problems with JCrop. It changes the divs when it’s applied to an element, and in my use case, calling the destroy call takes too much away. I fixed this, which means that you can create multiple GIFs in a row from multiple videos without the DOM getting all messed up.

It finally feels done! I could totally stop now, but here’s what I’d like to do if I had infinite time:

A couple tiny usability touchups: using the text input to specify seconds can be nicer than using the slider on very large videos
Change some wording about ‘viewports’ for the API’s benefit. In reality, with scaling, the viewport can stand-in for the desired size of the output GIF
Add options for speed of output GIF
Let the user put text on the output GIF
Spin up a couple nodes to watch this thing scale out. I’ve tried to design it in a way that should work — the only requirement should be a networked filesystem of some sort to hold the GIFs. Even that might not be necessary.
Do more general Javascript/frontend cleaning. I’m definitely not a frontend programmer and I think it shows. I’d also like to try using bower and all the other things the cool kids use
Add support for WebSockets

I might do this tomorrow, or I might run as fast and as hard as I can away from Javascript things for forever. I’ll see tomorrow.

Published

October 29, 2014

Stan Schwertly in Hacker School | October 29, 2014