It’s already week 3 of Hacker School! You can see more information about week one here, and you can read last week’s posts here:
- Hacker School Day 6 – Weekend Catchup/New Plan
- Hacker School Day 7 – Recursion
- Hacker School Day 8 – YouTube to Gif in Go, Apache Spark
- Hacker School Day 9 – Redis, Go, Config, Bash Golf, Presentations
- Hacker School Day 10 – syslog, youtube-dl, webm
I spent this weekend looking for a place to live in November. I haven’t found anything yet, but I have a promising lead I hope will work out. I did look a little into content negotiation. I have the Youtube-Gif-Go API mostly working, but I wanted to be able to return JSON data when a client requests JSON, and if a GIF is available, the same resource should be presented as a GIF. I went down a long rabbit hole with this one. You can skip the next section if you don’t want to read about content negotiation in Go.
First I thought I should try to leverage mux, the package I’m using for routes in my API. For example, right now my route looks like:
The mux documentation shows that you can chain together specifiers for your route, and includes an option for matching against headers. So in a magical world, you could say:
If you’re already familiar with the Accept header, you’ll notice the problem: accept headers allow the HTTP client to specify which types of content they want, along with a “weight” for each type (also known as a quality-factor and referenced with the parameter ‘q’). This lets the HTTP client say “I prefer GIFs, but if you can’t give me that, I’ll take a JPG, and as a last-ditch effort, I’ll take whatever you can give me”. In short, you need to parse these strings — a client might not always strictly say “Give me JSON”, but rather “I prefer JSON, but you can hand out XML too if you have to”. Checking out the issues on the mux repository shows someone with the same hopes dashed:
So then I went hunting for an Accept header parser in core and it wasn’t there. But I did find a mailing list thread on Golang-Nuts where someone asked a similar question, and another person provided a library. They were encourage to submit a patch to get it in core. Excited, I went to check if I could find it in core. Instead, I found the followup thread:
So much for that. I spent a long time fighting SSL certificates trying to install the person’s Accept header parsing package before installing it by hand, and didn’t use it. In the meantime I switched all responses over to JSON.
I spent today working on polishing more of the API before moving to the frontend. I’ve found it interesting that my commits are much more terrible in Go. I think it’s partly because I’m hacking everything together, and partly because I’m not as familiar with it as I am with Perl. When I commit in a language that feels natural to me, it’s easier/less intimidating to tease out the logical commits. I hope/expect my Go commits to be better in time.
Today I added parameter validation to the API. Right now you can send:
- Start second
- Cropping coordinates and size
Go’s core makes it easy to get those stringified parameters, but it was becoming a hassle to validate that each numeric parameter actually looked like a number. I found validate, an awesome package that uses struct field tagging and reflection to assign methods to struct fields for validation. So you can say “this string element gets validated by this function” inline.
I ended up doing my own gross reflect hack to translate form parameters into the struct, prior to being validated. There’s gotta be a better way to do this. I believe it’s a problem with the way I’ve structured the data. It’s at least on my list to reconsider as soon as possible.
I added a “finalize” worker to complement the downloader, chopper, and stitcher workers. It cleanly moves the output gif to its final destination and cleans up the work directory. I like this distinction because the worker could theoretically mount a remote filesystem to move it to a final destination.
I also made the jobs mark themselves as active while the job is being performed. This helps identify more than “the job made it past the download stage” and turns it into “the video is being chopped up now”. I also modified key creation/updates so they come with a 1 hour expiration date until they’re finalized. This should help eliminate stale jobs that never got processed.
Jobs also get passed the directory of the workspace from the job before it. Prior to this commit, I was using the same tempdir across all jobs. I like this method for the same reason as before — it eliminates assumptions, which I think is a good thing.
The next steps are going to be:
- make responses more RESTful: creation should return a Location header with the resource URL to poll. Want to add more rel links for details about the resource, actually do content negotiation at least for the completed GIF resource
- blackbox tests: should be straightforward to write a suite of tests that just hit the API and make sure things work the way they should. Maybe look into some mock testing so I don’t have to wait for the actual creation
- actually make the workers use go routines
- real supervisor for workers instead of a bash script for launching
- frontend website: the challenge here will be making sure I can get a thumbnail that is the same size as the video I’m about to download
That’s the wall of text for tonight!