in Hacker School

Hacker School Day 52 – Kernels, Magic, and Docs

Today is the last day of Hacker School.

Kernels

This week I played with kernels. I’ve been investigating an oddity I discovered while working on getting process invocations the hard way. At the bottom of every process’s stack (the highest address), there’s eight empty bytes. Here’s the tail of the xxd output of a process’s stack (saved to file, so the address is relative):

0020fe8: 01110101 01110011 01110010 00101111 01110011 01100010 usr/sb
0020fee: 01101001 01101110 00101111 01101101 01111001 01110011 in/mys
0020ff4: 01110001 01101100 01100100 00000000 00000000 00000000 qld...
0020ffa: 00000000 00000000 00000000 00000000 00000000 00000000 ......

The digits in the middle are in bits, and you can see 9 blocks of zeros. The first chunk of zeros is a NUL byte terminating the string before it. The next eight bytes are due to this line in the kernel:

bprm->p = vma->vm_end - sizeof(void *);

This sets the pointer to the end of the stack right before the kernel starts saving information about the process. I’ve been trying to find out why they subtract the size of a void pointer without success. I talked a little bit in the last post about where I’ve looked so far, and I haven’t had any luck.

A few days ago Adrien suggested just building the kernel with that part taken out. If things stopped working, I’d at least know that something depends on it. It’s not the most exhaustive test in the world, but it sounded like a great idea.

After stumbling a few times, I got grub to boot into my new kernel and…nothing broke! The stack looks just like I originally expected:

0020fee: 01100101 01110010 01101100 00000000 00101111 01110101 erl./u
0020ff4: 01110011 01110010 00101111 01100010 01101001 01101110 sr/bin
0020ffa: 00101111 01110000 01100101 01110010 01101100 00000000 /perl.

Aside from this, the entries in /proc look fine, all of the standard utilities seem to be normal. I was expecting my terminal to look like the end of this StackOverflow answer, but nothing seemed different. I’m not sure what to think — I want to believe it’s used somewhere, but if I can prove that it’s some historical baggage, maybe it can be removed? This is a pretty central piece of code though, so I suspect it’s useful in some way I don’t yet understand. I’m going to continue investigating, and if I still can’t find anything, I might invoke Cunningham’s Law and see what happens via a patch.

Magic

I also played in a Magic: The Gathering tournament the other night with some other Hacker Schoolers and had a blast. It was a really great time.

Docs

I also followed up with an issue I opened on this Go wrapper for SDL2. I tried using it during Ludum Dare and was frustrated with the lack of Godoc comments connecting a function name with the original SDL documentation. The codebase is pretty huge and the project is young, so adding these would take a lot of manual effort. However, many of the function names and struct types directly translate to a URL in the SDL wiki.

I started writing a messy perl script to try to automatically add a baseline Godoc comment to as many functions and types as I could. The idea is to go through the code to find functions and types, then mush together a URL string based on some guessing rules, then check if that URL actually exists. Some of these transformations are very straightforward: the documentation for the Go binding’s AudioSpec struct maps directly onto https://wiki.libsdl.org/SDL_AudioSpec. Some of them are more complicated. For example, this method has a Renderer receiver:

func (renderer *Renderer) SetScale...

But the SDL page for this is SDL_RenderSetScale. Renderer was a special case, but still formulaic — strip the “er” to change it to Render, then try putting it in front of the function name and checking if that’s a real documentation page. I started getting rate-limited by the SDL wiki and found a tarball with the contents to continue testing locally.

It only took a handful of guessing rules to get pretty good coverage. Overall I added documentation for 443 functions and types. I had to skip some of the related SDL packages like sdl_mixer, since their documentation URLs have numbers in the filename:

https://www.libsdl.org/projects/SDL_mixer/docs/SDL_mixer_68.html

If the project considers my PR reasonable, it would be fine to pre-process these, but I wanted to hold off on doing this ahead of time in case it ended up being wasted. We’ll see!

Today I’m reading up on some things and trying not to be distracted by the fact that Hacker School is ending. We’re having a party tonight.

Write a Comment

Comment

  1. The bitfield is a good approach! This question ends by asking how you’d solve it with ample memory (the bitmap) and if you only had tiny amounts of memory but ample disk space (the solution in this post). Nowadays IO is much more precious!

    To be fair to the book, the second edition uses “4 billion 32-bit integers”, so the intent of the question is “you have a whole lotta numbers”. A new edition would be much higher (as high as it takes for the bitfield solution to be impractical for the second answer :P)

    Thanks for commenting!