Monday, December 28, 2015

Making a Raspberry Pi Cluster's Rack

The frame, fully assembled, with trays

It's hard to play with a Raspberry Pi and not wonder what a bunch of them wired together could do. We're talking about $40 computers with 1GB RAM, a quad-core 900MHz processor, and a GPU, all on a credit card-sized board. They're not too far off, specs-wise, from the ThinkPad X60 I'm typing this post on. All that with a $40 price tag -- how could you not want as many as possible?

It probably wouldn't surprise anyone who knows me to learn that I've been working on doing exactly this: wiring a bunch of Raspberry Pis into a cluster for distributed computing. I've now collected enough of the hardware that the big priority is building a rack to keep it all organized. There are a bunch of cool Pi cluster rack designs out there. One 3D-printed design stands out in particular, both for the idea behind it and the implementation. I considered a few different designs, but none were a perfect match for my goals. In the end, it seemed like the best option was to roll my own rack.

Every design is tuned to the resources available to its designer. Being at my family's house for a week over winter break, I had a seven-day window of access to my dad's wood shop, his CNC machine, and his help. Naturally, I was inclined to take full advantage of all three. First of all, though, came deciding what exactly it was we would make.

Design
A few different design constraints are involved here. I wanted every Pis easy to access or remove. I wanted the finished product to look nice. I wanted to keep the footprint small, because this thing is probably going to live on my desk. It was also important to make sure the design wouldn't restrict airflow, because this cluster will be running some long jobs without any heat sinking. Lastly, budget dictated an eight-board setup.

I decided to power over USB, because my understanding is that the Pi has a protective self-resetting fuse in line with the USB power input. This fuse keeps power supply malfunctions from permanently destroying your board, but powering over GPIO seems to bypass this protection. I didn't put too much time into verifying this info, since I heard it from a trusted source, and in any case powering the boards over USB from a couple externally powered hubs is also nice and simple. If I ever decide to overclock these boards (and, let's be real, that's happening sooner or later), I'll probably enlist some help and make a custom power supply to deal with the increased power demands at that point. Maybe mount some heat sinks, too.

These parameters together with the tools available led me to the design you saw above. It's a vertical rack of five trays: on the bottom, a tray for USB hubs and an 8-port Ethernet switch; above it, four trays, all set up to mount two Pis each. Each Pi mount point also has an elliptical hole to promote airflow. It seemed important not to neglect either side of the boards ventilation-wise, since both sides of the Pi 2 have important chips on them (CPU and GPU on top, RAM on bottom). The Pis are rotated 180 degrees from each other, so that both have their Ethernet port and USB power input exposed.

The final design.

The Pis dictated the dimensions of the trays, which in turn dictated the dimensions of the wooden frame. The frame, assembled from square cuts of 3/4-inch-thick mahogany, ended up with a footprint of just about 7.5x5in (disregarding backstops), and a standing height of 9.5in. After cutting as many square beams as possible from the board, there was a thin strip left; we cut from this two more 9.5in pieces and used these as backstops.

Assembly
It was only in hindsight that I realized how simple this process really is. The vast majority of our time was spent not on woodworking but on trying to troubleshoot the CNC machine toolchain. There don't seem to be nearly enough mature open-source projects in this domain. This is an area where a few dedicated programmers could make a respectable impact. The process of setting up a working toolchain was so painful that I'm not going to say another word about it, except to note for posterity that when we did get things working it was using the proprietary tools provided by the manufacturer (Shapeoko) start-to-finish. One wishes that there could be a better way.

The CNC machine, hard at work cutting a tray. Not pictured: hours and hours of troubleshooting.

Anyway! Enough of that. The tray design ended up being pretty simple -- two pairs of four screw holes lined up to the Raspberry Pi's mount hole dimensions (link is to B+ measurements, but the Pi 2's hole spacing is identical). With those set to mill, it's time to cut some wood! We had this nice, light, 3/4-inch-thick mahogany lying around, so we used that, cutting as much as possible into square rods and cutting these rods down to the various lengths needed (four each at 3.5in, 6in, and 9.5in). This left a thin residual strip from the end of the board, which we later cut into backstops. But before backstops come tray slots, and before tray slots, the frame's sides need to be assembled.

The sides were easy enough to build: just take the relevant cut-to-size pieces (after sanding them reasonably smooth!), measure out whatever ground clearance you might want, glop some wood glue onto the points of contact, press everything together, slap on some clamps, and let it sit for a good long while.

Letting it sit.

Cutting the slots is easy, too, if there's a tablesaw around: just measure & mark the slot locations, lower the blade so it extends from the table by however deep you want the slots to be, and gouge away. I opted to be sort of generous with the cut depth (after all, better too wide than too narrow), but what I've come to realize is that the tolerance used here has a surprisingly big impact on how cleanly the trays slide in. It pays to be as accurate as possible.

Once the slots are cut, it's time to glue up the other four crosspieces and bring the whole frame together. This step was a bit tricky, since all four beams have to be left to set at roughly the same time in order to keep everything nicely aligned, which turns out to be easier said than done. The problem is that once two pieces are in and clamped down, the remaining ones get progressively harder to wedge in without losing most or all of their wood glue. I really don't have any better advice than to use generous amounts of glue (remember that spillover can always be cut/scraped/sanded off) and to have two pairs of hands involved: one person gently pulling the soon-to-be contact points apart, and another person to put the new beam in and get it lined up just right.

Do not disturb.

Once this glue cures, the frame's complete except for backstops. These really could not be easier to attach, especially compared to what came before them.

After that, all that's left to do is sand this puppy damn smooth and slap on some finish. This wood took the finish well, each coat drying pretty quickly. We put on ballpark four or five coats (who counts?), continuing until it looked good, then sanded it down and added a final layer.

A delicate process.

Oh -- there is one more thing left to do, and that's to mount the Pis on their trays. As you can see, we hit a slight gotcha here:

Asymmetry... adds character?

If you think it looks like those boards are missing half their standoffs, you're absolutely right. Here's what happened. We drew up our parts list for screws, standoffs, nuts, and washers, and rolled on down to the local hardware store (Stone Way Hardware, great place). Tracking down the 2.5mm hardware took a while, since just about nobody except computer & electronics hobbyists use this stuff... arguably for good reason. Turns out the stock was pretty sparse. We asked if they might have more "in the back". No such luck. The clerk commented, after hearing the amounts we were looking for, that we wanted as many of these components as they sold in about two years. He didn't seem particularly moved by the chance to get two years ahead of quota, and so we had to make due with half of everything.

Fortunately, it only really takes two screws & standoffs to do a "good enough" job of mounting a Pi if you install them diagonally from each other. It's not pretty, but it works. The boards are mounted and stable now, but I definitely plan to get the rest of the hardware at some point down the road. It'd probably be better for the boards, and it'd just look better besides. Maybe I'll check back in at Stone Way in a couple years.

The rack, with a network switch in its tray and 6 out of 8 Pis mounted

Wednesday, December 9, 2015

Thoughts on some 2015 Putnam Exam Problems

Now that a few days have gone by, I thought I'd write a few notes on things I found interesting while working on this year's Putnam exam.

This year's problems seemed pretty different from last year's. That year, I had trouble making use of all my time; this year, I found myself working into the last five minutes in each session. The A section felt very approachable in particular; B was certainly more arcane, but still didn't feel prohibitively so. There were interesting problems to talk about on both sides.

A PDF with this year's problems can be found attached to the first post here: (link)

A1, B1

Like last year's A1, these were easy problems whose main barrier to solution was that they required some nontrivial background knowledge. Not particularly interesting or exceptional.

For A1, this comes down to the fact that there's a simple geometric property which uniquely defines the point P. This in turn allows simple algebraic expressions for the two small areas under the chords AP and PB, and once you have those it's just a matter of integral-crunching to prove their equality.

For B1, it's a question of how much you know about the relevant facts in analysis. I didn't take the time to work out a full solution; the curious can find a good discussion here. The idea of letting g(x) = f(x)h(x), then working out first three derivatives and looking for interesting properties seems to bear fruit. It is useful to remember that if h has no zeroes, then g's zeroes are exactly f's zeroes.

A2

This was a fun one. The sequence is easy to work out by hand; it's also easy to work out a closed form for arbitrary terms. Doing either of these things can lead to the solution, if they are paired with the right ideas. The important first impression in either case is that since we're talking about odd prime factors, we should be on the lookout for properties concerning divisibility and prime factorization.

The straightforward way to find the closed form is by noting that the sequence can be modeled in terms of vectors and matrices:


allows one to obtain vectors containing consecutive elements of the sequence. It's possible to get the closed form of the sequence from this representation by solving for the matrix's eigenvectors and rewriting the starting vector as a linear combination of them. This makes it easy to get rid of the matrix. The closed form allows us to figure out some properties about certain elements of the sequence dividing others, and that gets us well on our way.

Working the sequence out by hand also can be surprisingly useful. In fact, the first thing I did was work out the first eight terms or so by hand and find the prime factorization of each one. This greatly helped to inform the rest of my strategy.

Note that a simple pigeonhole argument shows that the sequence is eventually periodic under any modulus. This means that if we work out the sequence mod p until we've found where its periodicity starts and how long the period is, then we can predict all indices n such that p divides a_n.

That strategy, while guaranteed to eventually work, will kind of take a while. The astute problem-solver can probably expect that no small primes bear fruit under this approach -- that would be too easy! So, while we could work out all the sequences up to say p=17 and cross our fingers that one of them turns out to have a zero at index 2015, there's probably a better use for that time. If we're committed to this strategy, though, then what other primes can we use? Well, a_5 = 2*181, which is a pretty big prime that shows up in a pretty weird place -- that might be worth checking out, right?

Right, indeed. If you do the arithmetic right, you find that 181 turns out to divide a_2015. Sometimes it pays to trust your intuition!

As an aside, there exists an interesting sort of symmetry between the zero terms of the sequence. Here, for example, is the start of the the sequence mod 181:

1, 2, 7, 26, 97, 0, 84, 155, 174, 179, 180, 179, 174, 155, 84, 0, 97, 26, 7, 2, 1, 2, 7, 26, 97, 0, 84, 155, 174, 179, 180, 179, 174, 155, 84, 0, ...

See the symmetry? Where's that coming from?

A3

Still kicking myself for this one. The insufficiently careful might think, as I in the haze of 10-minutes-left-in-the-session adrenaline did, that the exponent can be rewritten as,
which would sure make things a bunch easier! Sadly, the last step of this simplification fails to account for complex roots of unity -- something which, a few minutes into the break, I was absolutely kicking myself for. Even in the moment, it felt too easy. The actual solution methods turn out not to actually be that difficult -- more on that from the usual scary-smart folks here.

B2

This problem was so cool. Seriously. If you haven't tried it yet, you probably should. The patterns are really interesting and formulating rigorous arguments around them is a surprisingly tricky exercise. I don't want to say too much, because I don't want to spoil the fun of it -- the comments below will start out general and get gradually more specific. One word of warning: be cautious of conjectures concerning periodicity until you see a good proof for them.

It's possible to prove constraints on the least significant digits of terms in the sequence, and it's possible to put some useful constraints on their density. Once you have those tools, it's a short jump to finding large sums guaranteed to be in the sequence. If you find general forms for these sums, you can set them equal to 2015 mod 10,000. Then, if you know the extended Euclidean algorithm, you can invert any multiplications in these expressions and end up with some very strong hints about conditions sum terms would have to satisfy, if they exist at all.

Carefully following this line of thought, I was able to show that 4668+4669+4670 is in the sequence, which in turn implies that 14004+14005+14006 is. This is all we need, since 14004+14005+14006 = 42015 blaze it

I found it really helpful, in formulating these steps, to write out the first 60 positive integers in a 10x6 grid and work out the start of the sequence on this grid, crossing out summands and circling sums, doing my best to place sums > 60 roughly where they would appear if the grid extended far enough to include them.

B3

Interesting problem. It can be done through casework if you have the linear algebra chops. There turn out to be two main categories of matrices satisfying the condition. I definitely would've worked on this one if I'd had more time, but making sure my B2 argument was rigorous took way too long. Lesson learned: practicing formally writing up unusual arguments can be a productive use of one's time.

B4

There is a technique known as Ravi substitution which I had never heard of before reading about this problem, but which turns out to be well-known at the higher levels of competitive problem solving. It allows one to simplify the sides-of-a-triangle constraint. As far as I can tell, there seem to be two different forms of Ravi substitution: one for when the sides are reals, and one for when they are integers.

If the sides a, b, c are reals, it turns out that the sides-of-a-triangle condition is equivalent to specifying that there exist some positive reals x, y, z, such that a = x + y, b = x + z, and c = y + z.

If the sides a, b, c are integers (as in this problem) then that substitution is not possible, but several others are. For instance, we can define x, y, z as x = a + b - c, y = a + c - b, z = b + c - a. We can use the triangle inequality to trivially prove that x, y, z are all positive, and some mucking about with parity lets us set up a nice bijective property between triples a, b, c, and triples x, y, z. Once we have that, we can make substitutions and crunch out algebra. BIGTRO's comment in this thread works this line of thought out to the end; I wish I'd had the time to try and find all this.

B6

This is a really interesting problem, but I think that whoever came up with it may have been criminally insane. That's all I have to say here.

Saturday, September 12, 2015

this is just to say

i have drunk the protein shake that was in the fridge and which you were probably saving for breakfast it was so gross what
was i thinking

Tuesday, September 1, 2015

What Does the Internet of Things Mean for Security?

We are standing at the edge of a very steep hill full of sharp rocks, and internet-connected hardware manufacturers are trying to push us over.

Imagine you woke up one day to find out that overnight you lost access to every account you use online -- Facebook, Twitter, Gmail, you name it. Worse, because all your password resets ran through your Gmail account, there's no easy way to get these accounts back. Now imagine that all of this happened because you bought the wrong fridge.

This sounds like a thought experiment, but it isn't. For those few who're unlucky enough to own the vulnerable model of 'smart' fridge, this could actually happen. This is reality. It's made by Samsung and costs about $3600. If an attacker can get within radio distance of the fridge, they could take over the fridge owner's Google account without breaking a sweat.

Once we've got that (imagining ourselves now in the attacker's shoes, taking advantage of some poor sap who dropped close to four grand on an absurd fridge), we can pull up their other accounts, hit that big fat "password reset" button, and get a link delivered to your new mailbox inviting you to set the account's password to whatever you like. Note that the strength of the original passwords has no bearing on whether this attack can work.

Thursday, July 9, 2015

Diving Deeper Into Deep Dream: Different Distortions

The moment you get Google Research's Deep Dream project set up, you can make it do incredible things. If you want to generate trippy zooms with lots of dogs and eyes, the code Google provides works more or less out-of-the-box.

For reference, here is a gist for a .py file containing all the code from the notebook. This is meant to be run through an interactive interpreter, e.g. by providing the -i flag at runtime or, in iPython, starting a new notebook and then executing e.g. "run demo.py".

And, here is a gist for a similar but extended piece of code which takes an input file on the command line and performs the iterative process described at the very end of the researchers' original iPython notebook. Both these files are almost entirely composed of code from that notebook; they're just provided as references, so we can have some well-established starting points from which to branch out.

Thursday, July 2, 2015

Setting Up Deep Dream, Google Research's Hallucinatory Work of Genius

Google Research wrote recently about a technique they call "Inceptionism", which needs to be seen to be believed. Click through and check out their pictures. A full gallery of official images can be found here. The techniques used to generate these sorts of images were described in broad strokes in this blog post, but the level of detail stopped just short of what someone wanting to replicate their results might have wanted.

That is, until they published their source code.

In this repo (which was created just this morning, as of this post's writing) is an IPython notebook which contains everything necessary to duplicate the results you see in their blog post. That's so cool that it's honestly a little hard to believe. The dependencies are detailed in the repo. There are surprisingly few of them, and they're all surprisingly easy to get set up. Let me walk you through what I had to do on my Debian Stretch system to get everything up and running.

Wednesday, May 27, 2015

Evaluating the Alternating Harmonic Series

Here's a fun little note I just found while organizing my desk. I discovered this proof during a class on infinite series last summer. The professor mentioned the limit of the alternating harmonic series, then commented offhand that "this limit is true, but we cannot prove it yet." Never one to shy from a challenge, I decided to see whether he was right, and had roughly this proof scrawled on a piece of paper by the end of class. He suggested that I try to publish it in an undergrad journal, but that didn't pan out for various reasons. Nevertheless, it's a fun proof, and I wasn't able to find it anywhere online, so I thought I'd share it here.

The alternating harmonic series is defined as


which, of course, is just the harmonic series with the sign of every other term flipped.

It is well known that the harmonic series diverges. Slightly less well known is that the alternating harmonic series converges to a very clean limit -- strangely enough, it turns out to be ln 2. This fact, if it comes up, is usually proved just as a trivial consequence of the Euler-Mascheroni constant's most common identity. Not only is that proof beyond the reach of many undergrads, it's also so trivial as to border on inane. It's like using a sledgehammer to pound in a nail: of course it works, but is it really the best way? Someone pass me a hammer.

Consider the following series:

Sunday, May 24, 2015

Finding Files with Multiple Substrings

I'd like to share with you a challenge I ran into last night, a problem that's really difficult unless you take to the command line.

Some background: I play Go. The GoGoD game database, which I recently bought and downloaded, has somewhere on the order of 80,000 Go game records in it. These games are all in .sgf format, which is a simple plaintext format. Records of games from professionals of many different ranks are included. The rating scale for professionals starts at 1 dan (or "shodan") and goes up to 9 dan, the highest rating a player can achieve. Games between professionals are always interesting, but games between 9dan professionals are, in particular, packed with subtlety and value for the ambitious student of Go.

Any game from this database is worth studying, but with over eighty thousand games to choose from, it helps to be able to pick highlights, and what better way than by focusing on games between 9dans?

If we decide to take this route, we're immediately faced with a second question: sure, we can just open games at random until we find one between 9dans, but this is boring and slow. Are we doomed to tedium, or is there some way we could search this database of more than eighty thousand games, pick out only the games between 9dans, and copy these over to a special "elite database"?

To the average user, this might sound impossible. On the command line, it's almost trivial.

Wednesday, April 15, 2015

In Defense of Honors Programs (or, Read a Fucking Book)

In planning the follow-up to a survival skills event for incoming Honors freshmen, the question came up of how best to articulate to new students the value of WWU's honors program. Especially for STEM students, the fact is that the liberal-arts-focused Honors courseload can be expected to overlap very little (if at all) with their major's courses. That's a non-negligible number of extra credits, and there's nothing scarier than a delay in graduation, and so lots of people in the program question, at one point or another, whether the program is worth seeing through to the end. It's our job to try and make the case that it is.

Of course, regardless of obligation, it'd be disingenuous of us to argue this point if we did not ourselves wholly believe it. For my part, I am lucky in this regard, because in spite of (because of?) being a Math/CS double major, I'm convinced that the honors classes I've taken have on average been far more valuable than anything I've taken in either of my majors. Ditto for the classes I'm planning on taking. That's not because I don't like my majors -- far from it, I'm crazy about them. But computer science is close being treated as a trade skill, and math's "beauty cold and austere, like that of sculpture" (Russell) nourishes only certain parts of the mind. Both majors are fantastic for what they are, but what they are is not a complete education. Similar things could be said for practically any other major: They leave gaps, and honors fills those gaps.

My first, informal attempt at articulating to my co-planners this argument for why to stick with honors went something along these lines:

Why stay with honors? Because we offer things you can't get anywhere else. We have classes you can take as a freshman that're on par with the upper-division classes in specialized majors. You can just be like, "I want to learn about Russian literature this quarter" or "I want to learn about film this quarter" and nobody's going to be like, what the fuck, you're a chem major. These kinds of classes are just there, and anyone in the program can take them. If you don't want to be an English major but you want to study great books, we're your best shot. If you want to be in rooms full of clever people talking about Nabokov or Robbe-Grillet or Bely or Dante or for that matter anyone else worth reading, we've got you covered. We've got some of the best professors in the university teaching our classes. Great classes, taught by great professors -- what more do you fucking want?

2015 iCTF Retrospective

After some delays, last Friday saw the 2015 UCSB iCTF come and go, and it was quite the experience. This is (to my knowledge) the first year that Western has joined the competition. We didn't exactly place first, but it was great fun and lots of lessons were learned. Now that it's over and we've had a few days to reflect, here are some thoughts on what we did right, what we could've done better, and what I didn't go in expecting.

There were two big surprises, for myself and (I think) the rest of the team: First, the sheer mind-boggling number of exposed vulnerable services -- upwards of 40, I'm pretty sure -- and second, the inability to directly attack other teams.

Monday, March 30, 2015

You know what?

It doesn't seem like there's any good way to handle armchair "philosophers" who think they have something deep to say about the idea of certainty. "See, the thing is, you can't actually know anything for sure! I mean, like, what if we're all just brains in vats?" The only compelling argument I see for this point proceeds by example: The claim is that nobody can truly know anything whatsoever. Anybody who can in all sincerity make this claim must know very little -- perhaps little enough to serve as supporting evidence.

Joking aside, this is one of the silliest ideas out there. Those who contract it generally do so within a year of their first serious exposure to the great philosophers, in that honeymoon phase where everything makes sense and the ideas of "discarding all assumptions" and "questioning everything" still seem novel. Socrates and Descartes are particularly severe offenders, as brief or unconsidered treatments of their works can easily give the impression that they advocated this total uncertainty. What, more precisely, is this misconstrued idea, though?

It's the idea that we just can't know anything for real, man. You know? Like, think about it -- everything we know comes from observing the world, via our senses, right, and how can we trust our senses? What if we all see, like, different colors? What if two plus two doesn't always equal four? What if it could equal five, but we don't believe that because we've never seen it? Or because we can't see it? Would we even know how to count if we were all blind? I, like... I dunno, dude. I dunno. Whoa. We don't even know that. We don't know anything. Not really.

Thursday, February 19, 2015

Symbolic Computational Geometry in Python: Heron's Formula

Floating point numbers suck. Sometimes they're the best we've got, but they still suck. Floating point numbers are to math as Physics 101 lab measurements are to the laws of physics: you get basically what you expect, but with a margin of error tacked on. Sometimes that's okay. Sometimes it's not. But what choice do we have? I'm so glad you asked, because it turns out we have plenty.

But before we get into that, let's convince ourselves that we have a problem to solve. Luckily, this is really easy. Suppose we want to find out, for some terribly important reason, the area of the triangle with vertices at (3,4), (7,10), and (11, 161). We don't get pen and paper -- in fact, the only tool at our disposal is a Python shell. If we know some geometry, then our naive approach might go something like this:

Thursday, February 12, 2015

Cheat Sheet: Windowing in Screen & Vim

As a lazy computer science student, I find myself working on programs over SSH a lot. This tends to pose some challenges. Unless you want to mess around with -X and -Y flags, all you get is one terminal to do all your work in. Having to work in this environment has taught me one lesson: When all you have is a terminal, everything looks like a window manager. Or something. Anyway. There are lots of ways to split up a terminal session, and this post covers some common commands involved.

Tuesday, February 3, 2015

Cracking General Substitution Ciphers

As some of you have requested, today we'll be talking about paper-and-pencil cryptography and how we can use computers to poke it full of holes!

We're going to start with some background on basic (pre-digital) cryptography. Then, we're going to describe the specific cipher we're attacking here, along with some standard paper-and-pencil approaches to cracking it. Thirdly, we'll discuss one algorithmic approach powered by a dictionary file. Then, we'll look at some Python scripts implementing the discussed ideas. Two different scripts will be provided: A bare-bones one written to be simple to understand, and a batteries-included fancy one which elegantly implements the main logic using coroutines.

By the end of this post, we'll know what this script is doing and why it works.