Monday, May 30, 2016

Reading List - 2015

A little light on the technical books this year but I tried to focus my efforts on a worthwhile periodical instead.

Books

Bonus Level



Tuesday, March 29, 2016

Presidential Candidate Twitter Bot

I set up a Twitter bot to say presidential things.


Currently, it uses tweets from Trump, Cruz, Clinton, and Sanders as its corpus. I want to compare and contrast the content it produces to the rest of the political elite to see how it stacks up. 

I figure it's got a 50/50 on being better or worse.

Why?



Much of modern political debate is pithy one-liners fit for reality television, so this just seems to make sense. With so many jobs being "disrupted" by tech, political debate seems like a logical next industry.

In the future, I assume the Presidential position will be filled by a bot that uses the combined tweets of congress as its input. And perhaps, eventually, the congressional representatives will simply be bots pulling content from their constituents.

How?



My previous bots (like The Peterbot) used the python NLTK library and the awesome command-line Twitter client "t". This was a good option when my primary input was blogs and websites, but was sort of a pain to get working properly with Twitter feeds.

I've since migrated to the popular twitter_ebooks gem and set up a daily cron job to update the corpus.



The horrifyingly presidential picture was made with Morph Thing.


Saturday, February 27, 2016

Naming Variables: A Short Story

On the Importance of Naming Variables
~a short story~

During the last semester of my engineering degree, I took a class on information retrieval (IR), which focused on search engine internals and file matching. I was taking the class in parallel with a "capstone" course. The capstone was the last requirement for graduation, where teams of students were paired up to work on a project for an actual client company. Capstone was a notorious timesink, so most other courses were put on the back-burner.

The final project for the IR class was to make a program that would accept an image, process it for patterns, and match it to similar images from a corpus. The project was split into two halves: building the searchable corpus and building the front-end. We were grouped into pairs and I happened to wind up with a fellow capstoner, Jim. We talked briefly and I took the corpus half.

We had well over a month to complete the project but, like most college projects, it was ignored until the weekend before the due date. The icing on the cake was the realization that the program for corpus generation took well over an hour to run, even with a small subset of images. This made testing tedious and slow.

An all-nighter ensued, and when I finally passed out, the process was quietly humming away, abusing images and collecting patterns. Without the front-end to actually accept input and match images, I could only hope that corpus would be correct.

The next day, I asked Jim for some testing but received only vague responses. It was clear he hadn't completed the front-end yet. No sweat, we've all been there. I sent him my half of the project so he could finish up.

Finals came and went. IR was last, and the final would be a class presentation of our image-matching projects. It was officially the last thing I ever needed to do for my degree. Excitement ran high. Jim said that our image results were weird, but was positive they were correct. Right, sure thing.

Classmates demonstrated their projects, and it quickly became clear that our results were definitely not correct. An image of the American flag should not have been matched as similar to an image of a kitten. We fumbled through our presentation, realizing we had failed.

Feeling what I assume was frustration and pity, the professor decided to give us until the end of the day to repent. We would receive a just-technically-passing grade on the project if we could fix the matching. The last class of our college careers was now officially over but we were not done. We ran to the lab and furiously opened vim terminals.

I poured over my code, comparing implementations to documentation. It was dark outside now. No breaks, no dinner. The code looked solid. This was my personal nightmare scenario. At least an obvious error would have meant we were done.

Jim was having no luck, either, so we started doubling up on the code. I rolled my chair over to Jim's desk and realized something was wrong. There were no functions, no classes - just a single large block of code in a single large file. I tried to dig into it. Variables were named single sequential letters: a, b, c, etc. Soon the letters doubled: af, ag, ah. I was now in "ba" territory. Jim noticed my horror, and explained that this style of coding saved him significant time during his development. Somehow, I was doubtful.

An hour later, all hope seemed lost. It was past 9:30pm now. From a few feet away, Jim began to laugh. I assumed he'd gone insane. No, he'd found the error! In one of the crucial pattern matching functions, he'd passed in "af," a buffer of pre-processed image data, when it should have been "ag," a buffer of processed pattern data. So obvious, we both should have caught it straight away, I was informed.

A wave of relief as Jim sent an email to the professor, followed by a wave of anger as I debated killing him with the heavy mechanical keyboard. The next morning, technically our first day of post-college, was spent with the professor, demonstrating our now-working program. He was rightly skeptical of our explanation (read: completely pissed off), but passed us with a promised low score. It was a sad way to end, but we were done.

When I asked him what he was thinking with his variable naming, Jim was confused. "This is a common programming problem." No, it really isn't. Words were exchanged, ways were parted, and I'm not sure what happened to Jim after that. As a farewell gift, this story has always stuck with me, and I hope the same is true for him, too.

So, please, for everyone's sake, the next time you're working on a team project, take an extra two seconds to name your variable "image_data_buffer" instead of "af." It may not seem important now but you'll thank yourself if shit hits the fan.