My Octopress Blog

A blogging framework for hackers.

Port(1): A Four-Letter Word?

To me, port has always been kind of a dirty word. Sure, it’s nice to have a package manager for Mac, especially after getting used to apt-get. Still, things tend to show up in weird places, and paths get confused.

For instance, I was extremely frustrated this week to find that on OS X Lion, gcc-4.4.5 just would… not… compile. Frustrating stuff. I was tasked with porting this enormous existing in-house code base (of about 60-100k lines) to Mac, but was dismayed to find that they required C++0x features, which are unsupported in gcc 4.2.

Giving up, I turned to MacPorts as a broken, empty shell of a man. MacPorts was able to compile it, though relegated to /opt/, and though I could add that to my path, this new version of gcc wasn’t ready to consider the libraries I had installed in /usr/local/ by hand. Of course, I could edit all the makefiles, or do some other magics, but it turns out MacPorts can be bent to your will.

Like most, I had installed the binary release of MacPorts, configured to live in /opt/, but if you instead build from source, you can:

1
./configure --prefix=/usr/local --with-unsupported-prefix

This has not only MacPorts reside in /usr/local/, but then it will also in turn install its packages there as well! I don’t think I’m the only one who appreciates that kind of consistency – all my libraries in the right place. I still feel slightly dirty whenever I have to rely on port, but at least when I do, I can save a little face.

Error Macro Win

Still porting Linux code to Mac, I’ve been trying to keep a useful habit: using the #warning and #error macros. This new code is riddled with #ifdef’s checking whether or not we’re trying to build on a Mac, and using alternatives to the Linux-only system calls, but in parsing these large chunks of code, sometimes I forget what I’m doing. How horrible would it be to accidentally leave empty the Mac-only code block when it’s meant to actually do something.

So, whenever I open up one such block, I add a little macro:

1
2
3
4
5
6
7
8
#ifndef MAC
// ... The Linux-only code
// ... takes up
// ... a lot
// ... of space
#else
#error "Don't forget to implement it as Mac!"
#endif

At least I’ll always catch it at compile time, and when I fix/add the Mac-only code, then I can go ahead and remove that macro. It’s not that I’m forgetful, but I’ve shot myself in the foot so much at this point.

Command Line Stopwatch (Time Cat)

If you find yourself with a terminal and you need a stopwatch: $> time cat

Cat(1) by default waits on stdin if no arguments are provided, until an EOF is reached (Ctrl+d). Time(1) waits until the run command terminates, so in effect, it’s a stopwatch that runs until you press ctrl+d.

Boost, Typedef, #define and GCC Pain

Recently I’ve been working on porting some code to Mac, and yesterday I ran into a bug that stumped me for a little bit. Compiling against Boost was raising a bunch of errors, specifically in lines that seemed pretty innocuous (from cstdint.hpp):

1
2
3
4
5
6
  using ::int8_t;
  using ::int_least8_t;
  using ::int_fast8_t;
  using ::uint8_t;
  using ::uint_least8_t;
  using ::uint_fast8_t;

G++ kept giving me errors for each of those lines: error: expected unqualified-id before ‘signed’, referring to the line using ::int8_t. I’m a little embarrassed that I couldn’t figure it out right away, but eventually I figured out that it was caused by int8_t being #define‘d somewhere else. For those of you that don’t know, #define really just takes one term and substitutes every subsequent occurrence of that term with one you provide.

1
2
3
4
5
// If you define it as a macro
#define int8_t signed char
// Then this line will be interpreted very differently from how you expect
using ::int8_t;
// Gets interpreted as "using ::signed char"!!!

And this is what g++ had been complaining about. That is not legal C++ syntax; I’m sorry I doubted you, g++! But there still remained a larger question: where were these types getting defined? I didn’t want to be in the business of “patching” a library, and especially the largely impeccable Boost library. Typically these types (int8_t, uint8_t, etc.) are defined in cstdint or stdint.h, but looking at the system’s version, I found to my surprise that they were not macro-defined, but typedef’d, which is the right way to do it.

Sidebar: In general, you should be using typedef instead of #define for this reason, and another very good reason. Because #define macros just go through code and blindly replace references, it can be difficult to trace the origin of a type, but typedefs are carried through, and so even after preprocessing, you can still see what the semantic meaning was (that you wanted int8_t specifically, not just something that happens to be the same type). And when debugging, this extra type information can be helpful. Similarly, you should generally also use const to define constants in your code, instead of #define macros, because while you might remember what a magic number is when you write it into your library, the meaning of that particular constant becomes unclear when you encounter it’s value when debugging. (If you haven’t, read Scott Meyer’s Effective C++.)

Getting back to the morality tale, the library I was porting wasn’t macro-defining int8_t, and stdint.h wasn’t, so where was the culprit? Clearly there are potentially hundreds of places it could be, and I was running out of good guesses. Luckily, SEOmoz C++ shaman Martin taught me a little ninja magic: use the -E flag with g++ to only run the preprocessor stage, and redirect the output into a file! When compiling with ‘make,’ it typically spits out the offending g++ command, so rerun it to just pass that one through the preprocessor, which reaches out and fetches the header files and gloms them in order into one giant input file. Then, step through this file to see where “define” and int8_t occur on the same line! In two minutes we were able to find the header that was causing all this trouble, where I had spent two hours learning and reading about where the problem might be.

In the end we found it in a very small library that we happened to use, and on Mac we had just been using a slightly old version and this problem had been fixed in subsequent releases. Still, I’m glad to have added this preprocessor trick to my toolkit.

A Sign That You’re Doing It Wrong

Lifehacker recently posted an article for Memory Restart, which restarts Firefox when its RAM consumption becomes too high.

To me, it’s a pretty clear sign that you’ve done something wrong if people are writing plugins for you browser to restart it, because it consumes too much RAM with some regularity.

Champagne Problem

I had an interview a few weeks ago, and I spent a little time preparing for it. I puzzled over some logic problems, and read some common interview questions, knowing that there would still likely be questions I hadn’t heard. Still, I did have one cache hit, so to speak.

One of my favorite questions, though, was one about pouring bottles of champagne: Suppose you are in charge of pouring the champagne for the midnight toast at a prestigious party with more VIPs than you can count. You have 10 waiters, who are wheeling in 1000 bottles of champagne and some bad news; exactly one of these bottles is poisoned, and if you drink even a single drop, you will become violently ill in one hour. If you serve tainted champagne to any guest, you and your employees will be fired on the spot.

Your waiters are sympathetic to this fact, and have thus volunteered to be taste-testers – a night home on the couch being miserable is better than being unemployed. Though, because of time constraints, you only have time for one testing before people get sick and you have to serve champagne.

The naive approach is to split the bottles up into groups of 100, and have each waiter try a drop from each. You would know from whichever waiter gets sick which group of 100 bottles has the bad bottle, but there would be 99 good bottles wasted, and of course, the wasted good bottles get deducted from your pay.

So, you want everyone to keep their jobs (and not serve tainted booze) while wasting as few good bottles as possible. How many bottles do you have to waste? Click below to reveal solution.

You don’t have to waste a single bottle of good champagne, while ensuring that no VIP gets sick. Label your waiters 1, 2, 3, and so on up to 10. Label the bottles of champagne 1, 2, 3, and so on up to 1000. Then, for each bottle, convert that number to binary representations, and let each waiter be associated with a bit. For example, bottle 417 has representation 0110100001, meaning that waiters 1, 6, 8, and 9 would sample a small drop from it, but bottle 418 would have waiters 2, 6, 8, and 9. In this way, each bottle has a unique set of waiters who sampled it. Then, when waiters start becoming sick, say, waiters 3, 5, 6, and 10, you’d know that it’s bottle 1000110100 (564) that’s contaminated.

JavaScript Unit Testing With QUnit

In the vein of habits I wish I had picked up in Software Engineering, I’ve been increasingly using unit tests. Working on a project recently with a friend of mine, it fell to me to pick out a unit testing library for our development.

My uninformed search led me to QUnit, from the same people that bring us jQuery, which we already use heavily. We found it immediately simple to use, expressive and powerful. Much of this particular site involves AJAX, and as such, we wanted to be able to test and make sure our interface to these resources is working, as well. Not to mention that most operations rely on information retrieved from remote resources.

To that end, the vast majority of our tests use their asyncTest function, which lets you perform any kind of asynchronous request, and then in your callback, you indicate to the system that your test has all its necessary information and can continue. For example,

1
2
3
4
5
6
7
asyncTest("Our Site's API", function() {
	$.ajax({...
	success: function(response) {
		ok('data' in response, "We expect data in the response.");
		}
	}
}

One big ‘get’ in my mind is that it’s from the same group that produces a library we already use heavily, so they tend to be thought-out in similar ways. Plus, it runs in the browser, and has nice styling for the interface that make your unit tests look extra classy!

The QUnit site has a lot more examples and demos, but this concludes my shameless plug for a unit testing quite I’ve come to appreciate very much.

Git in Four Commands

Git is not a complicated tool for most things. I still find it a little tricky to set up for multiple users, but even that’s pretty easy. It really caters to the use-case where you’re just starting a project, or are the sole developer, but just want to keep track of changes and versions, make branches, etc.

The first major point of git is that everyone has their own copy of the repository. When you commit changes, you commit them to your local copy of the repository. If you are working on a group project, there is a shared resource that can be pushed to and pulled from, but I actually like that it takes an extra command to do that – it forces me to make sure that it’s what I actually want to do. Now, to the bare-bones commands:

  1. git init – Wherever you’re writing your code, type “git init”. It creates an empty repository in that directory (there’s a magical hidden folder “.git” that gets created and knows things).
  2. git status – Git knows which files have changed since you last saved changes, and it will happily tell you which files are new and changed with this command.
  3. git add – When you change files and are at a point where it make sense to save changes to your code (a bug fix, a new feature, etc.), then tell git which files you want to put in this commit with “git add”. If committing a set of changes with git is like shooting a gun, then adding files to be committed is like loading that gun. Git knows which files have changed, but it can make sense to group changed files into different logical commits. For example, if you fix two bugs between commits, you might want to add the changes for each bug fix separately.
  4. git commit – When you have added a bunch of changed files to be committed, now you’re ready to actually commit those changes. Type “git commit -m ‘A short, meaningful summary of the changes that happened.’”

There is a lot more to git, and a million tutorials, that will explain things in more detail, but these are the commands I spend 95% of my time using, and enough to at least get you starting tracking changes for anything and everything. It doesn’t matter very much what version control you use, just use something.

Unfuddle

For a recent project, we were looking for an issue tracker. Our biggest criteria was something that was free – we weren’t planning on doing anything complicated, and so available issue trackers were essentially equal in our eyes. We ended up going with Unfuddle, but it wasn’t until after signing up that we discovered its biggest strength: an API.

There are other issue trackers with an API, but this one seemed reasonable enough so we used it. The big get for us in building a web-based app, is that it lowers the barrier for reporting bugs. We set up our development server to include a bug-reporting button that collected a little report that then sent it to our server, which uses the Unfuddle API to add a ticket. It also records and submits a little information about the user’s browser, current session, etc. that might help in debugging their problem.

Not the most complicated or profound system in the world, but it does put user-reported (and tester-reported) bugs in the same place we check our to-dos and other issues. No checking a support email account we remember once a month. Just puts bugs right were we’re looking anyway. I hope to share a little bit of the backend code soon!