Reality-based Interviewing

March 18, 2014 2 comments

The job market for senior software developers is very hot right now (as of early 2014), at least in Seattle.  I know of several companies in the area that have aggressive hiring plans for the next 12 months, often with the goal of doubling or tripling in size.  Most of these companies intend to hire mostly or exclusively senior developers because their long-term productivity is so much greater than the typical junior or just-out-of-school developer.  I find it remarkable, however, that these same companies often don’t have an interviewing strategy that’s designed to select the kind of candidates they’re actually looking for.

A fairly typical interview strategy consists of an introductory conversation focused on cultural fit and soft skills.  This is sometimes a panel interview but is relatively short.  The vast majority of time is spent in one-on-one sessions doing whiteboard algorithm coding problems.  These coding problems are typically small enough that there’s a possibility of completing them in an hour but also obscure or tricky enough that most candidates have a significant likelihood of messing up or not finishing.  The candidate has to understand the description of the problem, construct an algorithm, and write code for the algorithm on a whiteboard, all the while talking out loud to reveal his or her thinking process.  Google is the most famous for this style of interviewing, but Microsoft has been doing it for decades as well and the whole rest of the industry has pretty much followed the lead of the big dogs.

Broken By Design

So what’s the problem?  Well, according to Laszlo Bock, senior vice president of people operations at Google, it simply doesn’t work:

Years ago, we did a study to determine whether anyone at Google is particularly good at hiring. We looked at tens of thousands of interviews, and everyone who had done the interviews and what they scored the candidate, and how that person ultimately performed in their job. We found zero relationship. It’s a complete random mess, except for one guy who was highly predictive because he only interviewed people for a very specialized area, where he happened to be the world’s leading expert.

Why doesn’t it work?  The short answer is because what we ask people to do in interviews often has very little relationship to what we actually expect them to do on the job.  Maria Konnikova offers a longer answer in a New Yorker article:

The major problem with most attempts to predict a specific outcome, such as interviews, is decontextualization: the attempt takes place in a generalized environment, as opposed to the context in which a behavior or trait naturally occurs. Google’s brainteasers measure how good people are at quickly coming up with a clever, plausible-seeming solution to an abstract problem under pressure. But employees don’t experience this particular type of pressure on the job. What the interviewee faces, instead, is the objective of a stressful, artificial interview setting: to make an impression that speaks to her qualifications in a limited time, within the narrow parameters set by the interviewer. What’s more, the candidate is asked to handle an abstracted “gotcha” situation, where thinking quickly is often more important than thinking well. Instead of determining how someone will perform on relevant tasks, the interviewer measures how the candidate will handle a brainteaser during an interview, and not much more.

(Edit: she’s referring to non-coding brainteaser questions here, but many “coding” interview questions also fall in to the toy brainteaser category.)

Why is it that we set out with the intent of hiring senior developers who are valuable specifically for their maturity, experience, and enormous depth of knowledge about how to build large software systems and end up evaluating them strictly on small-scale tactical coding exercises that would look right at home in any undergraduate homework set?  Why do we evaluate them in an artificial environment with horribly artificial tools?  Why is it so completely out of context?

Don’t get me wrong – it’s obviously important that the people we hire be intelligent, logical, and competent at tactical coding.  But tactical coding is only one part of what makes a great senior developer.  There are many vital technical skills that we don’t often explore.  When our interviews are artificial and one-dimensional we end up with poor-quality teams, because a healthy software team needs a variety of people with various strengths.  Selecting for only one narrow type of skill, even if it’s a useful skill, is a mistake.  There has to be a way to create more relevance between our interview questions and the palette of skills we’re actually looking for.

Focus on Reality

What is it that we do all day at our jobs?  We should take whatever that is and transfer it as directly as we possibly can into an interview setting.  If we’re not doing it all day every day we shouldn’t ask our candidates to do it either.

Let’s start with the venerable whiteboard: when was the last time any of us wrote more than a couple of lines of syntactically-correct code on a whiteboard outside of an interview setting?  That’s not a job skill we value, so don’t make candidates do it.  Give them a laptop to use, or better yet, allow them to bring their own, if they have one, so they have a familiar programming environment to work with.

Next, what kind of problems do we solve?  A few of us regularly invent brand new algorithmic concepts, be it in computer vision, artificial intelligence, or other “hard computer science” fields.  But let’s be honest – the vast majority of us spend our time doing more prosaic things.  We just move data from point A to point B, transforming it along the way, and hopefully without mangling it, losing parts of it, or going offline during the process.  That’s pretty much it.  Our success is measured in our ability to take old, crappy code, modify it to perform some additional business function, and (if we’re really talented) leave it in a slightly less old and slightly less crappy state than we found it.  Most of us will never invent any new algorithm from first principles that’s worth publishing in a journal.

This is the reality of day-to-day software development.  Small-scale tactical coding skills are expected to measure up to a certain consistent bar, but the key differentiator is the ability to write self-documenting, maintainable, bug-free code, and to design architectural systems that don’t force a wholesale rewrite every three years.  Clever binary tree algorithms we can get from the internet; a clean supple codebase depends directly on the quality of the developers we hire and ultimately determines whether our companies succeed or fail.

How do those skills translate into an interview setting?  I think a refactoring exercise is a great way to accomplish this.  Give the candidate a piece of code that works correctly at the moment but is ugly and/or fragile, then ask them to extend the behavior of the code with an additional feature.  The new feature should be small, even trivial (this isn’t an algorithm test) because the challenge for the candidate is to add the new behavior without introducing any regression bugs and also leaving the code in a much better state than in which it started.  I’m sure we all have lots of great messes we can pull straight from our production codebases (don’t lie, you know you do!), but if you want an example, take a look at the Gilded Rose Kata (originally in C# but available in several other languages as well).

A few companies have expanded even further on this idea of reality-based interviewing.  They’ve done things like dropping the candidate into the team room for the entire day and having them pair with multiple team members on actual production code.  Other companies have given candidates a short-term contract job that can be completed in a week of evenings or a weekend.  The candidate gets paid a standard contracting rate for their time and the company gets either good code and a well-validated employee or at worst avoids a bad full-time hire.  Those techniques may have logistical problems for many companies, and they don’t scale to high volume very well, but every company ought to be able to come up with some way to ground their interviewing process more firmly in reality.

Edit: coincidentally, Daniel Blumenthal published a defense of the traditional whiteboard coding exercise just after I wrote this.  It’s an interesting counter-point and it’s a good contrast to what I’ve written here.  He wrote, “you don’t get to choose how you get interviewed.”  That is, of course, completely correct.  My argument is not that we should make interviews easier to pass, or lower our standards, but rather that we should construct our interviews to screen for the skills that we actually want our employees to have.  If you really need your people to solve “completely novel problems”, as Daniel wrote, then interview for that skill.  If you actually need other things, interview for those things.

Categories: Uncategorized

Merging Two Git Repositories Into One Repository Without Losing File History

January 22, 2013 9 comments

A while ago my team had code for our project spread out in two different Git repositories.  Over time we realized that there was no good reason for this arrangement and was just a general hassle and source of friction, so we decided to combine our two repositories into one repository containing both halves of the code base, with each of the old repositories in its own subdirectory.  However, we wanted to preserve all of the change history from each repo and have it available in the new repository.

The good news is that Git makes this sort of thing very easy to do.  Since a repository in Git is just a directed acyclic graph, it’s trivial to glue two graphs together and make one big graph.  The bad news is that there are a few different ways to do it and some of them end up with a less desirable result (at least for our purposes) than others.  For instance, do a web search on this subject and you’ll get a lot of information about git submodules or subtree merges, both of which are kind of complex and are designed for the situation where you’re trying to bring in source code from an external project or library and you want to bring in more changes from that project in the future, or ship your changes back to them.  One side effect of this is that when you import the source code using a subtree merge all of the files show up as newly added files.  You can see the history of commits for those files in aggregate (i.e. you can view the commits in the DAG) but if you try to view the history for a specific file in your sub-project all you’ll get is one commit for that file – the subtree merge.

This is generally not a problem for the “import an external library” scenario but I was trying to do something different.  I wanted to glue to repositories together and have them look as though they had always been one repository all along.  I didn’t need the ability to extract changes and ship them back anywhere because my old repositories would be retired.  Fortunately, after much research and trial-and-error it turned out that it’s actually very easy to do what I was trying to do and it requires just a couple of straightforward git commands.

The basic idea is that we follow these steps:

  1. Create a new empty repository New.
  2. Make an initial commit because we need one before we do a merge.
  3. Add a remote to old repository OldA.
  4. Merge OldA/master to New/master.
  5. Make a subdirectory OldA.
  6. Move all files into subdirectory OldA.
  7. Commit all of the file moves.
  8. Repeat 3-6 for OldB.
    A powershell script for these steps might look like this:

# Assume the current directory is where we want the new repository to be created
# Create the new repository
git init

# Before we do a merge, we have to have an initial commit, so we’ll make a dummy commit
dir > deleteme.txt
git add .
git commit -m "Initial dummy commit"

# Add a remote for and fetch the old repo
git remote add -f old_a <OldA repo URL>

# Merge the files from old_a/master into new/master
git merge old_a/master

# Clean up our dummy file because we don’t need it any more
git rm .\deleteme.txt
git commit -m "Clean up initial file"

# Move the old_a repo files and folders into a subdirectory so they don’t collide with the other repo coming later
mkdir old_a
dir –exclude old_a | %{git mv $_.Name old_a}

# Commit the move
git commit -m "Move old_a files into subdir"

# Do the same thing for old_b
git remote add -f old_b <OldB repo URL>
git merge old_b/master
mkdir old_b
dir –exclude old_a,old_b | %{git mv $_.Name old_b}
git commit -m "Move old_b files into subdir"

Very simple.  Now we have all the files from OldA and OldB in repository New, sitting in separate subdirectories, and we have both the commit history and the individual file history for all files.  (Since we did a rename, you have to do “git log –follow <file> to see that history, but that’s true for any file rename operation, not just for our repo-merge.)

Obviously you could instead merge old_b into old_a (which becomes the new combined repo) if you’d rather do that – modify the script to suit.

If we have in-progress feature branches in the old repositories that also need to come over to the new repository, that’s also quite easy:

# Bring over a feature branch from one of the old repos
git checkout -b feature-in-progress
git merge -s recursive -Xsubtree=old_a old_a/feature-in-progress

This is the only non-obvious part of the whole operation.  We’re doing a normal recursive merge here (the “-s recursive” part isn’t strictly necessary because that’s the default) but we’re passing an argument to the recursive merge that tells Git that we’ve renamed the target and that helps Git line them up correctly.  This is not the same thing as the thing called a “subtree merge”.

So, if you’re simply trying to merge two repositories together into one repository and make it look like it was that way all along, don’t mess with submodules or subtree merges.  Just do a few regular, normal merges and you’ll have what you want.

Categories: Uncategorized

The Windows Experience Index for your system could not be computed

December 14, 2012 1 comment

I recently had a problem with a Windows 8 computer where I couldn’t run the Windows Experience Index.  I had previously run the WEI just fine on this computer but all of a sudden it started giving me this error message:

image

A cursory search online didn’t yield a solution but I did find out that WEI writes a log file to C:\Windows\Performance\WinSAT\winsat.log.  Looking in that log file showed me this at the end of the log:

338046 (4136) – exe\syspowertools.cpp:0983: > Read the active power scheme as ‘381b4222-f694-41f0-9685-ff5bb260df2e’
338046 (4136) – exe\main.cpp:2925: > power policy saved.
338078 (4136) – exe\syspowertools.cpp:1018: ERROR: Cannot set the current power scheme to ‘8c5e7fda-e8bf-4a96-9a85-a6e23a8c635c': The instance name passed was not recognized as valid by a WMI data provider.
338078 (4136) – exe\main.cpp:2942: ERROR: Can’t set high power state.
338078 (4136) – exe\processwinsaterror.cpp:0298: Unspecified error 29 occured.
338078 (4136) – exe\processwinsaterror.cpp:0319: Writing exit code, cant msg and why msg to registry
338078 (4136) – exe\syspowertools.cpp:1015: > Set the active power scheme to 381b4222-f694-41f0-9685-ff5bb260df2e’
338078 (4136) – exe\main.cpp:2987: > Power state restored.
338078 (4136) – exe\main.cpp:3002: > Successfully reenabled EMD.
338109 (4136) – exe\watchdog.cpp:0339: Watch dog system shutdown
338109 (4136) – exe\main.cpp:5341: > exit value = 29.

Ah! This computer is used by my entire family and I had been using the Local Group Policy Editor to lock down some settings that I didn’t want people to change, including the power management policy.  Apparently, if users can’t change the power management policy then WEI can’t change it either, and it gets grumpy about that.

The solution was to turn off enforcement of the active power management plan, run WEI (which now worked fine), then re-enable enforcement.

Categories: Uncategorized

Fixing the error “Unable to launch the IIS Express Web server. Failed to register URL. Access is denied.”

May 3, 2012 2 comments

I had a Visual Studio web project that was configured to use IIS Express on port 8080.  That should normally be fine and IIS Express is supposed to be able to run without administrator privileges, but when I would try to run the app I would get this error:

Unable to launch the IIS Express Web server.

Failed to register URL "http://localhost:8080/" for site "MySite" application "/". Error description: Access is denied. (0x80070005).

Launching Visual Studio with administrator credentials would cause it to be able to run the application successfully, but that kind of defeated the purpose of using IIS Express in the first place.

It turns out that the problem in my case was that something else had previously created a URL reservation for port http://localhost:8080/.  I have no idea what did it, but running this command in Powershell showed the culprit:

[C:\Users\Eric] 5/3/2012 2:39 PM
14> netsh http show urlacl | select-string "8080"

    Reserved URL            : http://+:8080/

The solution was to run this command in an elevated shell:

[C:\Users\Eric] 5/3/2012 2:39 PM
2> netsh http delete urlacl
http://+:8080/

URL reservation successfully deleted

Now I can run my web app from VS without elevation.

Categories: Uncategorized

Fixing “Could not load file or assembly ‘Microsoft.SqlServer.Management.SqlParser’”

April 17, 2012 6 comments

It’s incredibly frustrating that this still happens, but it turns out that if you have Visual Studio 2010 and SQL Server Express 2008 (or R2) on your machine, and you uninstall SQL Server Express 2008 and install SQL Server Express 2012 instead, you’ll get an error trying to load database projects in Visual Studio 2010: “Could not load file or assembly ‘Microsoft.SqlServer.Management.SqlParser’”.

Why can’t SQL Server 2012 install the stuff it knows Visual Studio requires?  Fine, whatever.

The fix for this problem is the same as the last time I posted something like this:

  1. Locate your Visual Studio 2010 installation media.
  2. In the \WCU\DAC folder, you’ll find three MSIs: DACFramework_enu.msi, DACProjectSystemSetup_enu.msi, and TSqlLanguageService_enu.msi. Run and install each of them. (Possibly just the third one is required in this case.  I’m not sure.)
  3. Reapply Visual Studio 2010 SP1.

You should be back to a working state.

Categories: Uncategorized

I Hate #Regions

April 11, 2012 Leave a comment

This is mostly a note to myself so I don’t lose this awesome Visual Extension.

I hate C# #region tags in Visual Studio.  I hate opening a class file and then having to expand all the regions in order to read the code.  If you feel the need to use regions to make your class file more manageable, then it’s quite likely that your class is actually too big and desperately needs to be refactored.  Just . . . no.

But since not everyone is as enlightened as myself (ahem), sometimes I have to work with code that uses lots of #region tags.  In those cases where I can’t delete them all with extreme prejudice, I can at least install the excellent I Hate #Regions Visual Studio extension that will auto-expand regions whenever I open a file and will also display the #region tags in very small type so that they’re not as obnoxious.  Ah, relief!

Categories: Uncategorized

Eliminating Friction, Part 3: The Build Server

March 1, 2012 Leave a comment

Once we had a build script for Jumala we could do something about setting up a continuous integration build server.

Before the build server got going, we had several chronic problems that sapped time and morale from our team:

  • People would forget to include a new file with a commit or break the build in various other ways. This wouldn’t be discovered until someone else updated from source control and couldn’t build any more. That’s never fun.
  • Because there wasn’t any official build, the process of deploying builds to the live web sites was pretty much having someone build on his/her local machine then manually copy the resulting binaries to the web server. Every once in a while a bad build would go out with personal configuration options or uncommitted code changes that shouldn’t have been made public.  That’s never fun either.
  • As a result of the previous point, we could never tell exactly what build was running on the web servers at any given moment.  The best we could do was inspect timestamps on the files but that didn’t tell us exactly what commits were included in a build.

Needless to say that wasn’t a great experience for our development team. The solution was to set up a continuous integration build server that would automatically sync, build, and report automatically after each commit.

Solving the problem

I downloaded and installed TeamCity from JetBrains.  TeamCity is one of those rare tools that manages to combine both ridiculously awesome ease of use and ridiculously powerful configurability into one package.  I highly recommend it if you’re looking for a CI build server.  JetBrains offers very generous licensing terms for TeamCity which makes it free to use for most small teams.  By the time your team is large enough to need more than what the free license provides, you won’t mind paying for it.

I set up a build configuration for our main branch in SVN, set it to build and run unit tests after every commit, and configured TeamCity to send email to all users on build failures. This way when someone breaks the build we know about it instantly and we know exactly what caused it.  Sure, we still have occasional build breaks but they’re not a big deal any more.

I extended our Rake build script to copy all binaries to a designated output folder in the source tree then configured TeamCity to grab that output folder and store it as the build artifacts.  Now when we want to deploy a build to our web servers, we can go to the TeamCity interface, download the build artifacts for a particular build, and deploy them without fear of random junk polluting them.  No more deploying personal dev builds!

I also set up a build numbering scheme for the TeamCity builds so that every build is stamped with a <major>.<minor>.<TeamCity build counter>.<SVN revision number> scheme.  This lets us unambiguously trace back any build binary to exactly where it came from and what it contains.

Finally, once the TeamCity build was working smoothly I created a template from the main branch build and use that template to quickly define new build configurations for our release branches as we create them. When we create a new release branch I copy the version number information from the main branch build definition to the new release branch build definition, then bump the minor version number and reset the TeamCity build counter on the main branch.  This way all builds from all branches are unambiguous and we know when and where they came from.

A dedicated build server has been considered a best practice for decades now, and a CI build server doing builds on every single commit has been a best practice for a long time as well, but small startup teams can sometimes get so caught up in building the product that they forget to attend to the necessary engineering infrastructure.  Setting up a build server isn’t a very sexy job but it pays huge dividends in reduced friction over time.  These days it’s amazingly easy to do it, too.  If your team doesn’t have a CI build server, you should get one right away.  Once you do you’ll wonder how you ever got along without it.

Categories: Development, Tools
Follow

Get every new post delivered to your Inbox.