Hail the Heroes - Denver Startup Week 2016 talk
If you were asked to identify the biggest killer of productivity for a software engineer, what would you say it is?
You might say “email” or “drop in managers” but those two things have the same root cause. Disruptions! The mother of all flow killing work place activities.
Disruptions can obliterate an engineer’s productivity. On average, developers only have two hours of disruption-free time per day.
We all know how this feels — even after a long day. You keep wondering, “What did even I do today?” If you find yourself asking that question a lot, I have great news for you. You need a hero.
Note: This is the post version of the talk given at Denver Startup Week 2016. See this post for the slides and information on the talk.
A little background
Before I answer what a hero is and how it can help you, let me give some background on where we started. I lead the SRE team at InVision and we were plagued with disruptions. Understandably, we had many days in which we didn’t know “where the time went.” We needed to remove these disruptions so that we could build the platform that the entire company depended on.
What was disrupting us? After a quick and informal survey, we had a lot of the typical disruptions that most systems administrators or IT help desks have. That would be fine, but we weren’t solely a help desk, we were a SRE group. Building the infrastructure that would power the future of our exciting SaaS offering was being neglected and we were just helping with printers.
When we looked at how critical the disruptions where, the results were fairly clear in that the disruptions were required but they none the less were still killing our productivity.
We solved the problem by leveraging a concept that we already were familiar with: being on call. Except this wasn’t the kind of on call that wakes you up at 3am with that horrible PagerDuty voice. This was a different kind of on call and we wanted to set that apart while giving the role a dose of dignity and awesomeness. The result is what we call our Heroes!
What is a hero?
Our hero isn’t the kind that wears tight pants while flying around. Our hero is the team’s representative to the rest of the company. Our hero lets the rest of us focus on making progress with our forward-thinking work, knowing that the disruptions are being handled.
The hero role rotates like an on call rotation would and it also has a few other expectations. As a hero:
- You’re responsible for the random questions and small change requests from the organization as a whole.
- You’re responsible for automating the issues/questions that come up.
The first responsibility of a hero helps a little but it doesn’t go very far on it’s own. As the organization grows, without the second responsibility, hero work only gets to be more and more until we are overwhelming the hero. The second responsibility is key. With that in mind, we can decrease both the raw number of disruptions as well as the time to complete them.
Since we wanted to approach the hero role as an experiment and only roll out an MVP until we were sure it would work, we wanted to keep it very light on the implementation side. What we ended up with is a fully SaaS solution that serves it purpose while adding very little overhead to both the requesting teams or the hero.
To start, we use Slack as our primary communication tool. One of the things that Slack gives you is the ability to define a “slash” command that will post to any endpoint that you give it. The slash command “/hail-hero” is the first part of our hero workflow. The receiving end of the web hook that the slash command posts to is hosted on Zapier. By using Zapier we can tie the receiving of that web hook into our other tools, most notably JIRA and PagerDuty.
Why both JIRA and PagerDuty? We use JIRA day in and day out for tracking our other work, so it was a natural fit. Also, by using JIRA we can create and link any other work that would take more time than we wanted to allocate in the hero process. PagerDuty was used because it helps us easily create and use a rotation of users who will be on hero duty. As an added benefit, PagerDuty can escalate hero tickets if they aren’t being responded to in a timely fashion.
Overall, the hero system has worked phenomenally well. The business continuity is being maintained and our team has a sense of focus. We are making progress on our most substantial goals. In addition to this being quite successful for the SRE team, four other teams have adopted this system and we have more looking to get their own heroes.
Since this has been a huge success, we’re looking at moving away from using Zapier as the logical glue and moving towards integrating this into a chat bot that we use for other business functions.
If you have any questions or are thinking about implementing ways in which you want your engineers to focus, please reach out to me on Twitter @jdowdle.