An Elegant Puzzle - William Larson

Note: While reading a book whenever I come across something interesting, I highlight it on my Kindle. Later I turn those highlights into a blogpost. It is not a complete summary of the book. These are my notes which I intend to go back to later. Let’s start!

Organizational design gets the right people in the right places, empowers them to make decisions, and then holds them accountable for their results. Maintained consistently and changed sparingly, nothing else will help you scale more.
You’ll find yourself sizing teams during reorganizations to accommodate growth from hiring, and when considering how to support new projects. It’ll be an unusual month that you won’t consider some aspect of team design.
The guiding principles I use for sizing teams are: Managers should support six to eight engineers This gives them enough time for active coaching, coordinating, and furthering their team’s mission by writing strategies, leading change, so on.
Tech Lead Managers (TLMs). Managers supporting fewer than four engineers tend to function as TLMs, taking on a share of design and implementation work. For some folks this role can uniquely leverage their strengths, but it’s a role with limited career opportunities. To progress as a manager, they’ll want more time to focus on developing their management skills. Alternatively, to progress toward staff engineering roles, they’ll find it difficult to spend enough time on the technical details. Coaches. Managers supporting more than eight or nine engineers typically act as coaches and safety nets for problems. They are too busy to actively invest in their team or their team’s area of responsibility. It’s reasonable to ask managers to support larger teams during the transition to a more stable configuration, but it is a bad status quo. Managers-of-managers should support four to six managers This gives them enough time to coach, to align with stakeholders, and to do a reasonable amount of investment in their organization. On the other hand, it will also keep them busy enough that they won’t be tempted to create work for their team. Ramping up. Managers supporting fewer than four other managers should be in a period of active learning on either the problem domain or on transitioning from supporting engineers to supporting managers. In the steady state, this can lead to folks feeling underutilized, or being tempted to meddle in daily operations. Coaches. Similar to supporting a large team of engineers, supporting a large team of managers leaves you functioning purely as a problem-solving coach.
On-call rotations want eight engineers For production on-call responsibilities, I’ve found that two-tier 24/7 support requires eight engineers. As teams holding their own pagers have become increasingly mainstream, this has become an important sizing constraint, and I try to ensure that every engineering team’s steady state is eight people. Shared rotations. It is sometimes necessary to pool multiple teams together to reach the eight engineers necessary for a 24/7 on-call rotation. This is an effective intermediate step toward teams owning their own on-call rotations, but it is not a good long-term solution. Most folks find being on-call for components that they’re unfamiliar with to be disproportionately stressful. Small teams (fewer than four members) are not teams I’ve sponsored quite a few teams of one or two people, and each time I’ve regretted it. To repeat: I have regretted it every single time. An important property of teams is that they abstract the complexities of the individuals that compose them. Teams with fewer than four individuals are a sufficiently leaky abstraction that they function indistinguishably from individuals. To reason about a small team’s delivery, you’ll have to know about each on-call shift, vacation, and interruption. They are also fragile, with one departure easily moving them from innovation back into toiling to maintain technical debt.
Keep innovation and maintenance together. A frequent practice is to spin up a new team to innovate while existing teams are bogged down in maintenance. I’ve historically done this myself, but I’ve moved toward innovating within existing teams. This requires very deliberate decision-making and some bravery, but in exchange you’ll get higher morale and a culture of learning, and will avoid creating a two-tiered class system of innovators and maintainers.
Fitting together those guiding principles, the playbook that I’ve developed is surprisingly simple and effective: Teams should be six to eight during steady state. To create a new team, grow an existing team to eight to ten, and then bud into two teams of four or five. Never create empty teams. Never leave managers supporting more than eight individuals.
Teams are slotted into a continuum of four states: A team is falling behind if each week their backlog is longer than it was the week before. Typically, people are working extremely hard but not making much progress, morale is low, and your users are vocally dissatisfied. A team is treading water if they’re able to get their critical work done, but are not able to start paying down technical debt or begin major new projects. Morale is a bit higher, but people are still working hard, and your users may seem happier because they’ve learned that asking for help won’t go anywhere. A team is repaying debt when they’re able to start paying down technical debt, and are beginning to benefit from the debt repayment snowball: each piece of debt you repay leads to more time to repay more debt. A team is innovating when their technical debt is sustainably low, morale is high, and the majority of work is satisfying new user needs. Teams want to climb from falling behind to innovating, while entropy drags them backward. Each state requires a different tact.
When the team is falling behind, the system fix is to hire more people until the team moves into treading water. Provide tactical support by setting expectations with users, beating the drum around the easy wins you can find, and injecting optimism. As a caveat, the system fix is to hire net new people, increasing the overall capacity of the company. Sometimes people instead attempt to capture more resources from the existing company, and I’m pretty negative on that. People are not fungible, and generally folks end up in useful places, so I’m skeptical of reassigning existing individuals to drive optimality. By nature, it’s also impossible for this kind of discussion to not become political, even when everyone involved has deep trust in and respect for each other. When the team is treading water, the system fix is to consolidate the team’s efforts to finish more things, and to reduce concurrent work until they’re able to begin repaying debt (e.g., limit work in progress). Tactically, the focus here is on helping people transition from a personal view of productivity to a team view. When the team is repaying debt, the system fix is to add time. Everything is already working, you just need to find space to allow the compounding value of paying down technical debt to grow. Tactically try to find ways to support your users while also repaying debt, to avoid disappearing into technical debt repayment from your users’ perspective. Especially for a team that started out falling behind and is now repaying debt, your stakeholders are probably antsy waiting for the team to start delivering new stuff, and your obligation is to prevent that impatience from causing a backslide! Innovating is a bit different, because you’ve nominally reached the end of the continuum, but there is still a system fix! In this case, it’s to maintain enough slack in your team’s schedule that the team can build quality into their work, operate continuously in innovation, and avoid backtracking. Tactically, ensure that the work your team is doing is valued: the quickest path out of innovation is to be viewed as a team that builds science projects, which inevitably leads to the team being defunded. I can’t stress enough that these fixes are slow. This is because systems accumulate months or years of static, and you have to drain that all away. Conversely, the same properties that make these fixes slow to fix make them extremely durable once in effect! The hard part is maintaining faith in your plan—both your faith and the broader organization’s faith. At some point, you may want to launder accountability through a reorg, or maybe skip out to a new job, but if you do that you’re also skipping the part where you get to learn. Stay the path.
As an organizational leader, you’ll be dealing with a number of teams, each of which is in a different place on this continuum. You’ll also have limited resources to apply, and they’ll usually be insufficient to simultaneously move every team down the continuum. Many folks try to move all teams at the same time, peanut buttering their limited resources, but resist that indecision-framed-as-fairness: it’s not a fair outcome if no one gets anything. For each constraint, prioritize one team at a time. If most teams are falling behind, then hire onto one team until it’s staffed enough to tread water, and only then move to the next. While this is true for all constraints, it’s particularly important for hiring. Adding new individuals to a team disrupts that team’s gelling process, so I’ve found it much easier to have rapid growth periods for any given team, followed by consolidation/gelling periods during which the team gels. The organization will never stop growing, but each team will.
Fundamentally, I believe that sustained productivity comes from high-performing teams, and that disassembling a high-performing team leads to a significant loss of productivity, even if the members are fully retained. In this worldview, high-performing teams are sacred, and I’m quite hesitant to disassemble them. Teams take a long time to gel. When a group has been working together for a few years, they understand each other and know how to set each other up for success in a truly remarkable way. Shifting individuals across teams can reset the clock on gelling, especially for teams in the early stages of gelling, and when there are significant differences in team culture. That’s not to say that you want teams to never change—that leads to stagnation—but perhaps preserving a team’s gelled state requires restrained growth.
Another reason that I lean away from moving folks off high-performing teams is that most teams have high fixed costs and relatively small variable costs: moving one person can shift an innovating team back into falling behind, and now neither team is doing particularly well. This is especially true on teams responsible for products and services. My rule of thumb is that it takes eight engineers on a team to support a two-tier on-call rotation, so I’m generally reluctant to move any team with membership below that line. However, fixed costs come in many other varieties: “keeping the lights on” work, precommitted contracts, support questions from other teams, etc. There are some teams with very low fixed costs—a startup without any users, a team supporting a product that you’ve turned off entirely—and I suspect that the rules for those teams are different. I also suspect that such teams are quite uncommon in successful companies.
The premise of moving folks to optimize global efficiency also implies a deeper understanding of how productivity is generated than I’ve ever personally achieved. I’m a strong believer in not adding more resources to a team with visible slack, but I’m less convinced that the inverse applies. The expected time to complete a new task approaches infinity as a team’s utilization approaches 100 percent, and most teams have many dependencies on other teams. Together, these facts mean you can often slow a team down by shifting resources to it, because doing so creates new upstream constraints. In further defense of slack, I find that teams put spare capacity to great use by improving areas within their aegis, in both incremental and novel ways. As a bonus, they tend to do these improvements with minimal coordination costs, such that the local productivity doesn’t introduce drag on the surrounding system. Most importantly, “slackful” teams function as an organizational debugger: you don’t have to consider them when debugging the overall organizational throughput. I’ve found it much easier to work a couple constraints at a time, solving forward without needing to revisit previous constraints. Okay, so what does work? I’ve found it most fruitful to move scope between teams, preserving the teams themselves. If a team has significant slack, then incrementally move responsibility to them, at which point they’ll start locally optimizing their expanded workload. It’s best to do this slowly to maintain slack in the team, but if it’s a choice of moving people rapidly or shifting scope rapidly, I’ve found that the latter is more effective and less disruptive.
Shifting scope works better than moving people because it avoids re-gelling costs, and it preserves system behavior.
The other approach that I’ve seen work well is to rotate individuals for a fixed period into an area that needs help. The fixed duration allows them to retain their identity and membership in their current team, giving their full focus to helping out, rather than splitting their focus between performing the work and finding membership in the new team. This is also a safe way to measure how much slack the team really has!
When I started at Uber, we were almost 1,000 employees and were doubling the headcount every six months. An old-timer summarized their experience as: “We’re growing so quickly that every six months we’re a new company.” A bystander quickly added a corollary: “Which means our process is always six months behind our head count.” Helping my team be successful when a defunct process merges with a constant influx of new engineers and system load has been one of the most rewarding opportunities I’ve had in my career. This is an attempt to explore the challenges and propose some strategies I’ve seen for mitigating and overcoming them.
Productively integrating large numbers of engineers is hard. Just how challenging this is depends on how quickly you can ramp engineers up to self-sufficient productivity, but if you’re doubling every six months and it takes six to twelve months to ramp up, then you can quickly find a scenario in which untrained engineers increasingly outnumber the trained engineers, and each trained engineer is devoting much of their time to training a couple of newer engineers.
Imagine a scenario in which training a single new engineer takes about 10 hours per week from each trained engineer, and in which untrained engineers are one-third as productive as trained engineers. The result is the right-hand chart’s (admittedly, pretty worst-case scenario) ratio of two-untrained-to-one-trained. Worse, for those three people you’re only getting the productivity of 1.16 trained engineers (2 x .33 for the untrained engineers plus .5 x 1 for the trainer). You also need to factor in the time spent on hiring. If you’re trying to double every six months, and about 10 percent of candidates undergoing phone screens eventually join, then you need to do ten interviews per existing engineer in that time period, with each interview taking about two hours to prep, perform, and debrief. That’s less than four hours per engineer per month if you can leverage your entire existing team, but training comes up again here: if it takes you six months to get the average engineer onto your interview loop, each trained engineer is now doing three to four hours of hiring-related work per week, and your trained engineers are down to approximately 0.4 efficiency. The overall team is getting 1.06 engineers’ worth of work out of every three engineers. It’s not just training and hiring, though: For every additional order of magnitude of engineers, you need to design and maintain a new layer of management. For every ~10 engineers, you need an additional team, which requires more coordination.14 Each engineer means more commits and deployments per day, creating load on your development tools. Most outages are caused by deployments, so more deployments drive more outages, which in turn require incident management, mitigations, and postmortems. Having more engineers leads to more specialized teams and systems, which require increasingly small on-call rotations so that your on-call engineers have enough system context to debug and resolve production issues. Consequently, relative time invested in on-call goes up. Let’s do a bit more handwavy math to factor these in. Only your trained engineers can go on-call. They’re on-call one week a month, and are busy about half their time on-call. So that’s a total impact of five hours per week for your trained engineers, who are now down to 0.275 efficiency, and your team is now getting less than the output of a single trained engineer for every three engineers you’ve hired.
Understanding the overall impact of increased load comes down to a few important trends: Most system-implemented systems are designed to support one to two orders’ magnitude of growth from the current load. Even systems designed for more growth tend to run into limitations within one to two orders of magnitude. If your traffic doubles every six months, then your load increases an order of magnitude every 18 months. (And sometimes new features or products cause load to increase much more quickly.) The cardinality of supported systems increases over time as you add teams, and as “trivial” systems go from unsupported afterthoughts to focal points for entire teams as the systems reach scaling plateaus (things like Apache Kafka, mail delivery, Redis, etc.). If your company is designing systems to last one order of magnitude and is doubling every six months, then you’ll have to re-implement every system twice every three years. This creates a great deal of risk—almost every platform team is working on a critical scaling project—and can also create a great deal of resource contention to finish these concurrent rewrites.
The real productivity killer is not system rewrites but the migrations that follow those rewrites. Poorly designed migrations expand the consequences of this rewrite loop from the individual teams supporting the systems to the entire surrounding organization. If each migration takes a week, each team is eight engineers, and you’re doing four migrations a year, then you’re losing about 1 percent of your company’s total productivity. If each of those migrations takes closer to a month, or if they are only possible for your small cadre of trained engineers—whose time is already tightly contended for—then the impact becomes far more pronounced. There is a lot more that could be said here—companies that mature rapidly often have tight and urgent deadlines around pursuing various critical projects, and around moving to multiple data centers, to active-active designs, and to new international regions—but I think we’ve covered our bases on how increasing system load can become a drag on overall engineering throughput.
When your company has decided that it is going to grow, you cannot stop it from growing, but, on the other hand, you absolutely can concentrate that growth, such that your teams alternate between periods of rapid hiring and periods of consolidation and gelling. Most teams work best when scoped to approximately eight engineers, as each team gets to that point, you can move the hiring spigot to another team (or to a new team). As the post-hiring team gels, eventually the entire group will be trained and able to push projects forward.
You can do something similar on an individual basis, rotating engineers off of interviewing periodically to give them time to recuperate. With high interview loads, you’ll sometimes notice last year’s solid interviewer giving a poor experience to a candidate or rejecting every incoming candidate. If your engineer is doing more than three interviews a week, it is a useful act of mercy to give them a month off every three or four months. One specific tool that I’ve found extremely helpful here is an ownership registry, which allows you to look up who owns what, eliminating the frequent “Who owns X?” variety of question. You’ll need this sort of thing to automate paging the right on-call rotation, so you might as well get two useful tools out of it!
In my opinion, probably the most important opportunity is designing your software to be flexible. I’ve described this as “fail open and layer policy”; the best system rewrite is the one that didn’t happen, and if you can avoid baking in arbitrary policy decisions that will change frequently over time, then you are much more likely to be able to keep using a system for the long term.
If you’re going to have to rewrite your systems every few years due to increased scale, let’s avoid any unnecessary rewrites, ya know? Along these lines, if you can keep your interfaces generic, then you are able to skip the migration phase of system re-implementation, which tends to be the longest and trickiest phase, and you can iterate much more quickly and maintain fewer concurrent versions. There is absolutely a cost to maintaining this extra layer of indirection, but if you’ve already rewritten a system twice, take the time to abstract the interface as part of the third rewrite and thank yourself later. (By the time you’d do the fourth rewrite, you’d be dealing with migrating six times as many engineers.)
Finally, a related antipattern is the gatekeeper pattern. Having humans who perform gatekeeping activities creates very odd social dynamics, and is rarely a great use of a human’s time. When at all possible, build systems with sufficient isolation that you can allow most actions to go forward. And when they do occasionally fail, make sure that they fail with a limited blast radius. There are some cases in which gatekeepers are necessary for legal or compliance reasons, or because a system is deeply frail, but I think that we should generally treat gatekeeping as a significant implementation bug rather than as a stability feature to be emulated.
Typically, my organizational philosophy is to stabilize team-by-team and organization-by-organization. Ensuring any given area is well on the path to health before moving my focus. I try not to push risks onto teams that are functioning well. You do need to delegate some risks, but generally I think it’s best to only delegate solvable risk. If something simply isn’t likely to go well, I think it’s best to hold the bag yourself. You may be the best suited to manage the risk, but you’re almost certainly the best positioned to take responsibility. As an organizational leader, you’ll always have a portfolio of risk, and you’ll always be doing very badly at some things that are important to you. That’s not only okay, it’s unavoidable.
Two or three years into a role, you may find that your personal rate of learning has trailed off. You know your team well, the industry particulars are no longer quite as intimidating, and you have solved the mystery of getting things done at your company. This can be a sign to start looking for your next role, but it’s also a great opportunity to build experience with succession planning. Succession planning is thinking through how the organization would function without you, documenting those gaps, and starting to fill them in. It’s awkward enough to talk about that it doesn’t get much discussion, but it’s a foundational skill for building an enduring organization.
The first step in succession planning is to figure out what you do. This seems like it should be easy, but I’ve found it surprisingly hard! There are the obvious things you do—one-on-ones, meetings, head count planning—but you’re probably filling in a hundred little holes that you don’t even think about. The approach I’ve taken is to consider your work from several different angles. Take a look at your calendar and write down your role in meetings. This goes for explicit roles, like owning a meeting’s agenda, and also for more nuanced roles, like being the first person to champion others’ ideas, or the person who is diplomatic enough to raise difficult concerns. Take a second pass on your calendar for non-meeting stuff, like interviewing and closing candidates. Look back over the past six months for recurring processes, like roadmap planning, performance calibrations, or head count decisions, and document your role in each of those processes. For each of the individuals you support, in which areas are your skills and actions most complementary to theirs? How do you help them? What do they rely on you for? Maybe it’s authorization, advice navigating the organization, or experience in the technical domain. Audit inbound chats and emails for requests and questions coming your way. If you keep a to-do list, look at the categories of the work you’ve completed over the past six months, as well as the stuff you’ve been wanting to do but keep putting off. Think through the external relationships that have been important for you in your current role. What kinds of folks have been important, and who are the strategic partners that someone needs to know? After exploring each of these avenues, you’ll have quite a long list of things. Test the list on a few folks whom you work closely with and see if you’ve missed anything. Congratulations, now you know what your job is!
Take your list, and for each item try to identify the individuals who could readily take on that work. Good job, cross those out. For items without someone who is ready today, identify a handful of individuals who could potentially take it over. (Depending on the size of your list, it may be helpful to cluster similar items into groups to reduce the toil of running this exercise.) If you’re working at a well-established company, you may find that there aren’t too many gaps that couldn’t be readily filled by someone else. However, if you’re at a company going through hypergrowth, it’s common to find that everyone is already working in the most complex role of their career, and you’ll uncover gaps, gaping and cavernous. Filter the gaps down to two lists: The first should cover the easiest gaps to close. Maybe it’ll require a written document or a quick introduction. You should be able to close one of these in less than four hours. The latter will be the riskiest gaps. These are the areas where you’re uniquely valuable to the company, where other folks are missing skills, and where getting the tasks done is truly important. You’d expect closing one of these gaps to require ongoing effort over several months. Write up a plan to close all of the easy gaps and one or two of the riskiest gaps. Add it to your personal goals, and then, congrats, you’ve completed a round of succession planning!
This isn’t a one-time tool, but rather a great exercise to run once a year to identify things you could be delegating. This helps nurture an enduring organization, and also frees up time for you to continue growing into a larger role as well. You can even get a sense of how well you’re doing by taking a two- or three-week vacation and seeing what slips through the cracks. Those items can be the start of next year’s list!
Many effective leaders I’ve worked with have the uncanny knack for working on leveraged problems. In some problem domains, the product management skill set is extraordinarily effective for identifying useful problems, but systems thinking is the most universally useful tool kit I’ve found.
I’ve spent a lot of time pondering the authors’ definition of velocity. They focus on four measures of developer velocity: Delivery lead time is the time from the creation of code to its use in production. Deployment frequency is how often you deploy code. Change fail rate is how frequently changes fail. Time to restore service is the time spent recovering from defects.
Let’s see if we can model them into a system that we can use to reason about developer productivity: Pull requests are converted into ready commits based on our code review rate. Ready commits convert into deployed commits at deploy rate. Deployed commits convert into incidents at defect rate. Incidents are remediated into reverted commits at recovery rate. Reverted commits are debugged into new pull requests at debug rate. Linking these pieces together, we see a feedback loop, in which the system’s downstream behavior impacts its upstream behavior. With a sufficiently high defect rate or slow recovery rate, you could easily see a world where each deploy leaves you even further behind.
If you don’t have a backlog of ready commits, then speeding up your deploy rate may not be valuable. Likewise, if your defect rate is very low, then reducing your recovery time will have little impact on the system.
Once you start thinking about systems, you’ll find that it’s hard to stop. Pretty much any difficult problem is worth trying to represent as a system, and even without numbers plugged in I find them powerful thinking aids.
Strategies are grounded documents which explain the trade-offs and actions that will be taken to address a specific challenge. Visions are aspirational documents that enable individuals who don’t work closely together to make decisions that fit together cleanly.
Good goals are a composition of four specific kinds of numbers: A target states where you want to reach. A baseline identifies where you are today. A trend describes the current velocity. A time frame sets bounds for the change. Put these all together, and a well-structured goal takes the form of: “In Q3, we will reduce time to render our frontpage from 600ms (p95) to 300ms (p95). In Q2, render time increased from 500ms to 600ms.”
There are two particularly interesting kinds of goals: investments and baselines. Investments describe a future state that you want to reach, and baselines describe aspects of the present that you want to preserve. Imagine that you wanted to speed up your data pipeline. Your goal might be, “Core batch jobs should finish within three hours (p95) by the end of Q3. They currently take six hours (p95), and over the course of Q2 they got two hours slower.” This is a well-structured goal, but it’s also incomplete because you could likely reach that goal tomorrow by doubling the size of your cluster, which is probably not a desirable outcome. The best way to avoid such unintended outcomes is to pair your investment goals with baseline metrics, sometimes referred to as countervailing metrics. For the data pipeline example, a few of the baseline metrics might be: Efficiency of running core batch jobs should not exceed current price of $0.05 per GB. Core batch jobs should not increase alert load on teams operating or using the pipeline, which are currently alerting twice per week.
Infrastructure cost is a great example of a baseline metric. When you’re asked to take responsibility for a company’s overall infrastructure costs, you’re going to start from a goal along the lines of “Maintain infrastructure costs at their current percentage of net revenue of 30 percent.” (That percentage is a fictional number for this example’s purposes, since the percentage will depend on your industry and maturity, but I have found that tying it against net revenue is more useful than pinning it at a specific dollar amount.)
Migrations are both essential and frustratingly frequent as your codebase ages and your business grows: most tools and processes only support about one order of magnitude of growth before becoming ineffective, so rapid growth makes migrations a way of life. This isn’t because you have bad processes or poor tools—quite the opposite. The fact that something stops working at significantly increased scale is a sign that it was designed appropriately to the previous constraints rather than being over-designed.
Lore tells us that Googlers have a phrase, “running to stand still,” to describe a team whose entire capacity is consumed in upgrading dependencies and patterns, such that the group can’t make forward progress on the product/system they own. Spending all your time on migrations is extreme, but every midsize company has a long queue of migrations that it can’t staff: moving from VMs to containers, rolling out circuit-breaking, moving to the new build tool . . . the list extends effortlessly into the sunset.
The good news is that while migrations are hard, there is a pretty standard playbook that works remarkably well: de-risk, enable, then finish. De-risk: The first phase of a migration is de-risking it, and to do so as quickly and cheaply as possible. Write a design document and shop it with the teams that you believe will have the hardest time migrating. Iterate. Shop it with teams who have atypical patterns and edge cases. Iterate. Test it against the next six to twelve months of roadmap. Iterate. After you’ve evolved the design, the next step is to embed into the most challenging one or two teams, and work side by side with those teams to build, evolve, and migrate to the new system. Don’t start with the easiest migrations, which can lead to a false sense of security. Effective de-risking is essential, because each team who endorses a migration is making a bet on you that you’re going to get this damn thing done, and not leave them with a migration to an abandoned system that they have to revert to. If you leave one migration partially finished, people will be exceedingly suspicious of participating in the next. Enable: Once you’ve validated the solution that solves the intended problem, it’s time to start sharpening your tools. Many folks start migrations by generating tracking tickets for teams to implement, but it’s better to slow down and build tooling to programmatically migrate the easy 90 percent. This radically reduces the migration’s cost to the broader organization, which increases the organization’s success rate and creates more future opportunities to migrate. Once you’ve handled as much of the migration programmatically as possible, figure out the self-service tooling and documentation that you can provide to allow teams to make the necessary changes without getting stuck. The best migration tools are incremental and reversible: folks should be able to immediately return to previous behavior if something goes wrong, and they should have the necessary expressiveness to de-risk their particular migration path. Documentation and self-service tooling are products, and they thrive under the same regime: sit down with some teams and watch them follow your instructions, then improve them. Find another team. Repeat. Spending an extra two days intentionally making your documentation clean and your tools intuitive can save years in large migrations. Do it! Finish: The last phase of a migration is deprecating the legacy system that you’ve replaced. This requires getting to 100 percent adoption, and that can be quite challenging. Start by stopping the bleeding, which is ensuring that all newly written code uses the new approach. That can be installing a ratchet in your linters, or updating your documentation and self-service tooling. This is always the first step, because it turns time into your friend. Instead of falling behind by default, you’re now making progress by default. Okay, now you should start generating tracking tickets, and set in place a mechanism which pushes migration status to teams that need to migrate and to the general management structure. It’s important to give wider management context around migrations because the managers are the people who need to prioritize the migrations: if a team isn’t working on a migration, it’s typically because their leadership has not prioritized it. At this point, you’re pretty close to complete, but you have the long tail of weird or unstaffed work. Your tool now is: finish it yourself. It’s not necessarily fun, but getting to 100 percent is going to require the team leading the migration to dig into the nooks and crannies themselves. My final tip for finishing migrations centers around recognition. It’s important to celebrate migrations while they’re ongoing, but the majority of the celebration and recognition should be reserved for their successful completion. In particular, starting but not finishing migrations often incurs significant technical debt, so your incentives and recognition structure should be careful to avoid perverse incentives.
I believe that, at quickly growing companies, there are two managerial skills that have a disproportionate impact on your organization’s success: making technical migrations cheap, and running clean reorganizations. Do both well, and you can skip that lovely running-to-stand-still sensation, and invest your attention more fruitfully.
My approach for planning organization change: Validate that organizational change is the right tool. Project headcount a year out. Set target ratio of management to individual contributors. Identify logical teams and groups of teams. Plan staffing for the teams and groups. Commit to moving forward. Roll out the change.
There are two best kinds of reorganizations: The one that solves a structural problem. The one that you don’t do.
There is only one worst kind of reorg: the one you do because you’re avoiding a people management issue.
My checklist for ensuring that a reorganization is appropriate: Is the problem structural? Organization change offers the opportunity to increase communication, reduce decision friction, and focus attention; if you’re looking for a different change, consider if there’s a more direct approach. Are you reorganizing to work around a broken relationship? Management is a profession where karma always comes due, and you’ll be better off addressing the underlying issue than continuing to work around it. Does the problem already exist? It’s better to wait until a problem actively exists before solving it, because it’s remarkably hard to predict future problems. Even if you’re right that the problem will occur, you may end up hitting a different problem first. Are the conditions temporary? Are you in a major crunch period or otherwise doing something you don’t anticipate doing again? If so, then it may be easier to patch through and rethink on the other side, and avoid optimizing for a transient failure mode.
The first step of designing the organization is determining its approximate total size. I recommend reasoning through this number from three or four different directions: 1. An optimistic number based on what’s barely possible. 2. A number based on the “natural size” of your organization, if every team and role was fully staffed. 3. A realistic number based on historical hiring rates. Then merge those into a single number.
Once you have your head count projection, you need to identify how many individuals you want each manager to support. This number particularly depends on your company’s working definition of an engineering manager’s role. If engineering managers are expected to do hands-on technical work, then their teams should likely be three to five engineers (unless the team has been working together well for a long time, in which case things get very specific and hard to generalize about). Otherwise, targeting five to eight engineers, depending on experience level, is pretty typical. If you’re targeting more than eight engineers per manager, then it’s worth reflecting on why you believe your managers can support a significantly higher load than industry average: Are they exceptionally experienced? Are your expectations lower than typical? Once you have the numbers, these are useful to ground you in the general number of teams and groups of teams you should have. In the first case, with 35 engineers, you’re going to want between one and three groups, containing a total of five or six teams. In the latter, with 74 engineers, you’ll want two to four groups, comprised of 12 to 15 teams. Once you’ve grounded yourself, here are some additional considerations: Can you write a crisp mission statement for each team? Would you personally be excited to be a member of each of the teams, as well as to be the manager of each of those teams? Put teams that work together (especially poorly) as close together as possible. This minimizes the distance for escalations during disagreements, allowing arbiters to have sufficient context. Also, most poor working relationships are the by-product of information gaps, and nothing fills those faster than proximity. Can you define clear interfaces for each team? Can you list the areas of ownership for each team? Have you created a gap-less map of ownership, such that each responsibility is owned by a team? Try to avoid implicitly creating holes of ownership. If you need to create explicit holes of ownership, that’s a better solution (essentially, defining unstaffed teams). Are there compelling candidate pitches for each of those teams? As always, are you over-optimizing on individuals versus establishing a sensible structure?
With your organization design and head count planning, you can roughly determine when you’ll need to fill each of the technical and management leadership positions. From there, you have four sources of candidates to staff them: Team members who are ready to fill the roles now. Team members who can grow into the roles in the time frame. Internal transfers from within your company. External hires who already have the skills. That is probably an ordered list of how you should try to fill the key roles. This is true both because you want people who already know your culture and because reorganizations that depend on yet-to-be-hired individuals are much harder to pull off successfully.
Now it’s time to make a go decision. A few questions to ask yourself before you decide to fully commit: Are the changes meaningful net positive? Will the new structure last at least six months? What problems did you discover during design? What will trigger the reorg after this one? Who is going to be impacted most? After you’ve answered those questions, make sure to get not only your own buy-in but also buy-ins from your peers and leadership. Organizational change is rather resistant to rollback, so you have to be collectively committed to moving forward with it, even if it runs into challenges along the way (which, if history holds, it almost certainly will).
There are three key elements to a good rollout: Explanation of reasoning driving the reorganization. Documentation of how each person and team will be impacted. Availability and empathy to help bleed off frustration from impacted individuals. In general, the actual tactics for doing this are: Discuss with heavily impacted individuals in private first. Ensure that managers and other key individuals are prepared to explain the reasoning behind the changes. Send an email out documenting the changes. Be available for discussion. If necessary, hold an organization all-hands, but probably try not to. People don’t process well in large groups, and the best discussions take place in small rooms. Double down on doing skip-level one-on-ones. And with that, you’re done! You’ve worked through an engineering reorganization. Hopefully, you won’t need to do that again for a while. As a closing thought, an organization is both (1) a collection of people and (2) a manifestation of an idea separate from the individuals comprising it. You can’t reason about organizations purely from either direction. There are many, exceedingly valid, different ways to think about any given reorganization, and you should use these ideas as one model for thinking through changes, not as a definitive roadmap.
When I transitioned from directly supporting teams to instead partnering their managers, I struggled to remain effective without understanding their day-to-day tasks. My first instincts were to retain the same fidelity of context over a much wider area, and for the individuals working with me this was probably indistinguishable from micromanagement. Maybe it even was micromanagement. Thanks to a great deal of feedback and reflection, I’ve gotten more deliberate at identifying where to engage and where to hang back, a process that I call identifying your controls. Controls are the mechanisms that you use to align with other leaders you work with, and they can range from defining metrics to sprint planning (although I wouldn’t recommend the latter). There is no universal set of controls—depending on the size of team and your relationships with its leaders, you’ll want to mix and match—but the controls structure itself is universally applicable. Some of the most common controls that I’ve seen and used: Metrics align on outcomes while leaving flexibility around how the outcomes are achieved. Visions ensure that you agree on long-term direction while preserving short-term flexibility. Strategies confirm you have a shared understanding of the current constraints and how to address them. Organization design allows you to coordinate the evolution of a wider organization within the context of sub-organizations. Head count and transfers are the ultimate form of prioritization, and a good forum for validating how organizational priorities align across individual teams. Roadmaps align on problem selection and solution validation. Performance reviews coordinate culture and recognition. Etc. There are an infinite number of other possibilities, many of which are specific to your company’s particular meetings and forums. Start with this list, but don’t stick to it! For whatever controls you pick, the second step is to agree on the degree of alignment for each one. Some of the levels that I’ve found useful are: I’ll do it. Stuff that I will personally be responsible for doing. When you’re going to do something, it’s better to be explicit and avoid confusion on responsibilities. Best used sparingly. Preview. I’d like to be involved early and often. This is probably an area where we aren’t quite on the same page and this will help us avoid redoing work. Review. I’d like to weigh in before it gets published or fully rolled out, but we’re pretty aligned on the topic. Notes. Projects I’d like to follow but don’t have much I can add to. Often used for wide-reaching initiatives on which we are well aligned, and I want to be able to represent my colleagues’ work correctly. No surprises. The work that we’re currently aligned on but requires updates to keep my mental model intact. If I’m asked about a related problem, I want to be able to answer it correctly. This is particularly important for me, as my effectiveness is evaluated based on my ability to stay on top of new problems. Let me know. We’re well aligned on this, since my colleagues have done it before and done it well. I want them to let me know if something comes up that I can help with, but otherwise I’m totally confident it’ll go well, so we don’t need to talk about this much. Combine your controls and the degree of alignment for each, and you’ve established the interface between you and the folks you support. This reduces the ambiguity of how you work together and allows everyone to focus. It’s useful for agreeing on performance goals, and is also very useful for exposing alignment gaps between you and individuals you work with (For example, it’s a worrisome sign if you want to preview every bit of someone’s work, unless you just started working together.) Finally, this is a useful diagnostic for you as a leader to identify if you are micromanaging. If you simply can’t imagine a world where you don’t preview everyone’s work, it’s probably time to reflect a bit on what’s holding you back from letting the team thrive.
The options for approaching a career, particularly for those of us privileged to work in technology, are so extraordinarily vast that exploring them effectively requires a different approach. This vastness also means that you, as a manager, can’t simply give folks a career path: you’ll inevitably steer them toward the most obvious avenues and through avoidable competition. Flipping perspectives, it’s also quite challenging to plan your own career. I sometimes find myself walking from one meeting where I’m coaching someone on their career goals, straight into a second meeting where I struggle to string together words to articulate my own. The hardest bit is that most folks are always at the furthest point in their career, each change a step into the unknown, with limited visibility into the upcoming opportunities that their company can provide. The intersection that I’ve found between the individual’s and their manager’s perspective is the career narrative.
If your long-term goal is to be the head of engineering at a mid size company, you could approach that linearly by trying to expand your role bit by bit at your current company on the track to becoming its head of engineering. That’ll work for roughly one person at the company, but for everyone else pursuing that same path it will probably be suboptimal. A different approach would be to instead work on identifying the gaps that would keep you from being a strong head of engineering, and then start using your current role to help fill those gaps. A prototypical head of engineering will be skilled at organizational design, process design, business strategy, recruiting, mentoring, coaching, public speaking, and written communication. They’ll also have a broad personal network and a broad foundation from product engineering to infrastructure engineering. That’s not even a particularly complete list of relevant skills! There are so many different aspects to build out, and you can find opportunities to practice all of them in your current role. There’s no need to convince yourself that your current role is holding you back—everything you need is here.
Importantly, while generalized career paths won’t necessarily align cleanly with your goals, they are also unlikely to take full advantage of your strengths. An important part of setting your goals is developing areas you’re less experienced in to maximize your global success, but it’s equally important to succeed locally within your current environment by prioritizing doing what you do well. With all of this in mind, take an hour and write up as many goals as you can for what you’d like to accomplish in the next one to five years. Then prioritize the list, pick a few that you’d like to focus on for the next three to six months, and share it with your manager at your next one-on-one.
Once you’ve identified goals to pursue, the next step is to translate those goals into actions, and this is where your manager can be a leveraged partner in iterating on your career narrative. Managers tend to have a strong sense of the business’s needs, and that gives them the superpower of finding the intersection of your interests and the business’s priorities. That translation is a creative pursuit, so don’t leave this entirely to your manager: participate as well! Brainstorm projects, research how folks at other companies have pursued similar goals, and educate your manager on aspects of your goals that they don’t know much about. (For example, engineers often have more conference speaking experience than engineering managers do.) Bringing your list of goals to this discussion helps ensure that it’s successful. If you don’t bring a rough draft, by default you’ll get steered toward the contested commons, and your career narrative will be a dull instrument for progress. This refined list of goals, aligned to your company’s priorities, then becomes a central artifact for how you and your manager collaborate on your career evolution. Every quarter or so, take some time to refresh the document and review it together. If you’re unconvinced that it’s worth your time to write a career narrative, you certainly don’t have to write one. Most folks get away with not writing one for their entire career and have deeply fulfilling careers despite the absence. That said, if you don’t, then there is probably no one guiding your career. Chasing the next promotion is at best a marker on a mass-produced treasure map, with every shovel and metal detector re-covering the same patch. Don’t go there. Go somewhere that’s disproportionately valuable to you because of who you are and what you want. The briefest of media trainings: When I was working at Digg, I was fortunate enough to get five minutes of media training from my colleague Christine. As a testament to her, that brief training lodged deeply in my head, and I’ve found myself repeating it frequently ever since. Eventually, I realized that I should probably just write it up! The three rules for speaking with the media: Answer the question you want to be asked.
The three rules for speaking with the media: Answer the question you want to be asked. If someone asks a very difficult or challenging question, reframe it into one that you’re comfortable answering. Don’t accept a question’s implicit framing, but instead take the opportunity to frame it yourself. Don’t Think of An Elephant by George Lakoff is a phenomenal, compact guide to framing issues. Stay positive. Negative stories can be very compelling. They are quite risky, too! As an interviewee, find a positive framing and stick to it. This is especially true when it comes to competitors and controversy. Speak in threes. Narrow your message down to three concise points, make them your refrain, and continue to refer back to your three speaking points.
Early on in my career, I spent a lot of time trying to find my leadership style. Recently, I think it’s more useful to think about growing yourself as a leader by developing a range of styles and applying them thoughtfully to your circumstances. Confining yourself to one style is just too hard.
One of the trickiest, and most common, leadership scenarios is leading without authority, and I’ve written about one of the styles that I’ve found surprisingly effective in those conditions. I call it Model, Document, Share.
Imagine that you’ve started a new job as an engineering manager, and the teams around you are too busy to use a planning process. You’ve mentioned to your peers a few times that you’ve seen kanban work effectively, but folks tried it two years ago and are still upset whenever the word is mentioned: it just doesn’t work here. Your first reaction might be to confront this head-on, but it takes a while to build credibility after starting a new job. Sure, you’ve been hired for your experience, so they respect your judgment, but it’s a hard sell to convince someone that your personal experience should invalidate their personal experience. I’ve been trying something different. Model. Start measuring your team’s health (maybe using short, monthly surveys) and your team’s throughput (do some lightweight form of task sizing, even if you just do it informally with a senior engineer on the team or with yourself), which will allow you to establish the baseline before your change. Then just start running kanban. Don’t publicize it, don’t make a big deal about it, just start doing it with your team. Frame it as a short experiment with the team, and start trying it. Keep iterating on it until you’re confident it works. Have the courage to keep at it for a while, and also the courage to stop doing it if it doesn’t work after a month or two. Use the team’s health and throughput metrics to ground your decision about whether it’s working. Document. After you’ve discovered an effective approach, document the problem you set out to solve, the learning process you went through, and the details of how another team would adopt the practice for themselves. Be as detailed as possible: make a canonical document, and even get a few folks on other teams to check that it’s readable from their perspective. Share. The final step is to share your documented approach, along with your experience doing the rollout, in a short email. Don’t ask everyone to adopt the practice, don’t lobby for change, just present the approach and your experience following it.
You’ll spend the majority of your time refining approaches that work effectively for your team, a bit of your time documenting how you did it, and almost no time trying to convince folks to change their approach. Strangely, in my experience, this has often led to more adoption than top-down mandates have.
Now that you’ve decided to create a decision-making group, it’s time to get into the choices! Influence. How do you expect this group to influence results? Will they be an authoritative group that makes binding decisions? Will you rely on the natural authority of the members you select? Will they primarily work through advocacy? The answers to these questions along with the particular folks in the group, will be the primary factor in how your group impacts the positive and negative freedoms of those they work with. Interface. How will other teams interact with this team? Will they submit tickets, send emails, attend a weekly review session? Will you be reviewing work before it launches, or previewing designs before they’re staffed? Depending on the kind of influence they’re exerting and how your company works, you’ll want to play around with different approaches. Size. How large should the group be? If it’s six or fewer individuals, it’s possible for them to gel into a true team, one whose members know each other well, work together closely, and shift a significant portion of their individual identities into the team. If the group has more than ten members, you’ll find it hard to even have a good discussion, and it’ll need to be structured into sub-groups to function well (rotation that spreads members over time, working groups of some members to focus on particular topics, and so on). The larger the group, the more responsibility becomes diffuse, and the more you’ll need to have specified roles within the group, for example a member responsible for coordinating members’ focus. Time commitment. How much time will members spend working in this group? Will this be their top priority, or will they still primarily be working on other projects? Higher time commitment correlates with more actions and decisions. Consequently, my sense is that you want time commitment to be higher for areas where folks are directly impacted by the consequences of their decisions, and to be lower for scenarios with weaker feedback loops. Identity. Should members view their role in the group as their primary identity? Should they continue to identify primarily as members of their existing team? You’ll need a small team and high time commitment to support individuals shifting their identity. Selection process. How will you select members? I’ve found the best method to be a structured selection process, in which you identify the requirements to be a member and the skills that you believe will be valuable, and then allow folks to apply. Membership in these groups often becomes an important signal of organizational status, which makes having a consistent process for selecting membership especially important. Length of term. How long will members serve? Are these permanent assignments, or are they fixed terms for, say, six months? If they are fixed terms, are folks eligible for subsequent elections? Is there a term limit? I’ve tried most combinations, and my sense is that the best default is fixed terms, while allowing current individuals to remain eligible, and without enacting term limits. Representation. How representative will this group be? Will you explicitly select folks based on their teams, tenure, or seniority, or will you allow clusters? Attention to this can help you avoid architecture reviews that are missing front-end and product engineers, as well as product reviews without infrastructure perspective. Predictably, each of these decisions will impact the effectiveness of the others, which can make designing the group you want quite tricky. Some formats will require particular kinds of individuals to staff them, and you have to design groups that will work with the people available to participate and the culture they are participating within. Before you release that email you’re writing to spin up a new centralized decision-making group, it’s worth talking about the four ways these groups consistently fail. They tend to be domineering, bottlenecked, status-oriented, or inert. Domineering groups significantly reduce individuals’ negative and positive freedoms, and become churn factories for members. This is most common when those making decisions are abstracted from the consequences of the decisions, e.g., architecture groups in which the members write little code. Bottlenecked groups tend to be very helpful, but are trying to do more then they’re actually able to do. These groups get worn down, and either burn out their members or create a structured backlog to avoid burning themselves out, which ends up seriously slowing down folks who need their authority. Status-oriented groups place more emphasis on being a member of the group than on the group’s nominal purpose. The value of the group revolves around recognition rather than contribution, leading to folks who try to join the group for status, and the diffusion of whatever original mission the group set out to resolve. Inert groups just don’t do much of anything. Typically, these are groups whose members have not gelled or are too busy. On the plus side, this is by far the most benign of the four failure modes—but you’re also missing out on a great deal of opportunity! Having experienced each of these a few times, I try to ensure that there is a manager embedded into every centralized group, and that the manager is responsible for iterating on the format to dodge these pitfalls.
I’ve picked up some tips that I hope will help you prepare for your next presentation: Communication is company-specific. Every company has different communication styles and patterns. Successful presenters probably can’t tell you what they do to succeed, but if you watch them and take notes, you’ll identify some consistent patterns. Each time you watch individuals present to leadership, study their approach. Start with the conclusion. Particularly in written communication, folks skim until they get bored and then stop reading. Accommodate this behavior by starting with what’s important, instead of building toward it gradually. Frame why the topic matters. Typically, you’ll be presenting on an area that you’re intimately familiar with, and it’s probably very obvious to you why the work matters. This will be much less obvious to folks who don’t think about the area as often. Start by explaining why your work matters to the company. Everyone loves a narrative. Another aspect of framing the topic is providing a narrative of where things are, how you got here, and where you’re going now. This should be a sentence or two along the lines of, “Last year, we had trouble closing several important customers due to concerns about our scalability. We identified our databases as our constraints to scaling, and since then our focus has been moving to a new sharding model that enables horizontal scaling. That’s going well, and we expect to finish in Q3.” Prepare for detours. Many forums will allow you to lead your presentation according to plan, but that is an unreliable prediction when presenting to senior leadership. Instead, you need to be prepared to lead the entire presentation yourself, while being equally ready for the discussion to derail toward a thread of unexpected questions. Answer directly. Senior leaders tend to be indirectly responsible for wide areas, and frequently pierce into areas to debug problems. Their experience “debug piercing” tunes their radar for evasive answers, and you don’t want to be a blip on that screen. Instead of hiding problems, use them as an opportunity to explain your plans to address them. Deep in the data. You should be deep enough in your data that you can use it to answer unexpected questions. This means spending time exploring the data, and the most common way to do that is to run a thorough goal-setting exercise. Derive actions from principles. One of your aims is to provide a mental model of how you view the topic, allowing folks to get familiar with how you make decisions. Showing you are “in the data” is part of this. The other aspect is defining the guiding principles you’re using to approach decisions. Discuss the details. Some executives test presenters by diving into the details, trying to uncover an area the presenter is uncomfortable speaking on. You should be familiar with the details, e.g., project statuses, but I think that it’s usually best to reframe the discussion when you get too far into the details. Try to pop up to either the data or the principles, which tend to be more useful conversations. Prepare a lot; practice a little. If you’re presenting to your entire company, practicing your presentation is time well spent. Leadership presentations tend to quickly detour, so practice isn’t quite as useful. Practice until you’re comfortable, but prefer to prepare instead getting deeper into the data, details, and principles. As a corollary, if you’re knowledgeable in the area you’re responsible for, and have spent time getting comfortable with the format, over time you’ll find that you won’t need to prepare much for these specifically. Rather, whether you’re able to present effectively without much preparation will become a signal for whether you’re keeping up with your span of responsibility. Make a clear ask. If you don’t go into a meeting with leadership with a clear goal, your meeting will either go nowhere or go poorly. Start the meeting by explicitly framing your goal!
My general approach to presenting to senior leaders is: Tie topic to business value. One or two sentences to answer the question “Why should anyone care?” Establish historical narrative. Two to four sentences to help folks understand how things are going, how we got here, and what the next planned step is. Explicit ask. What are you looking for from the audience? One or two sentences. Data-driven diagnosis. Along the lines of a strategy’s diagnosis phase, explain the current constraints and context, primarily through data. Try to provide enough raw data to allow people to follow your analysis. If you only provide analysis, then you’re asking folks to take you on trust, which can come across as evasive. This should be a few paragraphs, up to a page. Decision-making principles. Explain the principles that you’re applying against the diagnosis, articulating the mental model you are using to make decisions. What’s next and when it’ll be done. Apply your principles to the diagnosis to generate the next steps. It should be clear to folks reading along how your actions derive from your principles and the data. If it’s not, then either rework your principles or your actions! Return to explicit ask. The final step is to return to your explicit ask and ensure that you get the information or guidance you need.
The most impactful changes I’ve made to how I manage time are: Quarterly time retrospective. Every quarter, I spend a few hours categorizing my calendar from the past three months to figure out how I’ve invested my time. This is useful for me to reflect on the major projects I’ve done, and also to get a sense of my general allocation of time. I then use this analysis to shuffle my goal time allocation for the next quarter. Most folks are skeptical of whether this is time well spent, but I’ve found it particularly helpful, and it’s the cornerstone of my efforts to be mindful of my time. Prioritize long-term success over short-term quality. As your scope increases, the important work that you’re responsible for may simply not be possible to finish. Worse, the work that you believe is most important, perhaps high-quality one-on-ones, is often competing with work that’s essential to long-term success, like hiring for a critical role. Ultimately, you have to prioritize long-term success, even if it’s personally unrewarding to do so in the short term. It’s not that I like this approach, it’s that nothing else works. Finish small, leveraged things. If you’re doing leveraged work, then each thing that you finish should create more bandwidth to do more future work. It’s also very rewarding to finish things. Together, these factors allow large volumes of quick things to build into crescendoing momentum. Stop doing things. When you’re quite underwater, a surprisingly underutilized technique is to stop doing things. If you drop things in an unstructured way, this goes very poorly, but done with structure this works every time. Identify some critical work that you won’t do, recategorize that newly unstaffed work as organizational risk, and then alert your team and management chain that you won’t be doing it. This last bit is essential: it’s fine to drop things, but it’s quite bad to silently drop them. Size backward, not forward. A good example of this is scheduling skip-levels. When you start managing a multi-tier team, say 20 individuals, you can specify a frequency for skip-levels and reason forward to figure out how many hours of skip-levels you’ll do in a given week. Say you have 16 indirect reports, and you want to see them once a month for 30 minutes, so you end up doing two hours per week. This stops working as your team grows, because there is simply no reasonable frequency that won’t end up consuming an unsustainable number of hours. Instead, specify the number of hours you’re able to dedicate to the activity, perhaps two per week, and perform as many skip-levels as possible within that amount of time. This method keeps you in control of your time allocation, and it scales as your team grows. Delegate working “in the system.” Wherever you’re working “in the system,” design a path that will enable someone else to take on that work. It might be that this plan will take a year to come together, and that’s fine, but what’s not all right is if it’s going to take a year and you haven’t even started. Trust the system you build. Once you’ve built the system, at some point you have to learn to trust it. The most important case of this is handing off the responsibility to handle exceptions. Many managers hold onto the authority to handle exceptions for too long, and at that point you lose much of the system’s leverage. Handling exceptions can easily consume all of your energy, and either delegating them or designing them out of the system is essential to scaling your time. Decouple participation from productivity. As you grow more senior, you’ll be invited to more meetings, and many of those meetings will come with significant status. Attending those meetings can make you feel powerful, but you have to keep perspective about whether you’re accomplishing much by attending. Sometimes, being able to convey important context to your team is super valuable, and in those cases you should keep attending, but don’t fall into the trap of assuming that attendance is valuable. Hire until you are slightly ahead of growth. The best gift of time management that you can give yourself is hiring capable folks, and hiring them before you get overwhelmed. By having a clear organizational design, you can hire folks into roles before their absence becomes crippling. Calendar blocking. Creating blocks of time on your calendar is the perennial trick of time management: add three or four two-hour blocks scattered across your week to support more focused work. It’s not especially effective, but it does work to some extent and is quick to set up, which has made me a devoted user. Getting administrative support. Once you’ve exhausted all the above tools and approaches, the final thing to consider is getting administrative support. I was once quite skeptical of whether admin support is necessary—and, until your organization and commitments reach a certain level of complexity, it isn’t—but at some point having someone else handling the dozens of little interruptions is a remarkable improvement. As you start using more of these approaches, you won’t immediately find yourself less busy, but you will gradually start to finish more work. Over a longer period of time, though, you can get less busy by prioritizing finishing things with the goal of reducing load. If you’re creative and consequent, and if you don’t fall into the trap of believing that being busy is being productive, you’ll find a way to get the workload under control.
One of the most rewarding elements of being in a supportive work environment: building a community of learning with your peers.
Recently I’ve been spending more time facilitating a broader learning community of engineering managers. When I first started facilitating the group, we focused on content-rich presentations. Each slide was dense with important lessons and essential details. It didn’t work well. Folks weren’t engaged. Attendance dropped. Learning was not the order of the day. Since then, we’ve iterated on format and eventually stumbled on an approach that has worked consistently: Be a facilitator, not a lecturer. Folks want to learn from each other more than they want to learn from a single presenter. Step back and facilitate. Brief presentations, long discussions. Present a few minutes of content, maybe five, and then move into discussion. Keep the discussions short enough that even if a group doesn’t get traction on a given topic, it doesn’t become awkward. A good time limit would be 10 minutes. Small breakout groups. Giving folks time to discuss in small groups allows them to learn a bit about the topic in a small, safe place. It also gives everyone an opportunity to be part of the discussion, which is a lot more engaging than listening to others for an hour. Bring learnings to the full group. After discussions, give each group an opportunity to bring their discussion back to the larger group, to allow the groups to cross-pollinate their learnings. Choose topics that people already know about. Successful topics are ones that people have already thought about, typically because these concepts are core to their daily work. Ideally the topic is something that they do but would like to get better at, such as one-on-ones, mentorship, coaching, or career development. People find it very hard to discuss content that they’ve just learned if it’s too far away from their previous experience. That also creates an environment where learning has to come from the facilitator instead of from the group at large. Encourage tenured folks to attend. For many learning communities, you’ll find that the most senior or most tenured folks opt out to focus on other work. This is a shame, because there is so much for them to teach newer folks, and also because it creates an opportunity for them to learn from and get to know new members. Optional pre-reads. Some folks aren’t comfortable being introduced to a new topic in public, and for those individuals, providing a list of optional pre-reads can help them prepare for the discussion. I find that most people don’t read them (which, surprisingly, I also found true when hosting paper-reading groups46), but for some folks they’re very helpful. Checking in. Depending on the size of the group, it can be powerful to start by checking in with each other, having each person give a 20- or 30-second self-introduction. The format we’ve been using lately is your name, your team, and one sentence about what’s on your mind. This is especially useful in quickly growing communities, as it makes it easier for individuals to meet each other. The thing I enjoy most about this format is that it gives folks what they really want, spending time learning from each other, and is remarkably quick for the facilitator to prepare. I’m far from a seasoned facilitator, and I’ve also found this format to be a rewarding and safe opportunity for me to grow as a facilitator.
At an early job, I worked with a coworker whose philosophy was “If you don’t ask for it, you’ll never get it.” Which culminated in continuous escalations to management for exceptional treatment. This approach was pretty far from my intuition of how a well-run organization should work, and it grated on my belief that consistency is a precondition of fairness. Since then, I’ve come to believe that environments that tolerate frequent exceptions are not only susceptible to bias but are also inefficient. Keeping an organization aligned is challenging in the best of times, and exceptions undermine one of the most powerful mechanisms for alignment: consistency. Conversely, organizations survive by adapting to the dynamic circumstances they live in. An organization stubbornly insisting on its established routines is a company pacing a path whose grooves lead to failure. How do you reconcile consistency and change? As with most seemingly opposing goals, the more time I spent considering them, the less they were mutually exclusive. Eventually, a unified approach emerged, which I call “Work the policy, not the exceptions.”
Every policy you write is a small strategy, built by identifying your goals and the constraints that bring actions into alignment with those goals. It’s probably easiest to dig into an example. One of the most interesting policies that I recently worked on was designing who is able to join which teams in a company with both many remote employees and many geographically distributed offices. Our most important goals were: Make every office a first-tier office; there are no second-tier offices. A first-tier office must own multiple critical projects, and its work shouldn’t be constrained by support from other offices. Furthermore, it must have many individuals physically present who work together closely. Ensure that remote engineers remain an essential, well-supported cohort of the company. Once we had agreed upon our goals, the next step was to codify some constraints in order to narrow the scope of allowed actions to those that supported our goals. In this case, a (slightly simplified) set of constraints might be: Teams are staffed in, at most, one office. (I say “at most” in order to support the premise of teams composed entirely of remote engineers.) Employees in an office must be members of a team in that office. Remote employees may work on any team. Employees within a 60 minute commute of an office must work from that office. These are good examples of constraints because they clearly constrain allowed behaviors. You could imagine less opinionated constraints, such as “folks in an office should prefer working on teams within that office,” but that would do less to constrain behavior. If you find yourself writing constraints that don’t actually constrain choice, it’s useful to check if you’re dancing around an unstated goal that’s limiting your options. For example, in the above, you might have an unstated goal that an employee pursuing their preferred work is more important than offices being first-tier, which would lead you toward weak constraint. The fixed cost of creating and maintaining a policy is high enough that I generally don’t recommend writing policies that do little to constrain behavior. In fact, that’s a useful definition of bad policy. In such cases, I instead recommend writing norms, which provide nonbinding recommendations. Because they’re nonbinding, they don’t require escalations to address ambiguities or edge cases.
Once you’ve supported your goals through constraints, all you have to do is consistently uphold your constraints. This is easy to say, but consistency requires no little bravery. Even with the best intentions, I’ve often gone astray when it was time for me to support my own policies. The two reasons that applying policy consistently is particularly challenging are: Accepting reduced opportunity space. Good constraints make trade-offs that deliberately narrow your opportunity space. Some of the opportunities that you’ll encounter within that space will be exceptionally good, and it’s hard to stay true when faced with concrete consequences. Locally suboptimal. Satisfying global constraints inevitably leads to local inefficiency, sometimes forcing some teams to deal with deeply challenging circumstances in order to support a broader goal that they may experience little benefit from. It’s hard to ask folks to accept such circumstances, harder to be someone in one of those local inefficiencies, and hardest yet to stick to the decisions at real personal cost to the folks you’re impacting. When we’ve picked thoughtful constraints to allow us to accomplish important goals, we need the courage to bypass these opportunities and accept these local inefficiencies. If we don’t summon and maintain that courage, we incur most of the costs and receive few of the benefits.
Policy success is directly dependent on how we handle requests for exception. Granting exceptions undermines people’s sense of fairness, and sets a precedent that undermines future policy. In environments where exceptions become normalized, leaders often find that issuing writs of exception—for policies they themselves have designed—starts to swallow up much of their time. Organizations spending significant time on exceptions are experiencing exception debt. The escape is to stop working the exceptions, and instead work the policy.
Once you’ve invested so much time into drafting policy, you have to avoid undermining your work, and yourself, with exceptions. That said, you can’t simply ignore escalations and exceptions requests, which often represent incompatibilities between the reality you designed your policy for and the reality you’re operating in. Instead, collect every escalation as a test case for reconsidering your constraints. Once you’ve collected enough escalations, revisit the constraints that you developed in the original policy, merge in the challenges discovered in applying the policy, and either reaffirm the existing constraints or generate a new series of constraints that handle the escalations more effectively. This approach is powerful because it creates a release valve for folks who are frustrated with rough edges in your current policies—they’re still welcome to escalate—while also ensuring that everyone is operating in a consistent, fair environment; escalations will only be used as inputs for updated policy, not handled in a one-off fashion. The approach also maintains working on policy as a leveraged operation for leadership, avoiding the onerous robes of an exceptions judge. When you roll out a policy, it’s quite helpful to declare a future time when you’ll refresh it, which ensures that you’ll have the time to fully evaluate your new policy before attempting revision. It’s fairly common for folks to modify good, effective policy due to concerns that arise before the policy has had time to show its effect. At a sufficiently high rate of change, policy is indistinguishable from exception. The next time you’re about to dive into fixing a complicated one-off situation, consider taking a step back and documenting the problem but not trying to solve it. Commit to refreshing the policy in a month, and batch all exceptions requests until then. Merge the escalations and your current policy into a new revision. This will save your time, build teams’ trust in the system, and move you from working the exceptions to working the policy.
We have the opportunity to create an environment for those around us to be their best, in fair surroundings. For me, that’s both an opportunity and an obligation for managers, and saying no in that room with my manager and CTO was, in part, my decision to hold the line on what’s right. However, there was a second no in that room, and it’s one you’ll use routinely even under the best of circumstances. That no is an expression of what is possible for the team you lead to do. I felt that the decision would be wrong, but also that the precedent of firing people for on-call mistakes would irreparably damage the morale of a team who already saw their phone batteries drained before the end of a 12 hour on-call shift. This no is explaining your team’s constraints to folks outside the team, and it’s one of the most important activities you undertake as an engineering leader.
When folks want you to commit to more work than you believe you can deliver, your goal is to provide a compelling explanation of how your team finishes work. Finishes is particularly important, as opposed to does, because partial work has no value, and your team’s defining constraints are often in the finishing stages. The most effective approach that I’ve found for explaining your team’s delivery process is to build a kanban board2 describing the steps that work goes through, and documenting who performs which steps. You don’t have to switch to using a kanban system, although I’ve found it very effective for debugging team performance, you just have to populate the board once to expose your current constraints. Using this board, you’ll be able to explain what the current constraints are for execution, and help your team narrow suggestions for improvement down to areas that will actually help. If you don’t provide this framework, people tend to start making suggestions everywhere across your process, which at best means many ideas won’t reduce load where it’s most helpful, and at worst may inadvertently increase load. You want to focus your team on your core constraint, the single inefficient component that’s slowing down your throughput of finished work. Once you’ve focused the conversation on your core constraint, the next step is explaining what’s preventing you from solving for it. At many technology companies this comes down to technical debt or toil. However, the specters of technical debt and toil have been used to shirk so much responsibility that simply naming them tends to be unconvincing.
There are two ways to add capacity: move existing resources to the team (away from what they’re currently doing) or create new resources (typically through hiring). Neither is a panacea. The best outcome of a discussion on velocity is to identify a reality-based approach that will support your core constraint. The second-best outcome is for folks to agree that you’re properly allocated against your constraints and to shift the conversation to prioritization. (Those are the only good outcomes.) Although shifting from a discussion about velocity to one about prioritization is a good outcome, expressing your priorities convincingly can be a difficult, daunting task. I recommend breaking it down into three discrete steps: document all your incoming asks, develop guiding principles for how work is selected, and then share subsets of tasks you’ve selected based on those guiding principles. Hopefully, documenting your incoming asks is as straightforward as auditing your team’s tickets, but it’s pretty common for some of the most important asks to be undocumented. What I’ve found effective is to blend existing planning artifacts (typically quarterly/annual plans) and your tickets into a list, and then test it against your most important stakeholders. Keep asking those who routinely have dependencies on your team, “Does this seem like the right list of tasks?” The result will be a fairly accurate artifact. From there, you have to pick the guiding principles that you’ll use for selecting tasks. How you’ll do that will depend on your team and your company—infrastructure teams will pick different guides5 than product teams will, but they’ll likely be grounded in your company’s top-level plans and will intersect with your team’s mission. (The most controversial guides tend to be statements about the value of current work versus future work, for example doing investment work today that will pay off in two years but limit value in the short term. One technique that I’ve found useful for this particular scenario is specifying quotas for both immediate and long-term work.) The last step is sitting down with your team and applying your guiding principles to the universe of incoming asks, then finding the subset to prioritize. You’ll continuously get more requests to do work, so it’s important that this process is lightweight enough that you can rerun it periodically. Which, it so happens, is exactly what you should do when a stakeholder disagrees with your priorities. Understand their requests, and sit down with them to test their ask against your guiding principles and your currently prioritized work. If their request is more important than your current work, shift priorities at your next planning session. (To limit churn created by shifting priorities, it’s useful to wait for the next planning session instead of making these changes immediately. This does mean that you’ll need to be refreshing your plan at least monthly.)
When I started managing, my leadership philosophy was simple: The Golden Rule makes a lot of sense. Give everyone an explicit area of ownership that they are responsible for. Reward and status should derive from finishing high-quality work. Lead from the front, and never ask anyone to do something you wouldn’t.
I believe that almost every internal problem can be traced back to a missing or poor relationship, and that with great relationships it is possible to come together and solve almost anything.
A few years back, one of the leaders I worked with told me, “With the right people, any process works, and with the wrong people, no process works.”
Lately, I’ve come to have something of a mantra for guiding decision making: do the right thing for the company, the right thing for the team, and the right thing for yourself, in that order. This is pretty obvious on some levels, but I’ve found it to be a useful thinking aid. First, all thinking should start from a company perspective, and you should make sure that what you’re doing is not creating negative externalities for the company or the other teams you work with. For example, you’re really excited about trying out a new programming language in a project, but you should also make sure that you’ve considered the additional maintenance cost for the rest of the company. Next, make sure that your choices are being made on behalf of your team, not on your own behalf. This might mean pushing back on a timeline that would force your team into a death march, even though it’s uncomfortable to have that conversation with your manager or your product partner. Last in the list is yourself, but while I do believe that you should generally put yourself last, it’s also a reminder to “pay yourself.” Burnout is endemic in our industry, and a burned-out manager often leaks onto their team. Give as much as you can sustainably give, and draw the line there. Think for yourself So much of what we take for granted is cargo-culted instead of done with intention. Early in your management career, you’ll have to figure out how to approach common challenges: interviewing, performance management, promotions, raises. It’s totally all right to start out by following what you see around you—learning from your peers is critical to success. However, it’s also important to be honest about which of your practices are truly best practices and which you’re following on autopilot.
Leadership is matching appropriate action to your current context, and it’s pretty uncommon that any two situations will flourish from the same behaviors. If you’re working in the growth plates—or outside of them—for the first time, treat it like a brand-new role. It is!
Managers work more indirectly, so when we get stuck it isn’t always quite as obvious, but we absolutely do get stuck, both on individual projects and in our careers. Here are a few ways that happens. Newer managers, often in their first couple of years: Only manage down. This often manifests in building something your team wants to build, but which the company and your customers aren’t interested in. Only manage up. In Pearl S. Buck’s The Good Earth, she writes, “All power comes from the Earth.” In management, power comes from a healthy team. Some managers focus so much on following their management’s wishes that their team evaporates beneath them. Never manage up. Your team’s success and recognition depend significantly on your relationship with your management chain. It’s common for excellent work to go unnoticed because it was never shared upward. Optimize locally. Picking technologies that the company can’t support, or building a product that puts you in competition with another team. Assume that hiring never solves any problems. When you’re behind, it can be tempting to spend all of your time firefighting and neglect hiring, but if your business is growing quickly, then eventually you hire or burn out. Don’t spend time building relationships. Your team’s impact depends largely on doing something that other teams or customers want, and getting it shipped out the door. This is extraordinarily hard without a supportive social network within the company. Define their role too narrowly. Effective managers tend to become the glue in their team, filling any gaps. Sometimes that means doing things you don’t really want to do, in order to set a good example. Forget that their manager is a human being. It’s easy to get frustrated with your manager when they put you in bad situations, forget to tell you something important, or commit your team to something without consulting you, but they almost certainly did it with the best of intentions. To have a good relationship with your manager, you have to give them room to make mistakes. More experienced managers: Do what worked at their previous company. When you start a new job or new role, it’s important to pause to listen and foster awareness before you start “fixing” everything. Otherwise, you’re fixing problems that may not exist, and doing so with tools that may not be appropriate. Spend too much time building relationships. This is particularly common in managers coming from larger companies into smaller ones, and it creates the perception that the manager isn’t contributing anything of value. This tends to be because smaller companies expect more execution focus than relationship management focus from their managers. Assume that more hiring can solve every problem. Adding a few wonderful people to the team can solve many problems, but adding too many people can dilute your culture, and lead to people with unclear roles and responsibilities. Abscond rather than delegate. Delegation is important, but it’s easy to go too far and ignore the critical responsibilities that you’ve asked others to take on, only to discover an easily averted disaster later on. Become disconnected from ground truth. Particularly at larger companies, it can become frequent to make decisions that appear to be fundamentally disconnected from reality.
Managers of any and all levels of experience: Mistake team size for impact. Managing a larger team is not a better job, it’s a different job. It also won’t make you important or make you happier. It’s hard to unlearn a fixation on team size, but if you can, it’ll change your career for the better. Mistake title for impact. Titles are arbitrary social constructs that only make sense in the context they’re given. Titles don’t translate across companies meaningfully, and they’re a deeply flawed way to judge yourself or others. Don’t let them become your goal. Confuse authority with truth. Authority lets you get away with weak arguments and poor justifications, but it’s a pretty expensive way to work with people, because they’ll eventually turn off their minds and simply follow orders—if they’re in a complicated compensation or life situation, that is. Otherwise, they’ll just leave. Don’t trust the team enough to delegate. You can’t scale your impact or engage your team if you don’t give them enough room to do things differently than you would. Many organizations become bottlenecked on approvals, which is a sure proxy for lack of trust. Let other people manage their time. Most managers have significantly more work they could be doing than they’re able to do. This will probably be your status quo for the rest of your career, and it’s important to prioritize your time on important things, and not simply on whatever happens to end up on your calendar. Only see the problems. It can be easy to only see what’s going wrong, and forget to celebrate the good stuff. Down this path lie frustration and madness.
To partner successfully with your manager: You need them to know a few things about you. You need to know a few things about them. You should occasionally update the things you know about each other. Things your manager should know about you: What problems you’re trying solve. How you’re trying to solve each of them. That you’re making progress. (Specifically, that you’re not stuck.) What you prefer to work on. (So that they can staff you properly.) How busy you are. (So that they know if you can take on an opportunity that comes up.) What your professional goals and growth areas are. Where you are between bored and challenged. How you believe you’re being measured. (A rubric, company values, some KPIs, etc.) Some managers are easier to keep informed than others, and success hinges on finding the communication mechanism that works for them. The approach that I’ve found works well is: Maintain a document with this information, which you keep updated and share with your manager. For some managers, this will be enough! Mission accomplished. Sprinkle this information into your one-on-ones, focusing on information gaps (you’re not seeing support around a growth area, you’re too busy, or not busy enough, and so on). Success is filling in information gaps, not reciting a mantra. At some regular point, maybe quarterly, write up a self-reflection that covers each of those aspects. (I’ve been experimenting with a “career narrative” format that is essentially a stack of quarterly self-reflections.) Share that with your manager, and maybe with your peers too!
A few managers seemingly just don’t care, and I’ve always found that those managers do care, and are too stressed to participate in successful communication. This leads to the other key aspect of managing up: knowing some things about your manager and their needs. Here are some good things to know: What are their current priorities? Particularly, what are their problems and key initiatives. When I get asked this question, I often can’t answer it directly, because what I’m focused on is people-related, but it’s a warning sign if your manager never answers it (either because because they don’t know, or they are always working on people issues). How stressed are they? How busy are they? Do they feel like they have time to grow in their role or are they grinding? Is there anything you can do to help? This is particularly valuable for managers who don’t have strong delegation instincts. What is their management’s priority for them? What are they trying to improve on themselves, and what are their goals? This is particularly valuable to know if they appear stuck, because you may be able to help unstick them. (You could be especially helpful by redefining impact in terms of work that your team can accomplish versus growing team size, which is a frequent source of stickiness!) It’s relatively uncommon for managers to be unwilling to answer these kinds of questions. (Either they’re open and glad to share or are willing to speak about themselves.) However, it is fairly common for them to not know the answers. In those cases, each of these questions can be a pretty expansive topic for a one-on-one. Finding managerial scope I was chatting with an engineering manager last week, and he mentioned that the jobs he really wants are VP of Engineering roles, but he feels that no one is willing to take a gamble on him.
As managers looking to grow ourselves, we should really be pursuing scope: not enumerating people but taking responsibility for the success of increasingly important and complex facets of the organization and company. This is where advancing your career can veer away from a zero-sum competition to have the largest team and evolve into a virtuous cycle of empowering the organization and taking on more responsibility. There is a lot less competition for hard work. Companies will always need someone to run their cost-accounting initiatives, to set up their approach to on-call, to iterate on their engineer-recruiting process. Strong execution in these crosscutting projects will give you the same personal and career growth as managing a larger team. Project-managing an initiative working with 50 engineering managers is a far better learning opportunity than managing an organization of 50, and it builds the same skills. This realization was very important and empowering for me: you can always find an opportunity to increase your scope and learning, even in a company that doesn’t have room for more directors or vice presidents.
When I began managing managers, things shifted. I felt certain that I knew how to solve all the problems, but I didn’t know how to rely on others to solve them, and often learned of problems long after they’d deteriorated. Delegation, metrics, meetings, and process—practices that I’d considered obvious or unimportant—crept into my tool kit, and I started to regain my footing.
As a functional leader, you’ll be expected to set your own direction with little direction from others. When things in your area are going poorly, you’ll be swamped with more direction and input than you can readily absorb, but when things are going well, you’ll often be responsible for supplying your own direction and that of your team. If you don’t supply it yourself, you’ll start to feel the pull of irrelevance: Maybe no one really cares what we do? What would happen if I stopped showing up? Maybe I should be doing something different? That initial instinct to leave after hitting a pocket of seeming irrelevance is a comforting one, but it’s the wrong way to go. You can certainly avoid the current swells of ambivalence by switching jobs, but if you’re successful at another company then you’ll end up in the same situation. This is a symptom of success. You have to learn the lesson it’s trying to teach you: How to set your organization’s direction and your own. In a recent chat with an engineering manager, they mentioned a low-grade, ambient anxiety around their impact, which they’d felt since moving into a role focused on supporting managers rather than on leading a team directly. I think that this resonates with everyone’s transition to managing managers: it’s an unsettling period when you lose access to what used to make you happy—partnering directly with a team—and haven’t found new sources of self-worth in your work. This isn’t the only reason this transition is hard, it’s also hard because a lot of your skills and habits stop working well. The skill that scales the worst is outworking your problems. This is particularly frustrating, because your ability to put your head down and solve gritty, important problems was probably a big part of how you were promoted. Now it’s the wrong answer to most of your problems. This isn’t because it’s bad behavior, just because it’s too inefficient to accomplish the breadth and quantity of things you need to get done. If you’re sitting at the post-transition moment, detached from the work you loved, and with your instincts driving you into a pile of work you can’t make a dent in, I have a tool that’s been useful for me and might be useful for you! For every problem that comes your way—an email asking for a decision, a production problem, a dispute around on-call, a request to transfer from one team to another—you must pick one of three options: Close out. Close it out in a way such that this specific ask is entirely resolved. This means making a decision and communicating it to all involved participants. This strategy is a success if this particular task never comes back to you; and your goal is to finish this particular task as quickly and as permanently as possible. Solve. Design a solution such that you won’t need to spend time on this particular kind of ask again in the next six months. This is often designing norms or process, but depending on the kind of issue, this might be coaching an individual. With this option, your goal is to finish off an entire category of tasks. Delegate. Ideally, this is to redirect the ask to someone who is responsible for that kind of work, but sometimes it is a one-off. If you can’t close out a task or solve it, your only other option is to delegate it to someone who either has the specialized skills to close/solve it, or who can work in the system. No matter what problem comes your way, you’re not allowed to solve issues any other way! Give this method a try for a week, and see if it helps you navigate your role more effectively.
I’ve found a framework for thinking about inclusion efforts that is simple but that has allowed me to think about the problem broadly, identify useful programs, and move from anxiety to implementation. The basis: an inclusive organization is one in which individuals have access to opportunity and membership. Opportunity is having access to professional success and development. Membership is participating as a version of themselves that they feel comfortable with.
Good process is as lightweight as possible, while being rigorous enough to consistently work. Structure application allows folks to learn how the processes work, and lets them build trust by watching the consistent, repeated application of those processes.
Have you ever worked at a company where the same two people always got the most important projects? Me too. It’s frustrating to watch these opportunities to learn from the sidelines, and reliance on a small group can easily limit a company’s throughput as it grows. This is so important that I’ve come to believe that having a wide cohort of coworkers who lead critical projects is one of the most important signifiers of good organizational health. It’s a particularly powerful metric because it simultaneously measures the company’s capability to execute projects and the extent that its members have access to growth. The former measurement helps determine the company’s potential throughput, and the latter correlates heavily with inclusivity.
Over the past six months, I’ve been hiring engineering manager-ofmanagers roles. These roles are scarcer than line management roles, and they vary more across companies.
We’ve focused on testing these categories: Partnership. Have they been effective partners to their peers, and to the team that they’ve managed? Execution. Can they support the team in operational excellence? Vision. Can they present a compelling, energizing vision of the future state of their team and its scope? Strategy. Can they identify the necessary steps to transform the present into their vision? Spoken and written communication. Can they convey complicated topics in both written and verbal communication? Can they do all this while being engaging and tuning the level of detail to their audience? Stakeholder management. Can they make others, especially executives, feel heard? Can they make these stakeholders feel confident that they’ll address any concerns? This evaluation doesn’t cover every aspect of being an effective senior leader. But it does cover the raw skills that form the foundation of one’s success.
To test these categories, we’re using these tools: Peer and team feedback. Collect written feedback from four or five coworkers. Include peers on other teams. Include people the applicants have managed. Include people they would not have managed. My biggest advice? Lean into controversial feedback, not away from it. Listen to would-be dissenters, and hear their concerns. A 90-day plan. The applicant writes a 90-day plan of how they’d transition into the role, and what they would focus on. They emphasize specific tactics, time management, and where they’d put their attention. This is also a great opportunity to understand their diagnosis of the current situation. Provide written feedback to them on their plan. Have them incorporate that feedback into their plan. This is an opportunity to try out working together in the new role. Vision/strategy document. The applicant writes a combined vision/strategy document. It outlines where the new team will be in two to three years, and how they’ll steer the team to get there. Provide written feedback on the document. Have them incorporate that feedback. Vision/strategy presentation. Have the applicant present their vision/strategy document to a group of three to four peers. Have the peers ask questions, and see how the applicant responds to this feedback. Executive presentation. Have the applicant present their strategy document, one-on-one, with an executive. In particular, test for their ability to adapt communication to different stakeholders. Running the process takes a lot of time, but it’s rewarding time. In fact, this has generated more useful feedback than anything else I’ve done over the past year.
Stocks and flows are especially valuable in understanding the failure of projects and teams. Projects fall behind one sprint at a time. Technical debt strangles projects over months. Projects fail slowly—and fixing them takes time, too. Working at a frenetic pace for a couple of weeks or a month may feel like a major outpouring of effort and energy, but it’s near impossible to quickly counteract a deficit created over months of poor implementation or management choices.
Without policy, your tool is stepping back and allowing the brokenness to collapse under its own weight. A deeply flawed system can’t be saved by band-aids, but it can easily absorb your happiness to slightly extend its viability. If you step back, you conserve your energy and avoid creating rifts by pushing others away in hero mode, and you will be ready to be a part of a new—hopefully more functional—system after the reset does occur. This is a very uncomfortable process, and if you’re a hardworking, loyal person, then it probably goes deeply against your nature. It certainly goes against my nature, but I believe that this is one case in which following my nature is a detriment to both myself and those around me. Projects fail all the time, people screw up all the time. Usually it’s by failing to acknowledge missteps that we exacerbate them. If we acknowledge errors quickly, and cut our losses on bad decisions before burning ourselves out, then we can learn from our mistakes and improve. Kill your heroes and stop doing it harder. Don’t trap yourself in your mistakes, learn from them and move forward.
Luck plays such an extraordinary role in each individual’s career progression that sometimes the entire concept of career planning seems dubious. However, as managers, we have an outsize influence in reducing the role of luck in the careers of others. That potential to influence calls us to hold ourselves accountable for creating fair and effective processes for interviewing, promoting, and growing the folks we work with.
Working at a company isn’t a single continuous experience. Rather it’s a mix of stable eras and periods of rapid change that bridge between eras. Thriving requires both finding a way to succeed in each new era and successfully navigating the transitional periods. You yourself trigger some transitions, like switching companies. Others happen on their own schedule: a treasured coworker leaves, your manager moves on, or the company runs out of funding.
The good news is that both the stable eras and the transitions are great opportunities for growing yourself. Transitions are opportunities to raise the floor by building competency in new skills, and in stable periods you can raise the ceiling by developing mastery in the skills that the new era values. As the cycle repeats, your elevated floor will allow you to weather most transitions, and you’ll thrive in most eras by leveraging your expanding masteries.
It’s suggested that mature companies have more stable periods while startups have a greater propensity for change, but it’s been my experience that what matters most is the particular team you join. I’ve seen extremely static startups, and very dynamic teams within larger organizations.
Even hypergrowth companies tend to have teams that are largely sheltered from change by either their management or because they’re too far away from the company’s primary constraints to get attention. By tracking your eras and transitions, you can avoid lingering in any era beyond the point when you’re developing new masteries. This will allow you to continue your personal growth even if you’re working in what some would describe as a boring, mature company. The same advice applies if you’re within a quickly growing company or startup: don’t treat growth as a foregone conclusion. Growth only comes from change, and that is something you can influence.
Reflecting on the interviews I’ve run over the past few years and those I got to experience recently, I believe that, while interviewing well is far from easy, it is fairly simple. Be kind to the candidate. Ensure that all interviewers agree on the role’s requirements. Understand the signal your interview is checking for (and how to search that signal out). Come to your interview prepared to interview. Deliberately express interest in candidates. Create feedback loops for interviewers and the loop’s designer. Instrument and optimize as you would any conversion funnel.
Almost every unkind interviewer I’ve worked with has been either suffering from interview burnout after doing many interviews per week for many months or has been busy with other work to the extent that they have started to view interviews as a burden rather than a contribution. To fix that, give them an interview sabbatical for a month or two, and make sure that their overall workload is sustainable before moving them back into the interview rotation.
For some roles—especially roles that vary significantly between companies, like engineering managers, product managers, or architects—this is the primary failure mode for interviews, and preventing it requires reinforcing expectations during every candidate debrief in order to ensure interviewers are “calibrated.” I’ve found that agreeing on the expected skills for a given role can be far harder than anyone anticipates, and it can require spending significant time with your interviewers to agree on what the role requires. (This is often in the context of what extent and kind of programming experience is needed in engineering management, DevOps, and data science roles.)
After you’ve broken the role down into a certain set of skills and requirements, the next step is to break your interview loop into a series of interview slots that together cover all of those signals. Typically, each skill is covered by two different interviewers to create some redundancy in signal detection, in case one of the interviews doesn’t go cleanly. Just identifying the signals you want is only half of the battle, though; you also need to make sure that the interviewer and the interview format actually expose that signal. It really depends on the signal you’re looking for, but a few of the interview formats that I’ve found very effective are: Prepared presentations on a topic. Instead of asking the candidate to explain some architecture on the spur of the moment, give them a warning before the interview that you’ll ask them to talk about a given topic for 30 minutes, which is a closer approximation of what they’d be doing on the job. Debugging or extending an existing codebase on a laptop (ideally on their laptop). This is much more akin to the day-to-day work of development than writing an algorithm on the board. A great problem can involve algorithmic components without coming across as a pointless algorithmic question. (One company I spoke with had me implement a full-stack auto-suggest feature for a search inbox, which required implementing a prefix tree, but the interviewer avoided framing it as yet another algo question.) Giving demos of an existing product or feature (ideally the one they’d be working on.) This task helps them learn more about your product and get a sense of whether they have interest around what you’re doing, and it helps you get a sense of how they deliver feedback and criticism. Roleplays (operating off a script that describes the situation.) This option can be pretty effective if you can get the interviewers to buy into it, allowing you to get the candidate to create more realistic behavior (architecting a system together, giving feedback on poor performance, running a client meeting, and so on).
Make sure your candidates know that you’re excited about them.
Whenever you extend an offer to a candidate, have every interviewer send a note to them saying that they enjoyed the interview. (Compliment rules apply: more detailed explanations are much more meaningful.) At that point, as an interviewer, it can be easy to want to get back to your “real job,” but resist the temptation to quit closing just before you close: it’s a very powerful human experience to receive a dozen positive emails when you’re pondering if you should accept a job offer.
Ensure that you build feedback loops into your process, both for the interviewers and for the designer of the interview process.
For pair interviews, have a new interviewer (even if they are experienced somewhere else!) start by observing a more experienced interviewer for a few sessions, and then gradually take on more of the interview until eventually the more senior candidate is doing the observing. Since your goal is to create a consistent experience for your candidates, this is as important for new hires who are experienced interviewing elsewhere as it is for a new college grad.
To get the full benefit of calibration and feedback, after the interview have each interviewer write up their candidate feedback independently before the two discuss the interview and candidate together. Generally, I’m against kibitzing about a candidate before the group debrief, in order to reduce bias in later interviews based on an earlier one, but I think this is a reasonable exception since you’ve experienced the same interview together. Also, in a certain sense, calibrating on interviewing at your company is about having a consistent bias in how you view candidates, independently of who on your team interviews them. Beyond the interviewers getting feedback, it’s also critical that the person who owns or designs the interview loop get feedback. The best place to get that is from the candidate and from the interview debrief session.
For direct feedback from candidates, I’ve started to ask every candidate during my “manager interview” sessions how the process has been and what we could do to improve. The feedback is typically surprisingly candid, although many candidates aren’t really prepared to answer the question after five hours of interviews. (It’s easy to get into the mode of surviving the interviews rather than thinking critically about the process that is being used to evaluate you.) The other, more common, mechanism is to have the recruiters do a casual debrief with each candidate at the end of the day. Both of these mechanisms are tricky because candidates are often exhausted and the power dynamics of interviewing work against honest feedback. Maybe we should start proactively asking every candidate to fill out an anonymous Glassdoor review on their interview experience. That said, this is definitely a place where starting to collect some feedback is more important than trying to get it perfect in the first pass, so start collecting something and go from there.
If your ratio of sourced referrals to direct ones goes down, then you probably have a problem (specifically, probably a morale problem in your existing team). And if your acceptance rate goes down, then perhaps your offers are not high enough, but it also might be that your best interviewer has burned out on interviewing and is pushing people away. Keep watching these numbers and listening to candidate post-process feedback, and you can go to sleep at night knowing that the process is still on the rails.
The single clearest indicator of strong recruiting organizations is a close, respectful partnership between the recruiting and engineering functions. Spending some time cold sourcing is a great way to build empathy for the challenges that recruiters face on a daily basis, and it’s also an excellent opportunity to learn from the recruiters you partner with! We’ve been doing weekly cold sourcing meetings as a partnership between our engineering managers and engineering recruiters, and it’s been a great forum for learning, empathy-building, and, of course, hiring.
The number of approaches to performance management is uncountably vast, but most of them are composed of three elements: Career ladders describe the expected evolution an individual will undergo in their job. For example, a software engineer ladder might describe expectations of a software engineer, a software engineer II, a senior software engineer, and a staff engineer. Performance designations rate individuals’ performance for a given period against the expectations of their ladder and level. Performance cycles occur once, twice, or four times a year, with the goal of assigning consistent performance designations.
A performance designation is an explicit statement of how an individual is performing against the expectations of their career ladder at their current level over a particular period of time. Because these designations are explicit, they are a backstop against miscommunication between the company and an employee. However it is a cause of concern and debugging if the explicit designation doesn’t match the implicit signals someone has received.
Many calibration systems depend heavily on whether managers are effective, concise presenters, which can become a larger factor in an individual’s designation than their own work. Don’t allow managers to pitch their candidates in the room, but instead have everyone read the manager review. This still depends on the manager’s preparation, but it reduces the pressure to perform in the calibration session itself.
Compare against the ladder, not against others. Comparing folks against each other tends to introduce false equivalencies without adding much clarity. Focus on the ladder instead.
Performance designations are usually not meant to be the primary mechanism for handling poor performance. Instead, feedback for weak performance should be delivered immediately. Waiting for performance designations to deal with performance issues is typically a sign of managerial avoidance. That said, it does serve as an effective backstop for ensuring that these kinds of issues are being addressed.
Don’t hire for potential. Hiring for potential is a major vector for bias, and you should try to avoid it. If you do decide to include potential, then spend time developing an objective rubric for potential, and ensure that the signals it indexes on are consistently discoverable.
One tension in management is staying far enough out of the details to let folks innovate, yet staying near enough to keep the work well-aligned with your company’s value structures. Seeing this challenge from a variety of perspectives with different teams, I’ve collected a playbook of tools to facilitate the balancing act, and a loose framework for rolling those tools out. A few caveats about process rollouts: Teams and organizations have a very limited appetite for new process; try to roll out one change at a time, and don’t roll out the next change until the previous change has enthusiastic compliance. Process needs to be adapted to its environment, and success comes from blending it with your particular context.
The criteria I use to evaluate if a team’s sprint works well: Team knows what they should be working on. Team knows why their work is valuable. Team can determine if their work is complete. Team knows how to figure out what to work on next. Stakeholders can learn what the team is working on. Stakeholders can learn what the team plans to work on next. Stakeholders know how to influence the team’s plans.
As you spend less time with your teams, you’ll want to start a weekly staff meeting with your managers. The best versions I’ve seen start with brief updates from each attendee, at most a couple of minutes per person, and then move into group discussions on shared topics. Topics might include running effective sprints, planning, career development, or whatever else proves useful. Done well, these discussions are the key learning forum for you and the managers you work with.
As your organization starts to get even larger and you’re mostly managing middle managers, the playbook shifts again. Your staff meeting has changed in one of two ways: The meeting has so many managers in it that they can’t even provide important updates. Plus the discussions have become unwieldly, with a couple of folks dominating conversations. Alternatively, your meeting now includes your middle managers, who themselves are likely missing some of the context about what their teams are working on or struggling with. The mechanism I’ve found most helpful in this case is to ensure that every team has a clear set of directional metrics in an easily discoverable dashboard. The metrics should cover both the longer-term goals of the team (user adoption, revenue, return users, etc.) and the operational baselines necessary to know if the team is functioning well (on-call load, incidents, availability, cost, and so on.) For each metric, the dashboard should make three things clear: the current value, the goal value, and the trend between them.
Now your staff meetings can start with a quick metrics review to give a sense of whether there is somewhere you need to drill in, and, rather than peanut buttering, you can focus your attention on projects that are exceeding or struggling. The other mechanism I’ve found to be exceptionally useful at this point is team snippets. These come out every two to four weeks and give snapshots of each team’s sprints: what they’re doing, why they’re doing it, and what they’re planning to do next. These are valuable for you to retain a sense of what your teams are working on, but they are invaluable for decentralizing coordination and communication between teams in your organization, as you become increasingly ineffective in that role. Along the way, remember that your old problems still exist, it’s just that other folks are dealing with them instead. As you roll out new processes to solve your personal pain points, you should be handing off processes to your managers, and keeping those practices intact and running. This will leave you with a tapestry of reinforcing processes, which support you and each layer of management that you support.