Oops! I Broke LinkedIn... And What I Learned From It

(c) photo credit: FOX

(c) photo credit: FOX

As a software engineer here at LinkedIn, I devote my time to making peoples’ experiences better on our website and apps. You may notice that we’re creating and testing new ideas and making changes to our products on a day-to-day basis.

I’m in the same shoes as you as an avid user of LinkedIn, constantly looking for ways to improve the website. I often ponder new ideas and I’m genuinely interested in strengthening the LinkedIn community.

However, for better or worse, I make mistakes along the way. Last week, I broke a page on LinkedIn’s main site because I didn’t consider all of the possibilities of our global market. My parents who live in China and Africa, saw a page that read “This feature is currently unavailable”.

My teammate first discovered the bug and sent the page to me at 7PM. Upon seeing this, the first thing that came to my mind was: “Oops… I broke linkedin.com!”. Three months into my first full-time job, I was a deer caught in headlights. I had no idea what to do! Half of my colleagues had left and I had to give my manager a call. Instead of blaming me, she asked me to calm down. Then, she helped me identify the impact of the problem by narrowing it down to specific use cases. After that, she directed me to a point of contact (POC) that could make the decision on how to solve this issue.

I went to the POC and said, “I’ve got a problem”.
“Of course you do,” he said effortlessly, “so let’s fix it!”

I tried my best to conceal my anxiety and sat down with him to start trouble shooting. After a quick discussion and cross-team collaboration, we devised a bug-fix plan that night and I finished a patch the next day. Now everything’s back to normal and I will never tell you which page that was. You can’t imagine the smile I have on my face as I write!

So, what did I learn from this?

1. LinkedIn’s release architecture is amazingly fast. With our new engineering stack, we could roll out new features to all of our 300 million+ users in a day. As you can see from Facebook changing its famous motto from “Move Fast and Break Things” to “Move Fast with Stable Infra”, there’s both excitement and challenges when we can move quicker than ever.

2. Thorough testing and a controlled ramp up mechanism are absolute necessities. Always have a second pair of eyes checking not only the code itself (and your soon to be published post!), but also the product design and the work flow.

3. Don’t Panic. As printed on the cover of The Hitchhiker’s Guide to the Galaxy, the first and most important thing to do when a disastrous event occurs is to adjust your mindset and remain calm. My manager later told me a horror story on how she broke the front page of LinkedIn for two hours and dealt with it in a “war-room” with little sleep that night. The engineering world is not a perfect world, we’re constantly making mistakes and correcting them, so be preemptive and embrace challenges.

4. Relationships matter, one of LinkedIn’s core values, refers to investing in relationships with both customers and employees at LinkedIn. The fact that my supervisor and colleagues didn’t blame me, and instead did their best to help me resolve the issue, was such a heart-warming moment that I will always cherish.

5. Mistakes will happen. The more I practice in a real software engineering environment, the more appealing a Test Driven Development (TDD) approach seems to me. While adapting to TDD requires more initial effort and is not necessarily easy to get used to straight out of school, the benefits are endless and it reduces a lot of headaches. In the future, this would be my top choice whenever applicable.

So this is the story and what I’ve learned from breaking linkedin.com. Have you ever made a big mistake? I would love to hear from all of you!