How Facebook Catches Bugs On Its 100 Million+ Lines Of Code Platform

As the giant of the web, there is only a few services with sophistication that par with Facebook.

Consisting of over 100 million lines of code (LOC), that is indeed a lot of typing. While Facebook teams of engineers managed that, they also have the responsibility to maintain those codes, by tweaking when necessary, adding more codes, or may even remove some.

With billions of users to serve, Facebook needs to make sure that its platform is up and secured, and capable of delivering all the contents it have under its sleeves to its ever-hungry users.

Since no product is a 100% flawless, maintaining Facebook is certainly not an easy task, especially when considering that it's impractical for Facebook to have its engineers to manually review endless code changes all the time.

According to Facebook:

"Facebook’s web codebase currently contains more than 100 million lines of Hack code, and changes thousands of times per day. To handle the sheer volume of code, we build sophisticated systems that help our security engineers review code."

Zoncolan flowchart (source: https://engineering.fb.com/security/zoncolan)

This is where Facebook engineers created 'Zoncolan'.

It's essentially a "static analysis" tool capable of mapping the behavior and functions of Facebook's more than 100 million LOC for potential problems in individual branches, as well as in the interactions of various paths through the program.

As one of the tools that Facebook created to help its security engineers review code, Zoncolan can check for known types of bugs, capable of scanning the entire Facebook codebase in under 30 minutes - a task that could take months or years if attempted manually.

This should ease the process of finding and squashing the bugs, allowing the engineers to quickly tweak, change or release new features before the go live.

Using this static analysis approach, the tool can scale extremely well, because it sets "rules" about undesirable architecture or code behavior, and automatically scans the system for classes of bugs.

It's practical because whenever it sees a certain bug, it will understand the pattern and prevents that kind of bug to ever reappear. The system not only flags potential problems but gives engineers real-time feedback and helps them learn to avoid pitfalls.

According to a security engineer at Facebook:

"Every time an engineer makes a proposed change to our codebase, Zoncolan will start running in the background, and it will either report to that engineer directly or it will flag to one of our security engineers who’s on call."

Tools like the Zoncolan is effective for catching the same type of bug, or developer mistake, over and over again.

Because they can be good at recognizing data flows and patterns, as a way of cutting down on the false positives, these kind of tools are often used on a variety of environment by the security community and other companies.

Google for example, also has it own custom-built static analysis tool, which is capable of evaluating the company's enormous and eye-watering billions of LOC

But in the case of Zoncolan, the static analysis program is custom-built to only suit Facebook's specific code.

Zoncolan uses a technique called 'abstract interpretation' to track user-controlled input through the codebase. As it parses code, the tool builds data structures that represent the behavior of functions in the code (the control-flow graph) and how those functions interact (the call graph). It then creates a summary of the behavior of each function.

Instead of actually running each LOC, like the way an interpreter or HHVM would, Zoncolan only records properties that are relevant to potentially dangerous flows of information.

In other words, Zoncolan was designed to hunt security bugs, more than other similar tools that are generally developed to find a broader array of design and performance bug.

Among other reasons, this is why Zoncolan cannot perform well when it comes to finding new types of vulnerabilities on its own.

"As with any system of this type, Zoncolan cannot find every possible issue. But it does allow us to find classes of issue that lend themselves well to detection via static analysis," explained Facebook.

For example, the tool couldn't spot the permission issues that breached Facebook users' private messages.

So here, Zoncolan does help Facebook in its bug hunting attempts. But it is not a silver bullet.