Zero Trust Network Build, Osquery Fit & GitLab’s Real Life Roadmap

Kathy Wang, GitLab’s Sr. Director of Security, and Philippe Lafoucrière, a distinguished GitLab Engineer, recently presented “Towards Zero Trust at GitLab.com” at Google’s Cloud Next ‘19 event.

While a simple “zero trust” google search will return a variety of educational resources on the topic, what I valued most about the GitLab teams story was how pragmatically they break down the steps they took (and are still taking) along their Zero Trust journey. It’s also one of the only case studies showcasing a 100% cloud native application protection (CNAPP) organization working to implement the zero trust approach BEFORE a major security breach. Below, I’ll recap the core concepts of Kathy & Philippe’s talk, but you can also catch the full conversation in this YouTube video.

First, let's start with a quick explanation of what Zero Trust is. (or as Google calls it, BeyondCorp.)

What Is Zero Trust?

Cloudflare defines Zero Trust as:

“...an IT security model that requires strict identity verification for every person and device trying to access resources on a private network, regardless of whether they are sitting within or outside of the network perimeter.”

Kathy shares some additional perspective, saying, traditional network security is heavily perimeter based. Hard on the outside, soft on the inside. When an attacker does inevitably gain access, they can move laterally, gain privileged access, and cause a lot of headaches. Not ideal.

Zero Trust means the device is authenticated and authorized, the user is authenticated and authorized and decisions are risk-based and dynamic; meaning rules are enforced, ensuring that each access request takes into account the context, device used, application and data requested as well as the employee’s role in the organization.

For example, HR data might be available to HR, but only when using a corporate, managed system would an HR employee be able to access salary information. The same goes for system accounts accessing APIs and databases.

Kathy Describes Zero Trust as:

Not a product, but a process (that involves multiple products, configurations, procedures, and people)
Not a new idea
Not built all at once

In fact, even with corporate backing, Kathy shared that the road to building a Zero Trust Network can easily take 9-12 months. Older companies with a large IT footprint may take much longer, though any incremental progress also reduces risk accordingly and is worth it on its own. Considering this investment in time and resources, it’s helpful to understand why GitLab (and others) see value in building a Zero Trust Network. Kathy cited GitLab’s reasons as:

Lateral movement is much harder
Stolen credentials are less valuable
Known vulnerabilities that are easy to exploit will be rarer
Non-targeted attacks have less value
GitLab has a 100% remote workforce (so people are connecting remotely from around the world)

For another in depth perspective on what Zero Trust is, check out this comprehensive review of a Zero Trust Security Model provided by Akamai Technologies.

How Did GitLab Approach the Process?

This is where the pragmatic and approachable perspective comes into play. Kathy shares that, while the timeline and process for building a Zero Trust Network is long, it can also be broken up into buckets representing independent streams of work that can be built in parallel. Taking this bucketed approach to the build out allows speed and flexibility. We’ll get to GitLab’s defined buckets in a moment, but first, take a look at the foundational policies and systems that were put in place to support their Zero Trust build out:

Data Classification Policy - Identifies and governs what data is being stored/processed and what level of priority or sensitivity it is
GCP Security Guidelines - Org Follows Google’s Best Practices
Internal Acceptable Use Policy
HRIS System - A database of who works here, what role they are in & what access they have
Homogeneous Endpoints - MacOS and Linux only

With that context and foundation, Kathy and team then identified their three main buckets to help “wrap their heads around where to get started.” At GitLab, those buckets became:

Customer data - Anything that will process and store customer data and is centrally managed
Endpoints - User/employee laptops and devices, individually managed
Backend Infrastructure and third-party SaaS - Anything that does not process/store customer repo data (Slack, Zoom, Salesforce)

The buckets were then expanded into a roadmap to identify the critical components of each work path as shown in the diagram below.

Click here to skip ahead to this part of the video.

Processes, Policies & Technologies

Attacking the work across these three segments also required solving three major problems which are outlined below, sharing the process, policy and technologies required:

Problem 1: Managing User Identity Access

[Video Bookmark]

CSO.com defines Identity and access management (IAM) as “defining and managing the roles and access privileges of individual network users and the circumstances in which users are granted (or denied) those privileges... The core objective of IAM systems is one digital identity per individual.” For GitLab, building an IAM system that worked to provide visibility into their macOS user endpoints and production Linux servers, along with the ability to handle a fast pace onboarding and offboarding of staff was critical. (Kathy shares that GitLab is on track to grow from ~200 to 1,000 employees over a two year period!) GitLab’s IAM system is required to:

Verify endpoint integrity: Because they are a Linux and macOS shop, they looked deeply at Osquery and explored Uptycs and Kolide for deployment and management.
Verify access level aligns with role: This required a constantly updated org chart database or HRIS (Human Resource Information System)
Onboarding/offboarding of cloud services: Centralized SSO > Okta, Duo, Google Cloud Identity & Google Cloud Identity Secure LDAP
Minimize credential theft: U2F devices > Google’s Titan Security Keys
Enforce data classification policy: DLP solution> G Suite Enterprise

Problem 2: Securing GitLab Applications

[Video Bookmark]

Philippe helped to build a system that educated and empowered their developers to own the security of their applications and code, acknowledging that a secure network doesn’t mean much if they’re shipping insecure applications. His goal was to relieve the security team of the burden by “shifting security left” and making it a seamless part of the developers process. Much of what they’ve built below is also a part of their product vision and offering in GitLab Defend.

Here’s the process and tools their engineers and developers use to “Trust What’s Running In Production Environments”:

Secure GitLab - Scan every commit for security issues

SAST (Static Application Security Testing) > testing the source code
Dependency scanning
Container scanning +binauthz
DAST (Dynamic Application Security Testing) > testing the running application
Coming in 2019> IAST and Fuzzing

Trust What’s Deployed - Remove humans from the process

Ensure only trusted containers can be deployed
Sign and annotate images during CI phase (define an attestor policy)
Binary authorization (Grafeas and Kritis)

Dynamically Manage Eeys

Google Key Management Service: Only specific users can request keys
Secrets divided up based on Chef role
JSON files stored and encrypted on GCS
Access restricted by environment
Keys are auto-rotated every 90 days

Audit Google Cloud Identity/StackDriver logs
Proxy combined with WAF
Audit git actions

Proactively Identify Compromised Accounts - Using machine learning and data analysis to identify compromise

For a refresher, here’s a helpful article describing the differences between SAST, DAST and IAST.

Problem 3: Securing GitLab infrastructure

[Video Bookmark]

GitLab.com has over 2 million users that trust their sensitive data to GitLab. It’s not only a customer environment, but where GitLab itself manages their product repos. Ensuring the GitLab.com infrastructure is imperative. Using Google’s security best practices (and the Google Cloud Security Command Center) as a guidepost, here’s what their current process entails:

Vulnerability management - How do we ensure systems are patched in a timely manner > Tenable.io
Asset management & Ownership - Who owns what assets> Uptycs/ osquery
Mitigate abuse activities - Stopping DDos, etc > Fastly, Cloudflare
Blocking lateral movement - Using Google Virtual Private Clouds (VPC) enables compartmentalized access
Cloud policy automation - Forseti Security helps enforce our cloud policies

Taking It All in

It’s easy to appreciate how much work has gone into GitLab’s Zero Trust journey and the diligence that was required to plan, evaluate, implement and refine the various components. While Kathy and Philippe have outlined several granular actions/requirements, don’t lose sight of these higher-level takeaways when considering your own Zero Trust Network journey:

Seek internal buy-in/support
Gear up for a long-term project
Don’t underestimate foundational policies and procedures
Break the work into segments that can be built independently
Educate internally
Do this before a major breach

Check out the last few minutes of Kathy and Philippe’s presentation to hear about their lessons learned and advice for others considering a Zero Trust Network.

Learn more about Osquery:

Osquery: What it is, how it works, and how to use it

Building a Zero Trust Network ... & Where Osquery Fits ... GitLab’s Real Life Roadmap Recap

What Is Zero Trust?

Kathy Describes Zero Trust as:

How Did GitLab Approach the Process?