By Niranjan Shankar, Software engineer at Microsoft Azure
The widespread migration of on-premises workloads to the cloud, particularly in the aftermath of the Covid-19 pandemic, has made cloud security a growing concern for stakeholders across various industries. This uptick in cloud-based cyberattacks has compelled governments and enterprises to adopt “Zero-Trust” security frameworks over traditional perimeter-based network security. Whereas conventional security architectures assume that all communication originating from within a firewall is trustworthy, Zero-Trust models embrace a practice of “never trust, always verify,” and restrict user access only to those resources and privileges deemed absolutely necessary to perform their designated functions.
For microservices, an increasingly popular architectural approach under which applications are broken down into individual services with specific functions, implementing Zero-Trust security and networking resiliency is vital but also challenging. In highly dynamic and heterogenous environments with numerous service-to-service interactions and complex routing rules, incorporating networking logic directly into these components and trying to secure communication between workloads can be cumbersome. The process of issuing and rotating Transport Layer Security (TLS) certificates to authenticate and encrypt traffic can be especially burdensome and error-prone when done manually.
A service mesh allows developers to shift the responsibilities of networking, security, and observability into a separate dedicated layer of infrastructure – known as the “mesh” – and instead focus on business functions of their microservices, with minimal or no changes to their application code. The data-plane layer of a service mesh consists of proxies (or “sidecars”) injected alongside each application instance and is responsible for securing and controlling traffic throughout the mesh, and the control plane manages the data plane by dynamically configuring the proxies based on specifications (usually through custom resources) provided by the operator.
Source: Posta, Christian E., and Rinor Maloku. “Chapter 1 – Introducing the Istio Service Mesh.” Istio in Action, Manning Publications Co, Shelter Island, NY, 2022.
Because all traffic flows through each service’s sidecar, the service mesh can control both ends of network communication between applications, collect rich metrics and signals regarding the behavior of the network, encrypt communication between services, and effectively enforce authentication and authorization policies. The control plane also distributes and rotates Transport Layer Security (TLS) certificates to workloads in the mesh, thereby automating certificate management and greatly simplifying the process of implementing a Zero-Trust framework for developers.
Zero-Trust Security with the Istio Service Mesh
Istio is a widely-used, open-source service mesh written in Go designed to solve these networking and security challenges for cloud-native microservice applications running on Kubernetes. Istio injects an application-layer proxy called Envoy, which is capable of implementing various networking resiliency, traffic management, and security capabilities, as a sidecar to each service instance.
By default, all traffic between services onboarded to the mesh is encrypted and secured with mutual Transport Layer Security (mTLS), a security protocol in which the client’s certificate is also validated during the TLS handshake. When Istio is installed onto the cluster, it generates a self-signed certificate for istiod (the control plane service) as the root certificate for services in the mesh. When a mesh workload is started, an “agent” colocated with the Envoy proxy issues a certificate signing request (CSR) on behalf of the workload to istiod’s built-in certificate authority (CA). After validating the CSR, istiod’s CA issues a digitally signed SPIFFE Verifiable Identity Document (SVID) X.509 certificate to the agent, which is sent to the sidecar proxy via Envoy’s Secret Discovery Service (SDS) protocol. These provisioned certificates are used for service-to-service mTLS authentication.
While traffic between services in the mesh injected with the Envoy proxy is mutually authenticated and encrypted by default, mesh workloads are initially configured to also accept plaintext traffic from, and send outgoing plaintext traffic to, services outside of the mesh. This is to make it easier for large businesses managing numerous applications to onboard them to the mesh. Thus, operators will need to take additional steps to strictly enforce that workloads only communicate via mTLS.
Istio also leverages Envoy’s out-of-the-box role-based access control (RBAC) to provide a simple and declarative API to enforce fine-grained authorization policies. Though requests to workloads that don’t have any authorization policies specified are permitted by default, a mesh-wide catch-all policy can easily be configured to deny all incoming traffic that isn’t explicitly allowed. If needed, mesh operators can use an external authorization service to enforce access control for their applications.
In addition to securing service-to-service communication, Istio supports end-user authentication and authorization with JSON web tokens (JWTs) as well. This is typically performed through an ingress gateway at the edge of the mesh that receives incoming HTTP and TCP connections, which can be configured to reject requests with invalid tokens from entering the cluster. Istio authorization policies can be set to deny requests without tokens and to route traffic or permit different levels of access based on JWT claims – for instance, to grant regular users read access to a resource, while allowing administrators write-level access.
The following are the specific Istio custom resources that mesh operators can use to configure these authentication and authorization policies:
- PeerAuthentication: Set what type of traffic (encrypted vs plaintext) will be accepted by the sidecar.
- RequestAuthentication: Configure workloads to perform end-user authentication based on requests’ JWT.
- DestinationRule: Set what type of TLS traffic (TLS vs mTLS vs plaintext) will be sent by the sidecar.
- AuthorizationPolicy: Define access control for workloads in the mesh and connect to an external authorization service.
Source: Posta, Christian E., and Rinor Maloku. “Chapter 9 – Securing Microservice Communication.” Istio in Action, Manning Publications Co, Shelter Island, NY, 2022.
By automating certificate management, facilitating end-to-end traffic encryption, and mutually authenticating mesh workloads, Istio significantly streamlines the process of securing cloud-native microservice environments. Nonetheless, developers still need to take additional steps to maximize the mesh’s security – to strictly enforce mTLS across the mesh, deny requests without JWTs, and only permit traffic that’s explicitly authorized, the aforementioned custom resources need to be configured accordingly. Operators also need to ensure that they haven’t misconfigured their environment, as such errors can leave mesh workloads vulnerable to attacks from malicious actors. Finally, enterprises should continue to implement traditional application and networking security guidelines, such as regularly scanning their application images for vulnerabilities and patching them accordingly, using the most up-to-date versions of software packages, and limiting the attack surface of cloud deployments through firewall rules and Identity and Access Management (IAM).
If organizations are able to implement these best practices and take advantage of Istio’s ease-of-use and various networking and security features, their journey to Zero-Trust security will become a whole lot easier.
Niranjan Shankar is a software engineer at Microsoft Azure working on the Azure Kubernetes Service team. He has worked on both Azure-managed and open-source service mesh offerings.