Introduction to Istio for Non-Infrastructure Engineers
Introduction to Istio for Non-Infrastructure Engineers
Hello. I'm Narazaki from the Woven Payment Solution Development Group. We are involved in the development of the payment infrastructure application used by Woven by Toyota at Toyota Woven City, and are developing payment-related functions across the backend, web frontend, and mobile application. Within this project, I am mainly responsible for the development of backend applications. The payment backend we are developing contains microservices and runs on a Kubernetes-based platform called City Platform.
In this article, I would like to introduce you to Istio, a mechanism to set up microservice networks on Kubernetes. My aim is to explain its purposes and functions in an easy-to-understand manner for backend application engineers who are used to writing business logics or code. I hope this article will help you deepen your understanding of configurations using Istio, that it could be useful when isolating the issue causes during troubleshooting, and facilitate smooth communication with infrastructure and network engineers.
What is Istio?
With how the architecture of microservices work, their processing span multiple services, resulting in the need of communication cost between these services. As application engineers, we often think that it doesn't matter as long as it connects, but infrastructure engineers would want to effectively control the network layer.
That is why Istio was created with the aim to centralize declarative management of various settings such as network routing and security, similar to Kubernetes Manifests, and to provide integrated operational monitoring of network status. Because the network is structured like a mesh, these functions are collectively referred to as the service mesh.
Istio Architecture: Data Plane and Control Plane
First, it's essential to understand the architecture of Istio. Like Kubernetes, Istio is divided into a control plane and a data plane. Kubernetes is a control plane that receives API requests from Kubectl, etc., and controls resources such as pods, etc., and a data plane is the pod where the application actually runs.
Istio’s data plane employs a network proxy called Envoy. If necessary, the control plane injects Envoy as a sidecar container next to the container where our code runs.
Control Plane and Data Plane
Why Do We Need Istio?
Envoy is a network proxy application that can run independently. The configuration items are so varied that configuring a single Envoy instance as intended is not easy. (At least for non-infrastructure engineers! )
In a complex microservices architecture, the network is like a mesh connecting inside and outside the cluster, requiring the configuration of numerous Envoy Proxies. It is not difficult to imagine how difficult it would be to set up individually and make everything work the way you want it to.
Resources Configurable in Istio
The following features will be available by introducing Istio:
- Traffic management (service discovery, load balancing)
- Observability (logging, distributed tracing)
- Security such as authentication and authorization
On the other hand, for backend application engineers, there have been many situations in our experience where each of them is a black box, not knowing what is actually configurable or which configuration file to look at when encountering unintended behavior. Let's take a look at some specifics of what Istio's configurable resources mean.
Gateway
There are two resources related to Kubernetes networking: Ingress and Egress. Istio intercepts communications with an Envoy proxy called gateway. It is literally a gateway to the Istio network. You can set it up in the following file: Although this file itself rarely contains detailed settings that application engineers should know, it is often referenced by other files in the form of gateways
, so make sure that the Gateway is properly configured first.
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: test-gateway
spec:
selector:
istio: ingressgateway # LoadBalancer service available by default when Istio is installed
servers:
- port:
number: 80 # listening port
name: http
protocol: HTTP # allowed protocols
hosts:
- "*" # host name
Virtual Service
Kubernetes has a mechanism called Service
that allow deployment and StatefulSet to be accessed from the intracluster network. On the other hand, Istio's Virtual Service defines the route to the Service.
While this is powerful as it allows for the definition of a very large number of configuration values, caution must be taken to avoid duplication with other settings. If a request for the service is not received, there may be a mistake in the Virtual Service configuration The istioctl analyze
command may tell you about configuration errors, so let's take a look.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: test-virtualservice
namespace: test
spec:
hosts:
- "*" # specified host name. This means that the following rules apply when this host name is specified. If *, the rule applies to any host name
gateways:
- test-gateway # specify the gateway above Multiple specification allowed
- mesh # define 'mesh' here to allow intracluster communication without Gateway
http:
- match: # rules can be written for filtering requests
- uri:
prefix: /service-a # URI pattern. Regex, etc., can be selected.
route:
- destination:
host: service-a # destination service
port:
number: 80
- match: # multiple routing rules and connections can be defined
- uri:
prefix: /service-b
route:
- destination:
host: service-b
port:
number: 80
exportTo:
- . # where is this rule applied? Kubernetes namespaces.If (dot), only in the namespace where this rule is set
Authorization Policy
Communication between specific services can be controlled. Specifically, protocols, routing to specific paths, specification of HTTP methods, etc. can be modified in detail, so there may be many opportunities for application engineers to configure them. On the other hand, misconfigurations of rules and numerous unexpected pitfalls are common, so be sure to run tests after configuring.
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: access-allow-policy
namespace: test
spec:
selector:
matchLabels:
app: some-application # label on the pod
action: ALLOW # permission rule
rules:
- from: # define the source of the request
- source:
principals:
- cluster.local/ns/test/sa/authorized-service-account # Kubernetes service account
to: # request receiver definition
- operation:
methods: ["POST"] # HTTP methods allowed
paths:
- "/some-important-request" # Permitted endpoints
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: deny-policy
namespace: test
spec:
selector:
matchLabels:
app: some-application
action: DENY # example of denying a request
rules:
- to:
- operation:
paths: ["/forbidden-path"]
Other Settings
and other configuration files, but I will omit the discussion here. Basically, just like the Kubernetes resources, each schema is defined with its own configuration items. If you have a resource that your team uses, it is a good idea to check the documentation once to see what items can be configured.
Specific Examples of Common Debugging and Troubleshooting
First of all, make sure there are no glitches in the configuration. If you run the istioctl analyze
command, most misconfigurations will be reported as errors. If RBAC is enabled, such as in a production environment, and there are constraints on Istio-related resources, have an authorized infrastructure engineer perform them.
If there is no misconfiguration that cause errors, check to see how far the request has reached. Let's look at the application or sidecar logs to see if communication is broken at the gateway or up to the application pods. If it appears to be passing through the gateway, it is a good idea to check the sidecar container logs on the namespace of the pod that should be accessed, such as kubectl logs pod <pod-name> -c istio-proxy -n <namespace>
.
For intracluster communication, you can run curl
on a container, but since recent Docker base images often do not contain applications that are not needed to run container applications, attach a container for debugging such as k debug <pod-name> -n <namespace> -it --image=curlimages/curl:latest -- /bin/sh
and see if you can resolve names in the cluster.
If communication is being blocked, check the Virtual Service file. If there is a problem with authentication, refer to the Authorization Policy file to locate the misconfiguration. Routing and authentication are the areas where items configured in multiple layers are easily conflicted. You can list what authentication rules are applied to a pod with the istioctl x authz check <pod-name>.<namespace>
command.
In addition, what seems like a network error at first glance often turns out to be an implementation problem. At the same time, the implementation side should also review the network and authentication/authorization settings.
The following is what I do when I run into network-related errors.
- Isolate the causes by running the
istioctl analyze
command or checking the logs to see if the Istio configurations are incorrect. - Check the network communication from inside and outside of the cluster using
curl
andkubectl debug
commands. - Check the application configuration, such as whether the deployed application listens for requests at the port specified by the infrastructure layer.
- Check the request to see if the client application implements the required authentication and authorization mechanisms.
These can also be checked for misconfiguration and communication status via GUI if the observability stack settings such as Kiali are enabled.
Conclusion
By learning about the specific configurable items and their meanings, I hope you gained insight into some of the functions that were black-boxed. Also, some of you may have realised that the configuration items are surprisingly simple.
On the other hand, I believe the difficulty of Istio is not in the network configuration itself, but rather at the production operation phase, such as ensuring continuous stable operations (applying version patches and verifying the operations each time). As a backend application engineer, I would like to further understand the behavior of Istio and test the application's performance under actual operational conditions.
関連記事 | Related Posts
Building Cloud-Native Microservices with Kotlin/Ktor (Observability Edition)
Introducing the Woven Payment Solution Development Group
Kotlin / Ktorで作るクラウドネイティブなマイクロサービス(オブザーバビリティ編)
Woven Payment Solution開発G紹介
Practicing Observability with Grafana from a BE Engineer's Perspective
Load balancing on a global scale
We are hiring!
【Woven City決済プラットフォーム構築 PoC担当バックエンドエンジニア(シニアクラス)】/Toyota Woven City Payment Solution開発G/東京
Toyota Woven City Payment Solution開発グループについて私たちのグループはトヨタグループが取り組むWoven Cityプロジェクトの一部として、街の中で利用される決済システムの構築を行います。Woven Cityは未来の生活を実験するためのテストコースとしての街です。
【プラットフォームエンジニア】プラットフォームG/東京・大阪
プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。