Security Problems
End-user authentication fails
With Istio, you can enable authentication for end users. Currently, the end user credential supported by the Istio authentication policy is JWT. The following is a guide for troubleshooting the end user JWT authentication.
Check your Istio authentication policy,
principalBinding
should be set asUSE_ORIGIN
to authenticate the end user.If
jwksUri
isn’t set, make sure the JWT issuer is of url format andurl + /.well-known/openid-configuration
can be opened in browser; for example, if the JWT issuer ishttps://accounts.google.com
, make surehttps://accounts.google.com/.well-known/openid-configuration
is a valid url and can be opened in a browser.If the JWT token is placed in the Authorization header in http requests, make sure the JWT token is valid (not expired, etc). The fields in a JWT token can be decoded by using online JWT parsing tools, e.g., jwt.io1.
Get the Istio proxy (i.e., Envoy) logs to verify the configuration which Pilot distributes is correct.
For example, if the authentication policy is enforced on the
httpbin
service in the namespacefoo
, use the command below to get logs from the Istio proxy, make surelocal_jwks
is set and the http response code is in the Istio proxy logs.
Authorization is too restrictive
When you first enable authorization for a service, all requests are denied by default. After you add one or more authorization policies, then matching requests should flow through. If all requests continue to be denied, you can try the following:
Make sure there is no typo in your policy YAML file.
Avoid enabling authorization for Istio Control Planes Components, including Mixer, Pilot, Ingress. Istio authorization policy is designed for authorizing access to services in Istio Mesh. Enabling it for Istio Control Planes Components may cause unexpected behavior.
Make sure that your
ServiceRoleBinding
and referredServiceRole
objects are in the same namespace (by checking “metadata”/”namespace” line).Make sure that your service role and service role binding policies don’t use any HTTP only fields for TCP services. Otherwise, Istio ignores the policies as if they didn’t exist.
In Kubernetes environment, make sure all services in a
ServiceRole
object are in the same namespace as theServiceRole
itself. For example, if a service in aServiceRole
object isa.default.svc.cluster.local
, theServiceRole
must be in thedefault
namespace (metadata/namespace
line should bedefault
). For non-Kubernetes environments, allServiceRoles
andServiceRoleBindings
for a mesh should be in the same namespace.Visit Ensure Authorization is Enabled Correctly to find out the exact cause.
Authorization is too permissive
If authorization checks are enabled for a service and yet requests to the service aren’t being blocked, then authorization was likely not enabled successfully. To verify, follow these steps:
Check the authorization concept documentation to correctly apply Istio authorization.
Avoid enabling authorization for Istio Control Planes Components, including Mixer, Pilot and Ingress. The Istio authorization features are designed for authorizing access to services in an Istio Mesh. Enabling the authorization features for the Istio Control Planes components can cause unexpected behavior.
In your Kubernetes environment, check deployments in all namespaces to make sure there is no legacy deployment left that can cause an error in Pilot. You can disable Pilot’s authorization plug-in if there is an error pushing authorization policy to Envoy.
Visit Ensure Authorization is Enabled Correctly to find out the exact cause.
Ensure authorization is enabled correctly
The ClusterRbacConfig
default cluster level singleton custom resource controls the authorization functionality globally.
Run the following command to list existing
ClusterRbacConfig
:Verify there is only one instance of
ClusterRbacConfig
with namedefault
. Otherwise, Istio disables the authorization functionality and ignores all policies.If there is more than one
ClusterRbacConfig
instance, remove any additionalClusterRbacConfig
instances and ensure only one instance is nameddefault
.
Ensure Pilot accepts the policies
Pilot converts and distributes your authorization policies to the proxies. The following steps help you ensure Pilot is working as expected:
Run the following command to export the Pilot
ControlZ
:Verify you see the following output:
Start your browser and open the
ControlZ
page athttp://127.0.0.1:9876/scopez/
.Change the
rbac
Output Level todebug
.Use
Ctrl+C
in the terminal you started in step 1 to stop the port-forward command.Print the log of Pilot and search for
rbac
with the following command:Check the output and verify:
- There are no errors.
- There is a
"built filter config for ..."
message which means the filter is generated for the target service.
For example, you might see something similar to the following:
It means Pilot generated:
An empty config for
sleep.foo.svc.cluster.local
as there is no authorization policies matched and Istio denies all requests sent to this service by default.An config for
productpage.default.svc.cluster.local
and Istio will allow anyone to access it with GET method.
Ensure Pilot distributes policies to proxies correctly
Pilot distributes the authorization policies to proxies. The following steps help you ensure Pilot is working as expected:
Run the following command to get the proxy configuration dump for the
productpage
service:Check the log and verify:
- The log includes an
envoy.filters.http.rbac
filter to enforce the authorization policy on each incoming request. - Istio updates the filter accordingly after you update your authorization policy.
- The log includes an
The following output means the proxy of
productpage
has enabled theenvoy.filters.http.rbac
filter with rules that allows anyone to access it viaGET
method. Theshadow_rules
are not used and you can ignored them safely.
Ensure proxies enforce policies correctly
Proxies eventually enforce the authorization policies. The following steps help you ensure the proxy is working as expected:
Turn on the authorization debug logging in proxy with the following command:
Verify you see the following output:
Visit the
productpage
in your browser to generate some logs.Print the proxy logs with the following command:
Check the output and verify:
The output log shows either
enforced allowed
orenforced denied
depending on whether the request was allowed or denied respectively.Your authorization policy expects the data extracted from the request.
The following output means there is a
GET
request at path/productpage
and the policy allows the request. Theshadow denied
has no effect and you can ignore it safely.
Keys and certificates errors
If you suspect that some of the keys and/or certificates used by Istio aren’t correct, the first step is to ensure that Citadel is healthy.
You can then verify that Citadel is actually generating keys and certificates:
Where my-ns
and my-sa
are the namespace and service account your pod is running as.
If you want to check the keys and certificates of other service accounts, you can run the following command to list all secrets for which Citadel has generated a key and certificate:
Then check that the certificate is valid with:
Make sure the displayed certificate contains valid information. In particular, the Subject Alternative Name field should be URI:spiffe://cluster.local/ns/my-ns/sa/my-sa
.
If this is not the case, it is likely that something is wrong with your Citadel. Try to redeploy Citadel and check again.
Finally, you can verify that the key and certificate are correctly mounted by your sidecar proxy at the directory /etc/certs
. You
can use this command to check:
Optionally, you could use the following command to check its contents:
Mutual TLS errors
If you suspect problems with mutual TLS, first ensure that Citadel is healthy, and second ensure that keys and certificates are being delivered to sidecars properly.
If everything appears to be working so far, the next step is to verify that the right authentication policy3 is applied and the right destination rules are in place.
Citadel is not behaving properly
Citadel is not a critical data plane component. The default workload certificate lifetime is 3 months. Certificates will be rotated by Citadel before they expire. If Citadel is disabled for short maintenance periods, existing mutual TLS traffic will not be affected.
If you suspect Citadel isn’t working properly, verify the status of the istio-citadel
pod:
If the istio-citadel
pod doesn’t exist, try to re-deploy the pod.
If the istio-citadel
pod is present but its status is not Running
, run the commands below to get more
debugging information and check if there are any errors:
If you want to check a workload (with default
service account and default
namespace)
certificate’s lifetime: