Exploring Kubernetes Health Probes


Probes Mechanism

Kubernetes provides three types of probes that can be used for both liveness and readiness checks:

HTTP GET Probe : This probe performs an HTTP GET request to a specified endpoint. If the response code is between 200 and 399, the probe is considered successful.

TCP Socket Probe : This probe tries to establish a TCP connection to the specified endpoint. If the connection is successful, the probe is considered successful.

Exec Probe : This probe runs a specified command inside the container. If the command exits with a status code of 0, the probe is considered successful.


Liveness Probe : Ensuring the Container is Alive

What is a Liveness Probe?

  • If the probe fails (e.g., an HTTP 500 response or a failed TCP connection), Kubernetes will kill the container and start a new instance of it.
  • Importantly, only the failing container is restarted, not the entire pod. This ensures that other containers in the pod continue to function without interruption.
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Readiness Probe : Controlling Traffic Routing

What is a Readiness Probe?

  • This mechanism ensures that only healthy and fully initialized pods (and by extension, all the containers within those pods) serve traffic, improving the reliability of your services
  • The other containers in the pod continue running, but the pod itself is removed from the service endpoints until all readiness probes pass.
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Startup Probe : Handling Long Initialization Times

What is a Startup Probe?

  • If the startup probe fails, Kubernetes will restart the container, just as with a liveness probe failure.
  • Once the startup probe succeeds, Kubernetes switches to using the liveness and readiness probes for ongoing health checks.
startupProbe:
  httpGet:
    path: /startup
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

Advanced Health Probe Configurations

When configuring health probes in Kubernetes (whether liveness, readiness, or startup probes), there are three important timing parameters you can set: initialDelaySecondsperiodSeconds, and timeoutSeconds.

1. initialDelaySeconds

  • Purpose: This parameter specifies the amount of time Kubernetes should wait after the container starts before performing the first probe.
  • Use Case: This is particularly useful if your application needs some time to initialize before it can respond to health checks.
  • Setting an appropriate initialDelaySeconds helps avoid premature probe failures during startup.
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

2. periodSeconds

  • Purpose: This parameter defines how often (in seconds) Kubernetes will perform the probe after the initial delay. It determines the interval between consecutive probes.
  • Use Case: You can configure this to control the frequency of health checks based on your application’s characteristics.
  • For instance, setting a shorter periodSeconds might be useful for critical services where you want frequent health checks.
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 10

3. timeoutSeconds

  • Purpose: This parameter specifies the amount of time Kubernetes will wait for the probe to succeed. If the probe does not complete within this time, it is considered a failure.
  • Use Case: timeoutSeconds is useful when you want to ensure that a probe doesn’t hang indefinitely.
  • By setting this parameter, you can control how long Kubernetes waits before determining that the container is unhealthy.
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 2

initialDelaySeconds and Startup Probes

There might be a confusion that if we have initial delay in seconds settings, why do we still need a startup probe ? It seems both are solving the same problem, buying some time before a container starts.

  • Startup Probe: Offers a dedicated and flexible mechanism for applications with long or variable startup times, ensuring they are given enough time to initialize without interference from liveness or readiness checks.

Interaction Between Probes and Pod Restart Policy

Kubernetes handles each container within a pod independently when it comes to health probes. The pod’s restart policy, combined with the probe configurations, dictates how Kubernetes manages container failures:

  • Readiness Probe Failure: The failing container is marked as not ready, and the pod is removed from the service endpoints. No container is restarted; the pod resumes receiving traffic once all readiness probes pass.
  • Startup Probe Failure: The failing container is restarted, similar to a liveness probe failure. The other containers continue running without interruption.

Multi Containers Pod Health Probe Settings

In a multi-container pod, each container’s health is monitored independently:

  • If one container fails a readiness probe, the pod is removed from service endpoints until the container is ready again. The pod remains running, but it does not receive traffic.
apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: app-container
    image: myapp:latest
    ports:
    - containerPort: 8080
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
    startupProbe:
      httpGet:
        path: /startup
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5
  - name: sidecar-container
    image: sidecar:latest
    ports:
    - containerPort: 9090
    livenessProbe:
      httpGet:
        path: /healthz
        port: 9090
      initialDelaySeconds: 10
      periodSeconds: 5
    readinessProbe:
      httpGet:
        path: /ready
        port: 9090
      initialDelaySeconds: 5
      periodSeconds: 5
    startupProbe:
      httpGet:
        path: /startup
        port: 9090
      initialDelaySeconds: 10
      periodSeconds: 5

Best Practices for Kubernetes Health Probe Configuration

  1. Align Probes with Application Behavior :
  • Period (periodSeconds): Adjust the frequency of the probes to balance the need for timely health checks with the overhead of performing them. Critical applications might require more frequent checks, while less critical ones can have longer intervals.
  • Timeout (timeoutSeconds): Ensure this is long enough for the probe to complete under normal conditions, but short enough to detect genuine failures quickly.
  • Simple HTTP GET requests or basic TCP socket checks are often sufficient and minimize the impact on the application’s performance.
  • Readiness Probe: Use to control traffic routing, ensuring that only healthy and ready containers receive traffic. This is crucial during application startup, configuration changes, or temporary downtime.
  • Startup Probe: Use for applications with long or variable startup times to ensure that they are not prematurely killed or marked as not ready.
  • Ensure that the probes accurately reflect the application’s health and readiness in all environments.
  • Adjust the probe configurations based on observed behavior, especially after making changes to the application or infrastructure.
  • Avoid setting timeoutSeconds too short, which may cause the probe to fail before the application has a chance to respond, especially under heavy load.
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 2
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 15
  timeoutSeconds: 2
  successThreshold: 1

startupProbe:
  httpGet:
    path: /startup
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 3
  failureThreshold: 6

Key Points Related to Kubernetes Health Probes

Health Probes are Configured at the Container Level:

  • For simpler applications, probes may not be necessary, whereas more complex or critical applications can benefit significantly from them.
  • The decision to use specific probes should be based on the application’s requirements.
  • For instance, a Startup Probe might be essential for an application with a long initialization time, but not needed for a lightweight service that starts quickly.
  • For example, a sidecar container that performs logging might not need the same level of health monitoring as the main application container.
  • In contrast, Liveness and Readiness Probes continue to run throughout the container’s lifecycle. Liveness probes monitor the ongoing health of the container, while readiness probes determine if the container is ready to serve traffic.
  • For instance, a well-tuned readiness probe can prevent a container from being restarted when it’s temporarily unable to serve traffic but is otherwise healthy.
  • Similarly, a startup probe can prevent premature liveness probe failures by allowing the container sufficient time to initialize.
  • This is particularly important in high-availability scenarios where traffic must only be routed to containers that can handle requests effectively.

Conclusion

Kubernetes health probes — liveness, readiness, and startup — are essential tools for maintaining the health and availability of your applications.

One thought on “Exploring Kubernetes Health Probes”

Leave a comment