k8wiz
Back to Articles

Ready to test your Kubernetes knowledge? Try our Kubernetes quiz!

Kubernetes 1.36: What's New, What's Gone, and What You Should Care About

Kubernetes 1.36 is the first major release of 2026, and it comes with substance. The focus is clear: better hardware-aware scheduling for AI/ML workloads, deeper DRA integration, improved node-level APIs, and a set of long-overdue security cleanups. This article covers the most impactful changes with enough detail to help you decide what to test, what to migrate, and what to watch.

10 min read
By k8wiz Team
kubernetes1.36

⚠️ Note: As of writing (April 11, 2026), the official release is scheduled for April 22, 2026. Some alpha features listed here may still shift before the final cut. Always cross-check with the official changelog and the Kubernetes blog.


🔴 Removals — Act Now

gitRepo Volume Plugin Permanently Disabled

The gitRepo volume type has been deprecated since Kubernetes v1.11. Starting with v1.36, it is removed for good (KEP-5040). There is no feature gate to re-enable it. The security reason is direct: this plugin could allow an attacker to run arbitrary code as root on the node.

If you still use it, migrate to one of these patterns:

# Option 1: init container with git-sync or a custom git clone
initContainers:
  - name: git-clone
    image: alpine/git:latest
    command: ["git", "clone", "https://github.com/your-org/your-repo.git", "/repo"]
    volumeMounts:
      - name: repo-volume
        mountPath: /repo
# Option 2: use an OCI image as a volume (now Stable in v1.36)
volumes:
  - name: my-config
    image:
      reference: "your-registry/your-config-image:v1.0"
      pullPolicy: IfNotPresent

The OCI VolumeSource (KEP-4639, now stable in v1.36) is exactly the right replacement for most gitRepo use cases: store your files in an OCI image, ship them with your registry workflow, and mount them directly.


IPVS Mode in kube-proxy Deprecated

kube-proxy IPVS mode is deprecated in v1.36 and flagged for removal. If you rely on IPVS for load balancing in kube-proxy (popular in high-connection-count environments), this is the signal to evaluate eBPF-based alternatives like Cilium or kube-proxy replacement modes. The default iptables mode is unaffected.


service.spec.externalIPs Deprecated (Removal in v1.43)

The externalIPs field on Service objects is being deprecated starting v1.36 (KEP-5707). It has been a documented security risk since CVE-2020-8554, enabling man-in-the-middle attacks on cluster traffic. You will see deprecation warnings immediately. Full removal is planned for v1.43.

Migration path:

Current approachReplacement
externalIPs on ServiceLoadBalancer type with cloud provider
Static IP exposureNodePort + external LB
Flexible external routingGateway API

🟡 Features Graduating to Stable (GA)

OCI VolumeSource — Now Stable

Store arbitrary files in OCI images and mount them directly into Pods without baking them into your application image. This removes the need for init containers or startup scripts for config/data distribution. Requires containerd v2.1+ or a compatible runtime.

volumes:
  - name: ml-model
    image:
      reference: "registry.example.com/models/resnet:v2"
      pullPolicy: IfNotPresent
containers:
  - name: inference
    volumeMounts:
      - name: ml-model
        mountPath: /models

User Namespaces in Pods — Now Stable

User namespaces (KEP-127) started as alpha in v1.25. After nearly four years of iteration, it is finally GA in v1.36. It allows Pod processes to run as root inside the container while mapping to an unprivileged UID on the host, significantly reducing the blast radius of container escapes. Enable it in your Pod spec:

spec:
  hostUsers: false  # enables user namespace isolation
  containers:
    - name: app
      securityContext:
        runAsUser: 0  # root inside, unprivileged outside

Mutating Admission Policies — Now Stable

CEL-based mutating admission policies (KEP-3962) replace the need for webhook servers for many common mutation patterns. Now stable, you can safely use them in production to enforce defaults, inject labels, or normalize resource requests — without the operational overhead of a separate webhook deployment.

apiVersion: admissionregistration.k8s.io/v1
kind: MutatingAdmissionPolicy
metadata:
  name: set-default-resource-limits
spec:
  matchConstraints:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        resources: ["pods"]
        operations: ["CREATE"]
  mutations:
    - patchType: "ApplyConfiguration"
      applyConfiguration:
        expression: |
          Object{
            spec: Object.spec{
              containers: object.spec.containers.map(c, c.?resources.limits == null ?
                Object.spec.containers.@item{
                  resources: Object.spec.containers.@item.resources{
                    limits: {"cpu": "500m", "memory": "256Mi"}
                  }
                } : c
              )
            }
          }

🟢 New Alpha Features Worth Watching

These features are disabled by default. Enable them only in non-production environments.

Workload-Aware Preemption (KEP-5710)

Feature gate: WorkloadAwarePreemption

The current scheduler preempts Pods individually. For gang-scheduled AI/ML workloads (where all workers must run together or not at all), this causes partial preemption that breaks the job without freeing enough resources for anything else. KEP-5710 teaches the preemptor to treat a Workload group as a single entity — it preempts all Pods in a lower-priority Workload at once, or none at all.

This builds directly on the Workload API introduced in v1.35 and is the missing piece to make gang scheduling practical at scale without a third-party scheduler.

DRA: Resource Availability Visibility (KEP-5677)

Feature gate: DRAResourcePoolStatus

Until now, answering "how many GPUs are free on my cluster?" required stitching together ResourceSlices and ResourceClaims across namespaces — something most users don't have permissions to do cluster-wide. KEP-5677 introduces a ResourcePoolStatusRequest API:

# Create a request to check available GPUs
kubectl create -f - <<EOF
apiVersion: resource.k8s.io/v1alpha1
kind: ResourcePoolStatusRequest
metadata:
  name: check-gpus
spec:
  driver: example.com/gpu
EOF

# Wait and read the result
kubectl wait --for=condition=Complete rpsr/check-gpus --timeout=30s
kubectl get rpsr/check-gpus -o yaml

The output gives you per-node totalDevices, allocatedDevices, and availableDevices. This is the foundation for real capacity planning with DRA-managed hardware.

DRA: Device Attributes in Downward API (KEP-5304)

Currently, if your workload needs device metadata (PCIe bus address, UUID, vendor info), it has to query the Kubernetes API directly — which adds latency, requires permissions, and breaks in offline scenarios. KEP-5304 lets the DRA driver pass this metadata through kubelet, which mounts it as a JSON file in the container:

/var/run/dra-device-attributes/{claimName}/{requestName}/{driverName}-metadata.json

Your ML training job can now read its GPU topology directly from a local file with zero API calls.

Pod Level Resource Managers (KEP-5526)

Feature gates: PodLevelResources + PodLevelResourceManagers

Building on the pod-level resource spec (KEP-2837, beta in v1.34), this KEP extends the Topology Manager, CPU Manager, and Memory Manager to understand pod.spec.resources for NUMA alignment decisions. Two modes are available:

  • Pod Scope: The whole Pod gets resources from a single NUMA node — ideal for ML training jobs with multiple sidecars.
  • Container Scope: Each container is managed independently — useful when containers have different resource requirements within the same Guaranteed QoS Pod.
spec:
  resources:          # pod-level budget (KEP-2837)
    requests:
      cpu: "8"
      memory: "32Gi"
  runtimeClassName: high-performance
  containers:
    - name: trainer
      resources:
        requests:
          cpu: "7"
          memory: "28Gi"
    - name: log-sidecar
      resources:
        requests:
          cpu: "100m"
          memory: "128Mi"

CRI List Streaming (KEP-5825)

Feature gate: CRIListStreaming

On nodes running thousands of short-lived containers (dense CronJob clusters, batch workloads), the kubelet's ListContainers CRI call can hit the 16 MB gRPC message limit — roughly 11,000 containers or 14,000 Pods. KEP-5825 adds server-side streaming RPCs to the CRI: instead of building the full list in memory and sending it at once, the runtime streams one container at a time. Kubelet reassembles the stream. This is a backward-compatible addition and requires a runtime update (containerd, CRI-O) to take effect.

New Kubelet gRPC API for Local Pod Information (KEP-4188)

Feature gate: PodInfoAPI

CNI plugins, monitoring agents, and service mesh sidecars running on a node today must call the API server to get Pod status — which creates load, adds latency, and fails entirely when the node loses connectivity to the control plane. KEP-4188 exposes a new gRPC endpoint directly on the kubelet:

/var/lib/kubelet/pods/kubelet.sock

The API supports ListPods, GetPod, and WatchPods and returns the most current state known to the kubelet, even before it syncs to the API server. Access is restricted to privileged processes on the node.

HPA External Metrics Fallback (KEP-5679)

Feature gate: pending naming — sig-autoscaling

When a Horizontal Pod Autoscaler relies on external metrics (Datadog, cloud queue depths) and the metrics API is temporarily unavailable, the HPA currently behaves unpredictably. KEP-5679 adds a configurable fallback behavior: you can define what the HPA should do when the external metric source is unreachable (scale to a safe value, hold current replicas, etc.). This is a reliability improvement for any HPA-driven workload that uses non-core metrics.


🟡 Features Graduating to Beta

Device Taints and Tolerations in DRA (KEP-5055) — Now Beta

Works like node taints: a DRA device can be tainted (e.g., degraded, maintenance), and only workloads with matching tolerations will get scheduled to it. Now enabled by default as beta. This is essential for GPU fleet management where individual devices can degrade without the node itself being unhealthy.

DRA Support for Partitionable Devices (KEP-4815) — Now Beta

Allows a single physical device (e.g., an A100 GPU with MIG support) to be partitioned and allocated to multiple workloads independently. Also enabled by default as beta. This significantly improves GPU utilization for smaller inference or fine-tuning jobs that don't need a full card.

In-Place Pod Level Resources Vertical Scaling (KEP-InPlacePodLevelResourcesVerticalScaling) — Now Beta

Extends the stable In-Place Pod Resize feature (GA in v1.35) to work at the pod-level resource spec. Requires cgroupv2. Enables dynamic resource adjustments for multi-container Pods without restarts.

IP/CIDR Validation Improvements (KEP-4858) — Now Beta

Stricter validation of IP addresses and CIDR strings across the API. Catches malformed addresses earlier and consistently across all fields. If you have any tooling that generates Kubernetes resources with unconventional IP formats, test against v1.36 before upgrading.


🔥 The Ingress NGINX Retirement — What You Must Do

This is not a v1.36 feature, but it lands squarely in the v1.36 timeframe and affects most production clusters. SIG Network and the Security Response Committee officially retired Ingress NGINX on March 24, 2026. No more patches. No security fixes. The GitHub repos are read-only.

Existing deployments still work — which is exactly the problem. A silent, unpatched ingress controller at the edge of your cluster is a ticking compliance and security risk.

Check if you're affected:

kubectl get pods --all-namespaces \
  --selector app.kubernetes.io/name=ingress-nginx

Migration options:

OptionBest for
Gateway API + CiliumeBPF-native environments, new setups
Gateway API + Envoy GatewayVendor-neutral, multi-protocol
NGINX Gateway Fabric (NGF)Teams familiar with NGINX
TraefikSimpler setups, good CRD model
Kong GatewayAPI gateway + ingress combined

Use ingress2gateway (now v1.0) to automate the conversion of your Ingress resources to Gateway API resources.

# Install ingress2gateway
go install sigs.k8s.io/ingress2gateway@latest

# Convert existing Ingress resources
ingress2gateway print --providers ingress-nginx

Upgrade Checklist for v1.36

Before upgrading, work through this list in a staging environment first:

  • Remove gitRepo volumes: migrate to init containers or OCI VolumeSource
  • Audit externalIPs usage: kubectl get svc -A -o json | jq '.items[] | select(.spec.externalIPs != null) | .metadata'
  • Check kube-proxy mode: if using IPVS, start evaluating alternatives
  • Migrate away from Ingress NGINX: it's retired — no future patches, ever
  • Test DRA taints/tolerations and partitionable devices if you manage GPU workloads (both now beta/default-on)
  • Validate IP format correctness if you generate Kubernetes resources programmatically (KEP-4858 stricter validation)
  • Enable OCI VolumeSource if you're on containerd v2.1+ — it's GA and worth using

References


Written April 2026 — based on the official Kubernetes v1.36 Sneak Peek and KEP tracker. The release is scheduled for April 22, 2026.

Ready to test your Kubernetes knowledge? Try our Kubernetes quiz!