Flows stuck in Cancelling state while using Kubernetes work pool

### Bug summary

Hi Prefect Team,

I'm using a self-hosted Prefect Server setup in Kubernetes, deployed using the official Helm Chart (version `2025.11.24182903`). The infrastructure setup is the following:
- AWS EKS (Auto Mode) cluster
- AWS RDS (PostgreSQL) instance
- AWS ElastiCache (Redis) instance

While these are the running services:
- Prefect Server **3.6.4**
- Background Services split up
- Single Prefect Worker (via same Helm version)

All the workloads are running in the same namespace (`prefect`) and have the network policies needed to allow full ingress and egress access across the whole namespace. I can provide the Helm chart values if needed.

I'm using the following `prefect.yaml` file to deploy my flow (for reference):
```yaml
# Generic metadata about this project
name: api-extractor
prefect-version: 3.6.4

# build section allows you to manage and build docker images
build:
- prefect_docker.deployments.steps.build_docker_image:
    id: build_image
    requires: prefect-docker>=0.3.1
    image_name: 0000.dkr.ecr.us-xxxx-1.amazonaws.com/extractor/api-extractor
    tag: 0.0.1
    dockerfile: Dockerfile
    platform: linux/amd64

# push section allows you to manage if and how this project is uploaded to remote locations
push:
- prefect_docker.deployments.steps.push_docker_image:
    requires: prefect-docker>=0.3.1
    image_name: '{{ build_image.image_name }}'
    tag: '{{ build_image.tag }}'

# pull section allows you to provide instructions
pull:
- prefect.deployments.steps.set_working_directory:
    directory: /app

# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: extractor-deployment
  version: 1.0.0
  description: This deployment orchestrates extractor
  schedule: {cron: "00 18 * * *", slug: "utc-schedule", timezone: "UTC", active: true}
  flow_name: extractor
  entrypoint: flows/extractor.py:extract_from_api
  parameters:
    sources: 
      - users
      - transactions
    targets:
      sessions: "dev-users"
      events: "dev-transactions"
    target_type: "s3"
    output_format: "json"
    start_time: yesterday
    end_time: yesterday
  work_pool:
    name: extractor-work-pool
    work_queue_name: null
    job_variables:
      image: '{{ build_image.image }}'
      finished_job_ttl: 100
      memory: 16Gi
      image_pull_policy: Always
      service_account_name: extractor-account
      node_selector:
        karpenter.sh/capacity-type: on-demand
      env:
        PREFECT_RUNNER_HEARTBEAT_FREQUENCY: "30"
```

The flow runs smoothly in the K8s cluster as a proper `Job` with its customization and parameters, but if I **Cancel** it from the UI, it stays in **Cancelling** state even if the Job Pod (and its state accordingly) is properly stopped.

To reproduce the issue do the following:
1. Deploy a whole flow as usual (with or without schedule)
2. Run a flow (manually or with a schedule, with Quick run or Custom run)
3. Wait for the flow to provision and set to **Running** state
4. In the UI (or via APIs) click on the **Cancel** flow button
5. Check that the flow is in **Cancelling** state from the UI/APIs
6. Check that the flow Pod (from K8s Job) is being stopped at some point (takes a while sometimes)
7. After the Pod is stopped (gracefully), the flow run in Prefect stays in **Cancelling** state forever.

This behavior was not fixed by using the [proposed Automations](https://docs.prefect.io/v3/advanced/detect-zombie-flows), since the **Cancelling** state doesn't trigger the automation rule, while other states do.

Am I missing something relevant in the workflow about managing these kind of states while using the Kubernetes work pools? I'll be happy to help by providing any info needed by the team to troubleshoot the issue even further.

Thanks in advance!

### Version info

```Text
Version:              3.6.4
API version:          0.8.4
Python version:       3.12.10
Git commit:           d3c3ed50
Built:                Fri, Nov 21, 2025 06:04 PM
OS/Arch:              darwin/arm64
Profile:              dev
Server type:          server
Pydantic version:     2.12.2
Server:
  Database:           sqlite
  SQLite version:     3.51.0
Integrations:
  prefect-docker:     0.6.6
  prefect-kubernetes: 0.6.5
```

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flows stuck in Cancelling state while using Kubernetes work pool #19593

Bug summary

Version info

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Flows stuck in Cancelling state while using Kubernetes work pool #19593

Description

Bug summary

Version info

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions