Cloud Native 11 min read

Understanding the Kubernetes Scheduler: Queues, Filtering, Scoring, and Plugins

The Kubernetes scheduler continuously watches for unscheduled Pods, places them in a priority queue, filters feasible Nodes, scores and selects the best Node using built‑in and custom plugins—adjusting the node‑sampling rate for large clusters, and allowing extensibility through extenders, multiple schedulers, and the scheduler framework configuration.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Understanding the Kubernetes Scheduler: Queues, Filtering, Scoring, and Plugins

The Kubernetes scheduler watches the cluster for newly created, unscheduled Pods and attempts to bind each Pod to a suitable Node using a series of steps: queueing, filtering, scoring, and final selection.

1. Scheduling Queue – Pods waiting to be scheduled are placed in the active queue (a priority queue). The scheduler runs a cycle every second, and Pods that exceed the DefaultPodMaxInUnschedulablePodsDuration (5 minutes) are re‑queued for another attempt.

2. Single Scheduling Cycle – The flow for a single Pod consists of:

Skip Pods that are being deleted or already processed.

Filter the Pod ( skipPodSchedule , SchedulePod ) to obtain a list of feasible Nodes.

If filtering fails, run post‑filter plugins; otherwise invoke the configured plugins from Reserve to MultiPoint .

For large clusters the scheduler does not traverse every Node. If the cluster has ≤ 100 Nodes, all Nodes are examined; otherwise a percentage (clamped to [5, 100]) is used, calculated as:

prePercent = 50 - numAllNodes/125
percent = max(5, prePercent)

3. Scheduling Process

(a) Filtering – Plugins in the PreFilter phase run first. If any required plugin fails, scheduling stops. The scheduler obtains the full Node list ( allNodes ) and intersects it with the result of the filter plugins.

(b) Scoring – The remaining Nodes are ranked by the prioritizeNodes function. If no scoring plugins are enabled, all Nodes are returned. Scoring plugins are invoked in order: RunPreScorePlugins → RunScorePlugins .

(c) Selection – From the highest‑scored Nodes, one is chosen at random if there are ties.

4. Plugin Mechanism

Plugins are divided into scheduling and binding phases and are called at specific extension points. Major plugin types include:

// QueueSort is a list of plugins that should be invoked when sorting pods in the scheduling queue.
QueueSort PluginSet `json:"queueSort,omitempty"`
// PreFilter plugins run at the "PreFilter" extension point.
PreFilter PluginSet `json:"preFilter,omitempty"`
// Filter plugins run to eliminate Nodes that cannot run the Pod.
Filter PluginSet `json:"filter,omitempty"`
// ... other plugin sets omitted for brevity ...

Example of a scoring plugin interface:

type ScorePlugin interface {
Plugin
// Score is called on each filtered node. It must return success and an integer
// indicating the rank of the node. All scoring plugins must return success or the pod will be rejected.
Score(ctx context.Context, state *CycleState, p *v1.Pod, nodeName string) (int64, *Status)
// ScoreExtensions returns a ScoreExtensions interface if it implements one, or nil if does not.
ScoreExtensions() ScoreExtensions
}

Key built‑in plugins include ImageLocality (scores nodes based on local image presence), NodeAffinity , TaintToleration , and many others.

5. Scheduler Configuration

Typical scheduler configuration defines a list of priority plugins with weights, e.g.:

{"name":"BalancedResourceAllocation","weight":1},
{"name":"EvenPodsSpreadPriority","weight":1},
{"name":"NodePreferAvoidPodsPriority","weight":10000},
{"name":"TaintTolerationPriority","weight":1}

6. Customizing Pod Scheduling

Two common approaches:

Extender mode – Implement the Extender interface and configure it in scheduler-policy-config :

{
"extenders": [{
"urlPrefix": "http://xxx/prefix",
"filterVerb": "filter",
"weight": 1,
"bindVerb": "bind",
"enableHttps": false
}]
}

Multiple schedulers – Set spec.schedulerName on the Pod to point to a custom scheduler deployment.

From Kubernetes 1.19 onward, custom plugins can be added directly via the scheduler framework:

import (
scheduler "k8s.io/kubernetes/cmd/kube-scheduler/app"
)
func main() {
command := scheduler.NewSchedulerCommand(
scheduler.WithPlugin("my-plugin", MyPlugin))
if err := command.Execute(); err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
}
}

7. Conclusion

After deep analysis of the scheduler source code, we see that it provides adaptive scheduling based on cluster size, high reliability through leader election, and extensive extensibility via a rich plugin framework.

kubernetesSchedulerPluginsPod schedulingcustom-schedulerExtenderNode Scoring
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.