|
| 1 | +# Plugin Lifecycle and Stability Levels |
| 2 | + |
| 3 | +Author(s): @hexfusion |
| 4 | + |
| 5 | +Related issues: |
| 6 | +- https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/2653 |
| 7 | +- https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/1405 |
| 8 | + |
| 9 | +## Proposal Status |
| 10 | +***Draft*** |
| 11 | + |
| 12 | +## Summary |
| 13 | + |
| 14 | +GIE's plugin system is growing. Extension points now support |
| 15 | +multiple implementations, and more plugin types are coming as |
| 16 | +the EPP evolves (data layer sources, parsers, flow control |
| 17 | +policies). This growth is healthy, it lets contributors |
| 18 | +experiment with new approaches and iterate quickly. |
| 19 | + |
| 20 | +Today there is no mechanism to communicate plugin maturity to |
| 21 | +operators. A plugin either exists in the registry or it doesn't. |
| 22 | +There is no way to distinguish "this plugin is experimental and |
| 23 | +may change" from "this plugin is stable and its config API is |
| 24 | +committed." Without a clear support contract, operators can't |
| 25 | +make informed deployment decisions, and maintainers can't iterate |
| 26 | +on plugin designs without risking silent breakage for users who |
| 27 | +adopted them early. |
| 28 | + |
| 29 | +A plugin lifecycle model would let experimentation and stability |
| 30 | +coexist: contributors can ship new plugins without the pressure |
| 31 | +of immediate stability guarantees, and operators can see exactly |
| 32 | +what they're opting into. |
| 33 | + |
| 34 | +## Goals |
| 35 | + |
| 36 | +* Define maturity tiers for EPP plugins (Alpha, Beta, Stable) |
| 37 | + with clear support contracts at each tier |
| 38 | +* Gate experimental plugins behind feature flags so they're |
| 39 | + opt-in by default |
| 40 | +* Reject removed plugins at config validation time with |
| 41 | + actionable error messages |
| 42 | +* Communicate stability to operators at startup via structured |
| 43 | + log messages |
| 44 | + |
| 45 | +## Non-Goals |
| 46 | + |
| 47 | +* Runtime stability negotiation (plugins don't change stability |
| 48 | + while running) |
| 49 | +* Out-of-tree plugin certification, conformance testing, or |
| 50 | + governance of stability declarations |
| 51 | +* CRD-level stability annotations (this proposal covers compiled |
| 52 | + EPP plugins only) |
| 53 | + |
| 54 | +## Prior Art |
| 55 | + |
| 56 | +kube-scheduler gates alpha plugins via feature flags and |
| 57 | +hard-rejects removed plugins at config validation time. Gateway |
| 58 | +API uses [Standard/Experimental channels](https://gateway-api.sigs.k8s.io/concepts/versioning/) with |
| 59 | +formal graduation criteria. Neither system puts stability |
| 60 | +metadata in the plugin interface itself. |
| 61 | + |
| 62 | +## Proposed Design |
| 63 | + |
| 64 | +Stability is managed through the plugin registry, feature gates, |
| 65 | +and config validation not through the Plugin interface. |
| 66 | + |
| 67 | +### Stability Levels |
| 68 | + |
| 69 | +Plugin stability uses three maturity tiers: Alpha, Beta, and |
| 70 | +Stable. These are plugin-specific labels, not Kubernetes API |
| 71 | +versions. There is no separate "Deprecated" level, deprecation |
| 72 | +is a signal (a message indicating replacement), not a maturity |
| 73 | +tier. The plugin's current level determines its removal timeline. |
| 74 | + |
| 75 | +| Level | Default | Config Contract | Removal Policy | |
| 76 | +|-------|---------|-----------------|----------------| |
| 77 | +| **Alpha** | Gated off (requires feature gate) | No compatibility guarantee. Config schema may change between releases. | Can be removed any release. | |
| 78 | +| **Beta** | Gated on | Config schema is stable. Behavioral changes require release notes. | 2 releases + 6 months after deprecation notice. | |
| 79 | +| **Stable** | Always available | Full backward compatibility within config API version. | Not removed within a config API major version. | |
| 80 | + |
| 81 | +**Deprecation** is orthogonal to level. A plugin at any level |
| 82 | +can carry a deprecation message signaling that it will be |
| 83 | +removed. The level determines how long it must remain available |
| 84 | +after that signal. When the policy window expires, the plugin is |
| 85 | +removed from the registry entirely. A separate validation |
| 86 | +tombstone provides the migration message for stale configs that |
| 87 | +still reference it. |
| 88 | + |
| 89 | +**Removal** is not a stability level. Removed plugins are |
| 90 | +deleted from the registry. A tombstone map in the validation |
| 91 | +layer catches stale configs and returns actionable errors with |
| 92 | +migration guidance. |
| 93 | + |
| 94 | +These tiers and their removal policies are defined by this |
| 95 | +proposal and are specific to GIE's plugin system. They do not |
| 96 | +map to Kubernetes API versions and are independent of the |
| 97 | +`EndpointPickerConfig` API version. |
| 98 | + |
| 99 | +### Key Mechanisms |
| 100 | + |
| 101 | +**Registry metadata.** The existing `plugin.Registry` (a |
| 102 | +`map[string]FactoryFunc`) is extended to carry stability, |
| 103 | +feature gate, and deprecation message alongside the factory |
| 104 | +function. This is the single source of truth for plugin |
| 105 | +maturity. No changes to the `Plugin` interface are needed. |
| 106 | + |
| 107 | +**Feature gate integration.** Alpha plugins require an explicit |
| 108 | +feature gate in `EndpointPickerConfig.FeatureGates`. GIE already |
| 109 | +has a `FeatureGates []string` field on the config; this proposal |
| 110 | +extends its use to cover per-plugin gating. |
| 111 | + |
| 112 | +**Config validation.** At config load time: |
| 113 | +* Alpha plugins without their feature gate enabled are rejected |
| 114 | + with an actionable error |
| 115 | +* Removed plugins are rejected with migration guidance |
| 116 | +* Plugins with a deprecation message are accepted but log a |
| 117 | + warning with the replacement and removal timeline |
| 118 | + |
| 119 | +**Startup logging.** Every loaded plugin is logged with its |
| 120 | +stability level and any deprecation message. This gives |
| 121 | +operators immediate visibility into what they're running. |
| 122 | + |
| 123 | +## Implementation |
| 124 | + |
| 125 | +The implementation is scoped to the GIE framework packages. No |
| 126 | +changes to the `Plugin` interface or individual plugin code are |
| 127 | +required in Phase 1 or 2. |
| 128 | + |
| 129 | +### Current State |
| 130 | + |
| 131 | +Today `plugin.Registry` is a `map[string]FactoryFunc` with no |
| 132 | +metadata. Feature gates are phase-level (`prepareDataPlugins`, |
| 133 | +`experimentalDatalayer`, `flowControl`), not per-plugin. |
| 134 | +Validation checks profile references and gate names but knows |
| 135 | +nothing about plugin maturity. |
| 136 | + |
| 137 | +### Phase 1: Registry Metadata + Startup Logging |
| 138 | + |
| 139 | +**Goal:** Every plugin in the registry carries stability |
| 140 | +metadata. Operators see stability at startup. |
| 141 | + |
| 142 | +**Changes to `pkg/epp/framework/interface/plugin/registry.go`:** |
| 143 | + |
| 144 | +```go |
| 145 | +// StabilityLevel defines the maturity of a registered plugin. |
| 146 | +// Three maturity tiers that define the config contract and |
| 147 | +// removal policy. These are plugin-specific labels, not |
| 148 | +// Kubernetes API versions. Deprecation is orthogonal (a |
| 149 | +// message, not a level). Removal means the plugin leaves |
| 150 | +// the registry entirely. |
| 151 | +type StabilityLevel string |
| 152 | + |
| 153 | +const ( |
| 154 | + // Unknown is the zero value. Assigned to plugins registered |
| 155 | + // via the backward-compatible Register() path that have not |
| 156 | + // yet opted into the lifecycle model. |
| 157 | + Unknown StabilityLevel = "Unknown" |
| 158 | + Alpha StabilityLevel = "Alpha" |
| 159 | + Beta StabilityLevel = "Beta" |
| 160 | + Stable StabilityLevel = "Stable" |
| 161 | +) |
| 162 | + |
| 163 | +// IsValid returns true if s is a recognized stability level |
| 164 | +// that carries a support contract. Unknown is recognized but |
| 165 | +// indicates the plugin has not declared its stability. |
| 166 | +func (s StabilityLevel) IsValid() bool { |
| 167 | + switch s { |
| 168 | + case Unknown, Alpha, Beta, Stable: |
| 169 | + return true |
| 170 | + } |
| 171 | + return false |
| 172 | +} |
| 173 | + |
| 174 | +// RegistryEntry holds a plugin factory and its lifecycle |
| 175 | +// metadata. |
| 176 | +type RegistryEntry struct { |
| 177 | + // Factory instantiates the plugin. |
| 178 | + Factory FactoryFunc |
| 179 | + |
| 180 | + // Stability is the maturity level of this plugin. |
| 181 | + // Unknown for plugins registered via Register(); |
| 182 | + // Alpha, Beta, or Stable for plugins registered |
| 183 | + // via MustRegister(). |
| 184 | + Stability StabilityLevel |
| 185 | + |
| 186 | + // FeatureGate is the feature gate name required for |
| 187 | + // Alpha plugins. Must be non-empty when Stability is |
| 188 | + // Alpha. |
| 189 | + FeatureGate string |
| 190 | + |
| 191 | + // DeprecationMessage, if non-empty, signals that this |
| 192 | + // plugin will be removed in a future release. Logged as |
| 193 | + // a warning at startup. The plugin remains fully |
| 194 | + // functional. The removal timeline is determined by the |
| 195 | + // plugin's stability level. |
| 196 | + DeprecationMessage string |
| 197 | +} |
| 198 | + |
| 199 | +// Registry is the global plugin registry, keyed by plugin |
| 200 | +// type string. All registration must complete before |
| 201 | +// LoadRawConfig is called. Concurrent registration is not |
| 202 | +// supported. |
| 203 | +var Registry = map[string]RegistryEntry{} |
| 204 | + |
| 205 | +// Register adds a plugin factory to the registry without |
| 206 | +// stability metadata. Plugins registered this way get Unknown |
| 207 | +// stability and will log a warning at startup prompting the |
| 208 | +// author to migrate to MustRegister. This preserves backward |
| 209 | +// compatibility for out-of-tree plugins that have not yet |
| 210 | +// opted into the lifecycle model. |
| 211 | +func Register(pluginType string, factory FactoryFunc) { |
| 212 | + Registry[pluginType] = RegistryEntry{ |
| 213 | + Factory: factory, |
| 214 | + Stability: Unknown, |
| 215 | + } |
| 216 | +} |
| 217 | + |
| 218 | +// MustRegister adds a plugin factory with explicit lifecycle |
| 219 | +// metadata and panics on invalid plugin. |
| 220 | +func MustRegister(pluginType string, entry RegistryEntry) { |
| 221 | + if !entry.Stability.IsValid() { |
| 222 | + panic(fmt.Sprintf( |
| 223 | + "plugin %q: invalid stability level %q", |
| 224 | + pluginType, entry.Stability)) |
| 225 | + } |
| 226 | + if entry.Stability == Alpha && entry.FeatureGate == "" { |
| 227 | + panic(fmt.Sprintf( |
| 228 | + "plugin %q: alpha plugins must specify a FeatureGate", |
| 229 | + pluginType)) |
| 230 | + } |
| 231 | + if entry.Factory == nil { |
| 232 | + panic(fmt.Sprintf( |
| 233 | + "plugin %q: Factory must not be nil", |
| 234 | + pluginType)) |
| 235 | + } |
| 236 | + Registry[pluginType] = entry |
| 237 | +} |
| 238 | +``` |
| 239 | + |
| 240 | +**Startup logging** is a separate pass (`logPluginStability`) |
| 241 | +that runs after validation but before factory calls. It logs |
| 242 | +each plugin's name, type, and stability level. Plugins with a |
| 243 | +`DeprecationMessage` get an additional warning. |
| 244 | + |
| 245 | +**Migration path:** Existing `plugin.Register()` calls continue |
| 246 | +to work with `Unknown` stability. Plugin authors adopt |
| 247 | +`MustRegister()` at their own pace. |
| 248 | + |
| 249 | +### Phase 2: Alpha Gating + Removed Plugin Rejection |
| 250 | + |
| 251 | +**Goal:** Alpha plugins require explicit opt-in. Removed plugins |
| 252 | +produce actionable errors. Stability validation runs before |
| 253 | +plugin factories are called. |
| 254 | + |
| 255 | +```go |
| 256 | +// removedPlugins is a tombstone map for plugins that have been |
| 257 | +// deleted from the registry. When an operator's config |
| 258 | +// references a removed plugin, validation returns an actionable |
| 259 | +// error with migration guidance instead of the generic "not |
| 260 | +// registered" error from instantiatePlugins. Tombstones are |
| 261 | +// permanent and small. |
| 262 | +var removedPlugins = map[string]string{ |
| 263 | + // Populated as plugins are removed. Key is the plugin type, |
| 264 | + // value is the migration message. Example: |
| 265 | + // "old-plugin": "Use new-plugin instead. See https://...", |
| 266 | +} |
| 267 | + |
| 268 | +func validatePluginStability( |
| 269 | + cfg *configapi.EndpointPickerConfig, |
| 270 | +) error { |
| 271 | + enabledGates := sets.New(cfg.FeatureGates...) |
| 272 | + |
| 273 | + for _, spec := range cfg.Plugins { |
| 274 | + // Check tombstones first -- give a useful migration |
| 275 | + // error instead of the generic "not registered" from |
| 276 | + // instantiatePlugins. |
| 277 | + if msg, ok := removedPlugins[spec.Type]; ok { |
| 278 | + return fmt.Errorf( |
| 279 | + "plugin type '%s' has been removed: %s", |
| 280 | + spec.Type, msg, |
| 281 | + ) |
| 282 | + } |
| 283 | + |
| 284 | + entry, ok := fwkplugin.Registry[spec.Type] |
| 285 | + if !ok { |
| 286 | + continue // Will be caught by instantiatePlugins. |
| 287 | + } |
| 288 | + |
| 289 | + // Alpha plugins require their feature gate to be |
| 290 | + // explicitly enabled. |
| 291 | + if entry.Stability == fwkplugin.Alpha { |
| 292 | + if !enabledGates.Has(entry.FeatureGate) { |
| 293 | + return fmt.Errorf( |
| 294 | + "plugin '%s' (type: %s) is alpha and "+ |
| 295 | + "requires feature gate '%s' to be "+ |
| 296 | + "enabled in featureGates", |
| 297 | + spec.Name, spec.Type, entry.FeatureGate, |
| 298 | + ) |
| 299 | + } |
| 300 | + } |
| 301 | + } |
| 302 | + return nil |
| 303 | +} |
| 304 | +``` |
| 305 | + |
| 306 | +**Removed plugins** are deleted from the registry. The |
| 307 | +maintainer removes the `MustRegister` call and adds a tombstone |
| 308 | +to `removedPlugins`. Tombstones are permanent and small. |
| 309 | + |
| 310 | +**Feature gate registration** for alpha plugins is manual via |
| 311 | +`loader.RegisterFeatureGate()`, called alongside |
| 312 | +`plugin.MustRegister()`. |
| 313 | + |
| 314 | +## Open Questions |
| 315 | + |
| 316 | +1. Should alpha plugins be completely invisible in the default |
| 317 | + config, or just gated off? |
| 318 | +2. Should graduation criteria be GIE-specific, or adopt Gateway |
| 319 | + API's requirements? |
| 320 | +3. Where does the stability policy live, `docs/plugin-lifecycle.md`, |
| 321 | + `CONTRIBUTING.md`, or a dedicated proposal? |
| 322 | + |
0 commit comments