Skip to content

Enhance the evaluation rules of AccessPolicy objects#220

Open
haiyanmeng wants to merge 6 commits intokubernetes-sigs:mainfrom
haiyanmeng:policy-evaluation-order
Open

Enhance the evaluation rules of AccessPolicy objects#220
haiyanmeng wants to merge 6 commits intokubernetes-sigs:mainfrom
haiyanmeng:policy-evaluation-order

Conversation

@haiyanmeng
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:
The original evaluation rules had several issues:

  1. When an object is targeted by multiple InlineTools AccessPolicy objects, the request is only allowed if all the AccessPolicy objects allow the request.

    • Expected behavior: the request should be allowed if any of the AccessPolicy objects allows the request.

    • Rationale: Be consistent with prior art like Istio.

  2. The evaluation does not specify the evaluation order when both InlineTools and ExternalAuth rules exist for a target object.

    • Expected behavior: When both InlineTools and ExternalAuth rules exist for a target object, ExternalAuth rules should be evaluated first.

    • Rationale: efficiency + be consistent with prior art like Istio.

In addition, to make it easy for users to understand why a tool access is allowed or denied, we will disallow the combination of InlineTools and ExternalAuth in the same AccessPolicy.

Which issue(s) this PR fixes:

Addresses #217

Does this PR introduce a user-facing change?:

[Breaking Change] When an object is targeted by multiple InlineTools AccessPolicy objects, the request is allowed if any of the AccessPolicy objects allows the request.
[Breaking Change] Combining InlineTools and ExternalAuth in the same AccessPolicy is not allowed.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 3, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 3, 2026

Deploy Preview for kube-agentic-networking ready!

Name Link
🔨 Latest commit ce17e5b
🔍 Latest deploy log https://app.netlify.com/projects/kube-agentic-networking/deploys/69d507ea4c21bc00081dc0e4
😎 Deploy Preview https://deploy-preview-220--kube-agentic-networking.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: haiyanmeng
Once this PR has been reviewed and has the lgtm label, please assign david-martin for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 3, 2026
@haiyanmeng
Copy link
Copy Markdown
Contributor Author

}
```

In addition, to make it easy for users to understand why a tool access is allowed or denied, we will disallow the combination of `InlineTools` and `ExternalAuth` in the same `AccessPolicy`.
Copy link
Copy Markdown
Contributor Author

@haiyanmeng haiyanmeng Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason for this is to keep the complexity of the API under control and make it easier for vendors to implement this.

If folk are okay with this restriction, I will add the proposed API change.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between combining InlineTools and ExternalAuth in the same AccessPolicy versus in different ones? I could argue that, if you have both options, it's easier to manage when they are declared in the same object.

Additionally, should we accept that ExternalAuth is only for when targeting a Gateway now? If we want Backend-level ExternalAuth to trigger first and not to mix with Backend-level InlineTools, then why not setting ExternalAuth up in the HTTPRoute instead? (I can only think of one reason that is probably not a very common use case: triggering multiple callouts.)

@haiyanmeng
Copy link
Copy Markdown
Contributor Author

  1. When an object is targeted by multiple InlineTools AccessPolicy objects, the request is only allowed if all the AccessPolicy objects allow the request.

    • Expected behavior: the request should be allowed if any of the AccessPolicy objects allows the request.
    • Rationale: Be consistent with prior art like Istio.
  2. The evaluation does not specify the evaluation order when both InlineTools and ExternalAuth rules exist for a target object.

    • Expected behavior: When both InlineTools and ExternalAuth rules exist for a target object, ExternalAuth rules should be evaluated first.
    • Rationale: efficiency + be consistent with prior art like Istio.

@guicassolato , do these two changes make sense to you?

@guicassolato
Copy link
Copy Markdown
Contributor

It's a bank holiday where I live so I will only be back on Tuesday. Therefore, I couldn't give this PR the thorough review it deserves yet, but here are my 2¢ based on what I saw at a glance:

  1. The order of evaluation/enforcement of the rules should be the order of declaration - no obscure behaviour like ExternalAuth goes first, InlineTools next. This is what I think makes more sense for users, regardless of what implementation X does. It's simple and straightforward.
  2. There's no reason not to allow ExternalAuth and InlineTools in the same object – if they can be in different objects, so they should be allowed to declare in the same one. Please, check out as well these 5 reasons why NOT to move ExternalAuth to the top level of the AccessPolicy, which effectively is what not allowing using ExternalAuth and InlineTools together is trying to do, I suppose.
  3. ALL RULES must allow access for access to be granted, not AT LEAST ONE RULE, otherwise there may be a rogue policy that grants access while the user struggles to figure out why another policy that denies access seems not to be working as expected. It is the safe approach. Therefore, in bullet point number 2 of this comment, I am of the opinion that the observed behaviour is in fact the correct one.

As I mentioned on Slack, I think it's good a thing that we're trying to fill some definition gaps. My first impression nevertheless was that some of the new definitions proposed here feel like we may be overthinking it a little. This only means I think we should give stakeholders a chance of forming a second impression, hence my suggestion to hold at least a second round of discussion and my commitment to review the PR more thoroughly, trying to better understand the other POVs.

@keithmattix
Copy link
Copy Markdown

ALL RULES must allow access for access to be granted, not AT LEAST ONE RULE, otherwise there may be a rogue policy that grants access while the user struggles to figure out why another policy that denies access seems not to be working as expected. It is the safe approach. Therefore, in bullet point number 2 of #214 (comment), I am of the opinion that the observed behaviour is in fact the correct one.

Under that framework, don't you have the inverse problem where a rogue policy can block traffic to ~any endpoint? Regardless, if I understand the table correctly, the semantics here are the same as Istio's which means DENY rules override any allow rules. So this scenario couldn't even happen.

@guicassolato
Copy link
Copy Markdown
Contributor

guicassolato commented Apr 3, 2026

ALL RULES must allow access for access to be granted, not AT LEAST ONE RULE, otherwise there may be a rogue policy that grants access while the user struggles to figure out why another policy that denies access seems not to be working as expected. It is the safe approach. Therefore, in bullet point number 2 of #214 (comment), I am of the opinion that the observed behaviour is in fact the correct one.

Under that framework, don't you have the inverse problem where a rogue policy can block traffic to ~any endpoint?

Yes, and this is a good thing. That’s what meant by the safe approach.

@haiyanmeng haiyanmeng force-pushed the policy-evaluation-order branch from 9415bde to fbea6f7 Compare April 3, 2026 18:15
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 3, 2026
@haiyanmeng haiyanmeng force-pushed the policy-evaluation-order branch 3 times, most recently from be6c0c8 to d392ed8 Compare April 3, 2026 18:28
The original evaluation rules had several issues:

1. When an object is targeted by multiple InlineTools AccessPolicy
   objects, the request is only allowed if all the AccessPolicy objects
   allow the request.

   * Expected behavior: the request should be allowed if any of the
     AccessPolicy objects allows the request.

   * Rationale: Be consistent with prior art like Istio.

2. The evaluation does not specify the evaluation order when both
   InlineTools and ExternalAuth rules exist for a target object.

   * Expected behavior: When both InlineTools and ExternalAuth rules
     exist for a target object, ExternalAuth rules should be evaluated
     first.

   * Rationale: efficiency + be consistent with prior art like Istio.

In addition, to make it easy for users to understand why a tool access
is allowed or denied, we will disallow the combination of InlineTools
and ExternalAuth in the same AccessPolicy.
@haiyanmeng haiyanmeng force-pushed the policy-evaluation-order branch from d392ed8 to b1afab1 Compare April 3, 2026 18:33
@robscott
Copy link
Copy Markdown
Member

robscott commented Apr 3, 2026

ALL RULES must allow access for access to be granted, not AT LEAST ONE RULE, otherwise there may be a rogue policy that grants access while the user struggles to figure out why another policy that denies access seems not to be working as expected. It is the safe approach. Therefore, in bullet point number 2 of #214 (comment), I am of the opinion that the observed behaviour is in fact the correct one.

Under that framework, don't you have the inverse problem where a rogue policy can block traffic to ~any endpoint?

Yes, and this is a good thing. That’s what meant by the safe approach.

Just to make sure I understand the conversation here, let me provide an example:

policy 1:

  • rule a: identity foo can talk to the github tool

policy 2:

  • rule b: identity foo can talk to the gitlab tool

Taking this guideline literally, it seems like foo would not be able to talk to github or gitlab. That's because although policy 1 allows foo to access github, it does not allow foo to access gitlab. With a similar situation for policy 2. Is that what you're intending to say @guicassolato? I'm assuming I'm misunderstanding here.

@guicassolato
Copy link
Copy Markdown
Contributor

ALL RULES must allow access for access to be granted, not AT LEAST ONE RULE, otherwise there may be a rogue policy that grants access while the user struggles to figure out why another policy that denies access seems not to be working as expected. It is the safe approach. Therefore, in bullet point number 2 of #214 (comment), I am of the opinion that the observed behaviour is in fact the correct one.

Under that framework, don't you have the inverse problem where a rogue policy can block traffic to ~any endpoint?

Yes, and this is a good thing. That’s what meant by the safe approach.

Just to make sure I understand the conversation here, let me provide an example:

policy 1:

  • rule a: identity foo can talk to the github tool

policy 2:

  • rule b: identity foo can talk to the gitlab tool

Taking this guideline literally, it seems like foo would not be able to talk to github or gitlab. That's because although policy 1 allows foo to access github, it does not allow foo to access gitlab. With a similar situation for policy 2. Is that what you're intending to say @guicassolato? I'm assuming I'm misunderstanding here.

No, that is not what I intend to say.

Let’s begin with: Anybody who has the authority to attach a policy at any point in the request path shall have this policy honoured.

Are github and gitlab both in the same request path? A good representation of the network topology would separate them into two different logical paths, with two different points of attachment — one for each policy. github policy only activates in the github path, gitlab policy only activates in the gitlab path.

What’s more interesting about this silly and obvious example though is to think what would happen if the distinction between github and gitlab paths could not be represented in way to create two optional attachment points — unrealistic, but for the sake of argument.

Say, for example, we didn’t have different HTTPRoutes or different Backends for each host (github and gitlab), and both policies were attached to the gateway and were as poorly crafted as in the example. I.e. p1 attached to the gateway says if principal == 'bob' -> allow if host == 'github', p2 also attached to the gateway says if principal == 'bob' -> allow if host == 'gitlab'. If that’s all there is, and of course both policy owners had authority to create such policies, for any given request from 'bob', enforced at the gateway point, at least one of the policies would evaluate to DENY, since host cannot be simultaneously github and gitlab.

Once this mistake is noticed, the two policy owners can talk to each other. They are both attaching policies to the same gateway, they probably work on the same team, maybe they are even the same person. They will realise the policy should be better written as one: if principal == 'bob' -> allow if host in ('github', 'gitlab').

A policy only speaks to what it says, within the jurisdiction it governs. It cannot be read as prohibiting what it simply does not address. Give it enough context so it can exert its power strictly according to what it says and no more. And when two policy owners share jurisdiction over the same point, they bear the responsibility of writing consistent, non-conflicting rules, taking into account the information they know to be available at the time to enforce the policy.

@howardjohn
Copy link
Copy Markdown

I'd be interested to see if there was prior art supporting that pattern. a quick search shows the Istio Authz, k8s NetPol, GCP iam, AWS iam, and OPA all have the "any policy should allow". I actually could not find a single example behaving differently.

@LiorLieberman
Copy link
Copy Markdown
Member

What’s more interesting about this silly and obvious example though is to think what would happen if the distinction between github and gitlab paths could not be represented in way to create two optional attachment points — unrealistic, but for the sake of argument.

Say, for example, we didn’t have different HTTPRoutes or different Backends for each host (github and gitlab), and both policies were attached to the gateway and were as poorly crafted as in the example. I.e. p1 attached to the gateway says if principal == 'bob' -> allow if host == 'github', p2 also attached to the gateway says if principal == 'bob' -> allow if host == 'gitlab'. If that’s all there is, and of course both policy owners had authority to create such policies, for any given request from 'bob', enforced at the gateway point, at least one of the policies would evaluate to DENY, since host cannot be simultaneously github and gitlab.

Once this mistake is noticed, the two policy owners can talk to each other. They are both attaching policies to the same gateway, they probably work on the same team, maybe they are even the same person. They will realise the policy should be better written as one: if principal == 'bob' -> allow if host in ('github', 'gitlab').

I'd be very concerned with the approach. This will likely result in policy authors to create policies that deny previously allowed-by-policy traffic policies.

Both Gateways and Backends are namespaced policies. We generally are not concerned by accidentally granting broad access that another policy is trying to restrict. While its not always the case, we technically dont assume different owners for objects within a namespace.

So for the case where targetRef == Gateway, this is the attachment point (and the policy aggregation point). If something is allowing the request to come to the Gateway it should be allowed (unless we have a explicit DENY policy, which the API currently lacks). No ALLOW policies should cause a DENY of other traffic thats allowed by policy(ies) targeting the same gateway

A good representation of the network topology would separate them into two different logical paths, with two different points of attachment — one for each policy.

Regarding this comment, I dont necessarily think thats true 100% of the time. If the policy author were to own gitlab or github, then yes (maybe?). But remember that when targetRef a Gateway, the policy author is the owner of the namespace. And with this in mind, gitlab and github become just a parameter of the request, just like a header or any other l7 attribute, not an attachment point. Note that this is true for both ingress and egress imo.

@guicassolato
Copy link
Copy Markdown
Contributor

It looks like a majority is in the making in favour of "a single ALLOW suffices".

I stand firm with my opinion that, from a security POV, this is a bad design. It is the perfect example of the classical metaphor of chains that break at the weakest link. (A poor ALLOW breaks all strong DENIES.) But that's fine, as not all decisions have to be a consensus.

I actually could not find a single example behaving differently.

@howardjohn, I respect the decision of other policy engines which may have decided to do it this way. However, it doesn't change the fact that this is not a best practice.

A good representation of the network topology would separate them into two different logical paths, with two different points of attachment — one for each policy.

Regarding this comment, I dont necessarily think thats true 100% of the time. If the policy author were to own gitlab or github, then yes (maybe?). But remember that when targetRef a Gateway, the policy author is the owner of the namespace. And with this in mind, gitlab and github become just a parameter of the request, just like a header or any other l7 attribute, not an attachment point. Note that this is true for both ingress and egress imo.

@LiorLieberman, if it can be represented as a Gateway (ingress or egress), as an xRoute, as a Backend, or as a named section of any of the above, then it can be an isolated attachment point 100% of the times. You don't have to own github or gitlab for that, only the namespace where the attachment point lives, and of course support all those kinds of attachment point in the API.

What you're arguing is that there's a generalisation that includes other L7 attributes beyond what can be represented as an attachment point with the Kubernetes resources we know. Although IMO this is not the case of github and gitlab specifically (representable as hostnames or as backends), overall I agree. In fact, the two POVs are perfectly compatible with each other.

I don't see how this changes the fact that conflicting policies should never exist though. And when it's not possible to guarantee that by availability of having enough points of attachment to isolate policies to their own request path, policy owners need to talk to each other and just write better policies.

@howardjohn
Copy link
Copy Markdown

I respect the decision of other policy engines which may have decided to do it this way. However, it doesn't change the fact that this is not a best practice.

I am not what I would consider an authoritative expert on security or authorization policy design. I don't think many (if not anyone) involved is (though there may be some, including you!). That is why I lean heavily on prior art. If every* existing system is doing it one way, and you propose an alternative, how can I as a non-expert weigh those? Without more evidence explaining why every other system is wrong its hard to accept a proposal that differs away from every other existing authorization service.

Even if we assume the propose approach is better, given how prevalent the other approaches are, we would probably want to be very confident that its much better, as there is always a risk in diverging from established practices, even when its a theoretically better approach (see Haskell vs Go as a silly example).

* I doubt every is true, if we find counter examples would be great

@guicassolato
Copy link
Copy Markdown
Contributor

@howardjohn,

I see your point and I respect that.

People who know me also know I don't fight to get things my way. I fight for democratic and informed decisions.

So, in the spirit of clarity and making sure we're making an informed decision, here's what's at stake IIUC:

Given:

  • Policy A specifies: ALLOW github
  • Policy B specifies: DENY github (via ext_authz)

An agent sends a request to github.

Most people seem to be converging to this request should be allowed.
My opinion is that the request should be denied.

The OASIS XACML standard calls these two opposing behaviours, respectively, permit-overrides and deny-overrides.

Depending on how ext_authz behaves, the example above can be no different than:

  • Policy A specifies: ALLOW github
  • Policy B specifies: ALLOW gitlab (via ext_authz)

A naive ext_authz implementation of Policy B could be (in OPA): default allow := false; allow if { input.request.host == "gitlab" }.

Once again, a request to github would evaluate to:

  • Policy A: ALLOW
  • Policy B: DENY

By permit-overrides, Policy A wins; by deny-overrides, the winner is Policy B.


wrt prior art:

There are a few policy engines that implement permit-overrides. Kubernetes RBAC and NetworkPolicy are probably the two best examples in the wild. By being additive and not supporting DENY rules, policies and role bindings can only expand permissions, never restrict. Therefore, you can short-circuit evaluation as soon as you find an ALLOW match.

I obviously will not pretend to know more about Istio than you do, but I don't see AuthorizationPolicy in the same category. By evaluating CUSTOM, DENY and ALLOW in this order, wouldn't it make a case of deny-override?

I am not a regular user of AWS IAM or GCP IAM, but my understanding is that they also implement deny-override. With AWS IAM, if policies that apply to a request include an ALLOW statement and a DENY statement, I believe the DENY statement will trump the ALLOW statement. Whereas in GCP there's this thing of denying a permission to a principal, effectively invalidating any roles the principal is bound to that otherwise would grant the principal with access.

OPA is a policy language. (Well, Rego is.) OPA lets you express the logic yourself. E.g.: allow if {cond1}; allow if {cond2} is a logical OR (permit-overrides style), while allow if {cond1; cond2} is an AND.

Kuadrant AuthPolicy implements deny-overrides strictly.

I hope these count as counter-examples.


Anyway, I think in the end we are discussing two mental models. The question is which default is safer for our threat model. In networking security, where the consequence of an erroneous ALLOW is a real security breach, deny-overrides is the standard industry recommendation, I reckon. That's at least my experience from working on software security with banks and in the payment processing industry in the past.

(BTW, this is not me trying to play the card of "expert", because I certainly do not consider myself one either. These are just things I heard enough in my ears to continue echoing to this date.)

That said, if we decide to go with permit-overrides instead, it will be consistent with other kinds of policy in Kubernetes. As long as it's an informed decision, I'd be OK with losing the battle.

Comment on lines +24 to +31
- **[ExternalAuth-type](https://github.com/kubernetes-sigs/kube-agentic-networking/blob/cf8c85b85d067657a5dce7b87270f8099f1e302c/api/v0alpha0/accesspolicy_types.go#L169) Rules**:
- Evaluated first.
- If **any** ExternalAuth rule matching the request returns `deny`, the request is denied immediately.
- **[InlineTools-type](https://github.com/kubernetes-sigs/kube-agentic-networking/blob/cf8c85b85d067657a5dce7b87270f8099f1e302c/api/v0alpha0/accesspolicy_types.go#L165) Rules**:
- Evaluated only if all ExternalAuth rules (if any) allowed the request.
- If there are **no** InlineTools rules matching the request, the request is allowed.
- If **any** InlineTools rule matching the request exists and allows it, the request is allowed.
- Otherwise, the request is denied.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'd still prefer the rules to be evaluated in the order they appear in the policy, I think this is an acceptable compromise if it's to keep the deny-overrides behaviour.

For the reference implementation:

I think this will require configuring two RBAC filters - before and after ExternalAuth. So the config goes:

  1. Gateway-level RBAC filter with shadow rules for ExternalAuth
  2. Gateway-level Ext_authz filter
  3. Gateway-level RBAC filter with normal rules for InlineTools
  4. Backend-level RBAC filter with shadow rules for ExternalAuth
  5. Backend-level Ext_authz filter
  6. Backend-level RBAC filter with normal rules for InlineTools

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will require configuring two RBAC filters - before and after ExternalAuth. So the config goes:

  1. Gateway-level RBAC filter with shadow rules for ExternalAuth
  2. Gateway-level Ext_authz filter
  3. Gateway-level RBAC filter with normal rules for InlineTools

This looks right if there is only one Gateway-level ExternalAuth policy/rule.

If there are multiple Gateway-level ExternalAuth policies/rules, we may need multiple RABC filters for ExternalAuth. For example, if there are two Gateway-level ExternalAuth policies/rules, the HTTP filters in the reference implementation may look like:

Gateway-level RBAC filter with shadow rules for ExternalAuth-1
Gateway-level Ext_authz filter 1
Gateway-level RBAC filter with shadow rules for ExternalAuth-2
Gateway-level Ext_authz filter 2
Gateway-level RBAC filter with normal rules for InlineTools

Let me know if there are better ways to handle this in our reference implementation.

Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md Outdated
Comment on lines +156 to +157
- If **any** allows, proceed to Backend level.
- If **all** deny, the request is rejected immediately.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we only support implicit deny (by having no match), shouldn't it always proceed to Backend-level?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no InlineTools policies matching the request, it proceeds to Backend-level checks.

In this case (at line 157), there are two InlineTools policies matching the request: gateway-policy-inline-tools-1 and gateway-policy-inline-tools-2. Neither of the policies allow the request. Hence the request should be denied immediately, and the Backend-level checks should be skipped.

Copy link
Copy Markdown
Contributor

@guicassolato guicassolato Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "matching the request" mean? Is it the fact that the policies are attached to the gateway that is handling the request, the fact that source matches the request, or both?

Are you saying that, when a Gateway policy exists and it "matches the request", if at least one of the InlineTools rules in the policy doesn't allow, then there's no point in having Backend-level policies possibly allowing the request?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an example:

kind: XAccessPolicy
spec:
  targetRefs:
  - kind: Gateway
  rules:
  - authorization:
      type: InlineTools
      tools: 
      - x
---
kind: XAccessPolicy
spec:
  targetRefs:
  - kind: XBackend
  rules:
  - authorization:
      type: InlineTools
      tools: 
      - y

tools/call request sent to "y".

Should the request be allowed or denied?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an example:

kind: XAccessPolicy
spec:
  targetRefs:
  - kind: Gateway
  rules:
  - authorization:
      type: InlineTools
      tools: 
      - x
---
kind: XAccessPolicy
spec:
  targetRefs:
  - kind: XBackend
  rules:
  - authorization:
      type: InlineTools
      tools: 
      - y

tools/call request sent to "y".

Should the request be allowed or denied?

The request should be denied following the rule If the request is denied at the gateway level, evaluation stops, and Backend-level policies are NOT evaluated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except that the request is not explicitly denied at the gateway level. Rather, it's not allowed, which can then lead to implicitly denied, or alternatively could lead to deferred to backend level.

I wonder if an implicit deny at the gateway level when there are still other policies to check at backend level is the behaviour users expect.

This behaviour is equivalent to the atomic overrides merge strategy described by GEP-713. AFAIK, it has no counterpart in Istio, it's the exact opposite of the behaviour implemented by Envoy Gateway (atomic defaults), and exists in Kuadrant as an opt-in feature.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if an implicit deny at the gateway level when there are still other policies to check at backend level is the behaviour users expect.

I agree that we need user feedback regarding this.
Before we have user feedback, let us stick to implicit deny. WDYT?

@guicassolato @mikemorris @howardjohn

Copy link
Copy Markdown

@mikemorris mikemorris Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the trickiest part here is distinguishing between an explicitly configured empty set tools: [] field versus an unset/omitted field (which is particularly annoying in Go implementations, and would necessitate using a pointer type).

IMO, strict hierarchical evaluation (Gateway before Backend) means:

  • tools: [y] on a Gateway-targeted policy evaluates to an explicit ALLOW for calling tool y, any applicable Backend-targeted policy is evaluated next.
  • tools: [x] on a Gateway-targeted policy evaluates to an implicit "allow no tools except x" - an attempt to call tool y is not allowed, so any Backend-targeted policy is irrelevant
  • tools: [] on a Gateway-targeted policy implicitly allows no tool calls, any Backend-targeted policy is irrelevant

Critically, the "implicit allow nothing" is not actually modeled as a DENY which would halt further evaluation and reject the request, it's more like implicitly adding a higher-level default cluster-wide "allow nothing". In Istio this is actually created explicitly as an "allow-nothing" policy, whereas NetworkPolicy enforces this implicitly when an ALLOW rule is created (similar to what's being proposed here).

kind: XAccessPolicy
spec:
  targetRefs:
  - kind: Gateway
  rules: []

^ (or similarly, the lack of any AccessPolicy targeting the Gateway) does not specify any InlineTools ALLOW rules, and thus I would expect no implicit deny should exist and any Backend-targeted should be evaluated. Backend-targeted policies would also be evaluated when a Gateway-targeted AccessPolicy does explicitly allow a requested tool.

This is a somewhat contrived case currently, but could be much more relevant if AccessPolicy eventually became an additional ruleset extension of AuthorizationPolicy (which would make broader L4/L7 rules omitting a tools stanza much more common) or added an audit/"shadow" mode (for logging/testing before enforcement) as below:

  rules:
  - authorization:
      type: InlineTools
      action: ALLOW
      mode: AUDIT | ENFORCED (default when mode field is omitted)
      tools: 
      - x

(I've always found AUDIT as a peer enum value alongside ALLOW and DENY as confusing, because the intent of the rule becomes less clear.)

If we add a DENY action, we should consider whether that would also trigger the implicit "allow nothing" layer or not (as a narrow DENY breaking all traffic might be unexpected).

Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md Outdated
Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md
Comment on lines +253 to +258
| Level | Target Resource | Precedence |
|-------|-----------------|------------|
| Highest | Route Rule (via `sectionName`) | 1 (Wins all) |
| High | `HTTPRoute` / `GRPCRoute` | 2 |
| Medium | `Gateway` Listener (via `sectionName`) | 3 |
| Lowest | `Gateway` (Entire resource) | 4 |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly the same as Kuadrant AuthPolicy when the default merge strategy1 applies.

Footnotes

  1. With the overrides merge strategy, it's the exact reverse.

Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md Outdated
Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md Outdated
Comment thread docs/proposals/0059-AccessPolicyTargetRefs.md Outdated

One of the most important rules is how Linkerd handles multiple authentication requirements:

* OR (Between Policies): If you have two different AuthorizationPolicy objects targeting the same Server, the request is allowed if either one is satisfied.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example of permit-overrides behaviour, although it doesn't seem to support explicit deny – just like the other two examples we have (Kubernetes RBAC and NetworkPolicy).

cc @howardjohn

@guicassolato
Copy link
Copy Markdown
Contributor

  1. When an object is targeted by multiple InlineTools AccessPolicy objects, the request is only allowed if all the AccessPolicy objects allow the request.

    • Expected behavior: the request should be allowed if any of the AccessPolicy objects allows the request.
    • Rationale: Be consistent with prior art like Istio.
  2. The evaluation does not specify the evaluation order when both InlineTools and ExternalAuth rules exist for a target object.

    • Expected behavior: When both InlineTools and ExternalAuth rules exist for a target object, ExternalAuth rules should be evaluated first.
    • Rationale: efficiency + be consistent with prior art like Istio.

@guicassolato , do these two changes make sense to you?

@haiyanmeng,

After giving a better read today, I think this is reasonable.

I also realised there was some confusion before about what I said in reference to an ExternalAuth rule that denies access in conflict with an InlineTools that allows access. In my mind, ExternalAuth is currently the only type of authorisation we support that can explicitly deny access. (InlineTools is only about allowing.)

I never intended to imply that 2 InlineTools rules should be capable of cancelling each other and effectively evaluate to DENY.

There's a difference between choosing to evaluate a rule that denies access (explicit deny) and skipping a rule that otherwise would allow access (implicit deny). A single allow rule that is skipped cannot implicitly deny access if there are other rules to evaluate. Only if all allow rules fail to match the conditions to be evaluated (i.e., are all skipped), we can enforce an implicit deny.

The conflict I was referring to was between an InlineTools rule and an ExternalAuth rule or between two ExternalAuth rules, i.e., for explicit denies only and therefore always involving ExternalAuth, since this is the only type of authorisation today that supports explicit deny. Or basically the "unexpected behaviour" you reported at #214.

Even the github vs gitlab example, I wasn't thinking about anything but how one could naively implement the gitlab check as allow if host == 'gitlab' associated to an ExternalAuth rule.

I already said that I can live with evaluating ExternalAuth before InlineTools. We just need to watch for the required changes in the reference implementation and what it means for policies targeting a Backend (compared to setting ExternalAuth in the HTTPRoute.)

I guess my last remaining concern then is the disallowing of the combination of ExternalAuth and InlineTools in the same object. I still don't understand the rationale for that.

- **ExternalAuth Policies**: `gateway-policy-external-auth-1` and `gateway-policy-external-auth-2` are evaluated.
- If **any** denies, the request is rejected immediately.
- If **all** allow, proceed.
- *Note: Evaluation order among these policies does not matter.*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this "does not matter" is meant to mean "in the way its specified" or "to users"?

I think its probably important to users the order; usually if they are chaining, they are using external auth not just to give a allow/deny but to augment the request with additional information.

However I agree its a very challenging problem to solve if we allow multiple across resources 🙂 just that I would expect that eventually we see demand for ordering

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its probably important to users the order; usually if they are chaining, they are using external auth not just to give a allow/deny but to augment the request with additional information.

Good point.

Before this PR, the ordering is: the policies with earlier creationTimestamp will be evaluated first. For the policies with the same creationTimestamp, the ones appearing first in alphabetical order by {namespace}/{name} will be evaluated first.

Maybe we can keep this ordering for both ExternalAuth and ALLOW/InlineTools policies (strictly speaking, the evaluation order of ALLOW policies really does not matter). WDYT?

Copy link
Copy Markdown

@mikemorris mikemorris Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually noticed this reading through https://github.com/kubernetes-sigs/kube-agentic-networking/blob/main/docs/proposals/0059-AccessPolicyTargetRefs.md and felt the timestamp behavior was a bit unexpected - in Gateway API timestamps are mostly used to resolve conflicts during the evaluation of a resource by a controller (deciding whether to accept or reject it, such as if only a certain number of Gateway-targeted policies were allowed), and aren't typically relevant during evaluation/enforcement of a resource or policy by the dataplane at runtime.

First, gateway-policy-audit is evaluated (earlier creation timestamp).
Next, gateway-policy-region is evaluated.

I can see how, as @howardjohn mentions, mutations which augment a request could be relevant for subsequent policy decisions, but I think a more explicit ordered configuration would be preferable there, and I'm not sure if the mutating scope extends beyond current expectations for AccessPolicy and starts looking more like https://github.com/kubernetes-sigs/wg-ai-gateway/blob/main/proposals/7-payload-processing.md, which does have specific expectations around ordering and request mutation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikemorris , thanks for the feedback. Folk seems in favor of limiting the number of ExternalAuth policies/rules per target to 1 (see the GH discussion).

I will update this PR after we finalize the GH dicussion regarding this.

@markfisher1
Copy link
Copy Markdown

This PR is solving real problems:

  • Clarifies ambiguous behavior
  • Aligns with ecosystem patterns
  • Moves toward a usable model

But we’re at a foundational layer.

Decisions made here will:

  • Define how people reason about security
  • Shape future API constraints
  • Either enable or block advanced use cases

Recommendation Before Merge

  • Define a formal evaluation contract (deterministic spec)
  • Clarify ALLOW vs DENY responsibilities
  • Reconsider composition restriction (InlineTools + ExternalAuth)
  • Explicitly define Gateway vs Backend authority model
  • Leave room for ordering + context evolution

If we get this right now, this becomes a reference model for agentic policy in Kubernetes.

If we rush it, we’ll be revisiting this with breaking changes in a few releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants