Skip to content

Add OWASP Kubernetes resource importers#863

Open
Bornunique911 wants to merge 8 commits into
OWASP:mainfrom
Bornunique911:review/issue-471-kubernetes-importers
Open

Add OWASP Kubernetes resource importers#863
Bornunique911 wants to merge 8 commits into
OWASP:mainfrom
Bornunique911:review/issue-471-kubernetes-importers

Conversation

@Bornunique911
Copy link
Copy Markdown
Contributor

@Bornunique911 Bornunique911 commented Apr 4, 2026

Summary

This PR adds Kubernetes-related OWASP resource importer support for issue #471.

Adds importer/data/test support for:

  • OWASP Kubernetes Top Ten 2022
  • OWASP Kubernetes Top Ten 2025 (Draft)

This is the second upstream PR in the stacked #471 review series.

What changed

  • added Kubernetes parsers
  • added bundled source data
  • added parser tests

Validation

./venv/bin/python -m pytest application/tests/owasp_kubernetes_top10_2022_parser_test.py application/tests/owasp_kubernetes_top10_2025_parser_test.py -q

Why this is split out

The full #471 work is too large to review effectively as one PR.

This PR isolates one OWASP resource family so the parser/data model can be reviewed independently before the later Kubernetes, cheat sheet, backend analysis, and frontend changes.

@Bornunique911
Copy link
Copy Markdown
Contributor Author

Addition :

There is an another addition of other OWASP families with reference to issue #471 and PR #858 which includes OWASP Kubernetes Top 10 2022 (https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/) and OWASP Kubernetes Top 2025 (Draft) (https://owasp.org/www-project-kubernetes-top-ten/2025/en/src/).

Screenshot of OWASP Kubernetes Top 10 2022 :

image

Screenshot of OWASP Kubernetes Top 10 2025 (Draft) :

image

@Pa04rth
Copy link
Copy Markdown
Collaborator

Pa04rth commented Apr 5, 2026

@Bornunique911 attach the issue first

@Bornunique911
Copy link
Copy Markdown
Contributor Author

Bornunique911 commented Apr 7, 2026

Requesting kind reviews and feedback for this PR from : @northdpole , @Pa04rth , @robvanderveer

@Bornunique911 Bornunique911 force-pushed the review/issue-471-kubernetes-importers branch from f23755c to 9390511 Compare April 11, 2026 07:58
@Bornunique911 Bornunique911 force-pushed the review/issue-471-kubernetes-importers branch 2 times, most recently from a60a627 to 12a8ba1 Compare April 21, 2026 19:34
@Bornunique911
Copy link
Copy Markdown
Contributor Author

The third upstream PR in the stacked #471 review series is #865 .

@Bornunique911 Bornunique911 force-pushed the review/issue-471-kubernetes-importers branch 2 times, most recently from 92b2655 to d593747 Compare April 30, 2026 19:09
@Bornunique911 Bornunique911 force-pushed the review/issue-471-kubernetes-importers branch from bd8c5c5 to 241db90 Compare May 7, 2026 09:35
@Bornunique911 Bornunique911 force-pushed the review/issue-471-kubernetes-importers branch from 241db90 to f56d892 Compare June 1, 2026 15:17
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

Warning

Review limit reached

@Bornunique911, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 21 minutes and 28 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 64b65621-fb26-4186-a34f-a1d5ce6fec4d

📥 Commits

Reviewing files that changed from the base of the PR and between 9f6334d and 5745184.

📒 Files selected for processing (1)
  • application/tests/owasp_aisvs_parser_test.py

Walkthrough

This PR introduces six new OWASP standard parsers (AISVS, API Top 10 2023, Kubernetes Top 10 2022/2025, LLM Top 10 2025, Top 10 2025) with supporting JSON data files and unit tests, refactors the cheatsheets parser to use official OWASP URLs, and extends the CLI with import flags for these standards.

Changes

OWASP Standard Parsers

Layer / File(s) Summary
OWASP AISVS Parser with data and tests
application/utils/external_project_parsers/data/owasp_aisvs_1_0.json, application/utils/external_project_parsers/parsers/owasp_aisvs.py, application/tests/owasp_aisvs_parser_test.py
JSON dataset maps 14 AISVS sections to titles, hyperlinks, and CRE IDs. Parser loads entries, constructs Standard objects, and links CREs from cache. Test validates parser output with seeded CREs.
OWASP API Top 10 2023 Parser with data and tests
application/utils/external_project_parsers/data/owasp_api_top10_2023.json, application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py, application/tests/owasp_api_top10_2023_parser_test.py
JSON dataset defines 10 API security sections. Parser transforms entries into Standard objects and links CREs. Test asserts 10 entries with section metadata and first/last link mappings.
OWASP Kubernetes Top 10 2022 Parser with data and tests
application/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.json, application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2022.py, application/tests/owasp_kubernetes_top10_2022_parser_test.py
JSON dataset maps 10 Kubernetes sections K01–K10. Parser constructs Standards and resolves CRE links. Test validates entry count and specific section/link metadata.
OWASP LLM Top 10 2025 Parser with data and tests
application/utils/external_project_parsers/data/owasp_llm_top10_2025.json, application/utils/external_project_parsers/parsers/owasp_llm_top10_2025.py, application/tests/owasp_llm_top10_2025_parser_test.py
JSON dataset contains 10 LLM security entries. Parser loads entries and attaches CRE links when found in cache. Test validates 10 entries with section ordering and link document IDs for selected positions.
OWASP Top 10 2025 Parser with data and tests
application/utils/external_project_parsers/data/owasp_top10_2025.json, application/utils/external_project_parsers/parsers/owasp_top10_2025.py, application/tests/owasp_top10_2025_parser_test.py
JSON dataset maps 10 Web security sections. Parser constructs Standard objects and creates LinkedTo links by resolving CRE IDs from cache. Test asserts 10 results with section metadata and link mappings for first/last entries.
OWASP Kubernetes Top 10 2025 Parser with fallback logic
application/utils/external_project_parsers/data/owasp_kubernetes_top10_2025.json, application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2025.py, application/tests/owasp_kubernetes_top10_2025_parser_test.py
Primary and fallback JSON datasets support versioning. Parser loads 2025 entries and 2022 fallback mapping, attempts to link CREs from primary sections, and falls back to alternative sections when primary links are absent. Tests validate normal parsing and fallback behavior when 2025 links are missing.

Cheatsheets Parser Refactor

Layer / File(s) Summary
Cheatsheets parser URL generation refactor
application/utils/external_project_parsers/parsers/cheatsheets_parser.py
Adds cheatsheetseries_base_url constant and official_cheatsheet_url() helper that derives official OWASP HTML URLs from markdown filenames. Updates register_cheatsheets to use this derivation instead of GitHub tree URLs.
Cheatsheets parser tests with new filename and assertions
application/tests/cheatsheets_parser_test.py
Updates test fixture to use Secrets_Management_Cheat_Sheet.md filename. Adjusts expected hyperlink, switches link type to AutomaticallyLinkedTo, modifies tags, and changes assertion to assertEqual for exact dictionary equality.

CLI Integration

Layer / File(s) Summary
CLI argument flags for OWASP standards import
cre.py
Adds six boolean argparse flags (--owasp_top10_2025_in, --owasp_api_top10_2023_in, --owasp_kubernetes_top10_2022_in, --owasp_kubernetes_top10_2025_in, --owasp_llm_top10_2025_in, --owasp_aisvs_in) to enable importing each new OWASP standard collection.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding OWASP Kubernetes importers/parsers for 2022 and 2025 versions.
Description check ✅ Passed The description is directly related to the changeset, clearly explaining the addition of Kubernetes-related OWASP resource importers with specific versions and rationale for the split.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Bornunique911
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Bornunique911
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (3)
cre.py (1)

183-183: ⚡ Quick win

Align help text to use "Top 10" instead of "Top Ten".

The help text for the Kubernetes flags uses "Top Ten" (lines 183, 188), while other OWASP flags consistently use "Top 10" (line 173). The parser names in the codebase use "Top 10" (numeric), so the help text should match.

📝 Suggested help text corrections
     parser.add_argument(
         "--owasp_kubernetes_top10_2022_in",
         action="store_true",
-        help="import OWASP Kubernetes Top Ten 2022",
+        help="import OWASP Kubernetes Top 10 2022",
     )
     parser.add_argument(
         "--owasp_kubernetes_top10_2025_in",
         action="store_true",
-        help="import OWASP Kubernetes Top Ten 2025 draft",
+        help="import OWASP Kubernetes Top 10 2025 draft",
     )

Also applies to: 188-188

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cre.py` at line 183, Update the help text for the Kubernetes-related flags in
cre.py to use the numeric "Top 10" instead of "Top Ten" so it matches the rest
of the OWASP help strings and parser names; locate the two occurrences where
help="import OWASP Kubernetes Top Ten 2022" (and the similar entry at the other
Kubernetes flag) and change them to help="import OWASP Kubernetes Top 10 2022".
application/utils/external_project_parsers/parsers/owasp_aisvs.py (1)

13-45: ⚖️ Poor tradeoff

Heavy duplication across the six new OWASP parsers.

This parse implementation is essentially identical to owasp_llm_top10_2025.py (and per the stack, the API/K8s/Top10-2025 parsers). Only name and data_file differ. Consider extracting a shared base parser that takes name + data_file and implements the common load/link/ParseResult logic, leaving subclasses to set just the two class attributes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/utils/external_project_parsers/parsers/owasp_aisvs.py` around
lines 13 - 45, The parse method in OwaspAisvs duplicates logic used by several
OWASP parsers; extract the common load/link/ParseResult behavior into a shared
base class (e.g., OwaspBaseParser) that implements parse(cache, ph) using
self.name and self.data_file, then have OwaspAisvs (and other OWASP parser
classes) inherit from that base and only set the class attributes name and
data_file; update references to defs.Standard, defs.Link, cache.get_CREs, and
ParseResult inside the base so subclasses remain minimal.
application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py (1)

19-47: ⚖️ Poor tradeoff

Heavy duplication across the five new parsers.

This parse body is essentially identical to owasp_kubernetes_top10_2022.py, owasp_top10_2025.py (and the AISVS/LLM parsers per the stack). Only name and data_file differ. Consider a shared base that takes name/data_file as class attributes and implements parse once, so the per-standard classes shrink to two lines. This reduces drift risk as the JSON→Standard mapping evolves.

♻️ Sketch of a shared base parser
class _JsonStandardParser(ParserInterface):
    name: str
    data_file: Path

    def parse(self, cache: db.Node_collection, ph: prompt_client.PromptHandler):
        with self.data_file.open("r", encoding="utf-8") as handle:
            raw_entries = json.load(handle)
        entries = []
        for entry in raw_entries:
            standard = defs.Standard(
                name=self.name,
                sectionID=entry["section_id"],
                section=entry["section"],
                hyperlink=entry["hyperlink"],
            )
            for cre_id in entry.get("cre_ids", []):
                cres = cache.get_CREs(external_id=cre_id)
                if not cres:
                    continue
                standard.add_link(
                    defs.Link(ltype=defs.LinkTypes.LinkedTo, document=cres[0].shallow_copy())
                )
            entries.append(standard)
        return ParseResult(
            results={self.name: entries},
            calculate_gap_analysis=False,
            calculate_embeddings=False,
        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py`
around lines 19 - 47, Extract the duplicated parse logic into a shared base
class (e.g. _JsonStandardParser) that implements parse (using defs.Standard,
cache.get_CREs, defs.Link, defs.LinkTypes.LinkedTo and returning ParseResult)
and reads self.data_file and self.name; then have the specific parsers like the
class in owasp_api_top10_2023.py inherit that base and only set class attributes
name and data_file, removing the duplicated parse method from each per-standard
file so the JSON→Standard mapping lives in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@application/tests/owasp_aisvs_parser_test.py`:
- Around line 51-62: The test uses an ambiguous single-letter loop variable `l`
in list comprehensions for entries' links which triggers Ruff E741; update both
occurrences to a descriptive name (e.g., `link`) so the assertions read
[link.document.id for link in entries[0].links] and [link.document.id for link
in entries[-1].links], and search/replace any other test references to `l` in
this file to keep naming consistent.

In `@application/tests/owasp_api_top10_2023_parser_test.py`:
- Around line 39-43: Rename the ambiguous loop variable `l` used in the list
comprehensions inside the test assertions to a clearer name (e.g., `link`) to
satisfy Ruff E741; update both occurrences in the assertions that compute
document ids from entries[0].links and entries[-1].links (the comprehensions
currently written as [l.document.id for l in entries[0].links] and
[l.document.id for l in entries[-1].links]) so they read [link.document.id for
link in entries[0].links] and [link.document.id for link in entries[-1].links].

In `@application/tests/owasp_kubernetes_top10_2022_parser_test.py`:
- Around line 41-45: Rename the ambiguous loop variable `l` used in list
comprehensions inside the assertions to `link` to satisfy Ruff E741;
specifically update the expressions [l.document.id for l in entries[0].links]
and [l.document.id for l in entries[-1].links] in
owasp_kubernetes_top10_2022_parser_test.py to use [link.document.id for link in
entries[0].links] and [link.document.id for link in entries[-1].links]
respectively so the test method (the assertions referencing entries[0].links and
entries[-1].links) no longer uses the single-letter variable `l`.

In `@application/tests/owasp_kubernetes_top10_2025_parser_test.py`:
- Around line 45-52: Rename the ambiguous loop variable `l` to `link` in the
list comprehensions used in the assertions inside the test function (the
expressions producing `[l.document.id for l in entries[0].links]` and
`[l.document.id for l in entries[-1].links]`); update both occurrences to
`[link.document.id for link in entries[0].links]` and `[link.document.id for
link in entries[-1].links]` so the variable is clear and consistent with the
usage at Line 102.

In `@application/tests/owasp_llm_top10_2025_parser_test.py`:
- Around line 40-45: The tests use an ambiguous loop variable `l` in list
comprehensions causing Ruff E741; rename `l` to a clearer identifier like `link`
in the three assertions that build lists from entries[...] .links (i.e., replace
`[l.document.id for l in entries[0].links]`, `[l.document.id for l in
entries[4].links]`, and `[l.document.id for l in entries[-1].links]` with
`[link.document.id for link in entries[...].links]`) so the variable is
unambiguous while preserving the expected values and assertions for
`document.id` and `sectionID`.

In `@application/utils/external_project_parsers/data/owasp_aisvs_1_0.json`:
- Line 5: The "hyperlink" JSON entries currently use GitHub "/tree/main/" paths;
update each "hyperlink" value that contains "/tree/main/" (e.g., the string
"https://github.com/OWASP/AISVS/tree/main/1.0/en/0x10-C01-Training-Data-Governance.md")
to use "/blob/main/" instead (replace "/tree/main/" with "/blob/main/") for all
14 entries so the URLs point to the canonical file URLs.

---

Nitpick comments:
In `@application/utils/external_project_parsers/parsers/owasp_aisvs.py`:
- Around line 13-45: The parse method in OwaspAisvs duplicates logic used by
several OWASP parsers; extract the common load/link/ParseResult behavior into a
shared base class (e.g., OwaspBaseParser) that implements parse(cache, ph) using
self.name and self.data_file, then have OwaspAisvs (and other OWASP parser
classes) inherit from that base and only set the class attributes name and
data_file; update references to defs.Standard, defs.Link, cache.get_CREs, and
ParseResult inside the base so subclasses remain minimal.

In `@application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py`:
- Around line 19-47: Extract the duplicated parse logic into a shared base class
(e.g. _JsonStandardParser) that implements parse (using defs.Standard,
cache.get_CREs, defs.Link, defs.LinkTypes.LinkedTo and returning ParseResult)
and reads self.data_file and self.name; then have the specific parsers like the
class in owasp_api_top10_2023.py inherit that base and only set class attributes
name and data_file, removing the duplicated parse method from each per-standard
file so the JSON→Standard mapping lives in one place.

In `@cre.py`:
- Line 183: Update the help text for the Kubernetes-related flags in cre.py to
use the numeric "Top 10" instead of "Top Ten" so it matches the rest of the
OWASP help strings and parser names; locate the two occurrences where
help="import OWASP Kubernetes Top Ten 2022" (and the similar entry at the other
Kubernetes flag) and change them to help="import OWASP Kubernetes Top 10 2022".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e4804f97-f1a6-4b4e-a426-55447a012ba2

📥 Commits

Reviewing files that changed from the base of the PR and between e93ce92 and f56d892.

📒 Files selected for processing (21)
  • application/tests/cheatsheets_parser_test.py
  • application/tests/owasp_aisvs_parser_test.py
  • application/tests/owasp_api_top10_2023_parser_test.py
  • application/tests/owasp_kubernetes_top10_2022_parser_test.py
  • application/tests/owasp_kubernetes_top10_2025_parser_test.py
  • application/tests/owasp_llm_top10_2025_parser_test.py
  • application/tests/owasp_top10_2025_parser_test.py
  • application/utils/external_project_parsers/data/owasp_aisvs_1_0.json
  • application/utils/external_project_parsers/data/owasp_api_top10_2023.json
  • application/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.json
  • application/utils/external_project_parsers/data/owasp_kubernetes_top10_2025.json
  • application/utils/external_project_parsers/data/owasp_llm_top10_2025.json
  • application/utils/external_project_parsers/data/owasp_top10_2025.json
  • application/utils/external_project_parsers/parsers/cheatsheets_parser.py
  • application/utils/external_project_parsers/parsers/owasp_aisvs.py
  • application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py
  • application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2022.py
  • application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2025.py
  • application/utils/external_project_parsers/parsers/owasp_llm_top10_2025.py
  • application/utils/external_project_parsers/parsers/owasp_top10_2025.py
  • cre.py

Comment thread application/tests/owasp_aisvs_parser_test.py Outdated
Comment thread application/tests/owasp_api_top10_2023_parser_test.py Outdated
Comment thread application/tests/owasp_kubernetes_top10_2022_parser_test.py Outdated
Comment thread application/tests/owasp_kubernetes_top10_2025_parser_test.py
Comment thread application/tests/owasp_llm_top10_2025_parser_test.py Outdated
Comment thread application/utils/external_project_parsers/data/owasp_aisvs_1_0.json Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants