Add OWASP Kubernetes resource importers#863
Conversation
Addition :There is an another addition of other OWASP families with reference to issue #471 and PR #858 which includes OWASP Kubernetes Top 10 2022 (https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/) and OWASP Kubernetes Top 2025 (Draft) (https://owasp.org/www-project-kubernetes-top-ten/2025/en/src/). Screenshot of OWASP Kubernetes Top 10 2022 :
Screenshot of OWASP Kubernetes Top 10 2025 (Draft) :
|
|
@Bornunique911 attach the issue first |
|
Requesting kind reviews and feedback for this PR from : @northdpole , @Pa04rth , @robvanderveer |
f23755c to
9390511
Compare
a60a627 to
12a8ba1
Compare
92b2655 to
d593747
Compare
bd8c5c5 to
241db90
Compare
241db90 to
f56d892
Compare
|
Warning Review limit reached
More reviews will be available in 21 minutes and 28 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR introduces six new OWASP standard parsers (AISVS, API Top 10 2023, Kubernetes Top 10 2022/2025, LLM Top 10 2025, Top 10 2025) with supporting JSON data files and unit tests, refactors the cheatsheets parser to use official OWASP URLs, and extends the CLI with import flags for these standards. ChangesOWASP Standard Parsers
Cheatsheets Parser Refactor
CLI Integration
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (3)
cre.py (1)
183-183: ⚡ Quick winAlign help text to use "Top 10" instead of "Top Ten".
The help text for the Kubernetes flags uses "Top Ten" (lines 183, 188), while other OWASP flags consistently use "Top 10" (line 173). The parser names in the codebase use "Top 10" (numeric), so the help text should match.
📝 Suggested help text corrections
parser.add_argument( "--owasp_kubernetes_top10_2022_in", action="store_true", - help="import OWASP Kubernetes Top Ten 2022", + help="import OWASP Kubernetes Top 10 2022", ) parser.add_argument( "--owasp_kubernetes_top10_2025_in", action="store_true", - help="import OWASP Kubernetes Top Ten 2025 draft", + help="import OWASP Kubernetes Top 10 2025 draft", )Also applies to: 188-188
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cre.py` at line 183, Update the help text for the Kubernetes-related flags in cre.py to use the numeric "Top 10" instead of "Top Ten" so it matches the rest of the OWASP help strings and parser names; locate the two occurrences where help="import OWASP Kubernetes Top Ten 2022" (and the similar entry at the other Kubernetes flag) and change them to help="import OWASP Kubernetes Top 10 2022".application/utils/external_project_parsers/parsers/owasp_aisvs.py (1)
13-45: ⚖️ Poor tradeoffHeavy duplication across the six new OWASP parsers.
This
parseimplementation is essentially identical toowasp_llm_top10_2025.py(and per the stack, the API/K8s/Top10-2025 parsers). Onlynameanddata_filediffer. Consider extracting a shared base parser that takesname+data_fileand implements the common load/link/ParseResult logic, leaving subclasses to set just the two class attributes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@application/utils/external_project_parsers/parsers/owasp_aisvs.py` around lines 13 - 45, The parse method in OwaspAisvs duplicates logic used by several OWASP parsers; extract the common load/link/ParseResult behavior into a shared base class (e.g., OwaspBaseParser) that implements parse(cache, ph) using self.name and self.data_file, then have OwaspAisvs (and other OWASP parser classes) inherit from that base and only set the class attributes name and data_file; update references to defs.Standard, defs.Link, cache.get_CREs, and ParseResult inside the base so subclasses remain minimal.application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py (1)
19-47: ⚖️ Poor tradeoffHeavy duplication across the five new parsers.
This
parsebody is essentially identical toowasp_kubernetes_top10_2022.py,owasp_top10_2025.py(and the AISVS/LLM parsers per the stack). Onlynameanddata_filediffer. Consider a shared base that takesname/data_fileas class attributes and implementsparseonce, so the per-standard classes shrink to two lines. This reduces drift risk as the JSON→Standard mapping evolves.♻️ Sketch of a shared base parser
class _JsonStandardParser(ParserInterface): name: str data_file: Path def parse(self, cache: db.Node_collection, ph: prompt_client.PromptHandler): with self.data_file.open("r", encoding="utf-8") as handle: raw_entries = json.load(handle) entries = [] for entry in raw_entries: standard = defs.Standard( name=self.name, sectionID=entry["section_id"], section=entry["section"], hyperlink=entry["hyperlink"], ) for cre_id in entry.get("cre_ids", []): cres = cache.get_CREs(external_id=cre_id) if not cres: continue standard.add_link( defs.Link(ltype=defs.LinkTypes.LinkedTo, document=cres[0].shallow_copy()) ) entries.append(standard) return ParseResult( results={self.name: entries}, calculate_gap_analysis=False, calculate_embeddings=False, )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py` around lines 19 - 47, Extract the duplicated parse logic into a shared base class (e.g. _JsonStandardParser) that implements parse (using defs.Standard, cache.get_CREs, defs.Link, defs.LinkTypes.LinkedTo and returning ParseResult) and reads self.data_file and self.name; then have the specific parsers like the class in owasp_api_top10_2023.py inherit that base and only set class attributes name and data_file, removing the duplicated parse method from each per-standard file so the JSON→Standard mapping lives in one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@application/tests/owasp_aisvs_parser_test.py`:
- Around line 51-62: The test uses an ambiguous single-letter loop variable `l`
in list comprehensions for entries' links which triggers Ruff E741; update both
occurrences to a descriptive name (e.g., `link`) so the assertions read
[link.document.id for link in entries[0].links] and [link.document.id for link
in entries[-1].links], and search/replace any other test references to `l` in
this file to keep naming consistent.
In `@application/tests/owasp_api_top10_2023_parser_test.py`:
- Around line 39-43: Rename the ambiguous loop variable `l` used in the list
comprehensions inside the test assertions to a clearer name (e.g., `link`) to
satisfy Ruff E741; update both occurrences in the assertions that compute
document ids from entries[0].links and entries[-1].links (the comprehensions
currently written as [l.document.id for l in entries[0].links] and
[l.document.id for l in entries[-1].links]) so they read [link.document.id for
link in entries[0].links] and [link.document.id for link in entries[-1].links].
In `@application/tests/owasp_kubernetes_top10_2022_parser_test.py`:
- Around line 41-45: Rename the ambiguous loop variable `l` used in list
comprehensions inside the assertions to `link` to satisfy Ruff E741;
specifically update the expressions [l.document.id for l in entries[0].links]
and [l.document.id for l in entries[-1].links] in
owasp_kubernetes_top10_2022_parser_test.py to use [link.document.id for link in
entries[0].links] and [link.document.id for link in entries[-1].links]
respectively so the test method (the assertions referencing entries[0].links and
entries[-1].links) no longer uses the single-letter variable `l`.
In `@application/tests/owasp_kubernetes_top10_2025_parser_test.py`:
- Around line 45-52: Rename the ambiguous loop variable `l` to `link` in the
list comprehensions used in the assertions inside the test function (the
expressions producing `[l.document.id for l in entries[0].links]` and
`[l.document.id for l in entries[-1].links]`); update both occurrences to
`[link.document.id for link in entries[0].links]` and `[link.document.id for
link in entries[-1].links]` so the variable is clear and consistent with the
usage at Line 102.
In `@application/tests/owasp_llm_top10_2025_parser_test.py`:
- Around line 40-45: The tests use an ambiguous loop variable `l` in list
comprehensions causing Ruff E741; rename `l` to a clearer identifier like `link`
in the three assertions that build lists from entries[...] .links (i.e., replace
`[l.document.id for l in entries[0].links]`, `[l.document.id for l in
entries[4].links]`, and `[l.document.id for l in entries[-1].links]` with
`[link.document.id for link in entries[...].links]`) so the variable is
unambiguous while preserving the expected values and assertions for
`document.id` and `sectionID`.
In `@application/utils/external_project_parsers/data/owasp_aisvs_1_0.json`:
- Line 5: The "hyperlink" JSON entries currently use GitHub "/tree/main/" paths;
update each "hyperlink" value that contains "/tree/main/" (e.g., the string
"https://github.com/OWASP/AISVS/tree/main/1.0/en/0x10-C01-Training-Data-Governance.md")
to use "/blob/main/" instead (replace "/tree/main/" with "/blob/main/") for all
14 entries so the URLs point to the canonical file URLs.
---
Nitpick comments:
In `@application/utils/external_project_parsers/parsers/owasp_aisvs.py`:
- Around line 13-45: The parse method in OwaspAisvs duplicates logic used by
several OWASP parsers; extract the common load/link/ParseResult behavior into a
shared base class (e.g., OwaspBaseParser) that implements parse(cache, ph) using
self.name and self.data_file, then have OwaspAisvs (and other OWASP parser
classes) inherit from that base and only set the class attributes name and
data_file; update references to defs.Standard, defs.Link, cache.get_CREs, and
ParseResult inside the base so subclasses remain minimal.
In `@application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py`:
- Around line 19-47: Extract the duplicated parse logic into a shared base class
(e.g. _JsonStandardParser) that implements parse (using defs.Standard,
cache.get_CREs, defs.Link, defs.LinkTypes.LinkedTo and returning ParseResult)
and reads self.data_file and self.name; then have the specific parsers like the
class in owasp_api_top10_2023.py inherit that base and only set class attributes
name and data_file, removing the duplicated parse method from each per-standard
file so the JSON→Standard mapping lives in one place.
In `@cre.py`:
- Line 183: Update the help text for the Kubernetes-related flags in cre.py to
use the numeric "Top 10" instead of "Top Ten" so it matches the rest of the
OWASP help strings and parser names; locate the two occurrences where
help="import OWASP Kubernetes Top Ten 2022" (and the similar entry at the other
Kubernetes flag) and change them to help="import OWASP Kubernetes Top 10 2022".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: e4804f97-f1a6-4b4e-a426-55447a012ba2
📒 Files selected for processing (21)
application/tests/cheatsheets_parser_test.pyapplication/tests/owasp_aisvs_parser_test.pyapplication/tests/owasp_api_top10_2023_parser_test.pyapplication/tests/owasp_kubernetes_top10_2022_parser_test.pyapplication/tests/owasp_kubernetes_top10_2025_parser_test.pyapplication/tests/owasp_llm_top10_2025_parser_test.pyapplication/tests/owasp_top10_2025_parser_test.pyapplication/utils/external_project_parsers/data/owasp_aisvs_1_0.jsonapplication/utils/external_project_parsers/data/owasp_api_top10_2023.jsonapplication/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.jsonapplication/utils/external_project_parsers/data/owasp_kubernetes_top10_2025.jsonapplication/utils/external_project_parsers/data/owasp_llm_top10_2025.jsonapplication/utils/external_project_parsers/data/owasp_top10_2025.jsonapplication/utils/external_project_parsers/parsers/cheatsheets_parser.pyapplication/utils/external_project_parsers/parsers/owasp_aisvs.pyapplication/utils/external_project_parsers/parsers/owasp_api_top10_2023.pyapplication/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2022.pyapplication/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2025.pyapplication/utils/external_project_parsers/parsers/owasp_llm_top10_2025.pyapplication/utils/external_project_parsers/parsers/owasp_top10_2025.pycre.py


Summary
This PR adds Kubernetes-related OWASP resource importer support for issue #471.
Adds importer/data/test support for:
This is the second upstream PR in the stacked #471 review series.
What changed
Validation
Why this is split out
The full #471 work is too large to review effectively as one PR.
This PR isolates one OWASP resource family so the parser/data model can be reviewed independently before the later Kubernetes, cheat sheet, backend analysis, and frontend changes.