Skip to content

Add refresh scripts for OWASP resources for issue 471#877

Open
Bornunique911 wants to merge 10 commits into
OWASP:mainfrom
Bornunique911:review/issue-471-refresh-scripts
Open

Add refresh scripts for OWASP resources for issue 471#877
Bornunique911 wants to merge 10 commits into
OWASP:mainfrom
Bornunique911:review/issue-471-refresh-scripts

Conversation

@Bornunique911
Copy link
Copy Markdown
Contributor

@Bornunique911 Bornunique911 commented Apr 11, 2026

Issue Reference

#471

Summary

This PR addresses part of issue #471 by adding refresh and update workflow support for the OWASP resources introduced by earlier importer PRs.

This is the fourth upstream PR in the stacked #471 review series.

Problem Fixed

Newly added OWASP resources lacked refresh/update workflow support.

Solution

Added refresh scripts and helper flows for OWASP resource maintenance.

Tests

Validated against the parser/importer coverage introduced by the earlier PRs.

Why this is split out

The full #471 work is too large to review effectively as one PR.

This PR isolates one OWASP resource family so the parser/data model can be reviewed independently before the later Kubernetes, cheat sheet, backend analysis, and frontend changes.

@Bornunique911
Copy link
Copy Markdown
Contributor Author

Requesting kind reviews and feedback for this feature from : @northdpole , @Pa04rth , @robvanderveer

@Bornunique911 Bornunique911 force-pushed the review/issue-471-refresh-scripts branch 11 times, most recently from e4dd489 to 0d4d9c0 Compare April 21, 2026 19:24
@Bornunique911
Copy link
Copy Markdown
Contributor Author

@Bornunique911 Bornunique911 force-pushed the review/issue-471-refresh-scripts branch 2 times, most recently from 03d62bd to 8a19464 Compare April 30, 2026 19:33
@Bornunique911
Copy link
Copy Markdown
Contributor Author

The fifth upstream PR in the stacked #471 review series is #900 .

@Bornunique911 Bornunique911 force-pushed the review/issue-471-refresh-scripts branch from 571b932 to 32d02f7 Compare May 7, 2026 09:39
@Bornunique911 Bornunique911 force-pushed the review/issue-471-refresh-scripts branch from 32d02f7 to e0e8b08 Compare June 2, 2026 17:36
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 5509a26a-2f7f-44e4-a92f-e79c5942338c

📥 Commits

Reviewing files that changed from the base of the PR and between f2f85f4 and 71a51b1.

📒 Files selected for processing (2)
  • application/utils/external_project_parsers/parsers/cheatsheets_parser.py
  • scripts/update-cheatsheets.sh
🚧 Files skipped from review as they are similar to previous changes (2)
  • scripts/update-cheatsheets.sh
  • application/utils/external_project_parsers/parsers/cheatsheets_parser.py

Summary by CodeRabbit

  • New Features

    • Added support for OWASP Top 10 2025, API Top 10 2023, Kubernetes Top Ten (2022 + 2025 draft with fallback), LLM Top 10 2025, and AISVS mappings; cheatsheets now include supplemental entries, official OWASP HTML links, and deduplication.
  • Tests

    • Added and updated unit tests covering parsing, link mapping, fallback behavior, cheatsheet hyperlink normalization, and CRE link resolution across all new OWASP datasets.
  • Chores

    • New CLI options and scripts to import, refresh, back up, and normalize OWASP datasets in the local cache.

Walkthrough

This PR adds six OWASP parsers with bundled JSON mappings, extends the cheatsheets parser to ingest supplemental JSON and generate official OWASP HTML links, adds parser tests, CLI flags and cre_main registration, and provides scripts to update and normalize the standards cache.

Changes

OWASP Standards Parser Integration

Layer / File(s) Summary
JSON Standard Mappings
application/utils/external_project_parsers/data/owasp_*.json
Seven JSON files define AISVS 1.0 (14 sections), API Security Top 10 2023 (10 entries), supplemental cheatsheets, Kubernetes Top Ten 2022 (10 entries), Kubernetes Top Ten 2025 with fallback IDs (10 entries), LLM Top 10 2025 (10 entries), and Web Top 10 2025 (10 entries).
Six Parser Implementations
application/utils/external_project_parsers/parsers/owasp_*.py
OwaspAisvs, OwaspApiTop10_2023, OwaspKubernetesTop10_2022, OwaspKubernetesTop10_2025 (with 2022 fallback), OwaspLlmTop10_2025, and OwaspTop10_2025 each load JSON, construct defs.Standard entries, resolve CREs from the cache by external ID, and attach links. All return ParseResult with gap analysis and embeddings disabled.
Cheatsheets Parser Enhancement
application/utils/external_project_parsers/parsers/cheatsheets_parser.py
Adds official OWASP cheatsheet base URL, supplemental JSON ingestion, official_cheatsheet_url() for HTML links, register_supplemental_cheatsheets() to build and auto-link standards from JSON, and deduplicate_entries() to remove duplicates. parse() now tolerates clone failures and appends supplemental entries.
CLI Argument Integration
cre.py
Adds boolean flags: --owasp_top10_2025_in, --owasp_api_top10_2023_in, --owasp_kubernetes_top10_2022_in, --owasp_kubernetes_top10_2025_in, --owasp_llm_top10_2025_in, --owasp_aisvs_in.
Parser Test Suites
application/tests/owasp_*_parser_test.py, application/tests/cheatsheets_parser_test.py
New/unified tests create a test Flask app and DB, seed CREs into Node_collection, run each parser via PromptHandler, and assert exact entry counts plus specific section IDs, titles, hyperlinks, and linked CRE document IDs. Cheatsheets tests verify official URL mapping and supplemental registration. Kubernetes 2025 test covers fallback mapping.
Operational Cache Scripts
scripts/update-*.sh
update-cheatsheets.sh normalizes cheatsheet links in the SQLite cache to official OWASP HTML URLs. update-owasp-top10-2025-mappings.sh refreshes upstream and reapplies 2025 mappings. update-owasp-top10-standards.sh orchestrates full refresh, selects Kubernetes version by presence of 2025 draft links, and prunes unmapped standard nodes. All scripts manage venv, deps, backups, and environment configuration.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding refresh scripts for OWASP resources to address issue 471.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem (OWASP resources lacked refresh/update workflow support), the solution (adding refresh scripts), and the context (part of stacked PR series for issue #471).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Bornunique911
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 3, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (2)
application/utils/external_project_parsers/parsers/cheatsheets_parser.py (1)

116-130: 💤 Low value

Silent continue hides link failures.

The except Exception: continue swallows every error from add_link without any log, making misconfigured cre_ids invisible during refresh runs. Consider logging at debug/warning level so failures are diagnosable.

♻️ Suggested change
-                    except Exception:
-                        continue
+                    except Exception as exc:
+                        self.logger.debug(
+                            "Failed to link CRE %s to supplemental cheatsheet %s: %s",
+                            cre_id,
+                            entry.get("section"),
+                            exc,
+                        )
+                        continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@application/utils/external_project_parsers/parsers/cheatsheets_parser.py`
around lines 116 - 130, The loop silently swallows all errors from cs.add_link
which hides misconfigured cre_ids; change the bare "except Exception: continue"
to catch the exception as a variable and log a warning or debug message
including the cre_id (from entry.get("cre_ids")), the cre (from cres), and the
exception details before continuing so failures are visible; update the block
around cache.get_CREs / cs.add_link / defs.Link /
defs.LinkTypes.AutomaticallyLinkedTo to log (using the module logger) the
context and exception and then continue.
scripts/update-cheatsheets.sh (1)

22-55: ⚡ Quick win

Back up the cache DB before in-place link rewrites.

Unlike update-owasp-top10-2025-mappings.sh and update-owasp-top10-standards.sh, this script mutates node.link rows in place with no timestamped backup. A bad run (e.g., an unexpected link shape producing a malformed official URL) would corrupt links with no recovery path. Consider adding the same cp "$CACHE_FILE" "$BACKUP_FILE" guard the sibling scripts use.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/update-cheatsheets.sh` around lines 22 - 55, Add a timestamped backup
step before mutating the DB so the script copies DB_PATH to a BACKUP_FILE and
aborts on copy failure; specifically, before invoking the embedded Python block,
create a backup like BACKUP_FILE="${DB_PATH}.$(date +%Y%m%d%H%M%S).bak" and run
cp "$DB_PATH" "$BACKUP_FILE" (or the same cp "$CACHE_FILE" "$BACKUP_FILE" guard
used in sibling scripts), check the return code and exit with an error if the
copy fails, and print the backup location so that the Python code (which updates
node.link using github_prefix/official_prefix, cur, conn, rows) runs only after
a successful backup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@application/tests/owasp_aisvs_parser_test.py`:
- Around line 51-53: Rename the ambiguous loop variable `l` used in the list
comprehensions inside the test assertions to a clearer name (e.g., `link`) to
satisfy Ruff E741; specifically update occurrences like `[l.document.id for l in
entries[0].links]` and the similar instance around the second assertion to
`[link.document.id for link in entries[0].links]` (or analogous variable name)
so the test uses `link` instead of `l`.

In `@application/tests/owasp_api_top10_2023_parser_test.py`:
- Around line 39-43: Replace the ambiguous loop variable `l` used in the list
comprehensions inside the test assertions with a descriptive name (e.g., `link`
or `link_obj`) to satisfy Ruff E741; specifically update the expressions
`[l.document.id for l in entries[0].links]` and `[l.document.id for l in
entries[-1].links]` to use the new name consistently (`[link.document.id for
link in entries[0].links]`, etc.), and search for any other occurrences of `l`
in this test to rename similarly.

In `@application/tests/owasp_kubernetes_top10_2022_parser_test.py`:
- Around line 41-45: Rename the ambiguous loop variable `l` in the list
comprehensions inside the test assertions to a clear name like `link` to satisfy
Ruff E741; update both occurrences where you call `[l.document.id for l in
entries[0].links]` and `[l.document.id for l in entries[-1].links]` to use
`[link.document.id for link in entries[0].links]` and `[link.document.id for
link in entries[-1].links]` respectively so the test method (references to
entries and .links) remains the same but the loop variable is no longer
ambiguous.

In `@application/tests/owasp_kubernetes_top10_2025_parser_test.py`:
- Around line 45-52: The list comprehensions in the assertions use the ambiguous
loop variable name "l" (e.g., [l.document.id for l in entries[0].links] and
[l.document.id for l in entries[-1].links]); rename that loop variable to "link"
to match the convention used elsewhere (see the usage on line 102) so the
expressions become [link.document.id for link in entries[0].links] and
[link.document.id for link in entries[-1].links], keeping behavior identical but
removing the Ruff E741 warning.

In `@application/tests/owasp_llm_top10_2025_parser_test.py`:
- Around line 40-45: The loop variable `l` used in the list comprehensions
inside assertions should be renamed to a non-ambiguous identifier like `link` to
satisfy Ruff E741; update the three occurrences within the test assertions that
build lists from entries[*].links (e.g., change [l.document.id for l in
entries[0].links] to [link.document.id for link in entries[0].links], and
similarly for entries[4].links and entries[-1].links) so the test continues to
reference the same attributes (entries, links, document.id) but uses `link`
instead of `l`.

In `@application/utils/external_project_parsers/data/owasp_aisvs_1_0.json`:
- Line 5: Two OWASP AISVS entries have broken "hyperlink" values: the entries
referencing 0x10-C01-Training-Data-Governance.md and
0x10-C02-User-Input-Validation.md; locate the correct GitHub URLs for those two
markdown files in the OWASP/AISVS repository (use the repository browser to find
their current paths) and replace the current "hyperlink" string values with the
canonical GitHub blob URLs (ensure they use /blob/main/.../filename.md and are
not pointing to /tree/...); keep the "hyperlink" key intact and update only the
URL strings for those two entries.

In
`@application/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.json`:
- Around line 50-55: The CRE ID list for section_id "K09" (section
"Misconfigured Cluster Components") duplicates the K01 mapping; update the
"cre_ids" for K09 to the correct, distinct CRE identifiers (or an empty array if
none) instead of ["233-748","486-813"]. Locate the JSON object with
"section_id": "K09" in owasp_kubernetes_top10_2022.json and replace the
duplicated cre_ids value with the verified CRE IDs for K09 (or remove the
entries) so K09 no longer mirrors K01's mapping.

In `@cre.py`:
- Around line 170-199: The new CLI flags (args.owasp_top10_2025_in,
args.owasp_api_top10_2023_in, args.owasp_kubernetes_top10_2022_in,
args.owasp_kubernetes_top10_2025_in, args.owasp_llm_top10_2025_in,
args.owasp_aisvs_in) are never acted on; update cre_main.run to dispatch them:
read each args.<flag> and, when true, call the corresponding import/handler
function (or add them into the existing args-to-action dispatch/map or
vars(args) iteration) so the flags trigger the intended import flow; reference
the same flag names and ensure the handler names or mapping keys match the flags
exactly.

---

Nitpick comments:
In `@application/utils/external_project_parsers/parsers/cheatsheets_parser.py`:
- Around line 116-130: The loop silently swallows all errors from cs.add_link
which hides misconfigured cre_ids; change the bare "except Exception: continue"
to catch the exception as a variable and log a warning or debug message
including the cre_id (from entry.get("cre_ids")), the cre (from cres), and the
exception details before continuing so failures are visible; update the block
around cache.get_CREs / cs.add_link / defs.Link /
defs.LinkTypes.AutomaticallyLinkedTo to log (using the module logger) the
context and exception and then continue.

In `@scripts/update-cheatsheets.sh`:
- Around line 22-55: Add a timestamped backup step before mutating the DB so the
script copies DB_PATH to a BACKUP_FILE and aborts on copy failure; specifically,
before invoking the embedded Python block, create a backup like
BACKUP_FILE="${DB_PATH}.$(date +%Y%m%d%H%M%S).bak" and run cp "$DB_PATH"
"$BACKUP_FILE" (or the same cp "$CACHE_FILE" "$BACKUP_FILE" guard used in
sibling scripts), check the return code and exit with an error if the copy
fails, and print the backup location so that the Python code (which updates
node.link using github_prefix/official_prefix, cur, conn, rows) runs only after
a successful backup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 8f0f42f6-8050-436f-9c0a-2dfa61804f0d

📥 Commits

Reviewing files that changed from the base of the PR and between e93ce92 and e0e8b08.

📒 Files selected for processing (25)
  • application/tests/cheatsheets_parser_test.py
  • application/tests/owasp_aisvs_parser_test.py
  • application/tests/owasp_api_top10_2023_parser_test.py
  • application/tests/owasp_kubernetes_top10_2022_parser_test.py
  • application/tests/owasp_kubernetes_top10_2025_parser_test.py
  • application/tests/owasp_llm_top10_2025_parser_test.py
  • application/tests/owasp_top10_2025_parser_test.py
  • application/utils/external_project_parsers/data/owasp_aisvs_1_0.json
  • application/utils/external_project_parsers/data/owasp_api_top10_2023.json
  • application/utils/external_project_parsers/data/owasp_cheatsheets_supplement.json
  • application/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.json
  • application/utils/external_project_parsers/data/owasp_kubernetes_top10_2025.json
  • application/utils/external_project_parsers/data/owasp_llm_top10_2025.json
  • application/utils/external_project_parsers/data/owasp_top10_2025.json
  • application/utils/external_project_parsers/parsers/cheatsheets_parser.py
  • application/utils/external_project_parsers/parsers/owasp_aisvs.py
  • application/utils/external_project_parsers/parsers/owasp_api_top10_2023.py
  • application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2022.py
  • application/utils/external_project_parsers/parsers/owasp_kubernetes_top10_2025.py
  • application/utils/external_project_parsers/parsers/owasp_llm_top10_2025.py
  • application/utils/external_project_parsers/parsers/owasp_top10_2025.py
  • cre.py
  • scripts/update-cheatsheets.sh
  • scripts/update-owasp-top10-2025-mappings.sh
  • scripts/update-owasp-top10-standards.sh

Comment thread application/tests/owasp_aisvs_parser_test.py
Comment thread application/tests/owasp_api_top10_2023_parser_test.py Outdated
Comment thread application/tests/owasp_kubernetes_top10_2022_parser_test.py Outdated
Comment thread application/tests/owasp_kubernetes_top10_2025_parser_test.py
Comment thread application/tests/owasp_llm_top10_2025_parser_test.py Outdated
Comment thread application/utils/external_project_parsers/data/owasp_aisvs_1_0.json Outdated
Comment on lines +50 to +55
{
"section_id": "K09",
"section": "Misconfigured Cluster Components",
"hyperlink": "https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K09-misconfigured-cluster-components",
"cre_ids": ["233-748", "486-813"]
},
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify K09 CRE mappings — duplicates of K01.

K09 "Misconfigured Cluster Components" carries the exact same cre_ids (["233-748", "486-813"]) as K01 "Insecure Workload Configurations" (Lines 6). Given the sections cover different topics, this looks like a copy/paste leftover rather than an intentional mapping. Please confirm the correct CRE IDs for K09.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@application/utils/external_project_parsers/data/owasp_kubernetes_top10_2022.json`
around lines 50 - 55, The CRE ID list for section_id "K09" (section
"Misconfigured Cluster Components") duplicates the K01 mapping; update the
"cre_ids" for K09 to the correct, distinct CRE identifiers (or an empty array if
none) instead of ["233-748","486-813"]. Locate the JSON object with
"section_id": "K09" in owasp_kubernetes_top10_2022.json and replace the
duplicated cre_ids value with the verified CRE IDs for K09 (or remove the
entries) so K09 no longer mirrors K01's mapping.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the K09 mapping against the OWASP Kubernetes Top Ten source before changing the data.

K09: Misconfigured Cluster Components does look suspicious at first because it currently shares the same cre_ids as K01. However, after reviewing the upstream OWASP content, I could not verify a clearly better replacement mapping.

What I found:

  • K09 is focused on hardening and misconfiguration of core cluster components such as kubelet, etcd, and kube-apiserver
  • its prevention guidance is centered on secure configuration, CIS benchmark scans, and reducing unsafe defaults
  • the 2025 OWASP Kubernetes Top Ten appears to consolidate this area further under K07: Misconfigured And Vulnerable Cluster Components

Because of that, the current K09 mapping to configuration-focused CREs is not clearly wrong from the source material alone, even if it overlaps with K01.

I did not change the K09 cre_ids in the JSON because I do not have strong enough evidence for a correct distinct replacement, and changing it speculatively would risk making the mapping worse rather than better.

Instead, I documented the intent in the parser test so the duplication is explicit and reviewable. If we want to change the K09 mapping later, I think the next step should be to agree on the target CREs based on a more explicit mapping rationale rather than inferring them from the section title alone.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bornunique911, thanks for the thorough investigation. That's a well-reasoned approach — keeping the existing mapping with explicit documentation is preferable to a speculative change that could silently introduce incorrect data. The note in the parser test making the duplication intentional and reviewable is the right call for now.

Would you like me to open a follow-up GitHub issue to track the proper CRE mapping research for K09 "Misconfigured Cluster Components"? It could serve as a clear handoff point for whoever takes on the deeper mapping work against the CRE database, referencing this PR and the 2025 consolidation note.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please open a follow-up issue for that.

I think that would be the best way to track the deeper K09 mapping review separately from this PR, since the current decision here was intentionally to avoid introducing a speculative data change.

It would be helpful if the issue notes:

  • that K09 was intentionally left unchanged in this PR
  • that the open question is whether there is a more specific CRE mapping for cluster-component hardening/misconfiguration
  • that the 2025 consolidation into K07 may help guide that review
  • a link back to this PR for context

That would give a clean handoff point for refining the mapping later.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

gh issue create \
  --repo OWASP/OpenCRE \
  --assignee Bornunique911 \
  --title "Research correct CRE mapping for K09 (Misconfigured Cluster Components) in OWASP Kubernetes Top Ten 2022" \
  --body "## Context

This issue tracks follow-up research to find the correct, distinct CRE identifiers for **K09: Misconfigured Cluster Components** in the OWASP Kubernetes Top Ten 2022 dataset.

## Background

In PR `#877`, the \`owasp_kubernetes_top10_2022.json\` mapping for K09 was intentionally left unchanged because it currently shares the same \`cre_ids\` as K01 (\`[\"233-748\", \"486-813\"]\`). After reviewing the upstream OWASP source, a clearly correct distinct replacement could not be identified, so a speculative change was avoided to prevent introducing incorrect data.

**K09** is focused on hardening and misconfiguration of core cluster components (e.g., \`kubelet\`, \`etcd\`, \`kube-apiserver\`), with prevention guidance centred on secure configuration, CIS benchmark scans, and reducing unsafe defaults.

## Open question

Is there a more specific CRE mapping for cluster-component hardening/misconfiguration that should replace or augment the current \`[\"233-748\", \"486-813\"]\` entries for K09?

## Guidance for review

- Compare K09's prevention/mitigation content on the [OWASP Kubernetes Top Ten 2022 page](https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K09-misconfigured-cluster-components) against the CRE database.
- Note that the OWASP Kubernetes Top Ten 2025 appears to consolidate this area under **K07: Misconfigured And Vulnerable Cluster Components** — the 2025 mapping (PR `#877`, \`owasp_kubernetes_top10_2025.json\`) may provide useful pointers for the correct CRE IDs.
- If no better mapping exists, the duplication with K01 should be explicitly documented in the data file (e.g., via a comment or a companion note) so future maintainers understand it is intentional.

## References

- PR `#877`: https://github.com/OWASP/OpenCRE/pull/877
- Comment thread: https://github.com/OWASP/OpenCRE/pull/877#discussion_r3351547169
- OWASP K09 source: https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K09-misconfigured-cluster-components
- OWASP K07 (2025 consolidation): https://owasp.org/www-project-kubernetes-top-ten/
"

Length of output: 2591


🏁 Script executed:

gh issue create \
  --repo OWASP/OpenCRE \
  --title "Research correct CRE mapping for K09 (Misconfigured Cluster Components) in OWASP Kubernetes Top Ten 2022" \
  --body "## Context

This issue tracks follow-up research to find the correct, distinct CRE identifiers for **K09: Misconfigured Cluster Components** in the OWASP Kubernetes Top Ten 2022 dataset.

## Background

In PR `#877`, the \`owasp_kubernetes_top10_2022.json\` mapping for K09 was intentionally left unchanged because it currently shares the same \`cre_ids\` as K01 (\`[\"233-748\", \"486-813\"]\`). After reviewing the upstream OWASP source, a clearly correct distinct replacement could not be identified, so a speculative change was avoided to prevent introducing incorrect data.

**K09** is focused on hardening and misconfiguration of core cluster components (e.g., \`kubelet\`, \`etcd\`, \`kube-apiserver\`), with prevention guidance centred on secure configuration, CIS benchmark scans, and reducing unsafe defaults.

## Open question

Is there a more specific CRE mapping for cluster-component hardening/misconfiguration that should replace or augment the current \`[\"233-748\", \"486-813\"]\` entries for K09?

## Guidance for review

- Compare K09's prevention/mitigation content on the [OWASP Kubernetes Top Ten 2022 page](https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K09-misconfigured-cluster-components) against the CRE database.
- Note that the OWASP Kubernetes Top Ten 2025 appears to consolidate this area under **K07: Misconfigured And Vulnerable Cluster Components** — the 2025 mapping (PR `#877`, \`owasp_kubernetes_top10_2025.json\`) may provide useful pointers for the correct CRE IDs.
- If no better mapping exists, the duplication with K01 should be explicitly documented in the data file (e.g., via a comment or a companion note) so future maintainers understand it is intentional.

## References

- PR `#877`: https://github.com/OWASP/OpenCRE/pull/877
- Comment thread: https://github.com/OWASP/OpenCRE/pull/877#discussion_r3351547169
- OWASP K09 source: https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K09-misconfigured-cluster-components
- OWASP K07 (2025 consolidation): https://owasp.org/www-project-kubernetes-top-ten/
"

Length of output: 2331



@Bornunique911, the follow-up issue has been created: #917

It captures all the context you outlined — the intentional decision not to change K09 in this PR, the open mapping question, the 2025 K07 consolidation pointer, and backlinks to this PR and comment thread.

One note: the integration couldn't set the assignee automatically (GitHub restricts this for public repos). You can assign it to yourself directly on the issue page.

(=^・ω・^=)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening those.

It looks like #916 and #917 may be tracking the same follow-up work. If that’s the case, I think it would be better to keep one as the canonical tracking issue and close the other as a duplicate, so the K09 mapping follow-up stays in a single place.

I’m happy with either one being kept, as long as the surviving issue links back to this PR for context.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

gh issue view 916 --repo OWASP/OpenCRE --json number,title,body,state,url
echo "---"
gh issue view 917 --repo OWASP/OpenCRE --json number,title,body,state,url

Length of output: 4779


🏁 Script executed:

gh issue comment 917 --repo OWASP/OpenCRE --body "Closing as a duplicate of `#916`, which tracks the same K09 CRE mapping follow-up work from PR `#877`. Please continue the discussion there."
gh issue close 917 --repo OWASP/OpenCRE --reason "duplicate"

Length of output: 613



@Bornunique911, confirmed — #916 and #917 are exact duplicates (identical title and body). Unfortunately the integration doesn't have the permissions needed to close issues on this repository, so you'll need to do it manually:

  1. Open Research correct CRE mapping for K09 (Misconfigured Cluster Components) in OWASP Kubernetes Top Ten 2022 #917
  2. Add a comment: "Duplicate of #916 — closing in favour of the earlier issue."
  3. Click Close issue (or close with reason "duplicate" if available).

#916 is the canonical one to keep — it was created first and already has the full context and backlink to PR #877.

Comment thread cre.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@application/utils/external_project_parsers/parsers/cheatsheets_parser.py`:
- Around line 126-134: The except block inside the supplemental-link ingestion
uses an undefined symbol `logger`; replace it with the instance logger
`self.logger` (the logger defined on the parser class at initialization) so the
call becomes `self.logger.warning(...)`, keeping the same message and `continue`
behavior; search for the offending usage in the CheatsheetsParser class (e.g.,
the method that adds supplemental cheatsheet links where `cre_id`, `cre`, and
`entry` are referenced) and update any other occurrences of `logger` to
`self.logger` to avoid NameError at runtime.

In `@scripts/update-cheatsheets.sh`:
- Around line 18-24: Check for the source DB file before attempting the backup:
add an existence test for "$DB_PATH" (e.g., [[ -e "$DB_PATH" ]] or [[ -f
"$DB_PATH" ]]) and print a clear error and exit if it does not exist, then
proceed to create BACKUP_FILE and cp; remove or keep the post-cp existence check
on "$BACKUP_FILE" as redundant with set -e but prefer keeping a clearer error
message using the captured path variable BACKUP_FILE if cp somehow fails. Ensure
you reference DB_PATH and BACKUP_FILE variables and perform the check before
calling cp.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: eabcb116-2cfe-4742-aa18-40ffd826c9bd

📥 Commits

Reviewing files that changed from the base of the PR and between e0e8b08 and f2f85f4.

📒 Files selected for processing (9)
  • application/cmd/cre_main.py
  • application/tests/owasp_aisvs_parser_test.py
  • application/tests/owasp_api_top10_2023_parser_test.py
  • application/tests/owasp_kubernetes_top10_2022_parser_test.py
  • application/tests/owasp_kubernetes_top10_2025_parser_test.py
  • application/tests/owasp_llm_top10_2025_parser_test.py
  • application/utils/external_project_parsers/data/owasp_aisvs_1_0.json
  • application/utils/external_project_parsers/parsers/cheatsheets_parser.py
  • scripts/update-cheatsheets.sh
✅ Files skipped from review due to trivial changes (1)
  • application/utils/external_project_parsers/data/owasp_aisvs_1_0.json
🚧 Files skipped from review as they are similar to previous changes (5)
  • application/tests/owasp_llm_top10_2025_parser_test.py
  • application/tests/owasp_kubernetes_top10_2022_parser_test.py
  • application/tests/owasp_api_top10_2023_parser_test.py
  • application/tests/owasp_kubernetes_top10_2025_parser_test.py
  • application/tests/owasp_aisvs_parser_test.py

Comment thread scripts/update-cheatsheets.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant