docs: update the docs for cloudberry-site#114
Draft
tuhaihe wants to merge 7 commits into
Draft
Conversation
The PXF documentation is moving to the Apache Cloudberry website at cloudberry.apache.org/docs (built by apache/cloudberry-site, Docusaurus). This repository keeps the Markdown sources under docs/content/ but no longer ships its own build pipeline. Drop the Bookbinder/Middleman scaffolding under docs/book/ that would otherwise become unmaintained. Subsequent commits convert the Markdown sources to a Docusaurus-friendly layout, refresh docs/README.md, and add a docs lint workflow.
Convert the upstream Bookbinder-flavoured PXF documentation under docs/content/ into a layout that the Apache Cloudberry website (apache/cloudberry-site, Docusaurus) can consume directly. Changes per file: * Rename .html.md.erb -> .md. The .erb sources contained no ERB template code, so removing both suffixes is a no-op for content while letting Markdown tooling pick the files up. * Rewrite intra-doc links from `(foo.html)` and `(foo.html#anchor)` to Docusaurus-style relative paths like `(./foo.md)` or `(../administering/cfg_server.md#about-the-pxf-fs-basepath-property)`. Same-page bare anchors `(#foo)` are also remapped, including a fix for the upstream typo `(#procedure.html)` in pxf_kerbhdfs.md. * Replace heading anchor blocks of the form `## <a id="suppplat"></a> Supported Platforms` with plain headings, since Docusaurus auto-generates slug-based anchors. Cross-file references that used the old IDs are rewritten to the new slug. Stand-alone `<a id>` tags (table captions, mid-section deep links) are preserved as MDX-friendly invisible anchors. * Add `description` and `sidebar_position` frontmatter to every page so the Docusaurus sidebar can be auto-generated and pages get sensible meta tags. Page titles that referenced "Greenplum® Platform Extension Framework" are rebranded to "Apache Cloudberry Platform Extension Framework". * Reorganise the previously-flat `docs/content/` into category sub-directories matching the legacy subnav: `intro/`, `administering/`, `access-hadoop/`, `access-objectstores/`, `access-jdbc/`, `access-nfs/`, `troubleshooting/`, `upgrade/`, plus the existing `ref/`. Each carries a `_category_.json` for the Docusaurus sidebar. * Rebrand prose: "Greenplum Database" -> "Apache Cloudberry" and bare "Greenplum" (where it refers to the deployment, not a specific Greenplum release/version) -> "Apache Cloudberry". Compatibility tables, transition notes, and other historical references that need to keep the original wording are preserved via a small set of guard patterns. * Tweak raw HTML so MDX v3 can render it: `class="..."` -> `className="..."`, and `<a href="foo.html">` rewritten the same way as Markdown links. A few warnings remain that reflect pre-existing dead links upstream (e.g. `#s3_override_ext_ext`, `init_pxf.html`, `#topic1` in ref pages); these are flagged for follow-up but kept as-is so the diff is mechanical.
The previous README described the now-deleted Bookbinder build under docs/book/. Replace it with a brief authoring guide that points at apache/cloudberry-site as the source of the rendered documentation and documents the conventions used by the Markdown sources (frontmatter, relative links, image placement, category metadata).
Add a lightweight docs-lint workflow that runs on pushes and pull requests touching docs/. It runs: * markdownlint-cli2 against `docs/content/**/*.md` with a relaxed config tuned for the imported PXF content (legacy inline HTML, long lines, etc.). * lychee for link validity, both internal relative paths and external URLs. The goal is to catch broken links early rather than waiting for the cloudberry-site Docusaurus build to fail.
YAML treats an unquoted ':' inside a scalar as a key/value separator, which makes Docusaurus' gray-matter parser bail out with 'incomplete explicit mapping pair'. Wrap the two affected descriptions in double quotes so they parse as plain strings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
closes: #ISSUE_Number
Change logs
Contributor's checklist
Here are some reminders before you submit your pull request: