ci: gate docs PRs on internal-link validity#179
Merged
PatrickRitchie merged 40 commits intoJun 1, 2026
Conversation
Rewrite API page links to match docfx's flat MTConnect.<Namespace>.<Type> layout, swap obsolete configure/agent and configure/adapter paths for the canonical *-config slugs, update wire-format references to the json-v2-cppagent file names, and point troubleshooting cross-links at the matching sibling pages and corrected anchor slugs. Links to types that no longer exist as docfx-generated pages are kept as inline code.
Add the agent core, agent application host, every shipped agent module (HTTP server / HTTP adapter / MQTT adapter / MQTT broker / MQTT relay / SHDR adapter), the Python agent processor, the adapter core, the adapter application host, and the shipped adapter modules (MQTT, SHDR) to the docfx metadata input. The narrative docs cross-link to module configuration classes (e.g. MqttRelayModuleConfiguration) and to the agent core (e.g. InputValidationLevel, MTConnectAgentProcessors) that previously had no docfx-generated landing page; with the expanded coverage every referenced public type now resolves to a real API page. Update generate-api-ref.sh to build each project before docfx walks its DLL output, including the agent + adapter projects.
The /configure/index landing page advertised five sub-pages: Install, Configure an agent, Configure an adapter, Run, Connect a consumer, Operate. The last three did not yet exist; every link to them across the docs tree dead-ended at a missing page. Author them with real prose covering: - Run: CLI verbs (run / debug / trace / install / start / stop / reset), configuration-file resolution, Docker / Windows-service / systemd-unit deployment shapes, and first-boot troubleshooting. - Connect a consumer: HTTP REST endpoints (/probe, /current, /sample, /asset), polling vs streaming, content-type negotiation, JSON v2 MQTT topic tree, and .NET + Python consumer examples. - Operate: NLog loggers + file layout, the metrics emitter, health checks, soft-reload via monitorConfigurationFiles, hard restart via OS services, and durable-buffer handling. Wire all three into the /configure/ sidebar in .vitepress/config.ts.
Reverse the inline-code workarounds the first pass left in place and route every dangling reference at the proper docfx-generated page. The expanded docfx coverage (see prior commit) exposes the agent + adapter module configuration classes, and the new /configure/run, /configure/consumer, /configure/operate pages give every narrative cross-link a real target. Specific corrections: - MTConnect.Configurations.\*ModuleConfiguration: linked to the expanded docfx pages instead of bare code spans. - MTConnect.Formatters.Xml.XmlFormatter -> XmlResponseDocumentFormatter; MTConnect.Formatters.Json.JsonFormatter -> JsonResponseDocumentFormatter; MTConnect.Formatters.JsonCppAgent.JsonCppAgentFormatter -> JsonHttpResponseDocumentFormatter; MTConnect.Formatters.JsonCppAgentMqtt .JsonCppAgentMqttFormatter -> JsonMqttResponseDocumentFormatter (the originally cited types did not exist; the real formatter classes follow the IResponseDocumentFormatter naming). - MTConnect.Delegates -> the MTConnect namespace landing page (the delegates live under namespace MTConnect, not in a Delegates container). - DataItemFilter -> Filter; DataItemSource -> Source (the original references were renamed during the SysML import pass; the on-disk classes are the unprefixed versions). - ComponentRelationshipType (non-existent enum) -> ComponentRelationship class; the relationship type values live on the inherited Type property. - /configure/run, /configure/consumer, /configure/operate: linked to the newly-authored pages. - /api/ placeholder links across the modules pages: deep-linked to the specific type pages (HttpServerModuleConfiguration, MqttTopicStructure, IMTConnectMqttDocumentServerConfiguration, MTConnectHttpServer, MTConnectShdrHttpAgentServer, etc.). - SysML renderer references (CSharpTemplateRenderer / XmlTemplateRenderer / JsonCppAgentTemplateRenderer): the cited classes do not exist; routed to the MTConnect.SysML namespace + the MTConnectModel and ModelHelper classes that do. - cookbook/write-a-json-mqtt-consumer.md: rewrote the parsing example to use JsonMqttResponseDocumentFormatter (the real class) with its actual CreateStreamsResponseDocument / CreateAssetsResponseDocument API. The internal-link checker exits 0 against the resulting tree.
Document the two MQTT layouts the agent publishes—document-server (whole-envelope per device, with Sample carrying an MTConnectStreams delta) and entity-server (per-data-item under Devices/<uuid>/Observations/<id>). Update the mosquitto and Python examples to subscribe to the entity-server tree so the parsing block actually receives the per-observation payloads it expects. Drop the unverified application/mtconnect+json Accept-header row and point operators at the http-server module's documentFormat key for JSON v2 selection.
…ad claim AgentConfiguration exposes only enableMetrics; the tick interval and window length are constructor-only on MTConnectAgentMetrics. Replace the invented metrics: block with the real switch. NLog hot-reload is gated by autoReload on the <nlog> root, not by internalLoggingLevel; name the correct lever.
The HTTP server module emits 'Listening at <prefix>..', not 'MTConnectAgent : Started on port 5000'. Replace the invented line so operators tailing logs find a match, and update the first-boot troubleshooting bullet that keyed off the same string.
MTConnectHttpClient exposes CurrentReceived (not OnCurrentReceived), Start() (no async overload), ConnectionError (transport) and InternalError (parsing/dispatch); OnError does not exist. Rewrite the snippet against the shipped event names. The device-scoped current path is /<deviceName>/current, not /current?deviceName=<name>—the query parameter is neither in the MTConnect REST spec nor in the .NET server module.
Sweep up six rewrites the earlier pass missed: XmlFormatter / JsonFormatter / JsonCppAgentFormatter -> the shipped ResponseDocumentFormatter family; the XmlAssetsDocument / XmlErrorDocument display strings -> their real *ResponseDocument counterparts; the MTConnect.Formatters.* display segment -> MTConnect.Formatters.Xml.* (the link targets were already correct); the spurious .Shdr. infix on the ShdrAdapter family -> the real MTConnect.Adapters namespace; and retarget the agent-processor-python reference rows at the dedicated MTConnect.Agents.ProcessObservation and MTConnect.Processors pages the text actually names.
behaviour -> behavior, monopolising -> monopolizing, honours -> honors. Also replace the unresolvable parts/2.0/HttpProtocol.md path with the docs.mtconnect.org URL the citation actually points at.
Replace @jsdevtools/rehype-url-inspector (unmaintained since 2021, pulls deprecated url-regex with a ReDoS advisory) with a unist-util-visit walk over rendered link/image HAST nodes—same coverage, one less archived dep, no behaviour change for valid links. Edge-case hardening surfaced by the review pass: - guard node.position?.start against autolink and plugin-inserted nodes (the throw previously cascaded to an unhandled rejection and exit-2) - decodeURIComponent the path before stat() so %20-encoded targets resolve - strip ?query before splitting on # and special-case '' / '#' as the docs-root index / placeholder rather than reporting them broken - tighten the raw-HTML id sweep to skip data-id / aria-labelledby and to strip fenced code blocks before matching - contain candidate paths under docs-root so [x](/../../../etc/passwd) cannot stat arbitrary FS locations - skip symlinks during the walk and bracket the dot-directory skip list so .git / .docfx do not surface scratch markdown - emit '<file>:<line>:<col>: broken link <url>' rows on a single stderr stream so editor problem-matchers (VS Code, vim :cfile, GHA annotations) recognise the format, and surface the failing-script name + stack on the top-level catch Two cross-repo testing.md links (../tests/ and ../.github/) that the old script silently followed outside docs/ are rewritten as github.com URLs—the containment guard rejects them by design.
The link check ran serially after the docfx chain in the build job, on the order of four minutes on the 2k-file tree. Split it into its own job so it runs in parallel with the VitePress build—net wall-time drops to the longer of the two, not their sum.
The Windows-service install block defaulted to LocalSystem without naming the security trade-off; add one sentence pointing operators at a dedicated low-privilege account for production. The Docker block exposed -p 5000:5000 without flagging the lack of TLS / auth; add one paragraph pointing at the module-level TLS + auth blocks.
Cross-check against the shipped Dockerfile: ENTRYPOINT is dotnet agent.dll with CMD debug, WORKDIR /app; there is no /config/agent.config.yaml mount point. Rewrite the docker run example with the real /app/ paths and rewrite the prose to match. Pin dotnet run to the explicit MTConnect.NET-Agent.csproj path so the snippet is unambiguous even when the directory gains a second csproj.
Operate did not link back to Install or Configure an adapter; an on-call operator landing on the page had no one-click jump to either. Consumer did not link to Configure an adapter; integrators tracing an SHDR chain back to the equipment had no link back either. Add both.
Replace the fully-qualified MTConnectAgentMetrics link with a link to the MTConnect.Agents.Metrics namespace landing page. If the emitter is ever folded into another class the type link rots; the namespace link is stable across that refactor. The MQTT document-server reference in consumer.md was already retargeted at the root MTConnect namespace page in the earlier topic-shape rewrite.
6c70e55 to
a4d3bad
Compare
ottobolyos
added a commit
to ottobolyos/mtconnect.net
that referenced
this pull request
Jun 1, 2026
ottobolyos
added a commit
to ottobolyos/mtconnect.net
that referenced
this pull request
Jun 1, 2026
…xample
F-075 — operate.md:17-18: the file-pattern column on the modules /
processors rows still pointed at logs/<module-name>-<date>.log after
the F-052 fix updated only the logger-name column. The shipped
NLog.config templates the file as logs\${logger}-${shortdate}.log,
and ${logger} substitutes the full logger name including the
modules. / processors. prefix, so the on-disk file is actually
logs/modules.<module-id>-<date>.log. Update the file-pattern cells
accordingly.
F-076 — add a Common-operational-patterns bullet showing the
per-module and per-processor tail commands so operators do not have
to reconstruct the on-disk path from the logger table.
F-077 — the three configure/ pages (run, consumer, operate) used the spaced em-dash ' — ' while the rest of the docs tree (notably the newer docs-site.md page) uses the closed CMOS form 'word—word'. CONVENTIONS §1.0d-decies makes the closed form canonical, so retrofit the three pages to match. Verified no em-dashes inside fenced code blocks or inline-code spans before the sweep.
…ample
F-078 — the .NET HTTP and MQTT consumer snippets called
observation.GetValue("Result") with a bare string literal where the
shipped library defines ValueKeys.Result = "Result" in
libraries/MTConnect.NET-Common/Observations/ValueKeys.cs:15. Switch
both call sites to observation.GetValue(ValueKeys.Result) and add
'using MTConnect.Observations;' to each snippet so the constant
resolves. Reads as idiomatic .NET and signposts the ValueKeys.*
typed-constant home (Result, Level, NativeCode, …) for readers who
need to access other value keys.
…avior F-079 — actions/upload-artifact@v4 strips the longest common prefix from a multi-path list, so uploading 'docs/api' + 'docs/reference' produces an artifact rooted at 'api/' + 'reference/' rather than the literal upload paths. The downstream download-artifact steps in the build and check-links jobs rely on 'path: docs' to restore the 'docs/' prefix. The behaviour differs from v4's v3 predecessor and is non-obvious on a quick read. Add an inline comment above the upload-artifact step so a future editor of either end of the upload / download pair does not break the path contract.
…ecker
F-080 — runWithConcurrency synthesises a recordBrokenLink entry on
worker failure with url set to '<worker failure: ${message}>'. The
literal '<' / '>' characters propagate to the problem-matcher line as
'<file>:0:0: broken link <worker failure: ENOENT…>'. Editor
problem-matchers (VS Code, vim :cfile) parse the line correctly, but
the angle brackets look like template placeholders in the UI and may
confuse a quick visual read of the CI log. Use parentheses instead.
F-081 — CRLF -> LF normalisation was repeated in both computeAnchorSet (line 107-108) and processFile (line 223-224). Extract a single readMarkdownNormalized(file) helper and call it from both sites. Tiny duplication, but it was introduced in one commit and the intent reads clearer with a named helper.
ottobolyos
added a commit
to ottobolyos/mtconnect.net
that referenced
this pull request
Jun 1, 2026
ottobolyos
added a commit
to ottobolyos/mtconnect.net
that referenced
this pull request
Jun 1, 2026
ottobolyos
added a commit
to ottobolyos/mtconnect.net
that referenced
this pull request
Jun 1, 2026
PatrickRitchie
approved these changes
Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an internal-link validation gate to the docs CI, expands docfx coverage to include the agent + adapter projects, authors the three previously-missing
/configure/sub-pages (Run, Connect a consumer, Operate), and rewrites every broken internal link across the docs tree at the correct target. The link checker itself is hardened against the edge cases that surfaced during review, runs with bounded concurrency, and lands in a dedicated CI job so the wall-time cost is hidden behind the VitePress build.Tooling
docs/scripts/check-broken-links.mjs: new script. Walks every.mdunder the given root, parses withunified+remark-parse+remark-gfm, walks the rendered HAST forlink/imagenodes viaunist-util-visit, validates filesystem paths (with and without.mdextension, with percent-decoding, with index-page fallback), and validates anchor fragments againstrehype-slug-derived heading IDs plus a tightened raw-HTMLid="…"sweep on the target file. Hardening: position-guards autolink-derived nodes; constrains candidate paths to the docs root; promise-caches in-flight anchor parses; tightens the raw-id pattern to skipdata-id,aria-labelledby, and namespaced*:id=attributes; strips fenced code blocks; normalises CRLF line endings before fenced-code matching; splits URLs on#before stripping the query sopath?v=1#anchorresolves cleanly; trims candidate-path emission for the bare-/branch to the docs-root index page only; skips symlinks and dot-prefixed directories; records unreadable directories rather than aborting the walk; catches per-worker rejections so a transient FS hiccup degrades to one recorded broken link rather than an unhandled rejection. Surfaces failures as<file>:<line>:<col>: broken link <url>rows on a single stderr stream so editor problem-matchers (VS Code, vim:cfile, GHA annotations) recognise the format. Bounded concurrency keyed offos.cpus().lengthcuts the 2k-file pass materially. Exits 1 on any broken link.docs/package.json: addscheck-linksnpm script. Replaces the unmaintained@jsdevtools/rehype-url-inspector(last release 2021, deprecatedurl-regexdep with a ReDoS advisory) with a directunist-util-visitwalk..github/workflows/docs.yml: factors out a sharedprepare-docsjob that runs the docfx + reference regeneration once and uploads the generated tree as a workflow artefact. Bothbuildandcheck-linksdepend onprepare-docsand download the artefact; net wall-time drops to the longer of the two parallel jobs, and the previously-duplicated setup is gone.Docfx coverage
docs/.docfx/docfx.json+docs/scripts/generate-api-ref.sh: expand docfx coverage from the 13MTConnect.NET-*libraries to also include the agent core (MTConnect.NET-Agent,MTConnect.NET-Applications-Agents), every shipped agent module (HTTP server, HTTP adapter, MQTT adapter, MQTT broker, MQTT relay, SHDR adapter), the adapter core (MTConnect.NET-Adapter,MTConnect.NET-Applications-Adapter), and the shipped adapter modules (MQTT, SHDR). The expansion adds new docfx-generated pages including every*ModuleConfigurationtype the narrative docs cross-link to.New configure pages
docs/configure/run.md: CLI verbs (run/debug/trace/install/install-start/remove/start/stop/reset/help), Docker / Windows-service / systemd-unit deployment shapes (Docker entry-point and volume paths verified against the shipped Dockerfile; Windows-service LocalSystem note; Docker TLS / auth gap flagged), configuration-file resolution order, and first-boot troubleshooting against the realmodules.http-server | Info | Listening at <prefix>..startup line.docs/configure/consumer.md: HTTP REST endpoints (/probe,/current,/sample,/asset) with the device-scoped path form, polling vs streaming, content-type negotiation against the shipped formatter rows, the two parallel MQTT topic layouts (document-server envelope publishes keyed by device fromMTConnect.MTConnectMqttDocumentServer, and entity-server per-data-item publishes underDevices/<uuid>/Observations/<id>fromMTConnect.Clients.MTConnectMqttEntityServer), .NET + Python consumer examples that use the actual shipped event names (CurrentReceived,Start(),ConnectionError,InternalError,EventHandler<IObservation>signatures), and a note thatCurrentReceivedfires once on stream initialization.docs/configure/operate.md: NLog loggers + file layout (modules.<module-id>/processors.<processor-id>keyed off the shippedNLog.configrules), the metrics emitter (gated byenableMetrics:only; interval and window are constructor-only), health-check patterns, soft-reload viamonitorConfigurationFiles: true, hard restart via OS services, NLog hot-reload viaautoReload="true", durable-buffer handling with the shippedobservationBufferSize: 131072default.Doc rewrites
docs/**: fixes every broken internal link the checker reports against the post-docs: add VitePress documentation site #157 tree. Two categories: page-rename remaps (/configure/agent→/configure/agent-config,/wire-formats/json-cppagent*→/wire-formats/json-v2-cppagent*) and API-page slug remaps (the docfx output usesMTConnect.<Namespace>.<Type>.mdrather thanMTConnect.<Namespace>/<Type>.html). Where the original doc cited a type that does not exist in the codebase (MTConnect.Formatters.Xml.XmlFormatter,JsonCppAgentFormatter, the SHDR-namespace adapter family, theXmlAssetsDocument/XmlErrorDocumentdisplay strings,MTConnect.Delegates,DataItemFilter,DataItemSource,ComponentRelationshipType, the SysML*TemplateRendererfamily), the link is rewritten to the type that actually exists. Two namespace-level fallbacks reduce the rot surface when emitters move. Cross-repo../tests/and../.github/references intesting.mdare rewritten asgithub.comURLs. The .NET HTTP consumer example usesobservation.GetValue("Result")against the baseIObservationsurface so the snippet compiles against the shipped API.External (HTTP/HTTPS) URLs are deliberately not validated — third-party state is not the gate.