Skip to content

pampa - certain parse errors prevent recovery into subsequent blocks, suppressing all later diagnostics #232

@rundel

Description

@rundel

The underlying issue appears to be that for certain error types the parser cannot recover to continue parsing subsequent blocks. Once such an error is hit, no useful diagnostics are produced for the remainder of the document, including additional errors in the same block and any errors in later blocks. The first error in the first block typically still gets a specific diagnostic (e.g. Q-2-5 Unclosed Underscore Emphasis), but everything after it is silently dropped.

For inline constructs specifically, this should be recoverable in principle. Inline syntax (emphasis, quotes, code spans, etc.) cannot span a block boundary, so an unclosed inline delimiter is fully resolved at the end of its block: the block ends, the delimiter is left unmatched, and parsing of the next block should be able to proceed normally. Recovery does work this way for some error types but not for others. An unclosed inline emphasis in an early block currently halts useful parsing for the rest of the input (Example 1), whereas other error families recover and the parser progresses into later blocks. The attribute-list ordering error in #230 is an example of the latter: the parser recovers far enough to flag a later block, even though that case surfaces a separate problem with how the resulting diagnostic is dispatched.

All examples below run on main, invoked with --no-prune-errors -t native so no diagnostics are filtered.

Example 1: multiple errors in the same paragraph, plus another error in a second paragraph

This is the _first *paragraph_.

This is the _2nd paragraph
printf  "%s\n" "This is the _first *paragraph_." "" "This is the _2nd paragraph" \
  | pampa --no-prune-errors -t native
Error: [Q-2-5] Unclosed Underscore Emphasis
   ╭─[ <stdin>:1:32 ]
   │
 1 │ This is the _first *paragraph_.
   │            ─┬                  ┬
   │             ╰───────────────────── This is the opening '_' mark.
   │                                │
   │                                ╰── I reached the end of the block before finding a closing '_' for the emphasis.
───╯

Example 2: control — clean first paragraph, unclosed delimiter in the second paragraph

First good paragraph.

Second *bad paragraph.
printf  "%s\n" "First good paragraph." "" "Second *bad paragraph." \
  | pampa --no-prune-errors -t native
Error: [Q-2-12] Unclosed Star Emphasis
   ╭─[ <stdin>:3:23 ]
   │
 3 │ Second *bad paragraph.
   │ ───┬──                ┬
   │    ╰───────────────────── This is the opening '*' mark.
   │                       │
   │                       ╰── I reached the end of the block before finding a closing '*' for the emphasis.
───╯

POC fix

A proof-of-concept fix exists on the forked branch poc/downstream-block-diagnostics-reparse.

It works around the parser limitation at the pampa level rather than changing the grammar. When a parse already has errors, it re-parses the document once per remaining failing block, each time masking the already-handled prefix to blank lines (preserving byte offsets and rows) so the next still-broken block becomes the document's first block and reports its own diagnostic. This feels (very) hacky but works - something cleverer with offsets should be possible but would be more work. This surfaces diagnostics per failing block instead of only the first. It is diagnostics-only (no AST recovery), and block resync is based on a blank-line heuristic (which feels fragile but seems to hold up pretty well in practice).

Nothing blows up with the existing tests and based on an automated pass through quarto-web nothing obviously went wrong there either.

Happy to turn this into an actual PR if it seems worth while.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions