Skip to content

node: bounds-check and OOM-guard in cmark_set_cstr#609

Open
Alearner12 wants to merge 1 commit into
commonmark:masterfrom
Alearner12:fix/cmark-set-cstr-bounds-oom
Open

node: bounds-check and OOM-guard in cmark_set_cstr#609
Alearner12 wants to merge 1 commit into
commonmark:masterfrom
Alearner12:fix/cmark-set-cstr-bounds-oom

Conversation

@Alearner12
Copy link
Copy Markdown

node: bounds-check and OOM-guard in cmark_set_cstr

cmark_set_cstr in src/node.c backs every cmark string setter — cmark_node_set_url, _set_title, _set_literal, _set_fence_info, _set_on_enter, _set_on_exit. It has two paths to undefined behaviour that I think are worth closing.

1. The realloc result is not checked.

*dst = (unsigned char *)mem->realloc(NULL, len + 1);
memcpy(*dst, src, len + 1);

mem->realloc can return NULL — either stdlib OOM or, more importantly, a caller-supplied allocator (cmark exposes the allocator through cmark_node_new_with_mem). The next line dereferences it, so failure is UB inside memcpy rather than a defined abort. Other unreasonable-state paths in cmark — for example the target_size > INT32_MAX/2 branch in cmark_strbuf_grow (src/buffer.c:43-48) — already abort() with a diagnostic. This change keeps that pattern.

2. The bufsize_t cast of strlen silently truncates.

bufsize_t is int32_t (src/buffer.h:16). For strlen(src) > INT32_MAX the cast wraps; subsequent len + 1 and memcpy(*dst, src, len + 1) operate on a truncated (or, for some wraparounds, negative) count. The function then returns this incorrect length to the caller, which uses it as a size elsewhere. Same INT32_MAX bound and abort treatment as Defect 1.

Verification

Built against current master (b320f40) with -fsanitize=address,undefined. A small reproducer using cmark_node_new_with_mem to inject a failing realloc:

static void *p_calloc(size_t n, size_t sz) { return calloc(n, sz); }
static void *f_realloc(void *p, size_t sz) { (void)p; (void)sz; return NULL; }
static void  p_free(void *p)               { free(p); }

cmark_mem mem = { p_calloc, f_realloc, p_free };
cmark_node *link = cmark_node_new_with_mem(CMARK_NODE_LINK, &mem);
cmark_node_set_url(link, "https://example.com/");

Before:

node.c:275:7: runtime error: null pointer passed as argument 1, which is declared to never be null
==97168==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000
    #0 memcpy (libc.so.6)
    #1 cmark_set_cstr  src/node.c:275
    #2 cmark_node_set_url  src/node.c:534

After this patch:

[cmark] cmark_set_cstr: allocator returned NULL, aborting
Aborted (core dumped)

api_test on the patched build: 556 tests passed, 0 failed, 0 skipped.

cmark_set_cstr backs every cmark string setter (cmark_node_set_url, _set_title, _set_literal, _set_fence_info, _set_on_enter, _set_on_exit). This change closes two paths to undefined behaviour:

1. The realloc result is not checked. mem->realloc can return NULL via stdlib OOM or a caller-supplied allocator (cmark exposes the allocator through cmark_node_new_with_mem). The next line dereferences it, so failure is UB inside memcpy rather than a defined abort.

2. The bufsize_t cast of strlen silently truncates. bufsize_t is int32_t; for strlen(src) > INT32_MAX the cast wraps and subsequent len + 1 and memcpy operate on a truncated count.

Both now abort() with a diagnostic, matching the existing precedent in cmark_strbuf_grow (src/buffer.c).

Co-Authored-By: Claude <noreply@anthropic.com>
@jgm
Copy link
Copy Markdown
Member

jgm commented May 17, 2026

Looks good. @nwellnhof any comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants