Skip to content

feat(opensearch) #34610: ES-DECOMMISSION markers, OpenSearchExceptionMapper, and neutral utility conversions#35824

Open
fabrizzio-dotCMS wants to merge 19 commits into
mainfrom
feat/opensearch-es-coupling-annotation-utilities
Open

feat(opensearch) #34610: ES-DECOMMISSION markers, OpenSearchExceptionMapper, and neutral utility conversions#35824
fabrizzio-dotCMS wants to merge 19 commits into
mainfrom
feat/opensearch-es-coupling-annotation-utilities

Conversation

@fabrizzio-dotCMS
Copy link
Copy Markdown
Member

@fabrizzio-dotCMS fabrizzio-dotCMS commented May 25, 2026

Summary

Advances the ES→OpenSearch migration by introducing greppable // ES-DECOMMISSION: inline comments to mark ES-vendor-coupled classes for Phase 3 decommission review, adds OpenSearchExceptionMapper as the vendor-neutral JAX-RS replacement for ElasticsearchStatusExceptionMapper, and removes ES vendor imports from several neutral utility classes.

Changes

ES-DECOMMISSION markers

7 classes annotated inline with decommission context and rationale:

Class Decommission note
ElasticsearchStatusExceptionMapper Full class; OpenSearchExceptionMapper is the OS replacement
BulkActionListener Full class; migrate callers to vendor-neutral IndexBulkListener
ESContentTool Bridge methods esSearch / esSearchRaw reference ES-specific search types
ContentletAPI Deprecated esSearch / esSearchRaw method signatures
ContentletAPIPreHook Hook methods for deprecated signatures
ContentletAPIPostHook Hook methods for deprecated signatures
ContentletAPIInterceptor Same as ContentletAPI

Greppable via: grep -r "ES-DECOMMISSION"

New: OpenSearchExceptionMapper

  • JAX-RS @Provider counterpart to ElasticsearchStatusExceptionMapper
  • Uses exception.status() for exact HTTP code passthrough instead of string matching
  • NPE-safe: Optional-guards exception.error().reason() for cases with no server-side error body (e.g. transport errors)

WorkflowAPIImpl — simplified catch block

  • Removed the ES-specific QueryPhaseExecutionException instanceof branch (never fires via the REST client; no typed OS equivalent in OpenSearch Java client 3.x)
  • Replaced with a single Logger.warnAndDebug message that covers both the window-limit case and any other unexpected search failure
  • Comment documents why full vendor-neutral detection is deferred to Phase 3 at the factory layer

Quick wins — zero functional change

File Change
HttpRequestDataUtil Replace InetAddresses (ES) with NetUtil.createByteArrayFromIpAddressString (Netty); null-guard for getRemoteAddr()
TotalSizeSiteSearchIndicesMetricType Replace ByteSizeValue (ES) with neutral formatBytes() helper (binary units, 1 decimal place)
SearchHits (domain) Remove unused ES @Nullable import
ContentsWebAPI Remove unused ES SearchHits import and dead variable
CreateJsonWebTokenResource Replace org.elasticsearch.common.collect.Map with java.util.Map
DotRunnableThread Remove ReindexActionListeners inner class (implemented ES ActionListener<BulkResponse>)

Testing

  • TotalSizeSiteSearchIndicesMetricTypeTest.testTruncationNotRounding — documents that formatBytes truncates (not rounds): 1.99 GB → "1.9gb" is intentional
  • ./mvnw compile -pl :dotcms-core -DskipTests — verifies no compilation errors

Breaking Changes

None. All ES-DECOMMISSION markers are comments only; no behavior changed. OpenSearchExceptionMapper is additive.

Closes #34610

🤖 Generated with Claude Code

…eption mapper

- Introduce `@ESCoupled` annotation (com.dotcms.content.index) to mark
  ES-vendor-coupled classes for Phase 3 decommission review; searchable
  via reflection and grep as a structured backlog marker
- Apply @ESCoupled to 8 classes: ElasticsearchStatusExceptionMapper,
  BulkActionListener, WorkflowAPIImpl, ESContentTool, ContentletAPI,
  ContentletAPIPreHook, ContentletAPIPostHook, ContentletAPIInterceptor
- Add OpenSearchExceptionMapper (@Provider) as the OS counterpart of
  ElasticsearchStatusExceptionMapper; uses exception.status() for exact
  HTTP code mapping instead of string matching
- Remove ReindexActionListeners inner class from DotRunnableThread
  (implemented ActionListener<BulkResponse> — ES vendor type)

Part of #34610

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fabrizzio-dotCMS fabrizzio-dotCMS force-pushed the feat/opensearch-es-coupling-annotation-utilities branch from ea8835f to b9dcda1 Compare May 25, 2026 20:47
…ral classes

- CreateJsonWebTokenResource: org.elasticsearch.common.collect.Map → java.util.Map
- HttpRequestDataUtil: remove org.elasticsearch.common.network.InetAddresses;
  replace two-step isInetAddress+createByteArray with single NetUtil.createByteArrayFromIpAddressString
  null check (Netty already on classpath, returns null for invalid addresses);
  also removes now-unused io.vavr.control.Try import
- SearchHits (domain): remove unused org.elasticsearch.common.Nullable import
- ContentsWebAPI: remove unused org.elasticsearch.search.SearchHits import
  and dead variable declaration SearchHits hits = null

Part of #34610

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fabrizzio-dotCMS fabrizzio-dotCMS changed the base branch from master to main May 25, 2026 20:58
…se, add remove[]

The annotation's purpose is Phase 3 decommission review; encoding the phase
as a field was redundant. trackedIn is tracked in issue comments, not here.

New shape:
  @ESCoupled(reason = "...", remove = {"methodA", "methodB"})

- remove[] lists specific methods/inner classes to delete at Phase 3
- omit remove[] when the entire class is to be decommissioned
- Updated all 8 existing usages accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the Area : Backend PR changes Java/Maven backend code label May 25, 2026
…kflowAPIImpl

Replace vendor-specific QueryPhaseExecutionException instanceof check with
message-based isPaginationLimitReached() helper. Both ES and OS emit
"Result window is too large" when a query exceeds max_result_window (10 000),
so the detection is vendor-neutral. Removes the org.elasticsearch import
and the @ESCoupled annotation — WorkflowAPIImpl is now ES-free.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t null remoteAddr

NetUtil.createByteArrayFromIpAddressString throws NPE when passed null.
request.getRemoteAddr() can be null in test contexts. The original code
was protected by Try.of() which caught the NPE; the new code must null-check
before delegating to Netty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ormatBytes in TotalSizeSiteSearchIndicesMetricType

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…and formatBytes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exclude WorkflowAPIImpl from this PR — the QueryPhaseExecutionException
replacement will be handled in a dedicated PR that introduces a neutral
DotIndexWindowLimitException thrown by the factory layer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dotCMS dotCMS deleted a comment from claude Bot May 25, 2026
Remove @ESCoupled annotation and extra blank line — WorkflowAPIImpl
is excluded from this PR. Vendor-neutral exception handling tracked
separately in #35827.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fabrizzio-dotCMS fabrizzio-dotCMS removed the request for review from a team May 26, 2026 14:44
- Guard exception.error().reason() with Optional to avoid NPE when
  OpenSearchException carries no server-side error body (e.g. transport errors)
- Add testTruncationNotRounding to document that formatBytes truncates (string
  slice), not rounds: 1.99 GB → "1.9gb" is intentional

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dotCMS dotCMS deleted a comment from claude Bot May 26, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 26, 2026

Claude finished @fabrizzio-dotCMS's task in 3m 9s —— View job


Review

A few items worth a second look — the rest looks clean.


🔴 HttpRequestDataUtil.getIpAddress — silent fallback to 127.0.0.1 is a behavior/security regression

dotCMS/src/main/java/com/dotcms/util/HttpRequestDataUtil.java:78-84

final String addr = request.getRemoteAddr();
final byte[] parsed = addr != null ? NetUtil.createByteArrayFromIpAddressString(addr) : null;
final byte[] remoteAddr = parsed != null ? parsed : new byte[]{127, 0, 0, 1};
return InetAddress.getByAddress(remoteAddr);

Two differences from the previous Guava InetAddresses.forString(...) path:

  1. NetUtil.createByteArrayFromIpAddressString returns null on invalid input — it does not throw. The old code threw IllegalArgumentException.
  2. On null/invalid input, the new code returns InetAddress for 127.0.0.1 instead of letting the exception propagate.

The PR description calls this a "null-guard" but the fallback now also triggers on any unparseable string, not just null. This affects several security-relevant call sites:

  • MonitorHelper.isAccessGranted (MonitorHelper.java:105) uses this for ACL/CIDR checks. The localhost early-return at line 101 operates on the raw string (so a malformed value won't short-circuit there), but the parsed IP then becomes 127.0.0.1 — if the ACL allows loopback (very common), a malformed remoteAddr could effectively bypass the check. Previously, the exception was caught at line 112 → return false (deny). Now: probable allow.
  • Rules-engine conditionlets (UsersCountryConditionlet, VisitorsGeolocationConditionlet, DateTimeConditionlet) — geo/country lookups silently resolve to localhost instead of erroring out.
  • VisitorAPIImpl.lookupIPAddress previously returned null on bad input; now it returns a real-but-wrong InetAddress(127.0.0.1).

Whether reachable depends on the deployment (reverse-proxy header forwarding configs, RemoteIpValve, etc.), but the semantic change from "fail closed" → "fall back to loopback" deserves an explicit decision. Suggested fix: keep the null-guard, but on parse failure either re-throw UnknownHostException (the method already declares it) or return null/sentinel rather than 127.0.0.1.

Fix this →


🟠 WorkflowAPIImpl — error level downgraded for genuine failures

dotCMS/src/main/java/com/dotmarketing/portlets/workflows/business/WorkflowAPIImpl.java:2766-2778

The previous catch split traffic two ways: window-limit case at DEBUG (silent), everything else at ERROR. The new code collapses both into a single Logger.warnAndDebug.

  • Window-limit case went from silent DEBUG → WARN. That's louder in normal operation when paging beyond max_result_window.
  • Unexpected/genuine search bugs went from ERROR → WARN, losing visibility in production dashboards/alerts that key off ERROR.

The justification ("typed QPE never fires via REST client") is reasonable, but the level collapse has its own consequence. Consider: keep Logger.error for the generic branch (drop only the typed-instanceof, not the severity), or detect the window-limit by message substring once if WARN spam becomes a problem.


🟡 TotalSizeSiteSearchIndicesMetricType.format1Decimal — doc/behavior mismatch on edge values

dotCMS/src/main/java/com/dotcms/telemetry/collectors/sitesearch/TotalSizeSiteSearchIndicesMetricType.java:52-57

The string-slice approach truncates the second decimal but only strips the .0 suffix when String.valueOf(value) produces exactly one zero after the decimal. Example: 1.05 GBString.valueOf(1.05) = "1.05" → second-decimal-not-"0" → returns "1.0" → final output "1.0gb". The Javadoc claims "Trailing .0 decimals are dropped"; this case slips through.

Minor — not a regression — but the test testTruncationNotRounding documents the truncation behavior; consider adding a 1.05gb → "1gb" test case or noting the edge in the doc.


Minor

  • ESContentTool decommission note references "R7 dotEvergreen cutover (~Aug 18)" — fine in a tracking comment, but the year isn't pinned. If R7 slips, the comment ages poorly. Same for ContentletAPI.
  • OpenSearchExceptionMapper: switching from substring match ("index_not_found_exception" → 404) to exception.status() is the right call — OpenSearch returns 404 for that case natively. Good.
  • DotRunnableThread.ReindexActionListeners removal verified — no external references.
  • ContentsWebAPI dead hits variable, SearchHits unused @Nullable, CreateJsonWebTokenResource Map import swap — all safe.

Tasks

  • Read changed files
  • Analyze risks/edge cases
  • Post review
    · Branch

…COMMISSION comments

Remove ESCoupled.java annotation class and replace all usages with inline
// ES-DECOMMISSION: comments that preserve the decommission context and remain
greppable via: grep -r "ES-DECOMMISSION"

Affected classes (7):
- ESContentTool — bridge methods esSearch/esSearchRaw
- ElasticsearchStatusExceptionMapper — full class, replaced by OpenSearchExceptionMapper
- BulkActionListener — full class, replaced by IndexBulkListener
- ContentletAPI — deprecated method signatures esSearch/esSearchRaw
- ContentletAPIInterceptor — same as ContentletAPI
- ContentletAPIPostHook — hook methods for deprecated signatures
- ContentletAPIPreHook — hook methods for deprecated signatures

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…IImpl

Add ES-DECOMMISSION comment explaining why QueryPhaseExecutionException was
not migrated to OS 3.x in this branch: no typed equivalent exists in the
OpenSearch Java client — detection at this call-site would require message
parsing. Deferred to Phase 3 alongside ContentletAPI cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fabrizzio-dotCMS fabrizzio-dotCMS changed the title feat(opensearch) #34610: @ESCoupled annotation, OpenSearch exception mapper, and neutral utility conversions feat(opensearch) #34610: ES-DECOMMISSION markers, OpenSearchExceptionMapper, and neutral utility conversions May 27, 2026
fabrizzio-dotCMS and others added 3 commits May 27, 2026 10:37
…ES-specific QPE branch

Replace the if/else that branched on QueryPhaseExecutionException with a single
generic Logger.warnAndDebug call covering both the window-limit and any other
unexpected search failure. Reason: the QPE branch never fires via the REST
client (which wraps all server errors as ElasticsearchStatusException), and no
typed OS equivalent exists in OpenSearch Java client 3.x. A comment documents
the decision to defer full vendor-neutral handling to Phase 3 at the factory layer.
Also removes the now-unused QPE import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[TASK] Migrate Utilities and Supporting Classes

1 participant