Skip to content

Migrate Schema Designer to JAX-RS Apis. Fix some bugs.#4203

Open
epugh wants to merge 57 commits intoapache:mainfrom
epugh:copilot/migrate-schemadesignerapi-to-v2-annotations
Open

Migrate Schema Designer to JAX-RS Apis. Fix some bugs.#4203
epugh wants to merge 57 commits intoapache:mainfrom
epugh:copilot/migrate-schemadesignerapi-to-v2-annotations

Conversation

@epugh
Copy link
Copy Markdown
Contributor

@epugh epugh commented Mar 10, 2026

Summary

Migrate the SchemaDesigner to JAX-RS. Fixed a bug in the analyze feature that wasn't working in main.

Changes

JAX-RS adoption, RESTful style routing, and code review. Had to change up the routes in the JavaScript to match.

The runtime behaviour is unchanged.

Copilot AI and others added 8 commits February 22, 2026 14:50
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…mments

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…y docs guard)

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ader, sanitize filename

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…/docs map for JS rendering

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
@epugh epugh requested a review from Copilot March 10, 2026 09:50
@epugh epugh changed the title Fix unchecked cast warning in DefaultSampleDocumentsLoader Migrate Schema Designer to Ja Mar 10, 2026
@epugh epugh changed the title Migrate Schema Designer to Ja Migrate Schema Designer to JAX-RS Apis. Fix some bugs. Mar 10, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Schema Designer v2 API implementation to a Jersey-based resource + OpenAPI-defined interface, updates the admin UI download URL accordingly, and includes a small improvement to JSON-lines parsing to avoid unchecked-cast warnings.

Changes:

  • Migrate SchemaDesignerAPI from the legacy @EndPoint style to a JerseyResource implementing a new SchemaDesignerApi interface, and update tests to call the new method signatures.
  • Adjust the Schema Designer UI download endpoint URL to the new /api/schema-designer/download?configSet=... form.
  • Narrow unchecked-cast suppression in DefaultSampleDocumentsLoader via a helper method using instanceof Map<?, ?>.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
solr/webapp/web/js/angular/controllers/schema-designer.js Updates Schema Designer download URL to new endpoint shape.
solr/core/src/test/org/apache/solr/handler/designer/TestSchemaDesignerAPI.java Reworks tests to call new Jersey-style API methods and response handling.
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerSettings.java Changes settings class visibility to public.
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConstants.java Removes unused constants related to old request-param handling.
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java Minor cleanup/modernization and removal of unused helper logic.
solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerAPI.java Major migration to JerseyResource + SchemaDesignerApi, plus download/query response changes.
solr/core/src/java/org/apache/solr/handler/designer/SampleDocuments.java Uses Stream.toList() instead of Collectors.toList().
solr/core/src/java/org/apache/solr/handler/designer/DefaultSchemaSuggester.java Adjusts field-prop guessing API/signature usage.
solr/core/src/java/org/apache/solr/handler/designer/DefaultSampleDocumentsLoader.java Introduces parseStringToJson() helper and narrows unchecked-cast suppression.
solr/core/src/java/org/apache/solr/core/CoreContainer.java Registers Schema Designer as a Jersey resource class.
solr/api/src/java/org/apache/solr/client/api/endpoint/SchemaDesignerApi.java Adds new OpenAPI/JAX-RS interface defining Schema Designer endpoints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread solr/webapp/web/js/angular/controllers/schema-designer.js
Comment on lines 822 to 826
if (errorsDuringIndexing != null) {
Map<String, Object> response = new HashMap<>();
rsp.setException(
new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
"Failed to re-index sample documents after schema updated."));
response.put(ERROR_DETAILS, errorsDuringIndexing);
rsp.getValues().addAll(response);
return;
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
"Failed to re-index sample documents after schema updated.");
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query() used to return an error response containing errorDetails when re-indexing failed, which the Schema Designer UI surfaces (see its errorHandler). This new implementation throws a SolrException without attaching errorDetails, so clients will lose the per-doc failure details. Consider instantiating a FlexibleSolrJerseyResponse early and setting an errorDetails top-level property before throwing, or otherwise preserve the prior error payload structure.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I dug in and found some bugs and fixed them in how we surface errors.
image

Comment on lines +1349 to 1354
protected void requireSchemaVersion(Integer schemaVersion) {
if (schemaVersion == null) {
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
SCHEMA_VERSION_PARAM + " is a required parameter for the " + req.getPath() + " endpoint");
SolrException.ErrorCode.BAD_REQUEST, SCHEMA_VERSION_PARAM + " is a required parameter!");
}
return schemaVersion;
}
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requireSchemaVersion() now only rejects null, but previously the API treated -1 as “missing” (since getInt(..., -1) was used) and rejected it. As written, a client can pass schemaVersion=-1 and bypass this validation; it would be safer to also reject negative values to keep the same contract.

Copilot uses AI. Check for mistakes.
Comment thread solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerSettings.java Outdated
Comment thread solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerAPI.java Outdated
@epugh epugh requested review from dsmiley and janhoy March 10, 2026 10:43
Copilot AI and others added 3 commits March 19, 2026 11:57
…sentinel contract)

Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
… github.com:epugh/solr into copilot/migrate-schemadesignerapi-to-v2-annotations
@epugh
Copy link
Copy Markdown
Contributor Author

epugh commented Mar 21, 2026

Discussing with @gerlowskija this ticket, and he brings up the "Shouldn't this be more restFul"? Looking at it, configSet is a query param in ALL of these, and really should be part of the path param.

@epugh
Copy link
Copy Markdown
Contributor Author

epugh commented Mar 30, 2026

image image

@epugh epugh requested a review from gerlowskija March 30, 2026 10:10
@epugh
Copy link
Copy Markdown
Contributor Author

epugh commented Apr 24, 2026

We took a diversion to update ConfigsetsAPI with an idea that we would reuse those APIs in the schema designer. Unfortunantly the only reusable API was teh "download a zipped configset". It turns out the other SchemaDesignerAPI endpoints that look like they could be replace with a ConfigsetsAPI call can't be because of Schema Designer specific business logic.

I'll paste below two examples:

listConfigs vs listConfigSets — cannot replace

listConfigSets returns ListConfigsetsResponse { List configSets } — a flat list of all raw configset names, including ._designer_films.

listConfigs does designer-specific logic the JS UI depends on:

Filters out system/excluded configsets and raw .designer* entries from the top-level list
Maps ._designer_X → X (strips the prefix) and marks it as status=0 (draft in progress)
Returns Map<String, Integer> where 0=draft, 1=designer-disabled, 2=enabled — these status integers drive the color coding and filtering in schema-designer.js
listConfigs must stay.

updateFileContents vs uploadConfigSetFile — cannot replace

uploadConfigSetFile (PUT /configsets/{configSetName}/{filePath}) is a general-purpose file upload — it writes bytes to a configset and returns a plain SolrJerseyResponse.

updateFileContents does three critical designer-specific things that uploadConfigSetFile cannot do:

solrconfig.xml pre-validation: Before saving, it loads and parses the XML using InMemoryResourceLoader / SolrConfig.readFromResourceLoader to catch errors before corrupting the configset. The error is returned in the response body (updateFileError + fileContent fields) without touching ZK.
Temp collection reload: After saving, calls configSetHelper.reloadTempCollection(mutableId, false).
Full schema response: Re-indexes sample docs and returns a complete SchemaDesignerResponse (schema fields, settings, doc count, analysis errors).
Name blocker: uploadConfigSetFile calls SolrIdentifierValidator.validateConfigSetName() which rejects names starting with . like ._designer_films, so it literally cannot be used for designer draft configsets.
updateFileContents must stay.

@epugh
Copy link
Copy Markdown
Contributor Author

epugh commented Apr 24, 2026

I just discovered you can edit a file in a configset via the schemadesigner!
image
I look forward to having a better edit via Solr UI.

@gerlowskija
Copy link
Copy Markdown
Contributor

gerlowskija commented May 7, 2026

other SchemaDesignerAPI endpoints that look like they could be replace with a ConfigsetsAPI call can't be because of Schema Designer specific business logic.

Appreciate your writeup on this, thanks @epugh !

If we were designing this from scratch I'd argue some of this stuff should be built into the configset APIs themselves as a general-purpose concept. "Enabled/disabled" in particular seems like a good candidate in that regard. The configset-validation/testing is a bit trickier or more special-purpose, but still...

Anyway, all of that is definitely out of scope for this PR. But thinking aloud in terms of deduplicating this API stuff down the road...

Starting in on review now, will aim to do that from a purely "migration" perspective and not worry about some of the API-duplication concerns I mentioned above...


/** V2 API definitions for the Solr Schema Designer. */
@Path("/schema-designer")
public interface SchemaDesignerApi {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than commenting on each method or annotation line here, I'll combine all my "cosmetic" suggestions into a single comment

Here's what the PR currently proposes in terms of JAX-RS APIs...

  1. GET /api/schema-designer/{configSet}/info
  2. POST /api/schema-designer/{configSet}/prep
  3. DELETE /api/schema-designer/{configSet}
  4. PUT /api/schema-designer/{configSet}/file
  5. GET /api/schema-designer/{configSet}/sample
  6. GET /api/schema-designer/{configSet}/collectionsForConfig
  7. GET /api/schema-designer/configs
  8. POST /api/schema-designer/{configSet}/add
  9. PUT /api/schema-designer/{configSet}/update
  10. PUT /api/schema-designer/{configSet}/publish
  11. POST /api/schema-designer/{configSet}/analyze
  12. GET /api/schema-designer/{configSet}/query
  13. GET /api/schema-designer/{configSet}/diff

My notes/suggestions:

GET /api/schema-designer/configs

I love the "plural noun" configs here. Very common REST-ful idiom or listing out resources, that we use throughout the v2 API (e.g. /api/collections, /api/aliases). IMO we should add this path-segments to all of these schema-designer APIs

POST /api/schema-designer/{configSet}/add

PUT /api/schema-designer/{configSet}/update

IMO it's more REST-ful and consistent with our other APIs to drop the /add and /update path segments. Doing PUT or POST at a particular path is a pretty common idiom for create/update. Dropping /add and /update would keep things more consistent with our other v2 CRUD APIs.

GET /api/schema-designer/{configSet}/info

Similarly, I'd suggest dropping /info from this path.

throws Exception;

@GET
@Path("/{configSet}/collectionsForConfig")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] This API is kindof interesting - seemingly there's nothing schema-designer specific about it at all? You could imagine it moving to be a part of our existing "list-collections" API exposed by a query param, e.g. GET /api/collections?configSet=foo

Not work for this PR; just thinking aloud a bit....

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was struggling to not "reimainge things", however I was chatting with Christos and he identified some things that the Solr UI would benefit for editing a configset...

@Operation(
summary = "Update the contents of a file in a configSet being designed.",
tags = {"schema-designer"})
SchemaDesignerResponse updateFileContents(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q/-1] Where are the actual file contents in this API definition? I assume they're in the request-body, but they're not represented in the method signature at all. Is that intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I realized how this happened... All of my testing was via "Fire up Solr, test out the schema designer UI", so course it wasn't using of the openapi annotations.... (Maybe someday we do have a SolrJS client).... So I never really ran into these problems. Just committed a revamp to properly support.

summary = "Add a new field, field type, or dynamic field to the schema being designed.",
tags = {"schema-designer"})
SchemaDesignerResponse addSchemaObject(
@PathParam("configSet") String configSet, @QueryParam("schemaVersion") Integer schemaVersion)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q/-1] Where is the "schema object" being modified or added? I assume it's in the request-body of the API. If that's the case it should be represented in the method signature here as an InputStream param or something similar.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like wise just committed revamp.e

summary = "Update an existing field or field type in the schema being designed.",
tags = {"schema-designer"})
SchemaDesignerResponse updateSchemaObject(
@PathParam("configSet") String configSet, @QueryParam("schemaVersion") Integer schemaVersion)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q/-1] Ditto, re: the schema-object itself isn't represented in the method signature at all.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep..

case "dynamicField", "add-dynamic-field" -> response.dynamicField = value;
case "fieldType", "add-field-type" -> response.fieldType = value;
default -> {
/* unknown action type — silently ignore */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] Let's throw an error or have an assertion or something here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, dug in and found some old cases as well after looking at schema-designer.js...

}
}

protected void addErrorToResponse(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] I'm a little confused about the purpose of this method. We're converting indexing errors into a response that can be returned to the user? It takes a SolrException, but also a Map of Throwables keyed off of generic Object?

I know this isn't public, but maybe it's worth some javadocs anyway as the signature is pretty confusing at a glance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its actually a important method because yeah, it capctures indexing errors and bring sthem back to the user. revamped docs

protected String checkMutable(String configSet, SolrQueryRequest req) throws IOException {
void addSettingsToResponse(
SchemaDesignerSettings settings, final SchemaDesignerInfoResponse response) {
response.languages = settings.getLanguages();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] There's gotta be a better way to handle all of these method overloads. I'd guess this gets much much cleaner if we create a parent model class to contain all of these common properties that SchemaDesignerInfoResponse and friends extend from.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice call.. I dug in and discovered that we actually had four emthods and one wasn't even used.. Refactored.

* <p>Note: This test focuses on input validation. Full deletion workflow is tested in integration
* tests like {@code TestConfigSetsAPI} since actual deletion requires ZooKeeper interaction.
*/
public class DeleteConfigSetAPITest extends SolrTestCase {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] This is a nice unit test and all, but why is it in this PR? This PR is about Schema-Designer and makes only the tiniest one-line change to DeleteConfigSet...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... I think i noticed the gap in one of the varioutions of all of this... I can seperate it out.

@@ -0,0 +1,181 @@
#!/usr/bin/env bats
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] Is there a particular reason to do a BATS test in this case? Is there functionality that can't be tested using a normal Java-based integration test?

IMO our default should be to cover things in a Java-based test, since the BATS tests can't be parallelized, run on Windows, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pruned it back a lot... I think my thought on BATS is that it would cover the "we call multiple apis" use case.. But honestly, schema designer isnt' really something i care a lot about, and I can totally see removing this. I first wrote the bats test just to try and udnerstand the sequence of api calls.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to just remove it as it sint' a clear need.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to have some sort of documented "best practice" for when something gets unit tests, when it gets integration tests, when it gets BATS, etc. I think the unit/integration thing is a bit looser, but especially given the runtime of the BATS stuff it'd be nice to have clear consensus on.

Anyway, thanks for reconsidering!


local http_code
http_code=$(curl -s -o "${zip_file}" -w "%{http_code}" \
"http://localhost:${SOLR_PORT}/api/configsets/${mutable_id}/files?displayName=${DESIGNER_CONFIGSET}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] displayName? I don't see that as a query param on any of these schema-designer APIs, am I missing something? (Check for other occurrences in this file.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and rmeoved the test

@@ -0,0 +1,8 @@
# See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc
title: Fixed a bug in the analyze sample documents feature of Schema Designer that prevented the designer from working.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] IMO this changelog entry isn't that informative. "Fixed a bug in the Schema Designer" doesn't really tell a user what the bug was, or whether it was one they were likely to run into. Consider wording it like the following:

Schema Designer sample-doc analysis now works correctly when

[0] In addition to the bug fix, this PR adds a ton of API coverage to SolrJ. Idk how many people care about our schema-designer APIs as they're so UI focused. But this is at least theoretically valuable to people...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if it's user oritented, then i think the solrj doesn't actually matter ;-(.

@epugh
Copy link
Copy Markdown
Contributor Author

epugh commented May 9, 2026

@gerlowskija I figured out why the annotaitons were NOT complete... All of my testing was either .bats or by interacting with the schema-designer.js file, which of course uses JS and JSON and doesn't use the SolrJ library, so that was all a blind spot.

I've updated things.. I have one caveat that I need some help... the /analyze endpoint can take in csv, json, xml, jsonl... There is some "magic" in how it grabs the inputstream and parses it, and I can't figure out for the life of me how to get that generic inputstream that is "grabbed" documented. I think maybe we are the point to get this done that maybe a pairing session?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants