Migrate Schema Designer to JAX-RS Apis. Fix some bugs.#4203
Migrate Schema Designer to JAX-RS Apis. Fix some bugs.#4203epugh wants to merge 57 commits intoapache:mainfrom
Conversation
Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…mments Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…y docs guard) Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ader, sanitize filename Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…/docs map for JS rendering Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors the Schema Designer v2 API implementation to a Jersey-based resource + OpenAPI-defined interface, updates the admin UI download URL accordingly, and includes a small improvement to JSON-lines parsing to avoid unchecked-cast warnings.
Changes:
- Migrate
SchemaDesignerAPIfrom the legacy@EndPointstyle to aJerseyResourceimplementing a newSchemaDesignerApiinterface, and update tests to call the new method signatures. - Adjust the Schema Designer UI download endpoint URL to the new
/api/schema-designer/download?configSet=...form. - Narrow unchecked-cast suppression in
DefaultSampleDocumentsLoadervia a helper method usinginstanceof Map<?, ?>.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| solr/webapp/web/js/angular/controllers/schema-designer.js | Updates Schema Designer download URL to new endpoint shape. |
| solr/core/src/test/org/apache/solr/handler/designer/TestSchemaDesignerAPI.java | Reworks tests to call new Jersey-style API methods and response handling. |
| solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerSettings.java | Changes settings class visibility to public. |
| solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConstants.java | Removes unused constants related to old request-param handling. |
| solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerConfigSetHelper.java | Minor cleanup/modernization and removal of unused helper logic. |
| solr/core/src/java/org/apache/solr/handler/designer/SchemaDesignerAPI.java | Major migration to JerseyResource + SchemaDesignerApi, plus download/query response changes. |
| solr/core/src/java/org/apache/solr/handler/designer/SampleDocuments.java | Uses Stream.toList() instead of Collectors.toList(). |
| solr/core/src/java/org/apache/solr/handler/designer/DefaultSchemaSuggester.java | Adjusts field-prop guessing API/signature usage. |
| solr/core/src/java/org/apache/solr/handler/designer/DefaultSampleDocumentsLoader.java | Introduces parseStringToJson() helper and narrows unchecked-cast suppression. |
| solr/core/src/java/org/apache/solr/core/CoreContainer.java | Registers Schema Designer as a Jersey resource class. |
| solr/api/src/java/org/apache/solr/client/api/endpoint/SchemaDesignerApi.java | Adds new OpenAPI/JAX-RS interface defining Schema Designer endpoints. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| if (errorsDuringIndexing != null) { | ||
| Map<String, Object> response = new HashMap<>(); | ||
| rsp.setException( | ||
| new SolrException( | ||
| SolrException.ErrorCode.BAD_REQUEST, | ||
| "Failed to re-index sample documents after schema updated.")); | ||
| response.put(ERROR_DETAILS, errorsDuringIndexing); | ||
| rsp.getValues().addAll(response); | ||
| return; | ||
| throw new SolrException( | ||
| SolrException.ErrorCode.BAD_REQUEST, | ||
| "Failed to re-index sample documents after schema updated."); | ||
| } |
There was a problem hiding this comment.
query() used to return an error response containing errorDetails when re-indexing failed, which the Schema Designer UI surfaces (see its errorHandler). This new implementation throws a SolrException without attaching errorDetails, so clients will lose the per-doc failure details. Consider instantiating a FlexibleSolrJerseyResponse early and setting an errorDetails top-level property before throwing, or otherwise preserve the prior error payload structure.
| protected void requireSchemaVersion(Integer schemaVersion) { | ||
| if (schemaVersion == null) { | ||
| throw new SolrException( | ||
| SolrException.ErrorCode.BAD_REQUEST, | ||
| SCHEMA_VERSION_PARAM + " is a required parameter for the " + req.getPath() + " endpoint"); | ||
| SolrException.ErrorCode.BAD_REQUEST, SCHEMA_VERSION_PARAM + " is a required parameter!"); | ||
| } | ||
| return schemaVersion; | ||
| } |
There was a problem hiding this comment.
requireSchemaVersion() now only rejects null, but previously the API treated -1 as “missing” (since getInt(..., -1) was used) and rejected it. As written, a client can pass schemaVersion=-1 and bypass this validation; it would be safer to also reject negative values to keep the same contract.
…sentinel contract) Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…emadesignerapi-to-v2-annotations
… github.com:epugh/solr into copilot/migrate-schemadesignerapi-to-v2-annotations
|
Discussing with @gerlowskija this ticket, and he brings up the "Shouldn't this be more restFul"? Looking at it, |
…ner API Agent-Logs-Url: https://github.com/epugh/solr/sessions/3e11a7a9-caca-4333-8a6b-bbcd5a2cd862 Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/epugh/solr/sessions/8f08e90a-243e-43eb-9aaf-070c2e9ed092 Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ddenInConfigSets on ConfigSetService) Agent-Logs-Url: https://github.com/epugh/solr/sessions/8fcb8001-f311-4bcb-bf46-522ebd7a9d05 Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…e download+getFile to SchemaDesignerApi Agent-Logs-Url: https://github.com/epugh/solr/sessions/07fed0d8-5175-46b5-bff4-7111da9bb8c3 Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…emadesignerapi-to-v2-annotations
… DownloadConfigSet.zipConfigSet() Agent-Logs-Url: https://github.com/epugh/solr/sessions/fd2b245f-2e34-45e1-96fe-a5047e48206b Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
…ownload endpoint directly Agent-Logs-Url: https://github.com/epugh/solr/sessions/4f648089-c8e3-4fcd-9ac5-c55c3e9307df Co-authored-by: epugh <22395+epugh@users.noreply.github.com>
|
We took a diversion to update ConfigsetsAPI with an idea that we would reuse those APIs in the schema designer. Unfortunantly the only reusable API was teh "download a zipped configset". It turns out the other SchemaDesignerAPI endpoints that look like they could be replace with a ConfigsetsAPI call can't be because of Schema Designer specific business logic. I'll paste below two examples: listConfigs vs listConfigSets — cannot replacelistConfigSets returns ListConfigsetsResponse { List configSets } — a flat list of all raw configset names, including ._designer_films. listConfigs does designer-specific logic the JS UI depends on: Filters out system/excluded configsets and raw .designer* entries from the top-level list updateFileContents vs uploadConfigSetFile — cannot replaceuploadConfigSetFile (PUT /configsets/{configSetName}/{filePath}) is a general-purpose file upload — it writes bytes to a configset and returns a plain SolrJerseyResponse. updateFileContents does three critical designer-specific things that uploadConfigSetFile cannot do: solrconfig.xml pre-validation: Before saving, it loads and parses the XML using InMemoryResourceLoader / SolrConfig.readFromResourceLoader to catch errors before corrupting the configset. The error is returned in the response body (updateFileError + fileContent fields) without touching ZK. |
…emadesignerapi-to-v2-annotations
…hen I had to fix this. If we used Rails we wouldn't need this!
Appreciate your writeup on this, thanks @epugh ! If we were designing this from scratch I'd argue some of this stuff should be built into the configset APIs themselves as a general-purpose concept. "Enabled/disabled" in particular seems like a good candidate in that regard. The configset-validation/testing is a bit trickier or more special-purpose, but still... Anyway, all of that is definitely out of scope for this PR. But thinking aloud in terms of deduplicating this API stuff down the road... Starting in on review now, will aim to do that from a purely "migration" perspective and not worry about some of the API-duplication concerns I mentioned above... |
|
|
||
| /** V2 API definitions for the Solr Schema Designer. */ | ||
| @Path("/schema-designer") | ||
| public interface SchemaDesignerApi { |
There was a problem hiding this comment.
Rather than commenting on each method or annotation line here, I'll combine all my "cosmetic" suggestions into a single comment
Here's what the PR currently proposes in terms of JAX-RS APIs...
- GET /api/schema-designer/{configSet}/info
- POST /api/schema-designer/{configSet}/prep
- DELETE /api/schema-designer/{configSet}
- PUT /api/schema-designer/{configSet}/file
- GET /api/schema-designer/{configSet}/sample
- GET /api/schema-designer/{configSet}/collectionsForConfig
- GET /api/schema-designer/configs
- POST /api/schema-designer/{configSet}/add
- PUT /api/schema-designer/{configSet}/update
- PUT /api/schema-designer/{configSet}/publish
- POST /api/schema-designer/{configSet}/analyze
- GET /api/schema-designer/{configSet}/query
- GET /api/schema-designer/{configSet}/diff
My notes/suggestions:
GET /api/schema-designer/configs
I love the "plural noun" configs here. Very common REST-ful idiom or listing out resources, that we use throughout the v2 API (e.g. /api/collections, /api/aliases). IMO we should add this path-segments to all of these schema-designer APIs
POST /api/schema-designer/{configSet}/add
PUT /api/schema-designer/{configSet}/update
IMO it's more REST-ful and consistent with our other APIs to drop the /add and /update path segments. Doing PUT or POST at a particular path is a pretty common idiom for create/update. Dropping /add and /update would keep things more consistent with our other v2 CRUD APIs.
GET /api/schema-designer/{configSet}/info
Similarly, I'd suggest dropping /info from this path.
| throws Exception; | ||
|
|
||
| @GET | ||
| @Path("/{configSet}/collectionsForConfig") |
There was a problem hiding this comment.
[0] This API is kindof interesting - seemingly there's nothing schema-designer specific about it at all? You could imagine it moving to be a part of our existing "list-collections" API exposed by a query param, e.g. GET /api/collections?configSet=foo
Not work for this PR; just thinking aloud a bit....
There was a problem hiding this comment.
yeah, I was struggling to not "reimainge things", however I was chatting with Christos and he identified some things that the Solr UI would benefit for editing a configset...
| @Operation( | ||
| summary = "Update the contents of a file in a configSet being designed.", | ||
| tags = {"schema-designer"}) | ||
| SchemaDesignerResponse updateFileContents( |
There was a problem hiding this comment.
[Q/-1] Where are the actual file contents in this API definition? I assume they're in the request-body, but they're not represented in the method signature at all. Is that intentional?
There was a problem hiding this comment.
So, I realized how this happened... All of my testing was via "Fire up Solr, test out the schema designer UI", so course it wasn't using of the openapi annotations.... (Maybe someday we do have a SolrJS client).... So I never really ran into these problems. Just committed a revamp to properly support.
| summary = "Add a new field, field type, or dynamic field to the schema being designed.", | ||
| tags = {"schema-designer"}) | ||
| SchemaDesignerResponse addSchemaObject( | ||
| @PathParam("configSet") String configSet, @QueryParam("schemaVersion") Integer schemaVersion) |
There was a problem hiding this comment.
[Q/-1] Where is the "schema object" being modified or added? I assume it's in the request-body of the API. If that's the case it should be represented in the method signature here as an InputStream param or something similar.
There was a problem hiding this comment.
Like wise just committed revamp.e
| summary = "Update an existing field or field type in the schema being designed.", | ||
| tags = {"schema-designer"}) | ||
| SchemaDesignerResponse updateSchemaObject( | ||
| @PathParam("configSet") String configSet, @QueryParam("schemaVersion") Integer schemaVersion) |
There was a problem hiding this comment.
[Q/-1] Ditto, re: the schema-object itself isn't represented in the method signature at all.
| case "dynamicField", "add-dynamic-field" -> response.dynamicField = value; | ||
| case "fieldType", "add-field-type" -> response.fieldType = value; | ||
| default -> { | ||
| /* unknown action type — silently ignore */ |
There was a problem hiding this comment.
[-1] Let's throw an error or have an assertion or something here.
There was a problem hiding this comment.
Okay, dug in and found some old cases as well after looking at schema-designer.js...
| } | ||
| } | ||
|
|
||
| protected void addErrorToResponse( |
There was a problem hiding this comment.
[0] I'm a little confused about the purpose of this method. We're converting indexing errors into a response that can be returned to the user? It takes a SolrException, but also a Map of Throwables keyed off of generic Object?
I know this isn't public, but maybe it's worth some javadocs anyway as the signature is pretty confusing at a glance.
There was a problem hiding this comment.
its actually a important method because yeah, it capctures indexing errors and bring sthem back to the user. revamped docs
| protected String checkMutable(String configSet, SolrQueryRequest req) throws IOException { | ||
| void addSettingsToResponse( | ||
| SchemaDesignerSettings settings, final SchemaDesignerInfoResponse response) { | ||
| response.languages = settings.getLanguages(); |
There was a problem hiding this comment.
[-1] There's gotta be a better way to handle all of these method overloads. I'd guess this gets much much cleaner if we create a parent model class to contain all of these common properties that SchemaDesignerInfoResponse and friends extend from.
There was a problem hiding this comment.
Nice call.. I dug in and discovered that we actually had four emthods and one wasn't even used.. Refactored.
| * <p>Note: This test focuses on input validation. Full deletion workflow is tested in integration | ||
| * tests like {@code TestConfigSetsAPI} since actual deletion requires ZooKeeper interaction. | ||
| */ | ||
| public class DeleteConfigSetAPITest extends SolrTestCase { |
There was a problem hiding this comment.
[Q] This is a nice unit test and all, but why is it in this PR? This PR is about Schema-Designer and makes only the tiniest one-line change to DeleteConfigSet...
There was a problem hiding this comment.
yeah... I think i noticed the gap in one of the varioutions of all of this... I can seperate it out.
| @@ -0,0 +1,181 @@ | |||
| #!/usr/bin/env bats | |||
There was a problem hiding this comment.
[-0] Is there a particular reason to do a BATS test in this case? Is there functionality that can't be tested using a normal Java-based integration test?
IMO our default should be to cover things in a Java-based test, since the BATS tests can't be parallelized, run on Windows, etc.
There was a problem hiding this comment.
I pruned it back a lot... I think my thought on BATS is that it would cover the "we call multiple apis" use case.. But honestly, schema designer isnt' really something i care a lot about, and I can totally see removing this. I first wrote the bats test just to try and udnerstand the sequence of api calls.
There was a problem hiding this comment.
Going to just remove it as it sint' a clear need.
There was a problem hiding this comment.
It'd be nice to have some sort of documented "best practice" for when something gets unit tests, when it gets integration tests, when it gets BATS, etc. I think the unit/integration thing is a bit looser, but especially given the runtime of the BATS stuff it'd be nice to have clear consensus on.
Anyway, thanks for reconsidering!
|
|
||
| local http_code | ||
| http_code=$(curl -s -o "${zip_file}" -w "%{http_code}" \ | ||
| "http://localhost:${SOLR_PORT}/api/configsets/${mutable_id}/files?displayName=${DESIGNER_CONFIGSET}") |
There was a problem hiding this comment.
[-1] displayName? I don't see that as a query param on any of these schema-designer APIs, am I missing something? (Check for other occurrences in this file.)
There was a problem hiding this comment.
I went ahead and rmeoved the test
| @@ -0,0 +1,8 @@ | |||
| # See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc | |||
| title: Fixed a bug in the analyze sample documents feature of Schema Designer that prevented the designer from working. | |||
There was a problem hiding this comment.
[0] IMO this changelog entry isn't that informative. "Fixed a bug in the Schema Designer" doesn't really tell a user what the bug was, or whether it was one they were likely to run into. Consider wording it like the following:
Schema Designer sample-doc analysis now works correctly when
[0] In addition to the bug fix, this PR adds a ton of API coverage to SolrJ. Idk how many people care about our schema-designer APIs as they're so UI focused. But this is at least theoretically valuable to people...
There was a problem hiding this comment.
Yeah, if it's user oritented, then i think the solrj doesn't actually matter ;-(.
… left over from v1
…ing about this api that others don't have.
|
@gerlowskija I figured out why the annotaitons were NOT complete... All of my testing was either .bats or by interacting with the schema-designer.js file, which of course uses JS and JSON and doesn't use the SolrJ library, so that was all a blind spot. I've updated things.. I have one caveat that I need some help... the /analyze endpoint can take in csv, json, xml, jsonl... There is some "magic" in how it grabs the inputstream and parses it, and I can't figure out for the life of me how to get that generic inputstream that is "grabbed" documented. I think maybe we are the point to get this done that maybe a pairing session? |




Summary
Migrate the SchemaDesigner to JAX-RS. Fixed a bug in the analyze feature that wasn't working in
main.Changes
JAX-RS adoption, RESTful style routing, and code review. Had to change up the routes in the JavaScript to match.
The runtime behaviour is unchanged.