[ENH] V1 -> V2 Migration : Runs#1616
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1616 +/- ##
==========================================
- Coverage 54.64% 54.04% -0.60%
==========================================
Files 63 63
Lines 5124 5173 +49
==========================================
- Hits 2800 2796 -4
- Misses 2324 2377 +53 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…u040/openml-python into runs-migration-stacked # Conflicts: # openml/_api/resources/__init__.py # openml/_api/runtime/core.py
…into runs-migration-stacked
…into runs-migration-stacked
geetu040
left a comment
There was a problem hiding this comment.
sync with base pr
sdk code look good so far, please take a look at #1575 (comment) and make changes accordingly where needed.
all tests (existing and new) should pass to make sure we are retaining the original functionality of the sdk
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…into runs-migration-stacked
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…-11/openml-python into runs-migration-stacked
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
| use_cache = not ignore_cache | ||
| reset_cache = ignore_cache | ||
| return api_context.backend.runs.get( | ||
| run_id, | ||
| use_cache=use_cache, | ||
| reset_cache=reset_cache, | ||
| ) |
There was a problem hiding this comment.
use_cache should be true since the method always supports caching
reset_cache should rely on ignore_cache
|
Hi @geetu040 . As per discussion in final mentoring meeting isn't it ready to merge ? |
geetu040
left a comment
There was a problem hiding this comment.
overall this is close to getting merged, just a few review comments
There was a problem hiding this comment.
following on the last unresolved review comment https://github.com/openml/openml-python/pull/1616/changes#r2988760874
what are these changes? is it a merge fault?
There was a problem hiding this comment.
This was because of this https://github.com/openml/openml-python/actions/runs/25209014637/job/73915466617?pr=1616 . I had updated it . Should I look for a alternate solution ?
Co-authored-by: Armaghan Shakir <raoarmaghanshakir040@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…-11/openml-python into runs-migration-stacked
…rver Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
|
Hi @geetu040 !! Can you check the replies to reviews and the failing tests and let me know if any changes needed ? |
| def test_run_v1_get(run_v1, with_test_cache): | ||
| try: | ||
| run = run_v1.get(run_id=1) | ||
| except OpenMLServerException as e: | ||
| if e.code == 236 or "Run not found" in str(e): | ||
| run = run_v1.get(run_id=25) | ||
| else: | ||
| raise | ||
| _assert_run_shape(run) | ||
|
|
There was a problem hiding this comment.
why simply run_v1.get(run_id=1) doesn't work?
There was a problem hiding this comment.
Because Run Id of 1 is not present in test server and run Id of 25 is not present in local server . It is bit weird I tried Finding common ID Manually but couldn't find it .
There was a problem hiding this comment.
Okay, I remember it now, I mentioned this to @PGijsbers on slack here.
There was a problem hiding this comment.
Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,
run_id = 25 if openml config is local else 1
That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(
There was a problem hiding this comment.
we could actually use the env var OPENML_USE_LOCAL_SERVICES, that should be fine
geetu040
left a comment
There was a problem hiding this comment.
Nicely done @Omswastik-11.
@PGijsbers, could you please review/merge this PR when you get a chance?
There is currently one issue caused by differences between the test-server and local-server database entities, which is temporarily patched here: #1616 (comment).
I had mentioned this earlier on Slack as well here, we can continue discussion there
PGijsbers
left a comment
There was a problem hiding this comment.
small changes or clarifications requested, please see comments.
| path_parts = parsed_url.path.strip("/").split("/") | ||
|
|
||
| filtered_params = {k: v for k, v in params.items() if k != "api_key"} | ||
| params_part = [urlencode(filtered_params)] if filtered_params else [] |
There was a problem hiding this comment.
This is a good remark, but seeing as this code isn't touched by this PR, I would advocate fixing this in a separate PR.
| if response.content.startswith(b"PK\x03\x04"): | ||
| return "body.zip" | ||
|
|
||
| try: | ||
| arff.loads(response.text) | ||
| return "body.arff" | ||
| except arff.ArffException: | ||
| pass |
There was a problem hiding this comment.
Is there no HTTP header data that would allow us to tell what the content (and file name) should be?
Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive): @relation <relation name>. So we could look for that instead of parsing the entire file content.
There was a problem hiding this comment.
Is there no HTTP header data that would allow us to tell what the content (and file name) should be?
I tried but didn't find anything
Otherwise, at least for ARFF, the spec states that the first non-comment line of the file should be (not case sensitive):
@relation <relation name>.
sounds good I could give this a try
| OpenMLHashException | ||
| If checksum verification fails. | ||
| """ | ||
| url = urljoin(self.server, path) |
There was a problem hiding this comment.
ignore: If this isn't the case already, this should be normalized when openml.config.server is set, not each site which uses it.
| candidates = [] | ||
| for p in path.iterdir(): | ||
| if p.name.startswith("body.") and len(p.suffixes) == 1: | ||
| candidates.append(p) |
There was a problem hiding this comment.
| candidates = [] | |
| for p in path.iterdir(): | |
| if p.name.startswith("body.") and len(p.suffixes) == 1: | |
| candidates.append(p) | |
| candidates = [p for p in path.glob("body.*") if len(p.suffixes) == 1] |
see glob
| if use_api_key: | ||
| params["api_key"] = self.api_key | ||
|
|
||
| if method.upper() in {"POST", "PUT", "PATCH"}: | ||
| data = {**params, **data} |
There was a problem hiding this comment.
ignore: It raises an exception if api_key is None, it's the statement preceding this line..
| self, | ||
| limit: int, | ||
| offset: int, | ||
| *, | ||
| ids: builtins.list[int] | None = None, |
There was a problem hiding this comment.
please address or explain; i see you have dismissed previous comments about this so presumably there is a reason?
|
|
||
| # Fall back to generic oml:id (used by other resources) | ||
| if "oml:id" in root_value: | ||
| return int(root_value["oml:id"]) |
There was a problem hiding this comment.
If run responses always return oml:run_id, when do we expect this code path to be correct to run?
There was a problem hiding this comment.
@Omswastik-11 since this method is overriden for runs, we shouldn't expect to handle other resources here, therefore logically this path should be unreachable as Pieter has said
| def test_run_v1_get(run_v1, with_test_cache): | ||
| try: | ||
| run = run_v1.get(run_id=1) | ||
| except OpenMLServerException as e: | ||
| if e.code == 236 or "Run not found" in str(e): | ||
| run = run_v1.get(run_id=25) | ||
| else: | ||
| raise | ||
| _assert_run_shape(run) | ||
|
|
There was a problem hiding this comment.
Didn't we have a way to check whether a local or non-local server configured is being used?
Then I would prefer to use that e.g.,
run_id = 25 if openml config is local else 1
That embeds this knowledge into the code so it's clear for future maintainers.
We probably do not have the time to address this on our end for a while longer :(
|
@Omswastik-11 could you go through the above comments, we'd need to close these discussions. |
Metadata
Details
fixes #1624