feat(skill): discourage multi-line stream lambdas#37
Merged
Conversation
99859bd to
91694e9
Compare
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
martinfrancois
commented
Jun 23, 2026
b9521d8 to
b9d5ff6
Compare
ce9a552 to
e90b092
Compare
Discourage multiline stream lambdas, preserve requested Java artifacts and nested helper types, and keep implementation prompts on direct stream result production when the Java baseline allows it. Strengthen findFirst/findAny exception wording, bounded mapConcurrent guidance, Java 8 fallbacks, null collector handling, and parallel-stream review advice. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
Add the overdue shipment reference scenario, update multiline-lambda and Java baseline criteria, and mark the uppercase performance review as an explicit skill invocation. Keep suite numbering and agent behavior notes aligned with the expanded reference and regression coverage. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
Update scripts/run_eval_suite.sh for the current Tessl eval CLI, let suite policy choose baseline versus with-context runs, pass --skill java-streams and --force for final readiness evidence, and avoid pinning Sonnet in default commands. Document that Tessl's default solver is used by default because explicit model selection is entitlement-gated; Sonnet 4.6 or better remains recommended when modelSelection is available for representative checks. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
Add an internal pre-submit gate that runs quality first, tracks scenario-level evidence by skill and scenario fingerprints, schedules impact and historical-risk probes, and broadens to balanced remaining batches only after targeted evidence is clean. Document the final hard requirement for runtime skill changes, including 100% quality and 100% with-context evidence across main, reference, and regression for the final skill bundle state. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
Record how future agents should handle Tessl or other CLI deprecation warnings: update runtime scripts to the replacement path, keep compatibility fallbacks when needed, and refresh agent-facing docs in the same change. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
87d33e2 to
5577ced
Compare
Switch the Skill Review workflow from the deprecated tessl skill review command to tessl review run with an explicit martinfrancois workspace so CI is non-interactive and uses the review path validated locally. Update the pull request checklist to match the workflow command. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
Require runtime-reference overlap to be classified with explicit evidence types instead of allowing free-form rationale text to bypass validation. Reclassify focused main and reference scenarios that intentionally overlap runtime guidance, and keep explicit invocation metadata aligned with prompts. Co-Authored-By: marvinbuff <marvinbuff@hotmail.com> Co-Authored-By: PReimers <preimers@pm.me>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
$java-streamsdid not explicitly reject multi-line stream lambdas, including block lambdas, body-on-next-line lambdas, and wrapped nested stream callbacks.Change Type
Choose all that apply.
Linked Issue
User-Visible Behavior
Users of
$java-streamsnow get explicit guidance to keep stream lambdas as short glue and extract branching, loops, temporary variables, formatting, merge rules, and nested stream chains into named helpers. The hard-stop scan now also flags block lambdas, arrows whose body starts on the next line, and nested stream lambda bodies that continue on later lines.The default hosted eval wrapper no longer pins Sonnet. It uses Tessl's current default solver unless an explicit
--agentis supplied. The docs still recommend Sonnet 4.6 or a better frontier model when the account has model-selection entitlement and the goal is a more representative real-world check.Bug Fix Details
For bug fixes or regressions, explain why the issue happened and what now prevents it from coming
back. For other changes, write
N/A.-> {block lambdas.evals-reference/28-overdue-shipment-notices, added multi-line-lambda criteria to05-parallel-cpu-review,15-session-roster-indexes, and22-java8-version-scan, broadened the hard-stop scan, and reran targeted hosted evals for those scenarios.Validation
List the commands, manual checks, or hosted checks you ran. Include relevant failures that were fixed
during the PR.
Checks most contributors can run:
python3 scripts/validate_skill.py skills/java-streamspython3 scripts/validate_eval_criteria.py evals evals-reference evals-regressionpython3 -m py_compile scripts/*.pybash -n scripts/*.shtessl plugin lint .Tessl-authenticated checks:
bash scripts/check_publish_dry_run.sh .tessl plugin publish --dry-run --bump patch .tessl skill review --threshold 100 skills/java-streams/SKILL.md, if skill text or references changedscripts/run_eval_suite.sh <main|reference> <scenario-name>, if skill behavior or those evals changedscripts/run_eval_suite.sh regression <scenario-name>, if regression evals changedscripts/run_eval_suite.sh referencewas run after the final runtime-context change, or the PR links the blocker issuescripts/run_eval_suite.sh regressionwas run after the final runtime-context change, or the PR links the blocker issuescripts/classify_eval_result.py <run-json> --scenario-dir <scenario-dir>, if a scenario was added or moved between suitesscripts/run_eval_suite.sh main, if benchmark claims changed or targeted with-context results are cleanbash scripts/check_publish_dry_run.sh .,tessl skill review, and hosted Tessl evals requireTessl authentication. Hosted evals also require a linked Tessl project. If you can't run one of
them, leave it unchecked and explain why in the details.
Details:
Human Verification
Describe what you tried manually and what result you saw. If the change cannot be tried manually,
explain why.
Review Checklist
docs/agents/workflow.md, or any Tessl blocker is documented.AI Assistance (if used)
AI prompts / session logs (optional)