Skip to content

[codex] add ERNIE image diffusers runner#1115

Closed
HanFa wants to merge 2 commits into
ModelTC:mainfrom
sutro-planet:feature/ernie-image-diffusers
Closed

[codex] add ERNIE image diffusers runner#1115
HanFa wants to merge 2 commits into
ModelTC:mainfrom
sutro-planet:feature/ernie-image-diffusers

Conversation

@HanFa

@HanFa HanFa commented Jun 3, 2026

Copy link
Copy Markdown

Summary

Adds ERNIE-Image text-to-image support through a Diffusers-backed LightX2V runner.

This PR introduces:

  • ernie_image configs and model pipeline registration
  • ErnieImageRunner with a decomposed generation path instead of defaulting to ErnieImagePipeline.__call__
  • LightX2V wrappers for component access, text/PE encoding, transformer inference, scheduler state, and VAE decode
  • progress callback integration, image save/tensor return handling, CPU offload, and unload_modules cleanup
  • focused unit tests for shape handling, PE/CFG, scheduler/VAE behavior, component loading, progress, save, tensor return, and unload cleanup

Scope

This is still intentionally Diffusers-backed. It uses diffusers.ErnieImagePipeline.from_pretrained as the underlying loader and then routes runtime access through LightX2V wrapper boundaries. It does not yet implement native ERNIE weight mapping, native transformer infer classes, quantization, LoRA, or distributed execution.

Validation

Local checks:

  • python -m unittest test_cases.test_ernie_image_runner -> 17 tests OK
  • ruff check configs/ernie_image lightx2v/infer.py lightx2v/pipeline.py lightx2v/utils/set_config.py lightx2v/models/runners/ernie_image lightx2v/models/input_encoders/hf/ernie_image lightx2v/models/networks/ernie_image lightx2v/models/schedulers/ernie_image lightx2v/models/video_encoders/hf/ernie_image test_cases/test_ernie_image_runner.py -> all checks passed
  • python -m py_compile ... for the ERNIE runner/wrapper/test files -> passed

H100 smoke regression after the component-container migration:

  • PE off, no offload: LightX2V runner vs direct Diffusers was pixel-identical, MSE 0.0, max abs diff 0
  • PE off, CPU offload: LightX2V runner vs direct Diffusers was pixel-identical, MSE 0.0, max abs diff 0
  • PE on, CPU offload + unload_modules=true + tensor return: generation completed, progress reached (100.0, 100), and pipe/components/model/text_encoder/scheduler/vae were all unloaded

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the ERNIE-Image text-to-image model by adding dedicated configuration files, a text encoder, a transformer model wrapper, a scheduler, a VAE decoder, and a runner pipeline, along with comprehensive unit tests. The review feedback highlights three key areas for improvement: resolving a potential runtime dtype mismatch in the VAE decoder by explicitly casting batch norm statistics to the latent's dtype, replacing an assertion with a proper ValueError for runtime validation in the runner, and correcting an inconsistent variable reference from model_cls to self.model_cls in the pipeline initialization.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread lightx2v/models/video_encoders/hf/ernie_image/vae.py Outdated
Comment thread lightx2v/models/runners/ernie_image/ernie_image_runner.py Outdated
Comment thread lightx2v/pipeline.py Outdated
@HanFa HanFa marked this pull request as ready for review June 3, 2026 06:47
@HanFa

HanFa commented Jun 3, 2026

Copy link
Copy Markdown
Author

@wangshankun could you review this PR? Let me know if there is a contribution guide to follow.

@gushiqiao gushiqiao closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants