Skip to content

Issue in reproducing results from the paper #7

@tripathiarpan20

Description

@tripathiarpan20

Hi!
Thanks for the open-source work.

I tried training an LLVC with a custom voice using a pretrained model from weights.gg, however, there were several discrepancies that lead to failure in reproduction of the results:

  • The number of epochs are hardcoded to 10000, as opposed to 53 as mentioned in paper.
  • The training takes more than 1 hour on a H100 instance just for 250 global steps, and ~7-8 hours per 250 global steps on an A100 instance (with the default arguments in the experiments/llvc/config.json, with only the log_interval and checkpoint_interval altered). This is way off as compared to the information in the paper claiming a training time of 3 days on RTX 3090 (for 500000 steps), any suggestions to fix the same?:
    image
  • Is there a way to efficiently finetune a new voice model with the pretrained G_500000.pth checkpoint in a parameter efficient manner that can be open-sourced?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions