Added new params to support Fish Audio#298
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. To trigger a review, include ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches✨ Simplify code
Comment |
There was a problem hiding this comment.
Pull request overview
This PR extends the SDK’s request type models to support additional Fish Audio / audio-inference parameters (reference voices, additional settings knobs, and more flexible voice selection).
Changes:
- Added new
ISettingsfields for audio chunking/normalization/latency behavior. - Introduced
IAudioReferenceVoiceand addedIAudioInputs.referenceVoiceswith dict-to-dataclass normalization. - Expanded
IAudioSpeech.voicesto accept voice model ID strings; changedIAudioSpeech.volumetoOptional[float].
Comments suppressed due to low confidence (2)
runware/types.py:2155
- After normalization,
self.voiceswill contain onlystrandIAudioVoiceentries (dicts are converted). The local variable annotationList[Union[str, IAudioVoice, Dict[str, Any]]]is misleading; tightening it to match the post-normalized shape will make the code easier to reason about.
if self.voices is not None and isinstance(self.voices, (list, tuple)):
normalized_voices: List[Union[str, IAudioVoice, Dict[str, Any]]] = []
for v in self.voices:
if isinstance(v, dict):
normalized_voices.append(IAudioVoice(**v))
else:
normalized_voices.append(v)
self.voices = normalized_voices
runware/types.py:2155
- New input shapes (
IAudioInputs.referenceVoicesdict-to-dataclass normalization andIAudioSpeech.voicesacceptingstrIDs) aren’t covered by tests. Since this module already has unit coverage (tests/test_types.py), add tests asserting these new coercions/accepted types serialize as expected.
@dataclass
class IAudioInputs(SerializableMixin):
audio: Optional[str] = None
audios: Optional[List[str]] = None
video: Optional[str] = None
videos: Optional[List[str]] = None
referenceVoices: Optional[List[Union[IAudioReferenceVoice, Dict[str, Any]]]] = None
@property
def request_key(self) -> str:
return "inputs"
def __post_init__(self) -> None:
if self.referenceVoices is not None:
self.referenceVoices = [
IAudioReferenceVoice(**ref) if isinstance(ref, dict) else ref
for ref in self.referenceVoices
]
@dataclass
class IAudioVoice(SerializableMixin):
speaker: str
voice: str
@dataclass
class IAudioSpeech(SerializableMixin):
text: Optional[str] = None
voice: Optional[str] = None
voices: Optional[List[Union[str, IAudioVoice, Dict[str, Any]]]] = None
language: Optional[str] = None
speed: Optional[float] = None
volume: Optional[float] = None
pitch: Optional[int] = None
emotion: Optional[str] = None
tone: Optional[List[str]] = None
def __post_init__(self) -> None:
if self.voices is not None and isinstance(self.voices, (list, tuple)):
normalized_voices: List[Union[str, IAudioVoice, Dict[str, Any]]] = []
for v in self.voices:
if isinstance(v, dict):
normalized_voices.append(IAudioVoice(**v))
else:
normalized_voices.append(v)
self.voices = normalized_voices
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Added
IAudioReferenceVoice(audio,text) for reference-voice input onaudioInferenceIAudioInputsnow includes:referenceVoices: Optional[List[Union[IAudioReferenceVoice, Dict[str, Any]]]]IAudioSpeech.voicesnow accepts voice model ID strings in addition toIAudioVoiceobjectsISettingsnow includes:chunkLength: Optional[int]minChunkLength: Optional[int]normalize: Optional[bool]normalizeLoudness: Optional[bool]latency: Optional[str]conditionOnPreviousChunks: Optional[bool]earlyStopThreshold: Optional[float]Changed
IAudioSpeech.volumetype changed fromOptional[int]toOptional[float]