Add Streamlit UI for real-time SER predictions; update README and requirements#1
Add Streamlit UI for real-time SER predictions; update README and requirements#1Saurabh620 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 329129b476
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| features = extract_mfcc_from_bytes(audio_bytes) | ||
| probs = model.predict(features, verbose=0)[0] |
There was a problem hiding this comment.
Normalize MFCC features with training scaler
main() feeds raw MFCC vectors directly into model.predict, but the training notebook standardizes inputs (MainProgram.ipynb builds X_normalized with StandardScaler and trains on that tensor). Skipping the same scaler at inference creates a train/serve feature mismatch that can materially distort class probabilities and predicted emotions for normal user uploads; the app should load and apply the exact scaler used during training before prediction.
Useful? React with 👍 / 👎.
| if not model_exists: | ||
| st.warning(f"Model file not found: `{model_path}`. Upload audio is disabled until model is available.") | ||
|
|
||
| uploaded = st.file_uploader("Upload audio", type=["wav", "mp3", "ogg", "flac", "m4a"]) |
There was a problem hiding this comment.
Remove unsupported uploaded audio extensions
The uploader advertises mp3 and m4a, but decoding is done solely through soundfile.read in extract_mfcc_from_bytes with no fallback or error handling. In environments using the pinned soundfile==0.10.3.post1, these compressed formats are often not decodable (especially m4a), so selecting such a file can raise a runtime decode error instead of producing a prediction; either restrict allowed types to guaranteed codecs or add a robust fallback decoder path.
Useful? React with 👍 / 👎.
Motivation
Description
streamlit_app.pythat loads a Keras/TensorFlow model, extracts MFCC features from uploaded audio, runs inference, and displays predicted emotion, class probabilities, and a bar chart.extract_mfcc_from_bytesto read audio bytes withsoundfile, resample to 22050 Hz, slice/pad a 3s window, compute 40 MFCCs, and shape the result as(1, 40, 1).st.cache_resourceto cache the loaded model viaget_model, and include robust model-loading fallback betweenkerasandtensorflow.keras.README.mdwith step-by-step run instructions and a new "Streamlit UI (optimized)" section, and addstreamlit==1.45.1torequirements.txt.Testing
Codex Task