TopSBM is a topic modelling approach that infers a hierarchy of topic clusters and word clusters in your Corpus in a non-parametric manner by leveraging stochastic block models.
The approach is developed by E.G. Altman et al.
Top (Topic), SBM (Stochastic Block Models).
This repository is an integration effort of TopSBM to the ATAP platform.
This is demo jupyter notebook for TopSBM with ATAP Corpus integration. At the end of the notebook, you'll be able to download a Corpus with TopSBM results. You may then choose to upload this Corpus across to other ATAP tools for further analysis.
Note: Australian Access Federation (AAF) or Reannz Tuakiri (NZ) authentication is required. If you do not have access to AAF or NZ, you can use the below link to access the tool (this is a free Binder version, limited to 2GB memory only).
- TopSBM website: https://topsbm.github.io/
- TopSBM repository: https://github.com/martingerlach/hSBM_Topicmodel
If you are running this repository locally, you'll first need to:
# 1. activate your virtual environment
./scripts/install_topsbm.sh topsbm