tfmindi.tl.evaluate_topic_models#
- tfmindi.tl.evaluate_topic_models(adata, n_topics_range=None, alpha=50, eta=0.1, n_iter=150, random_state=123, **kwargs)#
Evaluate multiple topic models to find optimal number of topics.
- Parameters:
adata (
AnnData) – AnnData object with cluster assignments and genomic coordinatesn_topics_range (
list[int] |None(default:None)) – List of topic numbers to evaluate (default: [10, 15, 20, 25, 30, 35, 40, 50])alpha (
float(default:50)) – Dirichlet prior for document-topic distribution (default: 50)eta (
float(default:0.1)) – Dirichlet prior for topic-word distribution (default: 0.1)n_iter (
int(default:150)) – Number of LDA iterations (default: 150)random_state (
int(default:123)) – Random seed for reproducibility (default: 123)**kwargs – Additional arguments passed to run_topic_modeling
- Return type:
- Returns:
Mapping of n_topics to log-likelihood scores
Note: The best-performing model is automatically stored in adata
Examples
>>> import tfmindi as tm >>> # Evaluate different numbers of topics >>> scores = tm.tl.evaluate_topic_models(adata, n_topics_range=[10, 20, 30, 40]) >>> best_n_topics = max(scores, key=scores.get) >>> print(f"Best number of topics: {best_n_topics}") >>> # Best model is already stored in adata for plotting