tfmindi.tl.evaluate_topic_models#

tfmindi.tl.evaluate_topic_models(adata, n_topics_range=None, alpha=50, eta=0.1, n_iter=150, random_state=123, **kwargs)#

Evaluate multiple topic models to find optimal number of topics.

Parameters:

adata (AnnData) – AnnData object with cluster assignments and genomic coordinates
n_topics_range (list[int] | None (default: None)) – List of topic numbers to evaluate (default: [10, 15, 20, 25, 30, 35, 40, 50])
alpha (float (default: 50)) – Dirichlet prior for document-topic distribution (default: 50)
eta (float (default: 0.1)) – Dirichlet prior for topic-word distribution (default: 0.1)
n_iter (int (default: 150)) – Number of LDA iterations (default: 150)
random_state (int (default: 123)) – Random seed for reproducibility (default: 123)
**kwargs – Additional arguments passed to run_topic_modeling

Return type:

dict[int, float]

Returns:

Mapping of n_topics to log-likelihood scores

Note: The best-performing model is automatically stored in adata

Examples

>>> import tfmindi as tm
>>> # Evaluate different numbers of topics
>>> scores = tm.tl.evaluate_topic_models(adata, n_topics_range=[10, 20, 30, 40])
>>> best_n_topics = max(scores, key=scores.get)
>>> print(f"Best number of topics: {best_n_topics}")
>>> # Best model is already stored in adata for plotting

tfmindi.tl.evaluate_topic_models

Contents

tfmindi.tl.evaluate_topic_models#