-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Hi,
I wanted to fit the spatial LDA model on ~1.6 million cells (using all cells except ~50000 tumor cells as anchor cells), but I ran into some issues when training the model. Here is a snippet of the code:
spatial_lda_models = {}
difference_penalty = 0.25
N_TOPICS_LIST = [3,4,5,6,7,8,9,10,11]
for n in N_TOPICS_LIST:
path_to_train_model = '_'.join((f'spatialLDA_model/',f'penalty = {difference_penalty}', f'topics={n}',f'trainfrac=0.9')) + '.pkl'
print(f'Running n_topics = {n}, d = {difference_penalty}\n')
spatial_lda_model = spatial_lda.model.train(sample_features = train_hodgkin_features,
difference_matrices = train_difference_matrices,
difference_penalty = difference_penalty,
n_topics = n,
n_parallel_processes = 20,
verbosity = 1,
admm_rho = 0.1,
primal_dual_mu = 10)
spatial_lda_models[n] = spatial_lda_model
with open(path_to_train_model, 'wb') as f:
pickle.dump(spatial_lda_model, f)
order_topics_consistently(spatial_lda_models.values())
When training the model, I got exception: stopping in admm.newton_regularized_dirichlet. From the traceback message, it seems like this is caused by the hyperparameter primal_dual_mu. I tried to train the model on a subset of the data with the same hyperparameters and the code did run successfully. Can you provide some insight in to the issue I am facing? Could it be caused by the large sample size and improper hyperparameter choice making the optimization algorithm to diverge?
Metadata
Metadata
Assignees
Labels
No labels
