-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Hello
Could you please help me with the folowing error: DeepBGC failed with ValueError: Grouper for 'sequence_id' not 1-dimensional
Thank you very much!!!
Commands:
deepbgc prepare --output-tsv cat_saxi_clusters_refs.prepared.tsv cat_saxi_clusters_refs_genomic.fasta
deepbgc train --model deepbgc.json --output SaxiDetector.pkl --config PFAM2VEC ./pfam2vec.csv Saxi_Positives.incluster.pfam.tsv Fake_negatives.pfam.tsv
ERROR 29/07 18:14:08 DeepBGC failed with ValueError: Grouper for 'sequence_id' not 1-dimensional
cat_saxi_clusters_refs_genomic.fasta contains six nucleotide sequences from a BGC of six species
Based on the GeneSwap_Negatives.pfam.tsv file, I edited the cat_saxi_clusters_refs.prepared.tsv to have the same columns as the GeneSwap_Negatives.pfam.tsv file, including the 'sequence_id', which consist of six BGC identifiers in the positive file and 'NEG_FAKE_CLUSTER' in the edited Fake_negatives.pfam.tsv
Both files have these columns:
sequence_id|contig_id|protein_id|gene_start|gene_end|gene_strand|pfam_id|domain_start|domain_end|bitscore|in_cluster
Saxi_Positives.incluster.pfam.tsv with in_cluster = 1 and six sequence_id to group by during training
Fake_negatives.pfam.tsv with in_cluster = 0 and one sequence_id
I got the deepbgc.json and pfam2vec.csv from github
Complete error message:
Traceback (most recent call last):
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/main.py", line 113, in main
run(argv)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/main.py", line 102, in run
args.func.run(**args_dict)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/command/train.py", line 60, in run
train_samples, train_y = util.read_samples(inputs, target)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/deepbgc/util.py", line 561, in read_samples
samples = [sample for sample_id, sample in domains.groupby('sequence_id')]
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/pandas/core/generic.py", line 7632, in groupby
observed=observed, **kwargs)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 2110, in groupby
return klass(obj, by, **kwds)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/pandas/core/groupby/groupby.py", line 360, in init
mutated=self.mutated)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/pandas/core/groupby/grouper.py", line 602, in _get_grouper
if not isinstance(gpr, Grouping) else gpr)
File "/home/marlaux/anaconda3/envs/deepbgc/lib/python3.7/site-packages/pandas/core/groupby/grouper.py", line 322, in init
"Grouper for '{}' not 1-dimensional".format(t))
ValueError: Grouper for 'sequence_id' not 1-dimensional
ERROR 29/07 18:14:08 ================================================================================
ERROR 29/07 18:14:08 DeepBGC failed with ValueError: Grouper for 'sequence_id' not 1-dimensional
ERROR 29/07 18:14:08 ================================================================================