Inability to process SMILES with certain SMILES characters

Hello,
My team and I are interested in using your package to facilitate the generation of new molecules with potential to have a certain type of toxicity. In order to do that, we explored the ability of your package to use a user-defined scoring function as the center of the training protocol. In addition to that, a new dataset is required to complete the training process. Although your package proved to be able to carry out training using user-defined functions, it seems to have some issues handling SMILES representations that contain certain characters. After some investigation, it seems like the package fails to process SMILES that contain / and \ characters which are used to indicate the **cis** and **trans** positions of atoms. we were wondering if there exist an easy fix to this problem and if yes, what should be done to fix that issue.  


This the error it keeps showing 
    data processing 0/44
    Traceback (most recent call last):
      File "main.py", line 235, in <module>
        learn(mol_sml, args)
      File "main.py", line 121, in learn
        subgraph_set_init, input_graphs_dict_init = data_processing(smiles_list,   args.GNN_model_path, args.motif)
      File "/home/qspt_user/data_efficient_grammar/grammar_generation.py", line 42, in data_processing
        subgraphs.append(SubGraph(subgraph_i_mapped,  mapping_to_input_mol=subgraph_i_mapped, subfrags=list(cluster)))
      File "/home/qspt_user/data_efficient_grammar/private/molecule_graph.py", line 91, in __init__
        super(SubGraph, self).__init__(mol, is_subgraph=True, mapping_to_input_mol=mapping_to_input_mol)
      File "/home/qspt_user/data_efficient_grammar/private/molecule_graph.py", line 15, in __init__
        self.hypergraph = mol_to_hg(mol, kekulize=True, add_Hs=False)
      File "/home/qspt_user/data_efficient_grammar/private/hypergraph.py", line 744, in mol_to_hg
        bipartite_g = mol_to_bipartite(mol, kekulize)
      File "/home/qspt_user/data_efficient_grammar/private/hypergraph.py", line 692, in  mol_to_bipartite
        mol = standardize_stereo(mol)
      File "/home/qspt_user/data_efficient_grammar/private/hypergraph.py", line 938, in  standardize_stereo
        atom_idx_1 = each_bond.GetStereoAtoms()[0]
IndexError: Index out of range

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inability to process SMILES with certain SMILES characters #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Inability to process SMILES with certain SMILES characters #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions