-
Notifications
You must be signed in to change notification settings - Fork 397
Description
We have recently noticed that some structural bioinformatics programs (structure refinement or modelling) generate PDB files where the Element column is missing. The element column is the last column, where the periodic table element of the Atom is indicated.
Parsing these files with BioJava currently does not allow the calculation of structural alignments or symmetry (and any other analysis using C-alpha atoms), because to extract the C-alpha atoms of a structure the name (CA) and element (C) of the Atoms is checked (in StructureTools.getRepresentativeAtoms()).
The Element column is not completely redundant, because in case of a modified aminoacid with calcium bound to it, the name CA alone does not distinguish the calcium from the C-alpha carbon and the element column is needed to do so.
On the other hand, we could print a warning when parsing such models and guess and fill the Element column from the Atom names (at least for the Atoms in aminoacids), in order to support the incomplete files.
Question: is there any drawback in guessing and filling the Element of the Atoms? Can we use the Chemical Components for that?