Skip to content

Conversation

@sbliven
Copy link
Member

@sbliven sbliven commented Feb 2, 2022

This addresses #1019 and was developed in parallel with @josemduarte's fix #1020. It has been rebased to remove duplicated effort, and now consists mostly of additional tests for the bug.

Fix NullPointer if the structure identifier was missing
Currently `Structure` has 4 identifiers which may or may not be set
depending on the source, file format, and parsing method. These should
be revisited and harmonized in the future, but for now I just document
them in this test.

Summary of results:

| Format | With ID | Parse Method    | PdbId | Identifier | Name | StructureIdentifier |
|--------|---------|-----------------|-------|------------|------|---------------------|
| cif    | No      | fromInputStream | null  | ""         | ""   | null                |
| pdb    | No      | fromInputStream | null  | ""         | ""   | null                |
| cif    | No      | fromUrl         | null  | ""         | ""   | null                |
| cif    | No      | StructureIO     | null  | file:      | 5PTI | file:               |
| pdb    | No      | StructureIO     | null  | file:      | ""   | file:               |
| cif    | Yes     | fromInputStream | 5PTI  | ""         | ""   | null                |
| pdb    | Yes     | fromInputStream | 5PTI  | ""         | ""   | null                |
| cif    | Yes     | fromUrl         | 5PTI  | ""         | ""   | null                |
| cif    | Yes     | StructureIO     | 5PTI  | file:      | 5PTI | file:               |
| pdb    | Yes     | StructureIO     | 5PTI  | file:      | 5PTI | file:               |

- `getPdbId` reflects the ID parsed from the file
- Only StructureIO is setting the StructureIdentifier (by design, but
  indicates StructureIO should be used wherever possible)
- StructureIO does not set the name properly for PDB files. This should
  be fixed.
The core URLIdentifier bug was fixed in biojava#1020, but this improves the
test.

The parser changes are mostly for logging, and are there to indicate that
null PdbId is an expected situation rather than an error.
Copy link
Member

@aalhossary aalhossary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Copy link
Member

@aalhossary aalhossary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM

Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Very needed tests!

pdbIdToSet = new PdbId(pdbCode);
} catch (IllegalArgumentException e) {
logger.info("Malformed (or null) PDB ID {}. setting PdbId to null", pdbCode);
if(pdbCode.isBlank()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that String.isBlank() is a java 11 feature that we can't use just yet. At least that's what IntelliJ tells me. Interestingly the tests passed in CI, not sure why.

In any case, I'd like to move to java 11 soon. I think it is long overdue. We'll try to address that separately.

logger.info("Malformed (or null) PDB ID {}. setting PdbId to null", pdbCode);
pdbId = null;
}
if(pdbCode.isBlank()){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment

@@ -0,0 +1,2287 @@
HEADER 01-JUL-21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this file smaller? Simply keep 2-3 residues from ATOM and SEQRES

Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve, isBlank() will be ok once #1053 is merged

@josemduarte
Copy link
Contributor

I'll supersede this PR by a new one so that I can truncate the large PDB file before merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants