GenbankReader process(int max) seems to close the InputStream?

I have a large .gbff file I am trying to read (or rather several large such files). The biggest is about 1.5GB.

The code looks like this:
```java
InputStream is = new FileInputStream("filename.gbff.gz");
is = new GZIPInputStream(is);
GenbankReader<DNASequence, NucleotideCompound> dnaReader = new 
    GenbankReader<DNASequence, NucleotideCompound>(is, new 
        GenericGenbankHeaderParser<DNASequence,NucleotideCompound>(),
		new DNASequenceCreator(AmbiguityDNACompoundSet.getDNACompoundSet())
		); 
LinkedHashMap<String, DNASequence> gbFile = dnaReader.process();
...
```
Since the file sizes got too large I try to read the files in smaller chunks
(I have also tried setting VM arguments to -Xms512M -Xmx4096M, which did not to work due to lack of memory)

I try 
```java
InputStream is = new FileInputStream("filename.gbff.gz");
is = new GZIPInputStream(is);
GenbankReader<DNASequence, NucleotideCompound> dnaReader = new 
    GenbankReader<DNASequence, NucleotideCompound>(is, new 
        GenericGenbankHeaderParser<DNASequence,NucleotideCompound>(),
		new DNASequenceCreator(AmbiguityDNACompoundSet.getDNACompoundSet())
		); 
LinkedHashMap<String, DNASequence> gbFile = dnaReader.process(1);
...
```
which seems to work. However, I want all the entries. But, when I want to do this more than one time using:
```java
InputStream is = new FileInputStream("filename.gbff.gz");
is = new GZIPInputStream(is);
GenbankReader<DNASequence, NucleotideCompound> dnaReader = new 
    GenbankReader<DNASequence, NucleotideCompound>(is, new 
        GenericGenbankHeaderParser<DNASequence,NucleotideCompound>(),
		new DNASequenceCreator(AmbiguityDNACompoundSet.getDNACompoundSet())
		); 
LinkedHashMap<String, DNASequence> gbFile1 = dnaReader.process(1);
LinkedHashMap<String, DNASequence> gbFile2 = dnaReader.process(1);
...
```
I receive the error:
```
Exception in thread "main" org.biojava.nbio.core.exceptions.ParserException: Stream closed
	at org.biojava.nbio.core.sequence.io.GenbankSequenceParser.getSequence(GenbankSequenceParser.java:391)
	at org.biojava.nbio.core.sequence.io.GenbankReader.process(GenbankReader.java:145)
	at Prorgam.main(Prorgam.java:39)
```
which seems against the purpose of setting an int (not closing the stream?).

However,
```java
InputStream is = new FileInputStream("filename.gbff.gz");
is = new GZIPInputStream(is);
GenbankReader<DNASequence, NucleotideCompound> dnaReader = new 
    GenbankReader<DNASequence, NucleotideCompound>(is, new 
        GenericGenbankHeaderParser<DNASequence,NucleotideCompound>(),
		new DNASequenceCreator(AmbiguityDNACompoundSet.getDNACompoundSet())
		); 
LinkedHashMap<String, DNASequence> gbFile1 = dnaReader.process(1);
dnaReader.close();
```
do work.

(also, I I manage to get this to work, is there some GenbankReader.hasNext() or similar way which allows me to go through all the smaller chunks of the file, without reading them all to memory?)

Have I misunderstood something here? From the docs 
```
public LinkedHashMap<String,S> process(int max)
                                                            throws IOException,
                                                                   CompoundNotFoundException
This method tries to parse maximum max records from the open File or InputStream, and leaves the underlying resource open.
Subsequent calls to the same method continue parsing the rest of the file.
This is particularly useful when dealing with very big data files, (e.g. NCBI nr database), which can't fit into memory and will take long time before the first result is available.
```
It seems like I should be able to use it this way?
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GenbankReader process(int max) seems to close the InputStream? #800

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GenbankReader process(int max) seems to close the InputStream? #800

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions