-
Notifications
You must be signed in to change notification settings - Fork 16
Expand file tree
/
Copy pathAnnotations:List.html
More file actions
102 lines (88 loc) · 3.89 KB
/
Annotations:List.html
File metadata and controls
102 lines (88 loc) · 3.89 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
<h2 id="how-do-i-list-the-annotations-in-a-sequence">How do I List the Annotations in a Sequence?</h2>
<p>When you read in a annotates sequence file such as GenBank or EMBL there
is a lot more detailed information in there than just the raw sequence.
If the information has a sensible location then it ends up as a Feature.
If it is more generic such as the species name then the information ends
up as Annotations.</p>
<p>BioJava Annotation objects are a bit like Map objects and they contian
key value mappings.</p>
<p>Below is the initial portion of an EMBL file</p>
<div class="highlighter-rouge"><pre class="highlight"><code>ID AY130859 standard; DNA; HUM; 44226 BP.
XX
AC AY130859;
XX
SV AY130859.1
XX
DT 25-JUL-2002 (Rel. 72, Created)
DT 25-JUL-2002 (Rel. 72, Last updated, Version 1)
XX
DE Homo sapiens cyclin-dependent kinase 7 (CDK7) gene, complete cds.
XX
KW .
XX
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN [1]
RP 1-44226
RA Rieder M.J., Livingston R.J., Braun A.C., Montoya M.A., Chung M.-W.,
RA Miyamoto K.E., Nguyen C.P., Nguyen D.A., Poel C.L., Robertson P.D.,
RA Schackwitz W.S., Sherwood J.K., Witrak L.A., Nickerson D.A.;
RT ;
RL Submitted (11-JUL-2002) to the EMBL/GenBank/DDBJ databases.
RL Genome Sciences, University of Washington, 1705 NE Pacific, Seattle, WA
RL 98195, USA
XX
CC To cite this work please use: NIEHS-SNPs, Environmental Genome
CC Project, NIEHS ES15478, Department of Genome Sciences, Seattle, WA
CC (URL: http://egp.gs.washington.edu).
</code></pre>
</div>
<p>The following program reads an EMBL file and lists its Annotation
properties. The output of this program on the above file is listed below
the program.</p>
<java> import java.io.\*; import java.util.\*;
import org.biojava.bio.\*; import org.biojava.bio.seq.\*; import
org.biojava.bio.seq.io.\*;
public class ListAnnotations {
` public static void main(String[] args) {`
` try {`
` //read in an EMBL Record`
` BufferedReader br = new BufferedReader(new FileReader(args[0]));`
` `
` //for each sequence list the annotations`
` for(SequenceIterator seqs = SeqIOTools.readEmbl(br); seqs.hasNext(); ){`
` Annotation anno = seqs.nextSequence().getAnnotation();`
` //print each key value pair`
` for (Iterator i = anno.keys().iterator(); i.hasNext(); ) {`
` Object key = i.next();`
` System.out.println(key +" : "+ anno.getProperty(key));`
` }`
` }`
` }`
` catch (Exception ex) {`
` ex.printStackTrace();`
` }`
` }`
} </java>
<p>Program Output</p>
<div class="highlighter-rouge"><pre class="highlight"><code>RN : [1]
KW : .
RL : [Submitted (11-JUL-2002) to the EMBL/GenBank/DDBJ databases., Genome Sciences, University of Washington, 1705 NE Pacific, Seattle, WA, 98195, USA]
embl_accessions : [AY130859]
DE : Homo sapiens cyclin-dependent kinase 7 (CDK7) gene, complete cds.
SV : AY130859.1
AC : AY130859;
FH : Key Location/Qualifiers
XX :
OC : [Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;, Eutheria; Primates; Catarrhini; Hominidae; Homo.]
RA : [Rieder M.J., Livingston R.J., Braun A.C., Montoya M.A., Chung M.-W.,, Miyamoto K.E., Nguyen C.P., Nguyen D.A., Poel C.L., Robertson P.D.,, Schackwitz W.S., Sherwood J.K., Witrak L.A., Nickerson D.A.;]
ID : AY130859 standard; DNA; HUM; 44226 BP.
DT : [25-JUL-2002 (Rel. 72, Created), 25-JUL-2002 (Rel. 72, Last updated, Version 1)]
CC : [To cite this work please use: NIEHS-SNPs, Environmental Genome, Project, NIEHS ES15478, Department of Genome Sciences, Seattle, WA, (URL: http://egp.gs.washington.edu).]
RT : ;
OS : Homo sapiens (human)
RP : 1-44226
</code></pre>
</div>