Package Bio :: Package expressions :: Module genbank
[show private | hide private]
[frames | no frames]

Module Bio.expressions.genbank

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the 'regular expressions on steroids' capabilities of Martel.

Documentation for GenBank format that I found:

o GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

o There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt
Function Summary
  define_block(identifier, block_tag, block_data, std_block_tag, std_tag)
Define a Martel grouping which can parse a block of text.

Variable Summary
Group accession = <Martel.Expression.Group instance at 0x41041...
Group accession_block = <Martel.Expression.Group instance at 0...
Group authors_block = <Martel.Expression.Group instance at 0x4...
Group base_count = <Martel.Expression.Group instance at 0x4105...
Group base_count_line = <Martel.Expression.Group instance at 0...
Group base_number = <Martel.Expression.Group instance at 0x410...
Str big_indent_space = <Martel.Expression.Str instance at 0x...
MaxRepeat blank_space = <Martel.Expression.MaxRepeat instance at 0...
Group comment_block = <Martel.Expression.Group instance at 0x4...
Group consrtm_block = <Martel.Expression.Group instance at 0x4...
Group contig_block = <Martel.Expression.Group instance at 0x41...
Group contig_location = <Martel.Expression.Group instance at 0...
Group data_file_division = <Martel.Expression.Group instance a...
Group date = <Martel.Expression.Group instance at 0x4103ac4c>
Group db_source_block = <Martel.Expression.Group instance at 0...
Group definition_block = <Martel.Expression.Group instance at ...
list divisions = [<Martel.Expression.Str instance at 0x4103ac...
Group feature = <Martel.Expression.Group instance at 0x41053d0...
Group feature_block = <Martel.Expression.Group instance at 0x4...
Group feature_key = <Martel.Expression.Group instance at 0x410...
int FEATURE_KEY_INDENT = 5                                                                     
Group feature_key_line = <Martel.Expression.Group instance at ...
int FEATURE_QUALIFIER_INDENT = 21                                                                    
Group features_line = <Martel.Expression.Group instance at 0x4...
ParseRecords format = <Martel.Expression.ParseRecords instance at 0x4...
Group gi = <Martel.Expression.Group instance at 0x4104196c>
Seq header = <Martel.Expression.Seq instance at 0x410599cc>
int INDENT = 12                                                                    
Group journal_block = <Martel.Expression.Group instance at 0x4...
Group keywords_block = <Martel.Expression.Group instance at 0x...
Group location = <Martel.Expression.Group instance at 0x410534...
Group locus = <Martel.Expression.Group instance at 0x4103a76c>
Group locus_line = <Martel.Expression.Group instance at 0x4103...
Group medline_line = <Martel.Expression.Group instance at 0x41...
HeaderFooter ncbi_format = <Martel.Expression.HeaderFooter instance a...
Group nid = <Martel.Expression.Group instance at 0x410416ac>
Group nid_line = <Martel.Expression.Group instance at 0x410417...
Group organism = <Martel.Expression.Group instance at 0x410476...
Group organism_block = <Martel.Expression.Group instance at 0x...
Group origin_line = <Martel.Expression.Group instance at 0x410...
Group pid = <Martel.Expression.Group instance at 0x410417cc>
Group pid_line = <Martel.Expression.Group instance at 0x410418...
Group primary = <Martel.Expression.Group instance at 0x4105336...
Group primary_line = <Martel.Expression.Group instance at 0x41...
Group primary_ref_line = <Martel.Expression.Group instance at ...
Group pubmed_line = <Martel.Expression.Group instance at 0x410...
Group qualifier = <Martel.Expression.Group instance at 0x41053...
Alt qualifier_space = <Martel.Expression.Alt instance at 0x4...
Str quote = <Martel.Expression.Str instance at 0x410537ac>
Group quoted_chars = <Martel.Expression.Group instance at 0x41...
Seq quoted_string = <Martel.Expression.Seq instance at 0x410...
Group record = <Martel.Expression.Group instance at 0x4105996c...
Group record_end = <Martel.Expression.Group instance at 0x4105...
Group reference = <Martel.Expression.Group instance at 0x4104d...
Group reference_bases = <Martel.Expression.Group instance at 0...
Group reference_line = <Martel.Expression.Group instance at 0x...
Group reference_num = <Martel.Expression.Group instance at 0x4...
Group remark_block = <Martel.Expression.Group instance at 0x41...
list residue_prefixes = [<Martel.Expression.Str instance at 0...
Group residue_type = <Martel.Expression.Group instance at 0x41...
list residue_types = [<Martel.Expression.Str instance at 0x41...
Group segment = <Martel.Expression.Group instance at 0x410470e...
Group segment_line = <Martel.Expression.Group instance at 0x41...
Group sequence = <Martel.Expression.Group instance at 0x410590...
Group sequence_entry = <Martel.Expression.Group instance at 0x...
Group sequence_line = <Martel.Expression.Group instance at 0x4...
Group sequence_plus_spaces = <Martel.Expression.Group instance...
Group size = <Martel.Expression.Group instance at 0x4103a82c>
Str small_indent_space = <Martel.Expression.Str instance at ...
Group source_block = <Martel.Expression.Group instance at 0x41...
Group taxonomy = <Martel.Expression.Group instance at 0x410474...
Group title_block = <Martel.Expression.Group instance at 0x410...
Seq unquoted_string = <Martel.Expression.Seq instance at 0x4...
list valid_divisions = ['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'P...
list valid_residue_prefixes = ['ss-', 'ds-', 'ms-']
list valid_residue_types = ['DNA', 'RNA', 'mRNA', 'tRNA', 'rR...
Group version = <Martel.Expression.Group instance at 0x410418a...
Group version_line = <Martel.Expression.Group instance at 0x41...

Function Details

define_block(identifier, block_tag, block_data, std_block_tag=None, std_tag=None)

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

Variable Details

accession

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410414cc>                       

accession_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104162c>                       

authors_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41047d4c>                       

base_count

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41053dec>                       

base_count_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41053e6c>                       

base_number

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410590ac>                       

big_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x4103a68c>                         

blank_space

Type:
MaxRepeat
Value:
<Martel.Expression.MaxRepeat instance at 0x4103a62c>                   

comment_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104decc>                       

consrtm_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41047fec>                       

contig_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105968c>                       

contig_location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105944c>                       

data_file_division

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103af0c>                       

date

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103ac4c>                       

db_source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41041dac>                       

definition_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104146c>                       

divisions

Type:
list
Value:
[<Martel.Expression.Str instance at 0x4103ac2c>,
 <Martel.Expression.Str instance at 0x4103accc>,
 <Martel.Expression.Str instance at 0x4103acec>,
 <Martel.Expression.Str instance at 0x4103ad0c>,
 <Martel.Expression.Str instance at 0x4103ad2c>,
 <Martel.Expression.Str instance at 0x4103ad4c>,
 <Martel.Expression.Str instance at 0x4103ad6c>,
 <Martel.Expression.Str instance at 0x4103ad8c>,
...                                                                    

feature

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41053d0c>                       

feature_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41053d6c>                       

feature_key

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410534ac>                       

FEATURE_KEY_INDENT

Type:
int
Value:
5                                                                     

feature_key_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105374c>                       

FEATURE_QUALIFIER_INDENT

Type:
int
Value:
21                                                                    

features_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105340c>                       

format

Type:
ParseRecords
Value:
<Martel.Expression.ParseRecords instance at 0x41059acc>                

gi

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104196c>                       

header

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x410599cc>                         

INDENT

Type:
int
Value:
12                                                                    

journal_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104d54c>                       

keywords_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104706c>                       

location

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105348c>                       

locus

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103a76c>                       

locus_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103af6c>                       

medline_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104d58c>                       

ncbi_format

Type:
HeaderFooter
Value:
<Martel.Expression.HeaderFooter instance at 0x410599ec>                

nid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410416ac>                       

nid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104172c>                       

organism

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104768c>                       

organism_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104784c>                       

origin_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105902c>                       

pid

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410417cc>                       

pid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104184c>                       

primary

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105336c>                       

primary_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104dfcc>                       

primary_ref_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105322c>                       

pubmed_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104d72c>                       

qualifier

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41053bac>                       

qualifier_space

Type:
Alt
Value:
<Martel.Expression.Alt instance at 0x4103a6ec>                         

quote

Type:
Str
Value:
<Martel.Expression.Str instance at 0x410537ac>                         

quoted_chars

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105380c>                       

quoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x410539cc>                         

record

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105996c>                       

record_end

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105976c>                       

reference

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104dbec>                       

reference_bases

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104796c>                       

reference_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41047a6c>                       

reference_num

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410478cc>                       

remark_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104db2c>                       

residue_prefixes

Type:
list
Value:
[<Martel.Expression.Str instance at 0x4103a8ec>,
 <Martel.Expression.Str instance at 0x4103a90c>,
 <Martel.Expression.Str instance at 0x4103a92c>]                       

residue_type

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103aaec>                       

residue_types

Type:
list
Value:
[<Martel.Expression.Str instance at 0x4103a94c>,
 <Martel.Expression.Str instance at 0x4103a96c>,
 <Martel.Expression.Str instance at 0x4103a98c>,
 <Martel.Expression.Str instance at 0x4103a9ac>,
 <Martel.Expression.Str instance at 0x4103a9cc>,
 <Martel.Expression.Str instance at 0x4103a9ec>,
 <Martel.Expression.Str instance at 0x4103aa0c>,
 <Martel.Expression.Str instance at 0x4103aa2c>,
...                                                                    

segment

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410470ec>                       

segment_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104732c>                       

sequence

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105906c>                       

sequence_entry

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410593ec>                       

sequence_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105930c>                       

sequence_plus_spaces

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4105928c>                       

size

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4103a82c>                       

small_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0x4103a58c>                         

source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104762c>                       

taxonomy

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104744c>                       

title_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0x4104d2ac>                       

unquoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0x41053aac>                         

valid_divisions

Type:
list
Value:
['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'PLN', 'BCT', 'RNA', 'VRL']        

valid_residue_prefixes

Type:
list
Value:
['ss-', 'ds-', 'ms-']                                                  

valid_residue_types

Type:
list
Value:
['DNA', 'RNA', 'mRNA', 'tRNA', 'rRNA', 'uRNA', 'scRNA', 'snRNA', 'snoR\
NA']                                                                   

version

Type:
Group
Value:
<Martel.Expression.Group instance at 0x410418ac>                       

version_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0x41041acc>                       

Generated by Epydoc 2.1 on Sat Jul 16 15:49:03 2005 http://epydoc.sf.net