Genome Classifier¶
[1]:
import modelseedpy
Pull the genome classifier model¶
[2]:
from modelseedpy.helpers import get_classifier
[3]:
classifier = get_classifier('knn_filter')
[4]:
type(classifier)
[4]:
modelseedpy.core.msgenomeclassifier.MSGenomeClassifier
Get a Genome and Annotate with RAST¶
RAST annotation is essential since the classifier was trained with RAST annotated functions
[5]:
# Load e. coli genome
genome = modelseedpy.MSGenome.from_fasta('GCF_000005845.2_ASM584v2_protein.faa', split=' ')
[6]:
modelseedpy.RastClient().annotate_genome(genome)
[6]:
[{'id': '23AFF380-F4F9-11EB-BBBA-BBE5BBF382BD',
'parameters': ['-a',
'-g',
200,
'-m',
5,
'-d',
'/opt/patric-common/data/kmer_metadata_v2',
'-u',
'http://pear.mcs.anl.gov:6100/query'],
'hostname': 'pear',
'tool_name': 'kmer_search',
'execution_time': 1628063644.76991},
{'execution_time': 1628063644.90382,
'tool_name': 'KmerAnnotationByFigfam',
'hostname': 'pear',
'id': '23C46324-F4F9-11EB-BBBA-BBE5BBF382BD',
'parameters': ['annotate_hypothetical_only=1',
'dataset_name=Release70',
'kmer_size=8']},
{'parameters': [],
'id': '23F64B78-F4F9-11EB-908D-F73BBDF382BD',
'tool_name': 'annotate_proteins_similarity',
'hostname': 'pear',
'execute_time': 1628063645.23091}]
Run classifier¶
A: Archaea
C: Cyanobacteria
N: Gram Negative
P: Gram Positive
[ ]:
classifier.classify(genome)
'N'
[ ]: