Content
ncbi-taxonomist
Documentation¶
1.2.1+8580b9b :: 2020-11-15
Synopsis¶
$: pip install ncbi-taxonomist --user
$: ncbi-taxonomist collect -n human
ncbi-taxonomist
handles and manages phylogenetic data available in NCBI’s Entrez databases .
Functions¶
- Collect
- collect taxa from the Entrez Taxonomy database
- Map
- map taxids, names, and accessions to related taxonomic information
- Resolve:
- resolve lineages for taxa (taxid and names) and accessions, e.g. sequence or protein
- Import:
- store obtained results locally in a SQLite databases
- Subtree:
- extract a whole lineage, or a specific rank, or a range of ranks, from a taxid or name
- Group:
- create user defined groups for taxa, for example:
- create a group for all taxa specific for a project
- group taxa without a phylogenetic relationship, e.g. group all taxa representing trees inot a group “trees”
The ncbi-taxonomist
commands, e.g. map or import, can be chained together using
pipes to from more complex tasks. For example, to populate a local database
collect
will fetch data remotely from Entrez and print it to STDOUT where
import
will read STDIN
and populates the local database (see below).
ncbi-taxonomist collect -n human | ncbi-taxonomist import -db taxo.db
Requirements and Dependencies¶
Requirements¶
Required: Python >= 3.8
$: python --version
Optional: To use local databases, SQLite (>= 3.24.0) has to be installed.
ncbi-taxonomist
works without local databases, but needs to fetch all data remotely for each query.$: sqlite3 --version
Dependencies¶
ncbi-taxonomist
has one dependency:
entrezpy
: to handle remote requests to NCBI’s Entrez databases
This is a library maintained by myself and relies solely on the Python standard
library. Therefore, ncbi-taxonomist
is less prone to suffer
dependency hell.
Contact¶
To report bugs and/or errors, please open an issue at https://gitlab.com/ncbi-taxonomist or contact me at: jan.buchmann@mykolab.com. Of course, feel free to fork the code, improve it, and/or open a pull request.