ncbi-taxonomist Documentation

1.2.1+8580b9b :: 2020-11-15

https://img.shields.io/static/v1?label=LICENSE&message=GPLv3+&color=orange&style=for-the-badge&link=https://www.gnu.org/licenses/gpl-3.0.en.html&labelColor=36454f https://img.shields.io/librariesio/release/pypi/ncbi-taxonomist?style=for-the-badge&label=dependency%3A%3Aentrezpy&logo=pypi&labelColor=36454f&link=https://gitlab.com/ncbipy/entrezpy https://img.shields.io/pypi/v/ncbi-taxonomist?style=for-the-badge&logo=pypi&link=https://pypi.org/project/ncbi-taxonomist&labelColor=36454f&link=https://pypi.org/project/ncbi-taxonomist https://img.shields.io/static/v1?label=imgver&message=1.2.0&color=blue&style=for-the-badge&logo=docker&link=https://gitlab.com/janpb/ncbi-taxonomist/container_registry&labelColor=36454f https://img.shields.io/static/v1?label=singularity%3A%3Aimgver&message=1.2.0&color=blue&style=for-the-badge&&labelColor=36454f&link=https://cloud.sylabs.io/library/jpb/ncbi-taxonomist https://img.shields.io/pypi/pyversions/ncbi-taxonomist?color=gray&label=%20&logo=python&style=for-the-badge&&labelColor=36454f https://img.shields.io/pypi/status/ncbi-taxonomist?style=for-the-badge&color=informational&logo=pypi&&labelColor=36454f https://img.shields.io/pypi/format/ncbi-taxonomist?style=for-the-badge&color=informational&logo=pypi&labelColor=36454f https://img.shields.io/static/v1?label=+&message=sourcode&color=red&logo=gitlab&style=for-the-badge&labelColor=36454f&link=https://gitlab.com/janpb/ncbi-taxonomist

Synopsis

$: pip install ncbi-taxonomist --user
$: ncbi-taxonomist collect -n human

ncbi-taxonomist handles and manages phylogenetic data available in NCBI’s Entrez databases .

Functions

  • Collect
    collect taxa from the Entrez Taxonomy database
  • Map
    map taxids, names, and accessions to related taxonomic information
  • Resolve:
    resolve lineages for taxa (taxid and names) and accessions, e.g. sequence or protein
  • Import:
    store obtained results locally in a SQLite databases
  • Subtree:
    extract a whole lineage, or a specific rank, or a range of ranks, from a taxid or name
  • Group:
    create user defined groups for taxa, for example:
  • create a group for all taxa specific for a project
  • group taxa without a phylogenetic relationship, e.g. group all taxa representing trees inot a group “trees”

The ncbi-taxonomist commands, e.g. map or import, can be chained together using pipes to from more complex tasks. For example, to populate a local database collect will fetch data remotely from Entrez and print it to STDOUT where import will read STDIN and populates the local database (see below).

ncbi-taxonomist collect -n human | ncbi-taxonomist import -db taxo.db

Requirements and Dependencies

Requirements

  • Required: Python >= 3.8

    $: python --version

  • Optional: To use local databases, SQLite (>= 3.24.0) has to be installed. ncbi-taxonomist works without local databases, but needs to fetch all data remotely for each query.

    $: sqlite3 --version

Dependencies

ncbi-taxonomist has one dependency:

This is a library maintained by myself and relies solely on the Python standard library. Therefore, ncbi-taxonomist is less prone to suffer dependency hell.

Contact

To report bugs and/or errors, please open an issue at https://gitlab.com/ncbi-taxonomist or contact me at: jan.buchmann@mykolab.com. Of course, feel free to fork the code, improve it, and/or open a pull request.