ncbi-taxonomist Documentation

1.2.1+8580b9b :: 2020-11-15


$: pip install ncbi-taxonomist --user
$: ncbi-taxonomist collect -n human

ncbi-taxonomist handles and manages phylogenetic data available in NCBI’s Entrez databases .


  • Collect
    collect taxa from the Entrez Taxonomy database
  • Map
    map taxids, names, and accessions to related taxonomic information
  • Resolve:
    resolve lineages for taxa (taxid and names) and accessions, e.g. sequence or protein
  • Import:
    store obtained results locally in a SQLite databases
  • Subtree:
    extract a whole lineage, or a specific rank, or a range of ranks, from a taxid or name
  • Group:
    create user defined groups for taxa, for example:
  • create a group for all taxa specific for a project
  • group taxa without a phylogenetic relationship, e.g. group all taxa representing trees inot a group “trees”

The ncbi-taxonomist commands, e.g. map or import, can be chained together using pipes to from more complex tasks. For example, to populate a local database collect will fetch data remotely from Entrez and print it to STDOUT where import will read STDIN and populates the local database (see below).

ncbi-taxonomist collect -n human | ncbi-taxonomist import -db taxo.db

Requirements and Dependencies


  • Required: Python >= 3.8

    $: python --version

  • Optional: To use local databases, SQLite (>= 3.24.0) has to be installed. ncbi-taxonomist works without local databases, but needs to fetch all data remotely for each query.

    $: sqlite3 --version


ncbi-taxonomist has one dependency:

This is a library maintained by myself and relies solely on the Python standard library. Therefore, ncbi-taxonomist is less prone to suffer dependency hell.


To report bugs and/or errors, please open an issue at or contact me at: Of course, feel free to fork the code, improve it, and/or open a pull request.