Content
Container¶
ncbi-taxonomist
comes with a Docker container and Singularity image. Both include
jq
to facilitate JSON handling.
Both containers have the /dbs
mountpoint to mount host directories, e.g. to
use local databases.
Note
The commands shown here assume a current Linux system. Please adjust the commands to your system, accordingly.
Docker¶
The Docker container can be found at https://gitlab.com/janpb/ncbi-taxonomist/container_registry/. Please check the Docker Docs if some commands are unclear.
- The Docker image creates the user
user
for the container to run all commands - The container has the mountpoint
/dbs
to bind host paths
Install¶
The latest ncbi-taxonomist
Docker image can be pulled from
registry.gitlab.com/janpb/ncbi-taxonomist:latest
. It can be run with the
command docker run registry.gitlab.com/janpb/ncbi-taxonomist
.
If desired, the image can be tagged to a more concise tag name using docker
tag registry.gitlab.com/janpb/ncbi-taxonomist ncbi-taxonomist
.
1 2 3 4 5 6 7 8 9 10 | $: docker pull registry.gitlab.com/janpb/ncbi-taxonomist:latest
latest: Pulling from janpb/ncbi-taxonomist
cbdbe7a5bc2a: Pull complete
50d9a3e26028: Pull complete
a0e2567dead0: Pull complete
#cut
$: docker tag registry.gitlab.com/janpb/ncbi-taxonomist:latest ncbi-taxonomist
$: docker images
ncbi-taxonomist latest f957b80d1034 22 hours ago 68.3MB
registry.gitlab.com/janpb/ncbi-taxonomist latest f957b80d1034 22 hours ago 68.3MB
|
Line 6 indicats cut output and the output on lines 3-8 and 12-13 will likely look different.
Test¶
Assuming the image is tagged ncbi-taxonomist
, the following command should
print the basic usage:
1 2 3 4 5 6 7 | $: docker run --rm -it ncbi-taxonomist
usage: ncbi-taxonomist [--version] [-v] [--apikey APIKEY] {map,resolve,import,collect,subtree,group} ...
commands:
{map,resolve,import,collect,subtree,group}
map Map taxid to names and vice-versa
#cut
|
Basic usage¶
The examples assume the image has been tagged ncbi-taxonomist
and show
representative commands.
Mapping¶
1 2 | $: docker run --rm -it ncbi-taxonomist map -t 9606
{"mode":"mapping","query":"9606","cast":"taxon","taxon":{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"}}
|
Resolving¶
1 2 3 | $: docker run --rm -it ncbi-taxonomist resolve -t 2 -n 'Arabidopsis'
{"mode":"resolve","query":"Arabidopsis","cast":"taxon","taxon":{"taxid":3701,"rank":"genus","names":{"Arabidopsis":"scientific_name","Cardaminopsis":"Synonym"},"parentid":980083,"name":"Arabidopsis"},"lineage":[{"taxid":3701,"rank":"genus","names":{"Arabidopsis":"scientific_name","Cardaminopsis":"Synonym"},"parentid":980083,"name":"Arabidopsis"},{"taxid":980083,"rank":"tribe","names":{"Camelineae":"scientific_name"},"parentid":3700,"name":"Camelineae"},{"taxid":3700,"rank":"family","names":{"Brassicaceae":"scientific_name"},"parentid":3699,"name":"Brassicaceae"},{"taxid":3699,"rank":"order","names":{"Brassicales":"scientific_name"},"parentid":91836,"name":"Brassicales"},{"taxid":91836,"rank":"clade","names":{"malvids":"scientific_name"},"parentid":71275,"name":"malvids"},{"taxid":71275,"rank":"clade","names":{"rosids":"scientific_name"},"parentid":1437201,"name":"rosids"},{"taxid":1437201,"rank":"clade","names":{"Pentapetalae":"scientific_name"},"parentid":91827,"name":"Pentapetalae"},{"taxid":91827,"rank":"clade","names":{"Gunneridae":"scientific_name"},"parentid":71240,"name":"Gunneridae"},{"taxid":71240,"rank":"clade","names":{"eudicotyledons":"scientific_name"},"parentid":1437183,"name":"eudicotyledons"},{"taxid":1437183,"rank":"clade","names":{"Mesangiospermae":"scientific_name"},"parentid":3398,"name":"Mesangiospermae"},{"taxid":3398,"rank":"class","names":{"Magnoliopsida":"scientific_name"},"parentid":58024,"name":"Magnoliopsida"},{"taxid":58024,"rank":"clade","names":{"Spermatophyta":"scientific_name"},"parentid":78536,"name":"Spermatophyta"},{"taxid":78536,"rank":"clade","names":{"Euphyllophyta":"scientific_name"},"parentid":58023,"name":"Euphyllophyta"},{"taxid":58023,"rank":"clade","names":{"Tracheophyta":"scientific_name"},"parentid":3193,"name":"Tracheophyta"},{"taxid":3193,"rank":"clade","names":{"Embryophyta":"scientific_name"},"parentid":131221,"name":"Embryophyta"},{"taxid":131221,"rank":"subphylum","names":{"Streptophytina":"scientific_name"},"parentid":35493,"name":"Streptophytina"},{"taxid":35493,"rank":"phylum","names":{"Streptophyta":"scientific_name"},"parentid":33090,"name":"Streptophyta"},{"taxid":33090,"rank":"kingdom","names":{"Viridiplantae":"scientific_name"},"parentid":2759,"name":"Viridiplantae"},{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
{"mode":"resolve","query":"2","cast":"taxon","taxon":{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name","eubacteria":"GenbankCommonName","bacteria":"BlastName","Monera":"Inpart","Procaryotae":"Inpart","Prokaryota":"Inpart","Prokaryotae":"Inpart","prokaryote":"Inpart","prokaryotes":"Inpart"},"parentid":131567,"name":"Bacteria"},"lineage":[{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name","eubacteria":"GenbankCommonName","bacteria":"BlastName","Monera":"Inpart","Procaryotae":"Inpart","Prokaryota":"Inpart","Prokaryotae":"Inpart","prokaryote":"Inpart","prokaryotes":"Inpart"},"parentid":131567,"name":"Bacteria"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
Pipelines¶
1 2 3 | $: docker run --rm -i ncbi-taxonomist map -edb bioproject -a PRJNA604394 | \
docker run --rm -i ncbi-taxonomist resolve -m
{"mode":"resolve","query":"PRJNA604394","cast":"accs","accs":{"taxid":573,"accessions":{"project_id":604394,"project_acc":"PRJNA604394","project_name":"Klebsiella pneumoniae strain:S01"},"db":"bioproject","uid":604394},"lineage":[{"taxid":573,"rank":"species","names":{"Klebsiella pneumoniae":"scientific_name","'Klebsiella aerogenes' (Kruse) Taylor et al. 1956":"Synonym","Bacillus pneumoniae":"Synonym","Bacterium pneumoniae crouposae":"Synonym","Hyalococcus pneumoniae":"Synonym","Klebsiella pneumoniae aerogenes":"Synonym","Klebsiella sp. 2N3":"Includes","Klebsiella sp. C1(2016)":"Includes","Klebsiella sp. M-AI-2":"Includes","Klebsiella sp. PB12":"Includes","Klebsiella sp. RCE-7":"Includes","ATCC 13883":"type material","ATCC:13883":"type material","BCCM/LMG:2095":"type material","CCUG 225":"type material","CCUG:225":"type material","CDC 298-53":"type material","CDC:298-53":"type material","CIP 82.91":"type material","CIP:82.91":"type material","DSM 30104":"type material","DSM:30104":"type material","HAMBI 450":"type material","HAMBI:450":"type material","IAM 14200":"type material","IAM:14200":"type material","IFO 14940":"type material","IFO:14940":"type material","JCM 1662":"type material","JCM:1662":"type material","LMG 2095":"type material","LMG:2095":"type material","NBRC 14940":"type material","NBRC:14940":"type material","NCTC 9633":"type material","NCTC:9633":"type material"},"parentid":570,"name":"Klebsiella pneumoniae"},{"taxid":570,"rank":"genus","names":{"Klebsiella":"scientific_name"},"parentid":543,"name":"Klebsiella"},{"taxid":543,"rank":"family","names":{"Enterobacteriaceae":"scientific_name"},"parentid":91347,"name":"Enterobacteriaceae"},{"taxid":91347,"rank":"order","names":{"Enterobacterales":"scientific_name"},"parentid":1236,"name":"Enterobacterales"},{"taxid":1236,"rank":"class","names":{"Gammaproteobacteria":"scientific_name"},"parentid":1224,"name":"Gammaproteobacteria"},{"taxid":1224,"rank":"phylum","names":{"Proteobacteria":"scientific_name"},"parentid":2,"name":"Proteobacteria"},{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name"},"parentid":131567,"name":"Bacteria"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
Local database¶
To use local databases with the ncbi-taxonomist
Docker container, the path on the
host machine needs to be bound to the container’s internal mountpoint /dbs
.
To have the proper permissions, the --user
argument needs to be set when
writing to a local database. On Linux, this can be done via the id
command
(Listing 2).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $ ls ${PWD}
#empty
$: docker run --rm -i ncbi-taxonomist collect -t 9606 \ |
docker run --rm -i --user $(id -u):$(id -g) -v ${PWD}:/dbs ncbi-taxonomist import -db /dbs/dockertaxa.db
{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}
{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"}
{"taxid":33154,"rank":"clade","names":{"Opisthokonta":"scientific_name"},"parentid":2759,"name":"Opisthokonta"}
{"taxid":33208,"rank":"kingdom","names":{"Metazoa":"scientific_name"},"parentid":33154,"name":"Metazoa"}
{"taxid":6072,"rank":"clade","names":{"Eumetazoa":"scientific_name"},"parentid":33208,"name":"Eumetazoa"}
{"taxid":33213,"rank":"clade","names":{"Bilateria":"scientific_name"},"parentid":6072,"name":"Bilateria"}
{"taxid":33511,"rank":"clade","names":{"Deuterostomia":"scientific_name"},"parentid":33213,"name":"Deuterostomia"}
{"taxid":7711,"rank":"phylum","names":{"Chordata":"scientific_name"},"parentid":33511,"name":"Chordata"}
{"taxid":89593,"rank":"subphylum","names":{"Craniata":"scientific_name"},"parentid":7711,"name":"Craniata"}
#cut
$: ls ${PWD}
dockertaxa.db
$: docker run --rm -i -v ${PWD}:/dbs ncbi-taxonomist resolve -t 9606 -db /dbs/dockertaxa.db
{"mode":"resolve","query":"9606","cast":"taxon","taxon":{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"},"lineage":[{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"},{"taxid":9605,"rank":"genus","names":{"Homo":"scientific_name"},"parentid":207598,"name":"Homo"},{"taxid":207598,"rank":"subfamily","names":{"Homininae":"scientific_name"},"parentid":9604,"name":"Homininae"},{"taxid":9604,"rank":"family","names":{"Hominidae":"scientific_name"},"parentid":314295,"name":"Hominidae"},{"taxid":314295,"rank":"superfamily","names":{"Hominoidea":"scientific_name"},"parentid":9526,"name":"Hominoidea"},{"taxid":9526,"rank":"parvorder","names":{"Catarrhini":"scientific_name"},"parentid":314293,"name":"Catarrhini"},{"taxid":314293,"rank":"infraorder","names":{"Simiiformes":"scientific_name"},"parentid":376913,"name":"Simiiformes"},{"taxid":376913,"rank":"suborder","names":{"Haplorrhini":"scientific_name"},"parentid":9443,"name":"Haplorrhini"},{"taxid":9443,"rank":"order","names":{"Primates":"scientific_name"},"parentid":314146,"name":"Primates"},{"taxid":314146,"rank":"superorder","names":{"Euarchontoglires":"scientific_name"},"parentid":1437010,"name":"Euarchontoglires"},{"taxid":1437010,"rank":"clade","names":{"Boreoeutheria":"scientific_name"},"parentid":9347,"name":"Boreoeutheria"},{"taxid":9347,"rank":"clade","names":{"Eutheria":"scientific_name"},"parentid":32525,"name":"Eutheria"},{"taxid":32525,"rank":"clade","names":{"Theria":"scientific_name"},"parentid":40674,"name":"Theria"},{"taxid":40674,"rank":"class","names":{"Mammalia":"scientific_name"},"parentid":32524,"name":"Mammalia"},{"taxid":32524,"rank":"clade","names":{"Amniota":"scientific_name"},"parentid":32523,"name":"Amniota"},{"taxid":32523,"rank":"clade","names":{"Tetrapoda":"scientific_name"},"parentid":1338369,"name":"Tetrapoda"},{"taxid":1338369,"rank":"clade","names":{"Dipnotetrapodomorpha":"scientific_name"},"parentid":8287,"name":"Dipnotetrapodomorpha"},{"taxid":8287,"rank":"superclass","names":{"Sarcopterygii":"scientific_name"},"parentid":117571,"name":"Sarcopterygii"},{"taxid":117571,"rank":"clade","names":{"Euteleostomi":"scientific_name"},"parentid":117570,"name":"Euteleostomi"},{"taxid":117570,"rank":"clade","names":{"Teleostomi":"scientific_name"},"parentid":7776,"name":"Teleostomi"},{"taxid":7776,"rank":"clade","names":{"Gnathostomata":"scientific_name"},"parentid":7742,"name":"Gnathostomata"},{"taxid":7742,"rank":"clade","names":{"Vertebrata":"scientific_name"},"parentid":89593,"name":"Vertebrata"},{"taxid":89593,"rank":"subphylum","names":{"Craniata":"scientific_name"},"parentid":7711,"name":"Craniata"},{"taxid":7711,"rank":"phylum","names":{"Chordata":"scientific_name"},"parentid":33511,"name":"Chordata"},{"taxid":33511,"rank":"clade","names":{"Deuterostomia":"scientific_name"},"parentid":33213,"name":"Deuterostomia"},{"taxid":33213,"rank":"clade","names":{"Bilateria":"scientific_name"},"parentid":6072,"name":"Bilateria"},{"taxid":6072,"rank":"clade","names":{"Eumetazoa":"scientific_name"},"parentid":33208,"name":"Eumetazoa"},{"taxid":33208,"rank":"kingdom","names":{"Metazoa":"scientific_name"},"parentid":33154,"name":"Metazoa"},{"taxid":33154,"rank":"clade","names":{"Opisthokonta":"scientific_name"},"parentid":2759,"name":"Opisthokonta"},{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
Docker ncbi-taxonomist
and jq
¶
To use the included jq
, Docker’s run
command has to be adjusted with the
--entrypoint
argument (Listing 3).
1 2 3 4 5 | $: docker run --rm -i ncbi-taxonomist map -a QZWG01000002.1 MG831203 | \
docker run --rm -i ncbi-taxonomist resolve --mapping | \
docker run --rm -i --entrypoint 'jq' ncbi-taxonomist -r '[.query, .lineage[].name]|@tsv'
MG831203 Deformed wing virus Iflavirus Iflaviridae Picornavirales Pisoniviricetes Pisuviricota Orthornavirae Riboviria Viruses
QZWG01000002.1 Glycine soja Glycine subgen. Soja Glycine Phaseoleae indigoferoid/millettioid clade NPAAA clade 50 kb inversion clade Papilionoideae Fabaceae Fabales fabids rosids Pentapetalae Gunneridae eudicotyledons Mesangiospermae Magnoliopsida Spermatophyta Euphyllophyta Tracheophyta Embryophyta Streptophytina Streptophyta Viridiplantae Eukaryota cellular organisms
|
Singularity¶
The Singularity container can be found at https://cloud.sylabs.io/library/jpb/ncbi-taxonomist/ncbi-taxonomist. Please check the Singularity Docs if some commands are unclear.
- The Singularity image creates the user
user
for the container to run all commands - The container has the mountpoint
/dbs
to bind host paths
Install¶
The latest ncbi-taxonomist
Singularity image can be pulled from
https://cloud.sylabs.io/library/jpb/ncbi-taxonomist/ncbi-taxonomist
using
the command singularity pull library://jpb/ncbi-taxonomist/ncbi-taxonomist
.
If desired, the image can be renamed to a more concise name.
1 2 3 4 | $: singularity pull library://jpb/ncbi-taxonomist/ncbi-taxonomist
INFO: Downloading library image
23.7MiB / 23.7MiB [==============================================================================] 100 % 545.9 KiB/s 0s
$: mv ncbi-taxonomist_latest.sif ncbi-taxonomist.sif
|
Line 3 will likely look different.
Build¶
The Singularity container can be built using the definition file container/SINGULARITY.def present in the repository.
For more Singularity building ootions check the corresponding man
page
(‘’man singularity build’‘) or documentation
To build locally, you need root permissions or use the
--remote
option for the build
command (Listing 4):
1 2 | $: singularity build ncbi-taxonomist.sif SINGULARITY.def
$: singularity build --remote ncbi-taxonomist.sif SINGULARITY.def
|
Test¶
Assuming the image is named ncbi-taxonomist.sif
, invoking the command without
arguments shows the basic usage and indicating a succesful isntall(Listing 5):
1 2 3 4 5 6 7 | $: ./ncbi-taxonomist
usage: ncbi-taxonomist [--version] [-v] [--apikey APIKEY] {map,resolve,import,collect,subtree,group} ...
commands:
{map,resolve,import,collect,subtree,group}
map Map taxid to names and vice-versa
#cut
|
Basic usage¶
The examples assume the image is names ncbi-taxonomist.sif
and show
representative commands. The image can be used as an executable, i.e. it can be
invoked as ./ncbi-taxonomist.sif
. This corresponds to the command
singularity run ncbi-taxonomist.sif
. Listing 6 shows hoe to use
both commands.
Mapping¶
1 2 | $: ./ncbi-taxonomist.sif map -t 9606
{"mode":"mapping","query":"9606","cast":"taxon","taxon":{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"}}
|
Resolving¶
1 2 3 | $: ./ncbi-taxonomist.sif resolve -t 2 -n 'Arabidopsis'
{"mode":"resolve","query":"Arabidopsis","cast":"taxon","taxon":{"taxid":3701,"rank":"genus","names":{"Arabidopsis":"scientific_name","Cardaminopsis":"Synonym"},"parentid":980083,"name":"Arabidopsis"},"lineage":[{"taxid":3701,"rank":"genus","names":{"Arabidopsis":"scientific_name","Cardaminopsis":"Synonym"},"parentid":980083,"name":"Arabidopsis"},{"taxid":980083,"rank":"tribe","names":{"Camelineae":"scientific_name"},"parentid":3700,"name":"Camelineae"},{"taxid":3700,"rank":"family","names":{"Brassicaceae":"scientific_name"},"parentid":3699,"name":"Brassicaceae"},{"taxid":3699,"rank":"order","names":{"Brassicales":"scientific_name"},"parentid":91836,"name":"Brassicales"},{"taxid":91836,"rank":"clade","names":{"malvids":"scientific_name"},"parentid":71275,"name":"malvids"},{"taxid":71275,"rank":"clade","names":{"rosids":"scientific_name"},"parentid":1437201,"name":"rosids"},{"taxid":1437201,"rank":"clade","names":{"Pentapetalae":"scientific_name"},"parentid":91827,"name":"Pentapetalae"},{"taxid":91827,"rank":"clade","names":{"Gunneridae":"scientific_name"},"parentid":71240,"name":"Gunneridae"},{"taxid":71240,"rank":"clade","names":{"eudicotyledons":"scientific_name"},"parentid":1437183,"name":"eudicotyledons"},{"taxid":1437183,"rank":"clade","names":{"Mesangiospermae":"scientific_name"},"parentid":3398,"name":"Mesangiospermae"},{"taxid":3398,"rank":"class","names":{"Magnoliopsida":"scientific_name"},"parentid":58024,"name":"Magnoliopsida"},{"taxid":58024,"rank":"clade","names":{"Spermatophyta":"scientific_name"},"parentid":78536,"name":"Spermatophyta"},{"taxid":78536,"rank":"clade","names":{"Euphyllophyta":"scientific_name"},"parentid":58023,"name":"Euphyllophyta"},{"taxid":58023,"rank":"clade","names":{"Tracheophyta":"scientific_name"},"parentid":3193,"name":"Tracheophyta"},{"taxid":3193,"rank":"clade","names":{"Embryophyta":"scientific_name"},"parentid":131221,"name":"Embryophyta"},{"taxid":131221,"rank":"subphylum","names":{"Streptophytina":"scientific_name"},"parentid":35493,"name":"Streptophytina"},{"taxid":35493,"rank":"phylum","names":{"Streptophyta":"scientific_name"},"parentid":33090,"name":"Streptophyta"},{"taxid":33090,"rank":"kingdom","names":{"Viridiplantae":"scientific_name"},"parentid":2759,"name":"Viridiplantae"},{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
{"mode":"resolve","query":"2","cast":"taxon","taxon":{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name","eubacteria":"GenbankCommonName","bacteria":"BlastName","Monera":"Inpart","Procaryotae":"Inpart","Prokaryota":"Inpart","Prokaryotae":"Inpart","prokaryote":"Inpart","prokaryotes":"Inpart"},"parentid":131567,"name":"Bacteria"},"lineage":[{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name","eubacteria":"GenbankCommonName","bacteria":"BlastName","Monera":"Inpart","Procaryotae":"Inpart","Prokaryota":"Inpart","Prokaryotae":"Inpart","prokaryote":"Inpart","prokaryotes":"Inpart"},"parentid":131567,"name":"Bacteria"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
Pipelines¶
1 2 3 | $: ./ncbi-taxonomist.sif map -edb bioproject -a PRJNA604394 | \
./ncbi-taxonomist.sif resolve -m
{"mode":"resolve","query":"PRJNA604394","cast":"accs","accs":{"taxid":573,"accessions":{"project_id":604394,"project_acc":"PRJNA604394","project_name":"Klebsiella pneumoniae strain:S01"},"db":"bioproject","uid":604394},"lineage":[{"taxid":573,"rank":"species","names":{"Klebsiella pneumoniae":"scientific_name","'Klebsiella aerogenes' (Kruse) Taylor et al. 1956":"Synonym","Bacillus pneumoniae":"Synonym","Bacterium pneumoniae crouposae":"Synonym","Hyalococcus pneumoniae":"Synonym","Klebsiella pneumoniae aerogenes":"Synonym","Klebsiella sp. 2N3":"Includes","Klebsiella sp. C1(2016)":"Includes","Klebsiella sp. M-AI-2":"Includes","Klebsiella sp. PB12":"Includes","Klebsiella sp. RCE-7":"Includes","ATCC 13883":"type material","ATCC:13883":"type material","BCCM/LMG:2095":"type material","CCUG 225":"type material","CCUG:225":"type material","CDC 298-53":"type material","CDC:298-53":"type material","CIP 82.91":"type material","CIP:82.91":"type material","DSM 30104":"type material","DSM:30104":"type material","HAMBI 450":"type material","HAMBI:450":"type material","IAM 14200":"type material","IAM:14200":"type material","IFO 14940":"type material","IFO:14940":"type material","JCM 1662":"type material","JCM:1662":"type material","LMG 2095":"type material","LMG:2095":"type material","NBRC 14940":"type material","NBRC:14940":"type material","NCTC 9633":"type material","NCTC:9633":"type material"},"parentid":570,"name":"Klebsiella pneumoniae"},{"taxid":570,"rank":"genus","names":{"Klebsiella":"scientific_name"},"parentid":543,"name":"Klebsiella"},{"taxid":543,"rank":"family","names":{"Enterobacteriaceae":"scientific_name"},"parentid":91347,"name":"Enterobacteriaceae"},{"taxid":91347,"rank":"order","names":{"Enterobacterales":"scientific_name"},"parentid":1236,"name":"Enterobacterales"},{"taxid":1236,"rank":"class","names":{"Gammaproteobacteria":"scientific_name"},"parentid":1224,"name":"Gammaproteobacteria"},{"taxid":1224,"rank":"phylum","names":{"Proteobacteria":"scientific_name"},"parentid":2,"name":"Proteobacteria"},{"taxid":2,"rank":"superkingdom","names":{"Bacteria":"scientific_name"},"parentid":131567,"name":"Bacteria"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
Local database¶
To use local databases with the ncbi-taxonomist
Singularity container, the path
on the host machine needs to be bound to the container’s internal mountpoint
/dbs
via the --bind
options, which cannot be used when using the
executable form (Listing 6). However, the bind options can be stored in the enviromental
variable SINGULARITY_BIND
(Listing 7).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $ ls ${PWD}
#empty
$: ./ncbi-taxonomist.sif collect -t 9606 | \
singularity run --bind ${PWD}:/dbs ncbi-taxonomist.sif import -db /dbs/simgtaxa.db
{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}
{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"}
{"taxid":33154,"rank":"clade","names":{"Opisthokonta":"scientific_name"},"parentid":2759,"name":"Opisthokonta"}
{"taxid":33208,"rank":"kingdom","names":{"Metazoa":"scientific_name"},"parentid":33154,"name":"Metazoa"}
{"taxid":6072,"rank":"clade","names":{"Eumetazoa":"scientific_name"},"parentid":33208,"name":"Eumetazoa"}
{"taxid":33213,"rank":"clade","names":{"Bilateria":"scientific_name"},"parentid":6072,"name":"Bilateria"}
{"taxid":33511,"rank":"clade","names":{"Deuterostomia":"scientific_name"},"parentid":33213,"name":"Deuterostomia"}
{"taxid":7711,"rank":"phylum","names":{"Chordata":"scientific_name"},"parentid":33511,"name":"Chordata"}
{"taxid":89593,"rank":"subphylum","names":{"Craniata":"scientific_name"},"parentid":7711,"name":"Craniata"}
#cut
$: ls ${PWD}
simgtaxa.db
$: singularity run --bind ${PWD}:/dbs ncbi-taxonomist.sif resolve -t 9606 -db /dbs/simgtaxa.db
{"mode":"resolve","query":"9606","cast":"taxon","taxon":{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"},"lineage":[{"taxid":9606,"rank":"species","names":{"Homo sapiens":"scientific_name","human":"GenbankCommonName","man":"CommonName"},"parentid":9605,"name":"Homo sapiens"},{"taxid":9605,"rank":"genus","names":{"Homo":"scientific_name"},"parentid":207598,"name":"Homo"},{"taxid":207598,"rank":"subfamily","names":{"Homininae":"scientific_name"},"parentid":9604,"name":"Homininae"},{"taxid":9604,"rank":"family","names":{"Hominidae":"scientific_name"},"parentid":314295,"name":"Hominidae"},{"taxid":314295,"rank":"superfamily","names":{"Hominoidea":"scientific_name"},"parentid":9526,"name":"Hominoidea"},{"taxid":9526,"rank":"parvorder","names":{"Catarrhini":"scientific_name"},"parentid":314293,"name":"Catarrhini"},{"taxid":314293,"rank":"infraorder","names":{"Simiiformes":"scientific_name"},"parentid":376913,"name":"Simiiformes"},{"taxid":376913,"rank":"suborder","names":{"Haplorrhini":"scientific_name"},"parentid":9443,"name":"Haplorrhini"},{"taxid":9443,"rank":"order","names":{"Primates":"scientific_name"},"parentid":314146,"name":"Primates"},{"taxid":314146,"rank":"superorder","names":{"Euarchontoglires":"scientific_name"},"parentid":1437010,"name":"Euarchontoglires"},{"taxid":1437010,"rank":"clade","names":{"Boreoeutheria":"scientific_name"},"parentid":9347,"name":"Boreoeutheria"},{"taxid":9347,"rank":"clade","names":{"Eutheria":"scientific_name"},"parentid":32525,"name":"Eutheria"},{"taxid":32525,"rank":"clade","names":{"Theria":"scientific_name"},"parentid":40674,"name":"Theria"},{"taxid":40674,"rank":"class","names":{"Mammalia":"scientific_name"},"parentid":32524,"name":"Mammalia"},{"taxid":32524,"rank":"clade","names":{"Amniota":"scientific_name"},"parentid":32523,"name":"Amniota"},{"taxid":32523,"rank":"clade","names":{"Tetrapoda":"scientific_name"},"parentid":1338369,"name":"Tetrapoda"},{"taxid":1338369,"rank":"clade","names":{"Dipnotetrapodomorpha":"scientific_name"},"parentid":8287,"name":"Dipnotetrapodomorpha"},{"taxid":8287,"rank":"superclass","names":{"Sarcopterygii":"scientific_name"},"parentid":117571,"name":"Sarcopterygii"},{"taxid":117571,"rank":"clade","names":{"Euteleostomi":"scientific_name"},"parentid":117570,"name":"Euteleostomi"},{"taxid":117570,"rank":"clade","names":{"Teleostomi":"scientific_name"},"parentid":7776,"name":"Teleostomi"},{"taxid":7776,"rank":"clade","names":{"Gnathostomata":"scientific_name"},"parentid":7742,"name":"Gnathostomata"},{"taxid":7742,"rank":"clade","names":{"Vertebrata":"scientific_name"},"parentid":89593,"name":"Vertebrata"},{"taxid":89593,"rank":"subphylum","names":{"Craniata":"scientific_name"},"parentid":7711,"name":"Craniata"},{"taxid":7711,"rank":"phylum","names":{"Chordata":"scientific_name"},"parentid":33511,"name":"Chordata"},{"taxid":33511,"rank":"clade","names":{"Deuterostomia":"scientific_name"},"parentid":33213,"name":"Deuterostomia"},{"taxid":33213,"rank":"clade","names":{"Bilateria":"scientific_name"},"parentid":6072,"name":"Bilateria"},{"taxid":6072,"rank":"clade","names":{"Eumetazoa":"scientific_name"},"parentid":33208,"name":"Eumetazoa"},{"taxid":33208,"rank":"kingdom","names":{"Metazoa":"scientific_name"},"parentid":33154,"name":"Metazoa"},{"taxid":33154,"rank":"clade","names":{"Opisthokonta":"scientific_name"},"parentid":2759,"name":"Opisthokonta"},{"taxid":2759,"rank":"superkingdom","names":{"Eukaryota":"scientific_name"},"parentid":131567,"name":"Eukaryota"},{"taxid":131567,"rank":"no rank","names":{"cellular organisms":"scientific_name"},"parentid":null,"name":"cellular organisms"}]}
|
1 2 3 4 5 6 7 8 9 10 | $: export SINGULARITY_BIND="${PWD}:/dbs"
$: echo $SINGULARITY_BIND
/path/to/your/current/working/directory
$: ./ncbi-taxonomist.sif collect -t 9606 | \
./ncbi-taxonomist.sif import -db /dbs/simgtaxa.db
#result
$: ls ${PWD}
simgtaxa.db
$: ./ncbi-taxonomist.sif resolve -t 9606 -db /dbs/simgtaxa.db
#result
|
Singularity ncbi-taxonomist
and jq
¶
To use the included jq
with the Singularity container, the run command has
to used in conjunction with the –app option
1 2 3 4 5 6 7 | $: singularity run --app jq ncbi-taxonomist.sif
#jq usage
$: ./ncbi-taxonomist.sif map -a QZWG01000002.1 MG831203 | \
./ncbi-taxonomist.sif resolve --mapping | \
singularity run --app jq ncbi-taxonomist.sif -r '[.query, .lineage[].name]|@tsv'
MG831203 Deformed wing virus Iflavirus Iflaviridae Picornavirales Pisoniviricetes Pisuviricota Orthornavirae Riboviria Viruses
QZWG01000002.1 Glycine soja Glycine subgen. Soja Glycine Phaseoleae indigoferoid/millettioid clade NPAAA clade 50 kb inversion clade Papilionoideae Fabaceae Fabales fabids rosids Pentapetalae Gunneridae eudicotyledons Mesangiospermae Magnoliopsida Spermatophyta Euphyllophyta Tracheophyta Embryophyta Streptophytina Streptophyta Viridiplantae Eukaryota cellular organisms
|