v0.8.0 - 2025-09-10
No changes to the index format (see Index format changelog).
- New commands:
lexicmap utils merge-search-results
: Merge a query's search results from multiple indexes.lexicmap utils edit-genome-ids
: Edit genome IDs in the index via a regular expression.
It's helpful when users forgot to use the flag-N/--ref-name-regexp
to extract the genome ID from the sequence file during indexing.
This command help to fix it without rebuilding the index.
lexicmap index
:- Significantly reduce the memory usage (by up to 25%) in the merge step.
Also reduce some for huge data, such as long-reads or contigs in the Logan project.
- Significantly reduce the memory usage (by up to 25%) in the merge step.
lexicmap search
:- Reduce memory usage, particularly for batch searching (by up to 50%).
- Improve search speed, mainly for batch searching.
- Support limiting search by TaxId(s) via
-t/--taxids
or--taxid-file
.
Only genomes with descendant TaxIds of the specific ones or themselves are searched,
in a similar way with BLAST+ 2.15.0 or later versions.
Negative values are allowed as a black list.
For example, searching non-Escherichia (561) genera of Enterobacteriaceae (543) family with-t 543,-561
.
Users only need to provide NCBI-format taxdump files (-T/--taxdump
, can also create from
any taxonomy data with TaxonKit)
and a genome-ID-to-TaxId mapping file (-G/--genome2taxid
).
There's no need to rebuild the index. - Check if the output file and the log file are the same.
- Reduce the time of seed matching when using
-w
. - Change the default value of
--max-query-conc
from 12 to 8. - New flag
--gc-interval
(default 64, 0 for disable) for forcing garbage collection every N queries. This decreases memory usage a lot.
lexicmap utils subseq
:- Accept the output file of
lexicmap search
as the input.
So one can extract matched sequences (including flanking regions) from the index, after alignment withlexicmap search
with or without using the flag-a/--all
. - Support extending aligned regions with
-U/--upstream
and/or-D/--downstream
.
- Accept the output file of