Changelog

Version 2.3.1 (2026-06-10)

Fixed FASTQ read name fetching error
Fixed seq byte length error when reading sequences > 2 GB

Version 2.3.0 (2026-01-16)

Added support for Python 3.14
Fixed fasta sequence reading error
Fixed fastq sequence reading error
Improved performance for calculating gc content

Version 2.2.0 (2024-12-24)

Added support for Python 3.13
Fixed sub sequence error for non-standard fasta
Upgraded the zlib to 1.3.1 and sqlite 3.47.2

Version 2.1.0 (2024-02-28)

Added support for Python 3.12
Fixed fasta sequence composition error
Fixed fastq continuous reading error

Version 2.0.2 (2023-11-25)

Fixed subsequence return None error

Version 2.0.1 (2023-09-18)

Speedup the gzip index writing to index file

Version 2.0.0 (2023-09-05)

Added support for file name with wide char
Added support for specifying index file path
Added support for more characters in DNA sequence
Added reverse complement function for DNA conversion
Improved the performance of kseq library
Optimized gzip index importing and saving without temp file
Fixed segmentation fault when using sequence composition
Fixed memory leak in Fastq read quality integer
Fixed zlib download url broken error when building

Version 1.1.0 (2023-04-19)

Fixed unicode error when reading fastq file

Version 1.0.1 (2023-03-28)

Fixed invalid uppercase when iterating fastx

Version 1.0.0 (2023-03-24)

Added support for fasta header without space
Fixed some files missing in pypi tar.gz file

Older versions

Version 0.9.1 (2022-12-31)

Fixed unicode decode error when parsing large fasta/q file
Fixed sequence retrival error when using sequence object from loop after break

Version 0.9.0 (2022-12-30)

Added support for Python3.10, 3.11
Added support for aarch64 and musllinux
Added using tab as fasta sequence name splitter
Fixed repeat sequence comment error
Fixed the quality score parsing error from fastq
Fixed the reference of sequence returned from function

Version 0.8.4 (2021-06-30)

Added slice feature to FastaKeys
Fixed FastaKeys and FastqKeys iteration memory leak
Optimized FastaKeys and FastqKeys creation

Version 0.8.3 (2021-04-25)

Fixed Fastx iteration for next function
Fixed Fastx uppercase for reading fasta

Version 0.8.2 (2021-01-02)

Fixed sample segfault error caused by fastq iteration error
Fixed gzip index import error in multiple processes
Fixed fastq iteration segfault error with full_name=True
Fixed all objects iteration to support built-in next function

Version 0.8.1 (2020-12-16)

Fixed pip install error from source code
Removed support for python39 32bit due to dll load error

Version 0.8.0 (2020-12-15)

Added Fastx object as a simple sequence iterator
Added FastqKeys object to obtain read names
Added full_name option to Fastq object
Added support for Python 3.9
Fixed Fasta object error identifier order
Optimized speed of containing test and iteration
Changed Identifier object to FastaKeys object

Version 0.7.0 (2020-09-20)

Added support for extracting flank sequences
Added support for indexing super large gzip file
Reduced memory consumption when building gzip index
Improved the speed of random access to reads from fastq
Fixed sequence dealloc error cuasing no fasta delloc trigger
Fixed fastq max and min quality score return value

Version 0.6.17 (2020-08-31)

Fixed gzip index loading error when no write permission

Version 0.6.16 (2020-08-27)

Increased the buff size of kseq to speedup sequence iteration
Removed warning message from fasta.c when building full index

Version 0.6.15 (2020-08-25)

Fixed key_func error caused by free operation
Fixed full name error when reading sequence without whitespace in names
Fixed a hidden bug in fasta/q iteration when reading attributes (not seq)
Fixed fasta/fastq size and sequence count error on Windows when parsing large file
Fixed zlib 2gb and 4gb limit on windows x64 to support large file
Reduced seek point span size to speedup random access from gzip file

Version 0.6.14 (2020-07-31)

Added support for using full header as identifier without building index
Improved the speed of fasta sequence iteration
Improved the speed of gzipped fastq read iteration
Fixed a bug in fastq read reader

Version 0.6.13 (2020-07-09)

Fixed fastq read iteration error
Fixed fastq cache buffer reader
Added cache for mean, median and N50 length
Speedup fasta iteration by reduced seeks

Version 0.6.12 (2020-06-14)

Fixed DeprecationWarning on py38 caused by ‘#’ formats args
Fixed some memory leak bugs
Cached sequence name to speedup fetch method
Used random string as gzip index temp file to support multiple processes

Version 0.6.11 (2020-05-18)

Fixed iteration error on Windows
Fixed test error on Windows
Fixed fastq composition error on 32bit OS
Improved the speed of fasta identifier sort and filter

Version 0.6.10 (2020-04-22)

Improved the speed of sequence reading
Improved the speed of sequence line iteration
Added avglen, minlen, maxlen, minqual and maxqual to Fastq object
Fixed read retrieval error
Fixed some hidden memory leaks
Changed fastq index file structure to save more information

Version 0.6.9 (2020-04-12)

Added buffreader to improve speed for reading from gzipped file
Added extract subcommand to extract sequences from fasta/q file
Added build subcommand to just build index
Changed info subcommand output to a tab seperated table
Changed Fastq object composition parameter to full_index

Version 0.6.8 (2020-03-14)

Fixed large offset seek error on windows
Fixed PyUnicode_AsUTF8 const char type warning
Changed sequence read line by line function
Changed gzread to fread for fastq information

Version 0.6.7 (2020-03-03)

Added check for fasta/q format when open file
Added benchmark scripts for evaluating performance
Speed up the fasta/q object iteration
Optimzed str length warning caused by strlen

Version 0.6.6 (2020-02-15)

Fixed incorrect sliced sequence name
Fixed seq,identifier,read object memory dealloc
Changed description text into description length in index file

Version 0.6.5 (2020-01-31)

Reduced memory usage when building index for large fasta
Removed rebuild_index method from Fasta object due to segmentation fault
Optimized compatibility between sqlite3 and python GIL

Version 0.6.4 (2020-01-14)

Fixed last sequence fetching error caused by missing n
Improved fasta/q object key error message to make it more human

Version 0.6.3 (2020-01-08)

Added .raw attribute to sequence object to get seq raw string
Added .raw attribute to read object to get read raw string
Added .description to read object to get full header line
Added iteration for sequence object from FASTA object
Added iteration for tuple from FASTQ object
Changed FASTA class parameter composition to full_index

Version 0.6.2 (2020-01-04)

Fixed sample sequence index error
Fixed ci deploy error

Version 0.6.1 (2020-01-03)

Added sample sequences command line
Added get subsequence command line

Version 0.6.0 (2020-01-02)

Fixed FASTA object parameter error
Fixed identifier sprintf warning
Fixed fasta description end r retained
Fixed error byte length when slice sequence
Removed support for python2.7 and python3.4
Removed python2 compat
Disabled export gzip index when building memory index

Version 0.5.10 (2019-11-20)

Added identifier filter function
Remove tp_new for Read, Sequence and Identifier
Fixed module method error

Version 0.5.9 (2019-11-17)

Added get longest and shortest sequence object
Added composition argument to speedup getting GC content
Added memory index to keep index in memory rather than local file
Fixed command line error
Changed sqlite to higher version
Removed journal_mode OFF
Speedup index building

Version 0.5.8 (2019-11-10)

Fixed fasta NL function parameter check
Fixed read id error when fastq iteration

Version 0.5.7 (2019-11-09)

Fixed SystemError caused caused by Python 2.7 seperated int and long type
Fixed String type check on Python 2.7
Fixed objects memory deallocation

Version 0.5.6 (2019-11-08)

Optimized random access from plain file
Reduced memory consumption

Version 0.5.5 (2019-11-07)

Added Support for IUPAC code complement
Speedup reverse complement
Speedup space removing and uppercase

Version 0.5.4 (2019-11-04)

Added guess fasta type (DNA, RNA, protein)
Added support for calculating protein sequence composition
Optimized the speed of index building
Calculate sequence composition when get gc content or composition
Fixed char return in python 2.7

Version 0.5.3 (2019-10-23)

Added support for coverting fastq to fasta
Updated command line interface docs
Fixed command line entry points

Version 0.5.2 (2019-10-18)

Fixed command line interface running error

Version 0.5.1 (2019-10-17)

Added key function for custom sequence identifier
Optimized speed of fasta indexing
Fixed bool args parsing error in py2.7

Version 0.5.0 (2019-10-13)

Added support for python 2.7 and 3.4
Added command line tool to manipulate fasta and fastq file
Added gzip attribute to fasta and fastq object to check whether compressed
Added sort function for identifier object
Fixed python bool argument parsing error caused by uint16_t
Fixed identifier sort key initialization

Version 0.4.1 (2019-10-05)

Fixed fastq quality encoding system guesser
Fixed gzip index insertion error

Version 0.4.0 (2019-09-29)

Added support for parsing FASTQ
Added random access to reads from FASTQ

Version 0.3.10 (2019-09-27)

Fixed GC skew exception caused by mixing unsigned with signed for division

Version 0.3.9 (2019-09-26)

Fixed sequence read line by line error
Fixed last sequence build index error when fasta file ended without n
Fixed GC skew error

Version 0.3.8 (2019-09-25)

Fixed large offset became negative error
Fixed slice step
Fixed uncorrect median length
Fixed strand compare error
Added GC skew calculation
Updated test script

Version 0.3.7 (2019-09-24)

Changed int type to standard type
Added support for processing large fasta file
Added id number for each sequence
Fixed SQL fetch error
Used 50 as default value of nl to calculate N50 and L50

Version 0.3.6 (2019-09-20)

Added support for searching subsequence from a sequence
Added support for checking subsequence weather in a sequence
Fixed gzip index import error
Fixed subsequence parent length for full sequence extraction

Version 0.3.5 (2019-09-08)

Fixed unicode error caused by sqlite3_finalize

Version 0.3.4 (2019-09-07)

Fixed seq description unicode string error

Version 0.3.3 (2019-09-07)

Fixed sequence description encoding error

Version 0.3.2 (2019-09-07)

Deleted

Version 0.3.1 (2019-09-07)

Added support for geting sequence description

Version 0.3.0 (2019-09-07)

Added read sequence from fasta file line by line
Added support for calculating assembly N50 and L50
Added support for calculating median and average length
Added support for getting longest and shortest sequence
Added support for calculating counts of sequence
removed support for Python34

Version 0.2.11 (2019-08-31)

Support for Python 3.4

Version 0.2.10 (2019-08-28)

Changed fseek and fread into gzseek and gzread
Fixed sequence cache name comparision
Fixed last sequence read error without line end
Fixed subsequence slice error in normal FASTA file

Version 0.2.9 (2019-08-27)

Fixed bad line calculate error
Changed rewind to fseek for subsequence extraction

Version 0.2.8 (2019-08-26)

Changed kseq.h library from li to attractivechaos
Improved fasta parser

Version 0.2.7 (2019-08-26)

Fixed no gzip index wrote to sqlite index file

Version 0.2.6 (2019-08-26)

Optimized speed of gzip random access

Version 0.2.5 (2019-08-25)

Fixed segmentation fault raised when loading gzip index
Changed fasta object method get_seq to fetch

Version 0.2.4 (2019-08-25)

Fixed fasta iter error after building new index

Version 0.2.3 (2019-08-24)

Fixed fasta iter error when end of file is not n

Version 0.2.2 (2019-07-19)

Fixed identifier contain error

Version 0.2.1 (2019-07-15)

Fixed sequence name always end with 0
Fixed fasta iterable for flat fasta

Version 0.2.0 (2019-07-09)

First release to PyPI