Skip to content

Releases: biotite-dev/biotite

Biotite 0.28.0

08 Jun 09:12
723c09b
Compare
Choose a tag to compare

Changelog

Additions

  • Most classes support now the repr() function (#290)
  • Add equality comparison for sequence.CodonTable
  • Added structure.info.all_residues(), that gives all residue names from the
    Chemical Component Dictionary
  • Updated resources from Chemical Component Dictionary in structure.info
  • Add structure.io.mol.MOLFile to support MOL and SDF files for
    small molecule structure data
  • Add database.uniprot for UniProt database support
    • database.uniprot.search() searches for Uniprot IDs that match a given
      database.uniprot.Query
    • database.uniprot.fetch() downloads the file corresponding to the given
      Uniprot ID.
  • Increase performance of structure.pseudoknots() by using NetworkX
    to identify conflicting regions (#289)

Changes

  • Changed structure.BondType enum values for more precise description of
    aromatic bonds
    • BondType.AROMATIC is replaced by
      BondType.AROMATIC_SINGLE and BondType.AROMATIC_DOUBLE
    • New method structure.BondList.remove_aromaticity() converts
      BondType.AROMATIC_SINGLE to BondType.SINGLE and
      BondType.AROMATIC_DOUBLE to BondType.DOUBLE in-place

Fixes

  • Fixed error, when reading a single model from a structure.io.PDBFile,
    if the MODEL line is missing in the file
  • Fixed the format of formal charges written to structure.io.PDBFile
    • Previously it was e.g. '+2' instead of the correct 2+
  • Fixed atom_i parameter when reading a trajectory via
    structure.io.load_structure() (#308)
  • Fixed performance issue of structure.CellList
    • Previously the number of evaluated cells was too large, if the radius
      parameter of structure.CellList.get_atoms() was equal to the
      cell_size parameter of the constructor (#311)
  • Setting an existing annotation array in structure.AtomArray and
    structure.AtomArrayStack preserves the NumPy dtype

Biotite 0.27.0

23 Mar 12:05
ceacc01
Compare
Choose a tag to compare

Changelog

Additions

  • Added interface to AutoDock Vina
    • application.autodock.VinaApp uses vina executable to perform
      docking of ligand to a receptor molecule
    • Uses new structure.io.pdbqt.PDBQTFile class for writing input for
      and reading output from vina
      • An MGLTools installation is not necessary
    • By default the receptor is handled as rigid structure, however,
      flexible side chains can be defined
  • Added modular system for fast k-mer based sequence searches/mappings
    • sequence.align.KmerAlphabet encodes a sequence.Sequence into
      k-mers
    • sequence.align.KmerTable is able to find k-mer matches between
      sequence in an efficient manner
    • sequence.align.SimilarityRule allows matching similar instead of
      exact k-mer matches via a sequence.align.KmerTable
    • sequence.align.align_banded() performs a heuristic local or
      semi-global sequence alignment within a defined diagonal band
  • Added sequence.align.remove_terminal_gaps() function
  • Added application.sra.FastqDumpApp.get_file_paths() method
  • Increased performance of sequence.Sequence.get_symbol_frequency()
  • Increased performance of sequence.NucleotideSequence.complement()
  • sequence.Sequence.reverse() can optionally create an array view
    instead of a copy

Changes

  • application.sra.FastqDumpApp.get_file_paths() only parses
    downloaded PDBQT files, if required
  • Running pytest automatically recompiles changed Cython source code

Biotite 0.26.0

04 Mar 12:06
Compare
Choose a tag to compare

Changelog

Additions

  • Added interface to some programs of the ViennaRNA software package
    • application.viennarna.RNAfoldApp uses RNAfold to predict the minimum
      free energy secondary structure of an RNA sequence
    • application.viennarna.RNAplotApp uses RNAplot to calculate the
      2D coordinates for base symbols in a secondary structure plot
  • Added structure.graphics.plot_nucleotide_secondary_structure() for
    visualization of an RNA secondary structure via Matplotlib
    • Internally uses RNAplot
    • Optional visualization of pseudoknots
  • Increased performance of structure.find_connected()
  • Increased performance of structure.partial_charges()
  • Added structure.BondList.remove_bonds_to
  • Added molecule-level atom selections
    • structure.get_molecule_indices(), structure.get_molecule_masks() and
      structure.molecule_iter select atoms belong to a single molecule, i.e.
      atoms that are connected via bonds
  • Added interface to NetworkX package
    • Added as_graph() method to sequence.phylo.Tree and structure.BondList
      for conversion into a NetworkX Graph
    • The find_rotatable_bonds() function uses NetworkX to identify
      rotatable bonds, i.e. single bonds that are not part of a cycle,
      in structures with a structure.BondList()

Changes

  • Add support for Python 3.9, remove support for Python 3.6
  • Add networkx package as dependency
  • structure.io.pdbx.set_structure() does not convert the residue ID -1
    to "." anymore

Fixes

  • Fixed missing check for string length of chain ID, residue name
    and atom name when setting a structure in a structure.io.pdb.PDBFile
  • structure.io.pdbx.set_structure() supports now atom IDs larger than
    one million
  • Fixed application.dssp.DsspApp unable to work with multicharacter chain
    identifiers (#264)
  • Fixed application.muscle.MuscleApp sometimes not finishing for long
    alignments (#273)
  • Fixed the creation an AtomArray from atoms with optional annotations (#279)
  • Fixed deletion of annotation arrays in `structure.AtomArray
  • Fixed structure.BondList potentially ending in a broken state after
    indexing it with an unordered index array
  • structure.partial_charges() uses bond order instead of number of bond
    partners to calculate correct charges for atoms with positive or negative
    formal charge

Biotite 0.25.0

09 Dec 15:30
7d89828
Compare
Choose a tag to compare

Changelog

Additions

  • New analysis capabilities for nucleic acid base pairs
    • Added structure.info.nucleotide_names()
    • Increased performance of structure.base_pairs()
    • Support for exotic nucleotides in structure.base_pairs() and
      structure.filter_nucleotides()
    • Added structure.base_pairs_edge() and
      structure.base_pairs_glycosidic_bond() for further
      characterization of base pairs
    • Added structure.base_stacking() for identification of pi-stacking
      of nucleobases
    • Added structure.pseudoknots() for identification of pseudoknots
      in a given list of base pairings
    • Added structure.dot_bracket(),
      structure.dot_bracket_from_structure() and
      structure.base_pairs_from_dot_bracket() for conversion of
      base pairs to dot-bracket-letter notation and vice versa
  • New methods for structure.BondList:
    • Added get_all_bonds() for obtaining the bonds atoms for each atom
      in the structure
    • Added adjacency_matrix() and bond_type_matrix()
  • Added structure.partial_charges() for partial charge calculation
    using the PEOE method
  • Added structure.info.standardize_order(), that reorders atoms in
    residues into the PDB standard atom order for the respective residue
  • Added structure.graphics.plot_ball_and_stick_model()
  • Increased performance of residue level utilities
  • Added structure.get_residue_positions()
  • Added sequence.io.genbank.get_raw_sequence() which returns the
    sequence as string

Changes

  • structure.hbond() raises a warning if an input structure without
    hydrogen atoms is given (#241)
  • get_sequence() and get_sequences() of biotite.sequence.io.fasta
    and biotite.sequence.io.fastq convert selenocysteine to cysteine (#232)
  • Changed order of sequence type biotite.sequence.io.fasta.get_sequence() and
    biotite.sequence.io.fasta.get_sequences() try to create (#232):
    • First: sequence.NucleotideSequence
    • If this fails: sequence.ProteinSequence
  • Temporary files used by the application subpackage are removed via
    os.remove() due to issues on Windows (#243)

Fixes

  • Fixed structure.base_pairs() for structures that contian residues,
    that are not in the PDB standard order (#237)
  • Fixed slightly incorrect aspect ratio in molecular visualizations
    created via structure.graphics.plot_atoms()
  • Fixed bounds check for input bonds the structure.BondList constructor
    (related to #252):
    • Previously, the bond type value was not allowed to exceed the number
      of atoms
  • Fixed structure.BondList indexing with an unsorted index array (#238)
  • Fixed the charge annotation of molecules obtained via
    structure.info.residue() (#254)

Biotite 0.24.0

05 Oct 13:39
d5e7207
Compare
Choose a tag to compare

Changelog

Additions

  • Added sequence.ProteinSequence.get_molecular_weight() method
  • Added application.sra subpackage as interface to NCBI SRA tools
    • FastqDumpApp is used for fetching FASTQ files from the NCBI SRA
  • Added iter_read() static method to sequence.io.fasta.FastaFile
    and sequence.io.fasta.FastqFile
    • This method is used to parse header-sequence-pairs from FASTA/FASTQ
      files without the necessity to keep the entire file in memory.
  • set_sequence and set_sequences in sequence.io.fasta
    and sequence.io.fasta support writing RNA sequences with the new
    as_rna parameter

Fixes

  • Fixed missing whitespace at the end of _loop category labels in
    PDBx/mmCIF files (#224)
  • Fixed inconsistent handling of model IDs over different file formats
    for structures where the first model ID is greater than 1 (#227)
  • Removed warning in structure.density()
  • sequence.io.fastq.get_sequence() and sequence.io.fastq.get_sequences()
    properly handle RNA and ambiguous sequences now
  • Fixed start parameter in structure.renumber_atom_ids and
    structure.renumber_res_ids
  • Updated fetch URL for FASTA files in database.rcsb.fetch()

Biotite 0.23.0

03 Aug 15:30
Compare
Choose a tag to compare

Changelog

Additions

  • Improved example gallery
    • Added minigalleries in the API reference to get tangible examples for the
      respective function/class
    • Added support for animated Matplotlib plots
    • Using Ammolite for rendering
      PyMOL images
  • Added support for new RCSB search API
    • New database.rcsb.Query classes, that reflect the entirety of the new
      search API, including sequence, sequence motif and structure searches
      • Multiple database.rcsb.Query objects can be combined/negated using the
        operators |, & and ~
    • Added the return_type, sort_by and range parameter to
      database.rcsb.search()
    • Added database.rcsb.count() function to count the number of results a
      database.rcsb.Query would yield in a less costly way than
      database.rcsb.search()
  • Increased indexing speed in biotite.structure.BondList
  • Added attribute sequence.Sequence.alphabet property, that is equivalent to
    sequence.Sequence.get_alphabet()
  • Added convenience functions fastq.get_sequence(), fastq.get_sequences(),
    fastq.set_sequence() and fastq.set_sequences()
  • Drastically increased writing speed of sequence.io.fasta.FastaFile
  • Increased mapping speed of sequence.AlphabetMapper
  • Added sequence.Alphabet.is_letter_alphabet() method
  • Added general sequence I/O convenience functions
    sequence.io.load_sequence(), sequence.io.load_sequences(),
    sequence.io.save_sequence() and sequence.io.save_sequences() that derive
    the appropriate File class from the suffix of the file name.

Changes

  • The omit_chain parameter has been removed from database.rcsb.search()
  • The old database.rcsb.Query classes have been removed
  • Removed python setup.py test and python setup.py build_sphinx commands,
    please use pytest and sphinx-build directly instead
  • Renamed sequence.NucleotideSequence.alphabet to
    sequence.NucleotideSequence.alphabet_unamb
  • sequence.io.fastq.FastqFile returns its entries only as str instead of
    sequence.NucleotideSequence for consistency with
    sequence.io.fastq.FastaFile
    • The method sequence.io.fastq.FastqFile.get_sequence() is deprecated
    • The method sequence.io.fastq.FastqFile.get_seq_string() returns the
      sequence as a str instead of a sequence.NucleotideSequence

Fixes

  • Fixed expect_looped parameter in
    structure.io.pdbx.PDBxFile.get_category()
  • Fixed error in structure.io.pdbx.PDBxFile, that was raised, if a PDBx
    field and its single-line value are in separate lines
  • Added check for boolean mask length, when a boolean mask is given as index
    to biotite.structure.BondList
  • Changed chain_id dtype from 'U3' to 'U4' (#215)

Biotite 0.22.0

04 Jun 13:26
8979702
Compare
Choose a tag to compare

Changelog

Additions

  • Added structure.filter_nucleotides()
  • structure.io.pdbx.get_sequence() is able to parse a
    sequence.NucleotideSequence from a PDBx file in addition to
    sequence.ProteinSequence
  • Added structure.base_pairs() for determining base pairs in nucleic acid
    structures
  • Added structure.get_residue_starts_for()
  • Added structure.check_atom_id_continuity()
  • Added structure.renumber_atom_ids() and structure.renumber_res_ids()
    to fix structures with discontinuous atom/residue IDs
  • Added get_model_count() to structure.io.pdb, structure.io.pdbx,
    structure.io.mmtf and structure.io.gro to obtain the total number
    of models
  • The model parameter in get_structure() in structure.io.pdb,
    structure.io.pdbx, structure.io.mmtf and structure.io.gro supports
    negative values to start indexing beginning from the last model
  • Increased performance of residue and chain-related functions
    (e.g. structure.get_residue.starts())

Changes

  • Revamped altloc ID handling (#194)
    • Instead of choosing each alternate location individually there are three
      options:
    • 'first' choses always chooses the atoms with the first altloc ID
      for each residue
    • 'occupancy' choses always chooses the atoms with the highest occupancy
      for each residue
    • 'all' does not filter any altloc IDs and adds the altloc_id
      annotation to the resulting structure.AtomArray or
      structure.AtomArrayStack
  • Renamed structure.check_id_continuity() into
    structure.check_res_id_continuity(); structure.check_id_continuity()
    is still available, but is deprecated

Fixes

  • Fixed structure.BondList being iterable, yielding nonsense values
  • Improved element guesses in structure.io.pdb.PDBFile when the
    element column is missing (#188)
  • Fixed parsing of single models from structure.io.mmtf.MMTFFile (#205)
  • Fixed missing unit cell values in structure.io.pdbx.get_structure()
    raising an error; the box attribute is set to None instead

Biotite 0.21.0

28 Apr 13:48
1c806ce
Compare
Choose a tag to compare

Changelog

Additions

  • More functionality for structure.BondList
    • __contains__() method to test whether a bond exists
    • find_connected() identifies systems of connected atoms (aka molecules)
  • Added frame wise iteration of trajectory files for saving memory
    • structure.io.TrajectoryFile.read_iter() yields coordinates, box and time for each frame
    • structure.io.TrajectoryFile.read_iter_structure() yields an structure.AtomArray for each frame
  • Added ability to read entire biological assemblies from mmCIF files
    • structure.io.pdbx.list_assemblies() lists the available assemblies
    • structure.io.pdbx.get_assembly() returns the given assembly as
      structure.AtomArray or structure.AtomArrayStack
  • Added the expect_looped parameter to
    structure.io.pdbx.PDBxFile.get_category
  • structure.info.vdw_radius_single() provides VdW radii also for more
    uncommon elements
  • Added structure.get_residue_masks(), which masks all residues to which the
    given atoms belong
  • Added structure.repeat() functions to repeat atoms multiple times in the
    same model with different coordinates
  • Added a bunch of new examples to the gallery

Changes

  • temp_file() and temp_dir() is deprecated, use the Python standard library
    module tempfile instead
  • For all File classes, read() is now a class method,
    e.g. pdbx_file = PDBxFile.read(), the old instance method is deprecated
  • database.rcsb.fetch() and database.enrez.fetch() overwrite an existing
    file if it is empty

Fixes

  • A newline character is appended to the end of file, when writing text files
  • Fixed structure.CellList when using the periodic parameter in combination
    with the selection parameter; before unallocated memory was potentially
    accessed

Biotite 0.20.1

28 Feb 14:06
04c3c97
Compare
Choose a tag to compare

Changelog

Fixes

  • Fixed support for msgpack 1.0

Biotite 0.20.0

27 Feb 10:54
a8f7e16
Compare
Choose a tag to compare

Changelog

Additions

  • Added structure.from_template() to create a structure.AtomArrayStack from an existing atom array (or stack) and coordinates
  • Added ignore parameter to sequence.io.genbank.get_annotation() to ignore the given feature keys
  • Added sequence.graphics.plot_plasmid_map() for visualizing sequence.Annotation objects as plasmid
  • Added a bunch of new examples to the gallery
  • Added support for Python 3.8 on Windows

Changes

  • The output of the score_matrix() method of sequence.align.SubstitutionMatrix is not writable anymore, rendering a SubstitutionMatrix truly immutable
  • Renamed environment.yaml to environment.yml
  • A sequence.Feature must have at least one location

Fixes

  • Fixed incorrect centroid calculation in structure.superimpose(), when providing a boolean mask
  • Fixed installation of PyPI source distributions
  • Fixed issues when reading text files with \r\n line breaks (line breaks with carriage return, typical for Windows)