Skip to content

Releases: ilius/pyglossary

3.3.0

21 May 08:19
e7126e1
Compare
Choose a tag to compare

Changes since 3.2.1

  • Require Python 3.6 or higher (mainly becuase of f-strings)

  • New format support

    • Add support to write Kobo dictionary, #205
    • Add support to write EPUB-2
    • Add support to read AppleDict Binary (.dictionary)
    • Add support to read and write Aard 2 (slob), #116
  • Glossary: detect and load Writer class from plugins

    • Remove write function from plugin if it has Writer class
  • Glossary: call gc.collect() on indirect mode after reading/writing each 128 entries

    • To free up memory and avoid running out of RAM for large glossaries
  • Glossary: remove empty and duplicate alternate words when converting, using Entry Filter, #188

  • Add command line options to remove html tags:

    • --remove-html=tag1,tag2,tag3
    • --remove-html-all
  • Re-design format-specific options

    • Allow specifying format-specific read/write options in ui_gtk and ui_tk
    • Add much better and cleaner codebase for handling options in option.py
    • Implement validation of options in command line, GTK and Tkinter interfaces
    • Add tests for option.py in option_test.py
    • Avoid using None as default value of option argument
    • Check default value of plugin options and show warning if invalid
    • Add IntOption class, use it in Omnidic plugin
    • Add DictOption, use it for appledict defaultPrefs
    • And optionsProp to all plugins
      • Containing value type, allowed values and optional comment
    • Remove readOptions and writeOptions from all plugins
      • Detect options from functions' signature and optionsProp variables
      • Avoid using **kwargs in plugin read, Reader.open or write functions
  • Add depends variable to plugins

    • To let GUI install plugin dependencies
    • Type: dict, keys are module names, values are pip's package name
    • Add Glossary.formatsDepends
  • Minor fixes and improvements in Glossary class:

    • Return with error if output file path is an existing directory
    • Fix empty zip when creating DIRECTORY.zip as output glossary
    • Do not uncompress gz/bz2/zip input files automatically
    • Ignore "read" function of plugin if "Reader" class is present
    • Cleaning: Add Glossary.init() classmethod to initialize the class, can be called multiple times
    • Some refactoring and cleaning, and add some logs
    • Small optimization: index % 100 -> index & 0x7f
    • Allow having progressbar by position in file and size of file
      • use for appledict_bin.py
    • Do not write resource file names as entries to text file in Glossary.writeTxt
  • StarDict plugin

    • Always open .ifo file as UTF-8
    • Fix output filenames without .ifo extention creating hidden files, #187
  • Babylon BGL plugin

    • Fix bytes metedata values b'...' and some refactoring in readType3
    • Skip empty info values
    • Fix non-string info values written as empty
    • Prefix 3 info keys with bgl_
    • Fix NameError in debug mode in stripHtmlTags
    • Some refactoring
  • Octopus MDict plugin

  • Change yes/no options in AppleDict and ABBYY Lingvo DSL plugins to boolean

    • To keep compatibility of command line flags, fix yes/no manually in ui_cmd.py
  • AppleDict plugin:

    • Fix echo problem in Makefile (#177)
    • Add dark mode support for AppleDict output (#177)
    • Add comments for optionsProp
    • Use keyword argument features= and fix a warning about from_encoding=
  • Fix misspelled "extension" (as "extention") in plugins

  • Detect entries with span tag as html, #193

  • Refactoring in ui_gtk and ui_tk

  • Fix some deprecated API in ui_gtk

  • Fix minor bugs and improvements in ui_tk and ui_gtk

  • Update setup.py to adapt packaging with wheel, #189

  • Add type hints to codebase and plugins

  • Refactoring and style changes:

    • rename pyglossary.pyw to main.py, add a small pyglossary.pyw for compatibility
    • Switch to f-strings in glossary.py and freedict.py
    • main.py: replace single quotes with double quotes
    • PEP-8 style fixes

3.2.1

21 Jun 21:11
3.2.1
bb2a66a
Compare
Choose a tag to compare

Changes since 3.2.0

  • Changes in StarDict plugin:
    • Add sametypesequence write option (PR #162)
    • Fix some bugs
    • Cleaning
  • Disable gzip CRC check for BGL files with Python 3.7
  • Fix a bug in octopus_mdict.py
  • Fix Gtk warnings in ui_gtk
  • Allow seeing/customizing warnings by setting environment variable WARNINGS
  • Fix not being able to run the program when installed inside virtualenv (#168)
  • Show a tip about -h when no UI were found, #169
  • octopus_mdict_source.py: fix #68, add support for inconsecutive links with --read-options=links=True
  • Auto-detect UTF-16 encoding of DSL files
  • Update README.md (fix Archlinux pkg name, add AUR, add instructions for installing python-lzo on Windows, etc)
  • Some clean up

3.2.0

04 Mar 20:51
d42fb0a
Compare
Choose a tag to compare

Changes since 3.1.0

  • Add read support for CC-CEDICT plugin

    • Pull request #140, with some fixes and improvements by me
  • Fixes in DSL (ABBYY Lingvo) plugin:

    • Fix #136, removing one extra character after #CONTENTS_LANGUAGE:
    • Fix #137, regexp for re_lang_open
  • Improvement in Gtk interface:

    • Avoid changing Format combobox based on file extention if a format is already selected, #141
  • Fix encoding problem with non-UTF-8 system locales

    • Fix #147, give encoding="utf-8" when opening text files, for non-UTF-8 system locales
  • Improvements in Glossary class

3.1.0

17 Jun 16:32
3.1.0
9d40c3d
Compare
Choose a tag to compare

Changes since 3.0.4

  • Refactor StarDict plugin, and improve the performance
  • Detect HTML definitions when reading, and mark them as HTML when converting to StarDict
  • Fix #135 in StarDict writer:
    • Alternates were pointing at a wrong word in case there are resource/image files
  • Refactor AppleDict plugin
  • Refactor and improve BGL plugin
  • Style fixes including pep-8 fixes
  • Change indentations to tabs, and single quote to double quotes
  • Allow --ui=none flag
  • Allow --skip-resources flag
  • SQL plugin: add encoding write option
  • Octopus MDict Source plugin: add encoding read option
  • Drop sqlite3 support, xFarDic support, and read support for Omnidic
  • Improvement and cleaning in the code base and different plugins
  • Introduce DataEntry
    • Allowing to access resource files when iterating over entries (words) of Glossary
  • Glossary: write and convert methods return absolute path of output file, or None
  • Changes in master branch since 3.0.4:
    • Update README.md
    • Update pyglossary.spec
    • Fixes in setup.py
    • BGL: add gzip_no_crc.py for Python 36 (required for some non-standard BGL files)
    • AppleDict: give encoding='utf8' while opening xml file, fix for #84
    • Avoid lines that require trailing backslash, to avoid bugs like #67
    • babylon_source.py: remove extra %s, fix #92
    • AppleDict: force encoding="utf-8" for plist file, fix #94
    • Fix str/bytes bug in stardict.py (fix #98) and some renames for clarification
    • Fix #102: exception in dict_org.py
    • Fix wrong path of static files when running from dist-packages
    • readmdict.py: change by Xiaoqiang Wang: no encryption if Encrypted is not in header
    • Fix #118, SyntaxError (return with argument inside generator) in Glossary.reverse with Python 3.6

3.0.4

07 Nov 20:09
Compare
Choose a tag to compare

Changes since 3.0.3

  • Changes in Glossary code base
  • Fix critical bug in Glossary: ZeroDivisionError if wordCount < 500, #61
  • Bug fix in Glossary.progress: make sure ui.progress is not called with a number more than 1.0
  • Fix non-working write to SQL, #67
  • Bug fix & Feature: add newline argument to Glossary.writeTxt
    Because Python's open converts (modifies) newlines automatically, #66
  • Break compatibility about using Glossary.writeTxt method
    Replace argument sep which was a tuple of length two, with two mandatory arguments: sep1 and sep2
  • Changes in plugins
  • Fix in StarDict plugin: fix some Python3-related errors, #71
  • Fix in Dict.org plugin: install was not working
  • Fix in DSL plugin: replace backslash at the end of line with <br/>, #61
  • Fix in SQL plugin: specify encoding='utf-8' while opening file for write, #67
  • Fix in Octopus Mdict Source plugin: specify encoding='utf-8' while opening file for read, #78
  • Fix (probable) bugs of bad newlines in 4 plugins (use newline argument to Glossary.writeTxt), #66
    - Octopus MDict Source
    - Babylon Source (gls)
    - Lingoes Source (LDF)
    - Sdictionary Source (sdct)
  • Feature in Lingoes Source plugin: add newline write option
  • Minor fix in AppleDict plugin: fix beautifulsoup4 error message, #72
  • BGL plugin: better compatibilty with Python 3.4
    Fix CRC check failed error for some (rare) glossaries with Python 3.4
  • Other Changes
  • Bug fix in parsing command line read options--read-options and --write-options (happened in very rare cases)
  • Fix wrong shebang line in setup.py: must run with python3, fix #75
  • Update pyglossary.spec
  • Change Categories for pyglossary.desktop

3.0.3

01 Jul 19:53
Compare
Choose a tag to compare

Changes since 3.0.2

  • Fixes in AppleDict plugin
  • Improve Tkinter interface: fix Not Responding bug, make window icon colorful
  • Fix visual bug in command line Progress Bar (percentage did not become 100.0%)
  • BGL reader: add support for Python < 3.5, with a warning to install Python 3.5
  • Fixes in Reverse feature
  • Update README.md

3.0.2

24 Jun 14:23
Compare
Choose a tag to compare

Changes since 3.0.1

  • Fix a bug in setup.py, making it not to work
  • Fix a bug in logger class, occurring when pyglossary is imported as a library
  • Fix a few bugs in Octopus MDict reader
  • Fix a minor bug in BGL reader
  • Update README.md

3.0.1

17 Jun 17:00
Compare
Choose a tag to compare

Changes since 3.0.0

  • Fix some minor bugs in Glossary class
  • Fix wrong exist status in command line from pyglossary.pyw
  • Fix exception in BGL plugin

3.0.0

09 Jun 01:30
Compare
Choose a tag to compare

New versioning

  • Using date as the version was a mistake I made 7 years ago
  • From now on, versions are in X.Y.Z format (major.minor.patch)
  • While X, Y and Z are digits(0-9) for simplicity (version strings can be compared alphabetically)
  • Starting from 3.0.0
    • Take it for migrating to Python 3.x, or Gtk 3.x, or being alphabetically larger than previous versions (date string)

Since I believe this is the first standard version, I'm not sure which code revision should I compare it with. So I just write the most important recent changes, in both application-view and library-view.

Breaking Compatibility

  • Config migration
    • Config file becomes a config directory containing config file
    • Config file format changes from Python (loaded by exec) to JSON
    • Remove some obsolete / unused config parameters, and rename some
    • Remove permanent sort boolean flag
      • Must give --sort in command line to enable sorting for most of output formats
    • Load user-defined plugins from a directory named plugins inside config directory
  • Glossary class
    • Remove some obsolete / unused method
      • copy, attach, merge, deepMerge, takeWords, getInputList, getOutputList
    • Rename some methods:
      • reverseDic -> reverse
    • Make some public attributes private:
      • data -> _data
      • info -> _info
      • filename -> _filename
    • Clear (reset) the Glossary instance (data, info, etc) after write operation
      • Glossary class is for converting from file(s) to file, not keeping data in memory
    • New methods:
      • convert:
        • convert method is added to be used instead of read and then write
        • Not just for convenience, but it's also recommended,
          • and let's Glossary class to have a better default behavior
          • for example it enables direct mode by default (stay tuned) if sorting is not enabled (by user or plugin)
        • all UI modules (Command line, Gtk3, Tkinter) use Glossary.convert method now
    • Sorting policy
      • sort boolean flag is now an argument to write method
        • sort=True if user gives --sort in command line
        • sort=False if user gives --no-sort in command line
        • sort=None if user does not give either, so write method itself decides what to do
      • Now we allow plugins to specify sorting policy based on output format
        • By sortOnWrite variable in plugin, with allowed values:
          • ALWAYS: force sorting even if sort=False (user gives --no-sort), used only for writing StarDict
          • DEFAULT_YES: enable sorting unless sort=False (user gives --no-sort)
          • DEFAULT_NO: disable sorting unless sort=True (user gives --sort)
          • NEVER: disable sorting even if sort=True (user gives --sort)
        • The default and common value is: sortOnWrite = DEFAULT_NO
        • Plugin can also have a global sortKey function to be used for sorting
        • (like the key argument to list.sort method, See pydoc list.sort)
    • New way of interacting with Glossary instance in plugins:
      • glos.data.append((word, defi)) -> glos.addEntry(word, defi)
      • for item in glos.data: -> for entry in glos:
      • for key, value in glos.info.items(): -> for key, value in glos.iterInfo():

Gtk2 to Gtk3

  • Replace obsolete PyGTK-based interface with a simpler PyGI-based (Gtk3) interface

Migrating to Python 3

  • Even though master branch was based on Python 3 since 2016 Apr 29, there was some problem that are fixed in this release
  • If you are still forced need to use Python 2.7, you can use branch python2.7

Introducing Direct mode

  • --direct command line option
  • reads and writes at the same time, without loading the whole data into memory
  • Partial sorting is supported
    • --sort in command line
    • --sort-cache-size=1000 is optional
  • If plugin defines sortOnWrite=ALWAYS, it means output format requires full sorting, so direct mode will be disabled
  • As mentioned above (using Glossary.convert method), direct mode is enabled by default if sorting is not enabled (by user or plugin)
  • Of course user can manually disable direct mode by giving --indirect option in command line

Progress Bar

Automatic command line Progress Bar for all input / output formats is now supported

  • Implemented based on plugins Reader classes
  • Works both for direct mode and indirect mode
    • Only one progress bar for direct mode
    • Two progress bars for indirect mode (one while reading, one while writing)
  • Plugins must not update the progress bar anymore
  • Still no progress bar when both --direct and --sort flags are given, will be fixed later
  • User can disable progress bar by giving --no-progress-bar option (recommended for Windows users)

BGL Plugin

  • BGL plugin works better now (comparing to latest Python 2.7 code), and it's much cleaner too
  • I totally refactored the code, made it fully Python3-compatible, and much more easier to understand
  • This fixes bytes/str bugs (like Bug #54), and CRC check problem for some glossaries (Bug #55)
  • I'm a fan of micro-commits and I usually hate single-commit refactoring, but this time I had no choice!

Other Changes

Feature: Add encoding option to read and write drivers of some plain-text formats

Feature: SQL and SQLite: read/write extra information from/to a new table dbinfo_extra, backward compatible

New format invented and implemented for later implementation of a Glossary Editor

  • edlin.py (Editable Linked List of Entries) is optimized for adding/modifying/removing one entry at a time
  • while we can save the changes instantly after each modification
  • Using the ideas of Doubly Linked List, and Git's hash-based object database

Rewrite non-working Reverse functionality

  • The old code was messy, not working by default, slow, and language-dependent
  • It's much faster and cleaner now

Improve and complete command line help (-h or --help)