Skip to content

A set of tools for importing and exporting subtitles in various formats.

License

Notifications You must be signed in to change notification settings

Podnapisi-NET/pysubtools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A set of parsers and exporters for handling with subtitle files in various
subtitles. All data is imported into unicode, it uses chardet and some
guessing table (language to encoding) to identify proper encoding of the
file. The Subtitle object also supports save and from_file. It uses it's
own SIF (Subtitle Intermediate Format) which is just a plain YAML predefined
structure with some additional constructors.

Currently only SubRip parser and exporter is implemented, other parts are
yet to come.

A quick example of parsing:
import pysubtools
# Let's detect parser from data
parser = pysubtools.parser.Parser.from_data(open('sub.srt', 'rb'),
                                            encoding = 'windows-1250')
sub = parser.parse()

# Fifth unit
sub[5]

sub[1].start
sub[1].end
# And the lines itself (list of unicode)
sub[1].text

# And direct using SubRip parser
parser = pysubtools.parser.Parser.from_format('SubRip')
sub = parser.parse(open('sub.srt', 'rb'))

# We can also use GzipFile (if using python 2.x, can use patched GzipFile)
from pysubtools.utils import PatchedGzipFile as GzipFile
parser = pysubtools.parser.Parser.from_format('SubRip')
fileobj = open('sub.srt', 'rb')
# Can use guess table for specific language
sub = parser.parse(GzipFile(fileobj = fileobj), language = 'sl')

And an example of export (let's say we have the 'sub' from previous example):

exporter = pysubtools.exporters.Exporter.from_format('SubRip')

# Let us export it into io.BytesIO
import io
buf = io.BytesIO()
exporter.export(buf, sub)
# And we have utf-8 encoded string in buf

# Let us now export it into a file with different encoding
new_exporter = pysubtools.exporters.Exporter.from_format('SubRip',
                                                         encoding = 'cp1250')
new_exporter.export('test.srt', sub)

# And we have a cp-1250 encoded subtitle :). You may also use GzipFile to
# produce compressed subtitles.

About

A set of tools for importing and exporting subtitles in various formats.

Resources

License

Stars

Watchers

Forks

Packages

No packages published