Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
name: CI

on:
push:
branches: [master] # only when master gets new commits
pull_request:
branches: [master] # only when a PR is opened/updated that targets master
push:
branches:
- master
- 'development-*' # add pattern for development branches
pull_request:
branches:
- master
- 'development-*' # allow PRs targeting development-* branches


jobs:
test:
Expand Down
189 changes: 24 additions & 165 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,19 @@ Japan Association of Radio Industries and Businesses (ARIB) MPEG2 Transport Stre

[![CI](https://github.com/johnoneil/arib/actions/workflows/ci.yml/badge.svg?branch=master&event=push)](https://github.com/johnoneil/arib/actions/workflows/ci.yml)

<p>
<img src="img/gaki-2.jpg" width="200"><img src="img/ace-of-diamond.jpg" width="200"><img src="img/knights-of-sidonia.jpg" width="200">
</p>

## Description
Closed Captions (CCs) are encoded in Japanese MPEG Transport Streams as a separate PES (Packetized Elementary Stream) within the TS. The format of the data within this PES is described by the (Japanese native) ARIB B-24 standard. An English document describing this standard is included in the Arib/docs directory in this repository.

This python package provides tools to find and parse this ARIB closed caption information in MPGEG TS files.

This code can be used in your own applications or used via the arib-ts2ass tool which this package provides.

The image below shows example ARIB closed caption data displayed at runtime on a media player, generated via arib-ts2ass. The text, position and color are all driven by data derived from the MPEG TS Closed Caption elemenatry stream.
# Description
Closed Captions (CCs) are encoded in Japanese MPEG Transport Streams as a separate PES (Packetized Elementary Stream) within the TS. The format of the data within this PES is described by the (Japanese native) ARIB B-24 standard. An English document describing this standard is included in the Arib/docs directory in this repository.

![example of ass file](img/gaki2.png "Example ass file.")
This python package provides tools to find and parse this ARIB closed caption information in MPGEG TS files and can be used in your own applications or used via the tools which this package provides.

# Installation

Installation should be typical. We recommend a virtual environment.
Installation should be typical. We recommend using virtual environment.

```
pip install git+https://github.com/johnoneil/arib
Expand All @@ -32,179 +30,40 @@ cd arib
pip install -e .
```

The above commands may require ```sudo``` though I recommend again installing them in a python virtualenv.

## `arib-ts2ass` tool

This package provides a tool (`arib-ts2ass`) that extracts ARIB based closed caption information from an MPEG Transport Stream recording, and formats the info into a standard .ass (Advanced Substation Alpha) subtitle file. The image below shows a resultant .ass subtitle file loaded to the video file it was generated off:
![example of ass file](img/haikyu.png "Example ass file.")
Note the ts2ass tool supports (in a basic way) closed caption locations, furigana (pronunciation guide), text size and color.

If no PID is specified to the tool, arib-ts2ass will attempt to find the PID of the elementary stream carriing Closed Caption information within the specified MPEG TS file. Or one can be specified if it is known (see below concerning how to find PID values in TS files).

Basic command line help is available as below.
```
>arib-ts2ass --help
usage: arib-ts2ass [-h] [-o OUTFILE] [-p PID] [-v] [-q] [-t TMAX] [-m TIMEOFFSET] [--disable-drcs] infile

Remove ARIB formatted Closed Caption information from an MPEG TS file and format the results as a standard .ass
subtitle file.

positional arguments:
infile Input filename (MPEG2 Transport Stream File)

options:
-h, --help show this help message and exit
-o OUTFILE, --outfile OUTFILE
Output filename (.ass subtitle file)
-p PID, --pid PID Specify a PID of a PES known to contain closed caption info (tool will attempt to find the
proper PID if not specified.).
-v, --verbose Verbose output.
-q, --quiet Does not write to stdout.
-t TMAX, --tmax TMAX Subtitle display time limit (seconds).
-m TIMEOFFSET, --timeoffset TIMEOFFSET
Shift all time values in generated .ass file by indicated floating point offset in seconds.
--disable-drcs Disable emitting .ass drawing code for runtime (dynamic) DRCS characters.
```

### DRCS Support

I've introduced basic DRCS (dynamic runtime character) support, so when DRCS characters are encountered in the .ts stream they are cached and emitted as .ass drawing code when encountered in text. See the following image:

![DRCS in a closed caption](img/drcs.png)

This behavior can be turned off if the .ass drawing code is too heavyweight by specifying the `--disable-drcs` command line option. This results in previous behavior whereby the "unknown character" glyph is emitted for DRCS (see below).

![DRCS disabled unknown character](img/no-drcs.png)

# Experiments and Other Info

## `arib-ts-extract` and `arib-es-extract`

This package also installs two additional tools which can be used to draw basic CC information from MPEG ts and es files. These are ```arib-ts-extract``` and ```arib-es-extract```. They skip the usual .ass formatting and show a text representation of the basic ARIB codes present in the .ts or .es file. See the example below:
```
joneil@joneilDesktop ~/code/arib $ arib-es-extract tests/toriko_subs.es

<CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c"><clear screen>
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c"><clear screen>
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c">
<Screen Posiiton to 71,67><世はグルメ時代>
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c">
<Screen Posiiton to 71,65><食の探求者<Medium Text> <Normal Text>美食屋たちは訢
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c">
<Screen Posiiton to 71,65>あまたの食材を追い求める>
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c"><Small Text>
<Screen Posiiton to 76,66><Normal Text><そして<Medium Text> <Normal Text>この世の食材の頂点
<Screen Posiiton to 70,66><Small Text>ゴッド<Medium Text>
<Screen Posiiton to 70,75><Small Text>ほかく<Normal Text>
<Screen Posiiton to 71,65>GODの捕獲を目指す訢
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c">
<Screen Posiiton to 71,66>一人の美食屋がいた!>
<CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c"><clear screen>
<clear screen><CS:"620;480 V"><CS:"170;30 _"><CS:"1;0000 c">
<Screen Posiiton to 71,64>頰〜
```

In the above output, each line is not timestamped, but you can see the cursor movement info (screen positions in character row/col) text size info, and the on screen CC text data.

Interestingly, you can see how the furigana for certain words (perl or kanji pronunciation guide) is present for many romaji (latin alphabet) and kanji characters. For example the furigana "ゴッド" is positioned as small text above the normal sized text word "GOD".
# Tools Provided

Timestamp info for the for the various text and clear screen commands would have to be drawn out of the .TS packet info. This functionality is not present in this package.
## `arib-ts2srt`

Also note that in the example above, screen positions and other textual information was described using the ARIB control character set.
There is another way in which such info is carried around: via the ARIB control *seqence* character set. Please refer to the ARIB.control_characters.CS class for more info.
This package provides the `arib-ts2srt` tool which extracts closed caption data from a `.ts` file and produces a simple `.srt` file output. This application also serves as a simple example of how to use the underying library.

An example of inline control sequences carrying text position and other info follows:
```
<CS:"7 S"><CS:"170;30 _"><CS:"620;480 V"><CS:"36;36 W"><CS:"4 X"><CS:"24 Y"><Small Text><CS:"170;389 a">えいえゅゃ<Normal Text><CS:"170;449 a">栄純が<Medium Text><Small Text><CS:"530;449 a">い<Normal Text><CS:"190;509 a">きのぃとはゃなに言っくら訢
arib-ts2srt <input .ts file> [-o <optional output .srt file>]
```
Refer to the ARIB documentation for descriptions of what these control sequences mean, but some can be summarized here:
* 'S' character indicates the text layout style according to the ARIB std (here 7 indicates horizontal text with geometry based on a screen of 960x540)
* '_' underscore indicates UL corner in pixels of CC area (here at x=170,y=30).
* 'V' indicates the width, height in pixels of the CC area (here 620x480). Note that this is inset inside a stanard screen dimension of 960x540.
* 'W' indicates the height and width of a normal sized character in pixels. Japanese characters tend to be square.
* 'X' is the pixel spacing between characters in CCs.
* 'Y' is the pixel spacing between lines in CCs.
* 'a' Positions the cursor to a screen position in pixels. This is in contrast to the dedicated control character APS (Active Position Set) above which positions the cursor to a particular character *line* and *column*. APS style line and column positions can be translated to pixel positions by using the character width and height, space between characters and lines and the UL position of the CC area (see above).

# Manually drawing a PID and/or PES from a TS file
I've update the arib-ts2ass tool above to automatically find the id (PID) of the elementary stream carrying closed captions (if there is one) in any MPEG TS file. However, if you'd like to find these PID values for yourself I recommend using the ```tsinfo``` tool as below:
```
joneil@joneilDesktop ~/code/arib/analysis $ tsinfo <filename>.ts
Reading from <filename>.ts
Scanning 1000 TS packets

Packet 452 is PAT
Program list:
Program 2064 -> PID 0101 (257)

Packet 796 is PMT with PID 0101 (257)
Program 2064, version 15, PCR PID 0100 (256)
Program info (15 bytes): 09 04 00 05 e0 31 f6 04 00 0e e0 32 c1 01 84
Conditional access: id 0005 (5) PID 0031 (49) data (9 bytes): f6 04 00 0e e0 32 c1 01 84
Descriptor tag f6 (246) (4 bytes): 00 0e e0 32
Descriptor tag c1 (193) (1 byte): 84
Program streams:
PID 0111 ( 273) -> Stream type 02 ( 2) H.262/13818-2 video (MPEG-2) or 11172-2 constrained video
ES info (6 bytes): 52 01 00 c8 01 47
Descriptor tag 52 ( 82) (1 byte): 00
Descriptor tag c8 (200) (1 byte): 47
PID 0112 ( 274) -> Stream type 0f ( 15) 13818-7 Audio with ADTS transport syntax
ES info (3 bytes): 52 01 10
Descriptor tag 52 ( 82) (1 byte): 10
PID 0114 ( 276) -> Stream type 06 ( 6) H.222.0/13818-1 PES private data (maybe Dolby/AC-3 in DVB)
ES info (8 bytes): 52 01 30 fd 03 00 08 3d
Descriptor tag 52 ( 82) (1 byte): 30
Descriptor tag fd (253) (3 bytes): 00 08 3d
PID 0115 ( 277) -> Stream type 06 ( 6) H.222.0/13818-1 PES private data (maybe Dolby/AC-3 in DVB)
ES info (20 bytes): 52 01 38 09 04 00 05 ff ff f6 04 00 0e ff ff fd 03 00 08 3c
...
```
I recognize the PID 276, (stream type 6) as the PES private CC data from experience. Typically, tsinfo identifies Closed Caption elementary streams as `PES private data (maybe Dolby/AC-3 in DVB)`. The relevant CCs are usually the *first* elementary stream reported as well.

Note that sometimes an adequate PAT (Program allocation table) may not be within the first 1000 packets of the .TS, so you might have to run tsinfo with an additional argument (look through more packets for a PAT).
```
tsinfo -max 20000 <filename>.ts
```
An option exists to alternately output `.srt` data directly to stdout:

Then, if you wish, you can use ts2es to draw out the ES.
```
ts2es -pid 276 <input>.ts <output>.es
arib-ts2srt --stdou <input .ts. file> > output.srt
```

## arib-autosub
This repo also contains some code for an experimental application "arib-autosub" which draws Closed Caption information out of an MPEG TS file and then translates it via Bing Translate.
## `arib-ts2ass`

As I'm no longer installing this tool when this package is installed the description below is only for reference:
This tool outputs ARIB subtitle information in a formatted `.ass` ("advanced substation alpha") file. The advantage is that text position, color and size can be captured and presented as intended in the `.ts` stream. This is esecially advantageous in presenting furigana or ruby pronunciation guides correctly.

Command line help is available as below:
```
(arib)joneil@joneilDesktop ~/code/arib $ arib-autosub -h
usage: arib-autosub [-h] infile pid
<img src="img/knights-of-sidonia-2.jpg" width="400">

Auto translate jp CCs in MPEG TS file.
If no sutitle stream identifieer (PID) is provided to the tool, arib-ts2ass will attempt to find the PID of the elementary stream carriing Closed Caption information, or one can be specified if it is known (see below concerning how to find PID values in TS files).

positional arguments:
infile Input filename (MPEG2 Transport Stream File)
pid Pid of closed caption ES to extract from stream.

optional arguments:
-h, --help show this help message and exit

```

The application requires 2 command line arguments, the name of the input .ts file and the PID of the CC elementary stream. Please see below regarding how to identify a Closed Caption PID in a .ts file using the tsinfo tool.

An example screenshot of a resultant subtitle follows (from a news broadcast):
![example of translated ccs](img/news.png "Example of auto translated Closed Captions.")
### DRCS Support

Currently, text position and color are not carried through the translation process.
This tool now has basic DRCS (dynamic runtime character) support, so when DRCS characters are encountered in the .ts stream they are cached and emitted as .ass drawing code when encountered in text. See the following image:

Because this tool uses the Bing Translate API, the user must get their own "Client ID" and "Client scret" credentials from the windows Azue Marketplace. These need be defined in the arib.secret_key module.
![DRCS in a closed caption](img/drcs.png)

To find the PES ID of the closed captions stream within any TS (if it exists!) see the section below.
This behavior can be turned off if the .ass drawing code is too heavyweight by specifying the `--disable-drcs` command line option. This results in previous behavior whereby the "unknown character" glyph is emitted for DRCS (see below).

The translation results are not good. In fact, they are often lewd and comical. Still, this is an interesting experiment. To illustrate the defficiencies of the approach, I present the following screenshot, translating the shot from the previous section. You'll notice that despite the simplicity of the original source, the translation is off. It does give a "general sense" of meaning, however.
![example of auto translation](img/haikyu_eng.png "Example poor auto translation.")
![DRCS disabled unknown character](img/no-drcs.png)

# Experiments and Other Info

See [here](./experiments.md)
Loading