From d7c331a55f1e4aa8413e748c0cfe086d3e1d9eed Mon Sep 17 00:00:00 2001
From: sunghee-hwang <97494915+sunghee-hwang@users.noreply.github.com>
Date: Thu, 17 Aug 2023 17:46:57 +0900
Subject: [PATCH] 4CC for codec_id should not be linked.
---
index.bs | 44 ++++++++++++++++++++++----------------------
1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/index.bs b/index.bs
index 5a2e21cb..c2e9594c 100644
--- a/index.bs
+++ b/index.bs
@@ -593,21 +593,21 @@ class codec_config() {
codec_config_id defines an identifier for a codec configuration. Within an [=IA Sequence=], there SHALL be one unique [=codec_config_id=] per codec. There SHALL be exactly one [=Codec Config OBU=] with a given identifier in a set of [=Descriptors=]. [=Audio Element=]s use this identifier to indicate that its corresponding [=Audio Substream=]s are coded with this codec configuration.
codec_id indicates a ‘four-character code’ (4CC) to identify the codec used to generate the coded [=Audio Substream=]s. For this version of the specification, it SHALL be set to one of the four [=codec_id=] values defined below:
-- 'Opus': All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!RFC6716]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#opus-specific]].
-- 'mp4a': All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!AAC]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#aac-lc-specific]].
-- 'fLaC': All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!FLAC]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#flac-specific]].
-- 'ipcm': All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL contain linear PCM (LPCM) audio samples and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#lpcm-specific]].
+- Opus
: All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!RFC6716]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#opus-specific]].
+- mp4a
: All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!AAC]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#aac-lc-specific]].
+- fLaC
: All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL comply with the [[!FLAC]] specification and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#flac-specific]].
+- ipcm
: All coded [=Audio Substream=]s referred to by all [=Audio Element=]s with this codec configuration SHALL contain linear PCM (LPCM) audio samples and the [=decoder_config()=] structure SHALL comply with the constraints given in [[#lpcm-specific]].
Parsers compliant with this version of the specification SHOULD ignore [=Codec Config OBU=]s with an unknown [=codec_id=].
-NOTE: 'ipcm' should not be confused with lpcm
, which is another 4CC to identify codecs in other container formats (e.g., QuickTime).
+NOTE: ipcm
should not be confused with lpcm
, which is another 4CC to identify codecs in other container formats (e.g., QuickTime).
num_samples_per_frame indicates the frame length, in samples, of the [=audio_frame()=] provided in the audio_frame_obu
. It SHALL NOT be set to zero. If the [=decoder_config()=] structure for a given codec specifies a value for the frame length, the two values SHALL be equal.
audio_roll_distance indicates how many audio frames prior to the current audio frame need to be decoded (and the decoded samples discarded) to set the encoder in a state that will produce the perfect decoded audio signal. It SHALL always be a negative value or zero. For some audio codecs, even if an audio frame can be decoded independently, the decoded signal after decoding only that frame may not represent a perfect, decoded audio signal, even ignoring compression artifacts. This can be due to overlap transforms. While potentially acceptable when starting to decode an [=Audio Substream=], it may be problematic when automatically switching between similar [=Audio Substream=]s of different quality and/or bitrate.
-- It SHALL be set to -R when [=codec_id=] is set to 'Opus', where R is ceil(3840 / [=num_samples_per_frame=])
.
-- It SHALL be set to -1 when [=codec_id=] is set to 'mp4a'.
-- It SHALL be set to 0 when [=codec_id=] is set to 'fLaC' or 'ipcm'.
+- It SHALL be set to -R when [=codec_id=] is set to Opus
, where R is ceil(3840 / [=num_samples_per_frame=])
.
+- It SHALL be set to -1 when [=codec_id=] is set to mp4a
.
+- It SHALL be set to 0 when [=codec_id=] is set to fLaC
or ipcm
.
decoder_config() specifies the set of codec parameters required to decode the [=Audio Substream=]. It is byte aligned.
@@ -739,7 +739,7 @@ NOTE: For a given [=audio_element_type=], a future version of the specification
- The type PARAMETER_DEFINITION_MIX_GAIN SHALL NOT be present in [=Audio Element OBU=].
- The type SHALL NOT be duplicated in one [=Audio Element OBU=].
-- When [=codec_id=] = 'fLaC' or 'ipcm', the type PARAMETER_DEFINITION_RECON_GAIN SHALL NOT be present.
+- When [=codec_id=] = fLaC
or ipcm
, the type PARAMETER_DEFINITION_RECON_GAIN SHALL NOT be present.
- When [=num_layers=] > 1, the type PARAMETER_DEFINITION_RECON_GAIN SHALL be present.
- When the highest [=loudspeaker_layout=] of the (non-)scalable channel audio (i.e., [=num_layers=] = 1) is less than or equal to 3.1.2ch, the type PARAMETER_DEFINITION_DEMIXING SHALL NOT be present.
- When the highest [=loudspeaker_layout=] of the scalable channel audio (i.e., [=num_layers=] > 1) is greater than 3.1.2ch, both PARAMETER_DEFINITION_DEMIXING and PARAMETER_DEFINITION_RECON_GAIN types SHALL be present.
@@ -1621,7 +1621,7 @@ For legacy codecs, [=decoder_config()=] SHALL have exactly the same information
### OPUS Specific ### {#opus-specific}
-[=codec_id=] SHALL be 'Opus'.
+[=codec_id=] SHALL be Opus
.
[=decoder_config()=] for OPUS conforms to [=ID Header=] with [=ChannelMappingFamily=] = 0 in [[!RFC7845]] with the following constraints:
- [=Magic Signature=] SHALL NOT be present.
@@ -1636,7 +1636,7 @@ The sample rate used for computing offsets SHALL be 48 kHz.
### AAC-LC Specific ### {#aac-lc-specific}
-[=codec_id=] SHALL be 'mp4a'.
+[=codec_id=] SHALL be mp4a
.
[=decoder_config()=] for AAC-LC is the [=DecoderConfigDescriptor()=] from [[!MP4-Systems]], which is a subset of [=ESDBox=] for [[!MP4-Audio]], with the following constraints:
- [=objectTypeIndication=] = 0x40
@@ -1656,7 +1656,7 @@ The sample rate used for computing offsets SHALL be the rate indicated by the [=
### FLAC Specific ### {#flac-specific}
-[=codec_id=] SHALL be 'fLaC', the FLAC stream marker in ASCII, meaning byte 0 of the stream is 0x66, followed by 0x4C 0x61 0x43.
+[=codec_id=] SHALL be fLaC
, the FLAC stream marker in ASCII, meaning byte 0 of the stream is 0x66, followed by 0x4C 0x61 0x43.
[=decoder_config()=] for FLAC is the [=METADATA_BLOCK=]s of [[!FLAC]] for mono or stereo channels. The [=METADATA_BLOCK_STREAMINFO=] has the following constraints:
- [=minimum block size=] SHALL be set to [=num_samples_per_frame=].
@@ -1676,7 +1676,7 @@ The sample rate used for computing offsets SHALL be the sampling rate indicated
### LPCM Specific ### {#lpcm-specific}
-[=codec_id=] SHALL be 'ipcm'.
+[=codec_id=] SHALL be ipcm
.
[=decoder_config()=] for LPCM is as follows:
@@ -1857,7 +1857,7 @@ NOTE: Multiple sample entries may be used in a track, for example when the track
- The 'stts' or 'trun' box SHALL indicate the number of audio samples in an [=IA Sample=] (i.e., the duration of an [=IA Sample=]).
- The duration of an [=IA Sample=] includes audio samples trimmed at the beginning but excludes audio samples trimmed at the end.
- Sample Group
- - When the [=codec_id=] is set to 'Opus' or 'mp4a' in an [=IA Track=], every sample SHALL be associated with a sample group of type 'roll'. The [=roll_distance=] value SHALL equal the value of the [=audio_roll_distance=] field in the [=Codec Config OBU=] stored in the [=configOBUs=] array in the sample entry.
+ - When the [=codec_id=] is set to Opus
or mp4a
in an [=IA Track=], every sample SHALL be associated with a sample group of type 'roll'. The [=roll_distance=] value SHALL equal the value of the [=audio_roll_distance=] field in the [=Codec Config OBU=] stored in the [=configOBUs=] array in the sample entry.
- Composition Time Stamp (CTS)
- For each [=IA Sample=], CTS = DTS (Decoding Time Stamp), and as a consequence, the 'ctts' box (and similar signaling in movie fragments) SHALL NOT be used.
@@ -1924,25 +1924,25 @@ DASH and other applications require defined values for the 'codecs' parameter sp
- The fourth element and any additional elements, if any, SHALL be the elements of the codecs parameter string if that stream was carried in its own track (i.e., not encapsulated in IAMF).
For example,
-- the codecs parameter string for [=codec_id=] = 'Opus' is
+- the codecs parameter string for [=codec_id=] = Opus
is
```
iamf.xxx.yyy.Opus
```
-- the codecs parameter string for [=codec_id=] = 'mp4a' is
+- the codecs parameter string for [=codec_id=] = mp4a
is
```
iamf.xxx.yyy.mp4a.40.2
```
-- the codecs parameter string for [=codec_id=] = 'fLaC' is
+- the codecs parameter string for [=codec_id=] = fLaC
is
```
iamf.xxx.yyy.fLaC
```
-- the codecs parameter string for [=codec_id=] = 'ipcm' is
+- the codecs parameter string for [=codec_id=] = ipcm
is
```
iamf.xxx.yyy.ipcm
@@ -2126,7 +2126,7 @@ For example, consider the case where CL #1 = 2ch, CL #2 = 3.1.2ch, CL #3 = 5.1.2
### Recon Gain ### {#processing-scalablechannelaudio-recongain}
-Recon gain is REQUIRED only for [=num_layers=] > 1 and when [=codec_id=] is set to 'Opus' or 'mp4a'.
+Recon gain is REQUIRED only for [=num_layers=] > 1 and when [=codec_id=] is set to Opus
or mp4a
.
[=recon_gain=] SHALL only be applied to all audio samples of the de-mixed channels from the De-mixer module.
- [=recon_gain_info_parameter_data()=] indicates each channel of CL #i to which [=recon_gain=] needs to be applied and provides the [=recon_gain=] value for each frame of the channel.
@@ -2144,8 +2144,8 @@ The figure below shows the smoothing scheme of [=recon_gain=].
Opus
: olen = 60.
+- When [=codec_id=] is set to mp4a
: olen = 64.
## Mix Presentation ## {#processing-mixpresentation}
@@ -2784,7 +2784,7 @@ Down-mix paths, which conform to the above rule, SHALL be only allowed for scala
This section RECOMMENDs how to generate [=recon_gain=].
-NOTE: Recon gain generation is not required when the codec is lossless, i.e., when [=codec_id=] is set to 'ipcm' or 'fLaC'.
+NOTE: Recon gain generation is not required when the codec is lossless, i.e., when [=codec_id=] is set to ipcm
or fLaC
.
Recon gain needs to be applied to de-mixed channels. For this, the IA encoder needs to deliver it to IA decoders.