From 313773fdebb6372e4612d033cdd54fc3812dd2a4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=ED=99=A9=EC=84=B1=ED=9D=AC/=EC=B0=A8=EC=84=B8=EB=8C=80=20?= =?UTF-8?q?Display=20Lab=28SR=29/=EC=82=BC=EC=84=B1=EC=A0=84=EC=9E=90?= Date: Wed, 30 Aug 2023 17:30:33 +0900 Subject: [PATCH 1/6] Fix #736, Section Reordering --- index.bs | 455 +++++++++++++++++++++++++++---------------------------- 1 file changed, 227 insertions(+), 228 deletions(-) diff --git a/index.bs b/index.bs index 4c5806bb..ef4334c1 100644 --- a/index.bs +++ b/index.bs @@ -694,7 +694,7 @@ audio_element_type: The type of audio representation. audio_substream_id indicates the identifier for an [=Audio Substream=] which this [=Audio Element=] refers to. -Let a particular [=Channel Group=]'s [=Audio Substream=]s be indexed as \(\left[c, n_c\right]\), where a [=Channel Group=] format is described in [[#iamfgeneration-scalablechannelaudio-channelgroupformat]] and +Let a particular [=Channel Group=]'s [=Audio Substream=]s be indexed as \(\left[c, n_c\right]\), where a [=Channel Group=] format is described in [[#scalablechannelaudio-channelgroupformat]] and - \(c = \left[1, \ldots, C\right]\) is the [=Channel Group=] index and \(C\) is the number of [=Channel Group=]s. - \(n_c = \left[1, \ldots, N_c\right]\) is the [=Audio Substream=] index in the \(c\)-th [=Channel Group=] and \(N_c\) is the number of [=Audio Substream=]s in the \(c\)-th [=Channel Group=]. @@ -913,11 +913,14 @@ When an [=Audio Element=] is composed of \(G(r)\) number of [=Audio Substream=]s
Immersive Audio Sequence with scalable channel audio (before OBU packing). See [[#standalone]] for related details on OBU ordering within an IA Sequence.
-Each [=Channel Group=] (or scalable audio channel layer) is associated with a different [=loudspeaker_layout=]. The IA decoder SHALL select one of the layers according to the following rules, in order: +Each [=Channel Group=] (or scalable channel audio layer) is associated with a different [=loudspeaker_layout=]. The IA decoder SHALL select one of the layers according to the following rules, in order: - The IA decoder SHOULD first attempt to select the layer with a [=loudspeaker_layout=] that matches the physical playback layout. - If there is no match, the IA decoder SHOULD select the layer with the closest [=loudspeaker_layout=] to the physical layout and then apply up- or down-mixing appropriately, after decoding and reconstruction of the channel audio. Sections [[#iamfgeneration-scalablechannelaudio-downmixmechanism]] and [[#processing-downmixmatrix]] provide examples of dynamic and static down-mixing matrices for some common layouts that MAY be used. +The relationship among all [=Channel Group=]s for an [=Audio Element=] SHALL comply with [[#scalablechannelaudio-channelgroupformat]] and the relationship among all channel layouts indicated by [=loudspeaker_layout=]s specified in an [=Audio Element OBU=] SHALL comply with [[#scalablechannelaudio-channellayoutgenerationrule]]. + + Semantics num_layers indicates the number of [=Channel Group=]s for scalable channel audio. It SHALL NOT be set to zero and its maximum value SHALL be 6. @@ -1018,6 +1021,36 @@ Bit position : Channel Name output_gain indicates the gain value to be applied to the mixed channels which are indicated by [=output_gain_flags=], where each mixed channel is generated by down-mixing two or more input channels. It is computed as \(20 \times \log_{10}(f)\), where \(f\) is the factor by which to scale the mixed channels. It is stored as a 16-bit, signed, two’s complement fixed-point value with 8 fractional bits (i.e., Q7.8)([[Q-Format]]). +#### Channel Layout Generation Rule (Normative) #### {#scalablechannelaudio-channellayoutgenerationrule} + +This section describes the generation rule for channel layouts for scalable channel audio. + +For a given channel layout (CL #n) of a channel-based input [=3D audio signal=], any list of CLs ({CL #i: i = 1, 2, ..., n}) for scalable channel audio SHALL conform with the following rules: +- Xi ≤ Xi+1 and Yi ≤ Yi+1 and Zi ≤ Zi+1 except Xi = Xi+1, Yi = Yi+1 and Zi = Zi+1 for i = n-1, n-2, ..., 1, where the i-th channel layout CL #i = Xi.Yi.Zi, Xi is the number of surround channels, Yi is the number of LFE channels, and Zi is the number of height channels. +- CL #i is one of the [=loudspeaker_layout=]s supported in this version of the specification. + +Scalable channel audio with [=num_layers=] > 1 SHALL only allow down-mix paths that conform to the rules above, as depicted in the figure below. + +
+
IA Down-mix Path for scalable channel audio
+ +#### Channel Group Format (Normative) #### {#scalablechannelaudio-channelgroupformat} + +The [=Channel Group=] format SHALL conform to the following rules: +- It consists of C number of channels and is structured to n number of [=Channel Group=]s, where C is the number of channels for the input [=3D audio signal=]. +- [=Channel Group=] #1 (as called BCG): This [=Channel Group=] is the [=down-mixed audio=] itself for CL #1 generated from the input [=3D audio signal=]. It contains a C1 number of channels. +- [=Channel Group=] #i (as called DCG, i = 2, 3, …, n): This [=Channel Group=] contains (Ci – Ci-1) number of channels. (Ci – Ci-1) channel(s) consists of as follows: + - (Xi – Xi-1) surround channel(s) if Xi > Xi-1 . When \(S_{\text{set}} = \{x \mid \text{Xi}-1 < x \le \text{Xi}\} \) and \(x\) is an integer, + - If 2 is an element of \(S_{\text{set}}\), the L2 channel is contained in this CG #i. + - If 3 is an element of \(S_{\text{set}}\), the Center channel is contained in this CG #i. + - If 5 is an element of \(S_{\text{set}}\), the L5 and R5 channels are contained in this CG #i. + - If 7 is an element of \(S_{\text{set}}\), the Lss7 and Rss7 channels are contained in this CG #i. + - The LFE channel if Yi > Yi-1. + - (Zi - Zi-1) top channels if Zi > Zi-1. + - If Zi-1 = 0, the top channels of the [=down-mixed audio=] for CL #i are contained in this [=Channel Group=] #i. + - If Zi-1 = 2, the Ltf and Rtf channels of the [=down-mixed audio=] for CL #i are contained in this [=Channel Group=] #i. + - Where Xi.Yi.Zi denotes the channel layout in CL #i, where Xi is the number of surround channels, Yi is the number of LFE channels and Zi is the number of height channels. + ### Ambisonics Config Syntax and Semantics ### {#syntax-ambisonics-config} @@ -2270,7 +2303,7 @@ This section defines the renderer to use, given a channel-based [=Audio Element= - The output layout of the IA renderer is set to the playback layout (X.Y.Z). - The IA renderer is selected according to the following rules: - If DemixingParamDefinition() is not present, render according to [[#processing-mixpresentation-rendering-m2l-withoutdemixinfo]]. - - Else, if the playback layout matches a [=loudspeaker_layout=] which can be generated from the highest loudspeaker layout of the [=Audio Element=] according to [[#iamfgeneration-scalablechannelaudio-channellayoutgenerationrule]], + - Else, if the playback layout matches a [=loudspeaker_layout=] which can be generated from the highest loudspeaker layout of the [=Audio Element=] according to [[#scalablechannelaudio-channellayoutgenerationrule]], - If the playback layout has height channels, use [=demixing_info_parameter_data=] or [=default_demixing_info_parameter_data=]. - Else, if the input layout does not have height channels, use [=demixing_info_parameter_data=] or [=default_demixing_info_parameter_data=]. - Else, the EAR Direct Speakers renderer ([[!ITU-2127-0]]) can be used. @@ -2591,7 +2624,77 @@ The 3.1.2ch down-mix matrix for 7.1.4ch is given below, where \(p = 0.707\). \] -# IAMF Generation Process (Informative) # {#iamfgeneration} +# Convention # {#convention} + +## Syntax Description ## {#convention-syntaxstructure} + +All syntax elements conform to the [=Syntactic Description Language=] specified in [[!MP4-Systems]] and the additional [=Syntactic Description Language=] defined in this section. + +### Data types ### {#convention-data-types} + + leb128() syntaxName + + leb128() indicates the type of an unsigned integer. To encode the following unsigned integer syntaxName, it first represents the integer in binary with an N-bit representation, where N is a multiple of 7. Then break the integer up into groups of 7 bits. Output one encoded byte for each 7 bits group, from least significant to most significant group. Each byte will have the group in its 7 least significant bits. Set the most significant bit on each byte except the last byte. + + syntaxName is an unsigned integer which is encoded by leb128(). Its size is limited to 32 bits. + + NOTE: There are multiple ways of encoding the same value depending on how many leading zero bits are encoded. There is no requirement that this syntax descriptor uses the most compressed representation. This can be useful for encoder implementations by allowing a fixed amount of space to be filled in later when the value becomes known. + + string syntaxName + +string indicates a null-terminated (i.e., ending at the first byte set to 0x00), UTF-8 encoded as defined in [[!RFC-3629]] and whose length SHALL be limited to 128 bytes. + +syntaxName is a human readable label. + +### Function templates ### {#convention-function-templates} + +When the template keyword is used to decorate the class declaration, it indicates that the code is a template with a placeholder type that can be reused by other classes. Only classes that use the template present in the bitstream; the template itself does not present in the bitstream. Classes that use a function template pass a data type that is specified in either [[!MP4-Systems]] or [[#convention-data-types]]. + +Example + +``` +template +class Foo { + T t; +} + +class Bar { + Foo f; +} +``` + +## Arithmetic Operators ## {#convention-arithmetic-operators} + + + + + + + + + + + + + + + + + + + + +
\(\left\lfloor{x}\right\rfloor \)The largest integer that is smaller than or equal to \(x\).
\(\left\lceil{x}\right\rceil \)The smallest integer that is greater than or equal to \(x\).
\(\text{round}(x)\)The integer value closest to \(x\). It may be implemented as \(\left\lfloor{x + 0.5}\right\rfloor \).
\(\sqrt{x}\)The square root of \(x\).
\(\text{Clip3}(x, y, z)\)Conforms to [=Clip3=] specified in [[!AV1-Spec]].
\(x^y\)The value of \(x\) to the power of \(y\).
+ +## Q Format ## {#convention-qformat} + +Qx.y + +Qx.y indicates that it is stored as a (x+y+1)-bit, signed, two’s complement fixed-point value with y fractional bits. That is, a (x+y+1)-bit signed (two’s complement) integer, that is implicitly multiplied by the scaling factor 2^(−y). + +# Annex # {#annex} + +## Annex A: IAMF Generation Process (Informative) ## {#iamfgeneration} This section provides a guideline for encoding an [=IA Sequence=] that conforms to the [[#obu-syntax]], given a set of input [=3D audio signal=] and user inputs. @@ -2626,7 +2729,7 @@ The IA encoder is composed of the Pre-Processor, Codec Encoder, and OBU Packetiz - The OBU Packetizer packetizes [=Descriptors=], [=Parameter Substream=]s and [=Audio Substream=]s into OBUs, and outputs an [=IA Sequence=]. - The Temporal Unit Generator generates a [=Temporal Unit=] for each frame by grouping and ordering [=Audio Frame OBU=]s and [=Parameter Block OBU=]s (if present). -## Ambisonics Encoding ## {#iamfgeneration-ambisonics} +### Annex A1: Ambisonics Encoding (Informative) ### {#iamfgeneration-ambisonics} For Ambisonics encoding: @@ -2643,7 +2746,7 @@ For Ambisonics encoding: - The i-th [=Temporal Unit=] is composed of the [=Audio Frame OBU=]s for the i-th frame. - It may have an immediately preceding [=Temporal Delimiter OBU=]. -## Scalable Channel Audio Encoding ## {#iamfgeneration-scalablechannelaudio} +### Annex A2: Scalable Channel Audio Encoding (Informative) ### {#iamfgeneration-scalablechannelaudio} For Scalable Channel Audio encoding: @@ -2698,196 +2801,7 @@ The figure below shows the IA encoding flowchart for Scalable Channel Audio. - It may have the immediately preceding [=Temporal Delimiter OBU=], - The OBU Packetizer outputs an [=IA Sequence=] which is composed of OBUs for [=Descriptors=], followed by OBUs for [=Temporal Unit=]s. -## Mix Presentation Encoding ## {#iamfgeneration-mixpresentation} - -The [=Mix Presentation OBU=] for one single channel-based [=Audio Element=] is set as follows: -- [=num_sub_mixes=]: set to 1. -- [=num_audio_elements=]: set to 1. -- [=element_mix_config=]: No [=Parameter Block OBU=]s for [=element_mix_config=] and [=default_mix_gain=] = 0 dB. -- [=output_mix_config=]: No [=Parameter Block OBU=]s for [=output_mix_config=] and [=default_mix_gain=] = 0 dB. -- [=num_layouts=]: set to N, where N is the number of input channel layouts. -- [=loudness_layout=]: set to L(1), L(2), ..., L(N), where L(i) is the measured layout for the i-th layer and i = 1, 2, ..., N. - - [=LoudnessInfo()=] for L(1), [=LoudnessInfo()=] for L(2), ..., [=LoudnessInfo()=] for L(N): loudness information of the audio rendered to to the measured layout L(i). - -NOTE: If the input channel layouts do not include Stereo, then [=num_layers=] is set to N + 1 and the [=loudness_layout=]s includes Stereo. - - -The [=Mix Presentation OBU=] for one single scene-based [=Audio Element=] is set as follows: -- [=num_sub_mixes=]: set to 1 -- [=num_audio_elements=]: set to 1 -- [=element_mix_config=]: set to [=mix_gain=] -- [=output_mix_config=]: set to [=output_mix_gain=] -- [=num_layouts=]: set to M1, the number of layouts for which loudness information is provided. -- [=loudness_layout=]: set to L(1), L(2), ..., L(M1), where L(i) is the measured layout for the i-th loudness information and i = 1, 2, ..., M1. - - One of them is Stereo. -- [=LoudnessInfo()=] on L(1), [=LoudnessInfo()=] on L(2), ..., [=LoudnessInfo()=] on L(M1): loudness information of the audio rendered to the measured layout L(i). -- This [=Mix Presentation=] is authored using the highest [=loudness_layout=]. - -The [=Mix Presentation OBU=] for 2 [=Audio Element=]s is set as follows: -- [=num_sub_mixes=]: set to 1 -- [=num_audio_elements=]: set to 2 -- [=element_mix_config=] for each [=Audio Element=]: set to [=mix_gain=] -- [=output_mix_config=]: set to [=output_mix_gain=] -- [=num_layouts=]: set to M2, the number of layouts for which loudness information is provided. -- [=loudness_layout=]: set to L(1), L(2), ..., L(M2), where L(i) is the measured layout for the i-th loudness information and i = 1, 2, ..., M2. - - One of them is Stereo. -- [=LoudnessInfo()=] on L(1), [=LoudnessInfo()=] on L(2), ..., [=LoudnessInfo()=] on L(M2): loudness information of the audio rendered to the measured layout L(i). -- This [=Mix Presentation=] is authored using the highest [=loudness_layout=]. - -### Element Mix Config ### {#iamfgeneration-mixpresentation-mix} - -This section provides a guideline to generate [=element_mix_config=]. - -An IA multiplexer may merge two [=IA Sequence=]s (or two [=Audio Element=]s). In this case, it adjusts the gain values for [=element_mix_config=]s as necessary to describe the desired relative gains between the [=IA Sequence=]s (or two [=Audio Element=]s) when they are summed to generate the final mix. It also ensures that the gains selected do not result in clipping when the final mix is generated. - -## Two Audio Elements Encoding ## {#iamfgeneration-multipleaudioelements} - -This section provides a way to generate an [=IA Sequence=] with two [=Audio Element=]s from two [[#profiles-simple|Simple Profile]] [=IA Sequence=]s. - -### Two Audio Elements with One Codec Config ### {#iamfgeneration-multipleaudioelements-onecodec} - -This section provides a way to generate an [=IA Sequence=] with two [=Audio Element=]s from two [[#profiles-simple|Simple Profile]] [=IA Sequence=]s with the same [=Codec Config OBU=]. The result complies with the [[#profiles-base|Base Profile]]. - -Step 1: [=Descriptors=] are generated as follows: -- [=IA Sequence Header OBU=]: take the larger [=primary_profile=] field and the larger [=additional_profile=] field from the two input [=IA Sequence=]s. -- [=Codec Config OBU=]: take the [=Codec Config OBU=] from either of the input [=IA Sequence=]s. -- Two [=Audio Element OBU=]s: take both [=Audio Element OBU=]s from both the input [=IA Sequence=]s and make the following modifications as needed: - - The [=audio_element_obu/codec_config_id=]s in both [=Audio Element OBU=] are updated to indicate the [=codec_config_obu/codec_config_id=] specified in the taken [=Codec Config OBU=]. - - The [=audio_element_obu/audio_element_id=]s are updated to be unique between the two [=Audio Element OBU=]s. - - The [=audio_element_obu/audio_substream_id=]s are updated to be unique between the two [=Audio Element OBU=]s. - - The [=ParamDefinition/parameter_id=]s in [=ParamDefinition()=]s carried in the [=Audio Element OBU=]s are updated to be unique within the new [=IA Sequence=]. -- [=Mix Presentation OBU=]s: generate new ones which are used for mixing the two [=Audio Element=]s. - - The [=mix_presentation_obu/audio_element_id=]s in each [=Mix Presentation OBU=] are set to indicate the [=audio_element_obu/audio_element_id=]s of the referred [=Audio Element OBU=]s. - - The [=ParamDefinition/parameter_id=]s in [=ParamDefinition()=]s carried in each [=Mix Presentation OBU=] are set to refer to their associated [=Parameter Substream=]s. - -Step 2: The i-th [=Temporal Unit=] is generated as follows: -- Place all [=Parameter Block OBU=]s for the i-th frame, followed by the [=Audio Frame OBU=]s for the i-th frame (grouped by [=Audio Element=]s). Make the following modifications as needed: - - The [=obu_type=]s of the [=Audio Frame OBU=]s are updated to be aligned with the [=audio_element_obu/audio_substream_id=]s specified in the [=Audio Element OBU=]s. - - The [=parameter_block_obu/parameter_id=]s in the [=Parameter Block OBU=]s are updated to identify their associated [=Parameter Substream=]s based on the [=ParamDefinition/parameter_id=]s carried in the [=Descriptors=]. -- It may have an immediately preceding [=Temporal Delimiter OBU=]. - -Step 3: Generate an [=IA Sequence=] which starts with [=Descriptors=] and is followed by [=Temporal Unit=]s, in order. - -## Post Processing ## {#iamfgeneration-postprocessing} - -This section provides a way to generate metadata for post-processing. - -### Loudness Information ### {#iamfgeneration-postprocessing-loudness} - -This section provides a way to generate [=LoudnessInfo()=], given a [=Mix Presentation OBU=] and a [=loudness_layout=]. - -1. Each [=Audio Element=] specified in the given [=Mix Presentation OBU=] is rendered to the given [=loudness_layout=]. -2. Each rendered [=Audio Element=] specified in the given [=Mix Presentation OBU=] has a gain applied using the value from [=mix_gain=] specified in its [=element_mix_config=]. -3. All rendered and processed [=Audio Element=]s specified in the given [=Mix Presentation OBU=] are summed. -3. The summed audio (i.e., [=Rendered Mix Presentation=]) has a gain applied using the value from [=mix_gain=] specified in [=output_mix_config=]. -4. Generate [=LoudnessInfo()=] for the [=Rendered Mix Presentation=] according to [[#obu-mixpresentation-loudness]]. - - -# Convention # {#convention} - -## Syntax Description ## {#convention-syntaxstructure} - -All syntax elements conform to the [=Syntactic Description Language=] specified in [[!MP4-Systems]] and the additional [=Syntactic Description Language=] defined in this section. - -### Data types ### {#convention-data-types} - - leb128() syntaxName - - leb128() indicates the type of an unsigned integer. To encode the following unsigned integer syntaxName, it first represents the integer in binary with an N-bit representation, where N is a multiple of 7. Then break the integer up into groups of 7 bits. Output one encoded byte for each 7 bits group, from least significant to most significant group. Each byte will have the group in its 7 least significant bits. Set the most significant bit on each byte except the last byte. - - syntaxName is an unsigned integer which is encoded by leb128(). Its size is limited to 32 bits. - - NOTE: There are multiple ways of encoding the same value depending on how many leading zero bits are encoded. There is no requirement that this syntax descriptor uses the most compressed representation. This can be useful for encoder implementations by allowing a fixed amount of space to be filled in later when the value becomes known. - - string syntaxName - -string indicates a null-terminated (i.e., ending at the first byte set to 0x00), UTF-8 encoded as defined in [[!RFC-3629]] and whose length SHALL be limited to 128 bytes. - -syntaxName is a human readable label. - -### Function templates ### {#convention-function-templates} - -When the template keyword is used to decorate the class declaration, it indicates that the code is a template with a placeholder type that can be reused by other classes. Only classes that use the template present in the bitstream; the template itself does not present in the bitstream. Classes that use a function template pass a data type that is specified in either [[!MP4-Systems]] or [[#convention-data-types]]. - -Example - -``` -template -class Foo { - T t; -} - -class Bar { - Foo f; -} -``` - -## Arithmetic Operators ## {#convention-arithmetic-operators} - - - - - - - - - - - - - - - - - - - - -
\(\left\lfloor{x}\right\rfloor \)The largest integer that is smaller than or equal to \(x\).
\(\left\lceil{x}\right\rceil \)The smallest integer that is greater than or equal to \(x\).
\(\text{round}(x)\)The integer value closest to \(x\). It may be implemented as \(\left\lfloor{x + 0.5}\right\rfloor \).
\(\sqrt{x}\)The square root of \(x\).
\(\text{Clip3}(x, y, z)\)Conforms to [=Clip3=] specified in [[!AV1-Spec]].
\(x^y\)The value of \(x\) to the power of \(y\).
- -## Q Format ## {#convention-qformat} - -Qx.y - -Qx.y indicates that it is stored as a (x+y+1)-bit, signed, two’s complement fixed-point value with y fractional bits. That is, a (x+y+1)-bit signed (two’s complement) integer, that is implicitly multiplied by the scaling factor 2^(−y). - -# Annex # {#annex} - -## Annex A: ID Linking Scheme (Informative) ## {#Annex_A} - -The figure below shows the linking scheme among IDs in the obu_header or OBU payload. - -
-
ID Linking Scheme
- -In the figure above, -- The [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0 is providing its [=codec_id=] and [=decoder_config=]. -- The [=Mix Presentation OBU=] with [=mix_presentation_id=] = 21 is saying: - - There are two [=Audio Element=]s (with [=audio_element_obu/audio_element_id=] = 11 and 12) which need to be mixed. The [=mix_presentation_obu/audio_element_id=] = 11 and the [=mix_presentation_obu/audio_element_id=] = 12 are linked to the [=Audio Element OBU=]s with [=audio_element_obu/audio_element_id=] = 11 and [=audio_element_obu/audio_element_id=] = 12, respectively. - - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 32 to be used for mixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. - - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 33 to be used for mixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 12. - - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 34 to be used for mixing the two [=Audio Element=]s. -- The [=Audio Element OBU=] with [=audio_element_obu/audio_element_id=] = 11 is saying: - - This [=Audio Element=] has been coded using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. - - There are two [=Audio Substream=]s ([=audio_substream/audio_substream_id=] = 0 and 1, respectively) in this [=Audio Element=]. They are linked to the [=Audio Frame OBU=]s with [=audio_substream/audio_substream_id=] = 0 and [=audio_substream/audio_substream_id=] = 1 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID0 and [=obu_type=] = OBU_IA_Audio_Frame_ID1), respectively. - - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 31 to be used for demixing this [=Audio Element=]. -- The [=Audio Element OBU=] with [=audio_element_obu/audio_element_id=] = 12 is saying: - - This [=Audio Element=] has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. - - There is one [=Audio Substream=] ([=audio_substream/audio_substream_id=] = 2) in this [=Audio Element=]. It is linked to the [=Audio Frame OBU=]s with [=audio_substream/audio_substream_id=] = 2 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID2). - -- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 0 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID0) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. -- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 1 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID1) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. -- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 2 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID2) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. -- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 31 is providing [=demixing_info_parameter_data=] to be applied for demixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. -- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 32 is providing [=mix_gain_parameter_data=] to be applied to the rendered [=Audio Element=] after rendering according to [=rendering_config=] of the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. -- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 33 is providing [=mix_gain_parameter_data=] to be applied to the rendered [=Audio Element=] after rendering according to [=rendering_config=] of the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 12. -- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 34 is providing [=mix_gain_parameter_data=] to be applied to the [=Rendered Mix Presentation=] of the two rendered [=Audio Element=]s. - -## Annex B: Rules for Scalable Channel Audio ## {#Annex_B} - -This Annex specifies normative rules for scalable channel audio with [=num_layers=] > 1. - -### Annex B-1: Down-mix parameter and Loudness (Informative) ### {#iamfgeneration-scalablechannelaudio-downmixparameter} +#### Annex A2.1: Down-mix parameter and Loudness (Informative) #### {#iamfgeneration-scalablechannelaudio-downmixparameter} This section describes how down-mix parameters and loudness levels can be generated for a given channel audio and a given list of channel layouts for scalability (i.e., [=num_layers=] > 1). @@ -2909,7 +2823,7 @@ For a given channel-based input [=3D audio signal=] (e.g., 7.1.4ch) and a given - It is not depicted in the figure but the Down-Mixer further generates [=dmixp_mode=] and [=recon_gain=] for each frame to be passed to the OBU Packetizer. - The Loudness module measures the loudness level ([=LKFS=]) of each [=down-mixed audio=] based on [[ITU-1770-4]], and passes them to the OBU Packetizer. -### Annex B-2: Down-mix Mechanism (Informative) ### {#iamfgeneration-scalablechannelaudio-downmixmechanism} +#### Annex A2.2: Down-mix Mechanism (Informative) #### {#iamfgeneration-scalablechannelaudio-downmixmechanism} This section specifies the down-mixing mechanism to generate down-mixed audio for scalable channel audio encoding. @@ -2947,20 +2861,7 @@ For example, to get the 3.1.2ch [=down-mixed audio=] from 7.1.4ch: - TF2 of 3.1.2ch is generated by using [=T4to2 encoder=] and [=T2toTF2 encoder=]. -### Annex B-3: Channel Layout Generation Rule (Normative) ### {#iamfgeneration-scalablechannelaudio-channellayoutgenerationrule} - -This section describes the generation rule for channel layouts for scalable channel audio. - -For a given channel layout (CL #n) of a channel-based input [=3D audio signal=], any list of CLs ({CL #i: i = 1, 2, ..., n}) for scalable channel audio SHALL conform with the following rules: -- Xi ≤ Xi+1 and Yi ≤ Yi+1 and Zi ≤ Zi+1 except Xi = Xi+1, Yi = Yi+1 and Zi = Zi+1 for i = n-1, n-2, ..., 1, where the i-th channel layout CL #i = Xi.Yi.Zi, Xi is the number of surround channels, Yi is the number of LFE channels, and Zi is the number of height channels. -- CL #i is one of the [=loudspeaker_layout=]s supported in this version of the specification. - -Scalable channel audio with [=num_layers=] > 1 SHALL only allow down-mix paths that conform to the rules above, as depicted in the figure below. - -
-
IA Down-mix Path for scalable channel audio
- -### Annex B-4: Recon Gain Generation (Informative) ### {#iamfgeneration-scalablechannelaudio-recongaingeneration} +#### Annex A2.3: Recon Gain Generation (Informative) #### {#iamfgeneration-scalablechannelaudio-recongaingeneration} This section provides guidelines about how to generate [=recon_gain=]. @@ -3002,11 +2903,11 @@ Recon_Gain for D_Rtb4: - \(D_k\) is the signal power for frame \(k\) of D_Rtb4. -### Annex B-5: Channel Group Generation Rule (Informative) ### {#iamfgeneration-scalablechannelaudio-channelgroupgenerationrule} +#### Annex A2.4: Channel Group Generation Rule (Informative) #### {#iamfgeneration-scalablechannelaudio-channelgroupgenerationrule} This section describes the generation rule for a [=Channel Group=] (CG). -For a given channel-based input audio and the list of CLs ({CL #i: i = 1, 2, ..., n}), the CG Generation module outputs the transformed audio (i.e., [=Channel Group=]s) which adheres to [[#iamfgeneration-scalablechannelaudio-channelgroupformat]]. +For a given channel-based input audio and the list of CLs ({CL #i: i = 1, 2, ..., n}), the CG Generation module outputs the transformed audio (i.e., [=Channel Group=]s) which adheres to [[#scalablechannelaudio-channelgroupformat]]. An example of a transformation matrix with 4 CGs (2ch/3.1.2ch/5.1.2ch/7.1.4ch) is given below, @@ -3105,19 +3006,117 @@ where \[c(k) = w(k) \times \delta(k) \times \alpha(k),\] \[d(k) = w(k) \times \delta(k) \times \beta(k).\] -### Annex B-6: Channel Group Format (Normative) ### {#iamfgeneration-scalablechannelaudio-channelgroupformat} -The [=Channel Group=] format SHALL conform to the following rules: -- It consists of C number of channels and is structured to n number of [=Channel Group=]s, where C is the number of channels for the input [=3D audio signal=]. -- [=Channel Group=] #1 (as called BCG): This [=Channel Group=] is the [=down-mixed audio=] itself for CL #1 generated from the input [=3D audio signal=]. It contains a C1 number of channels. -- [=Channel Group=] #i (as called DCG, i = 2, 3, …, n): This [=Channel Group=] contains (Ci – Ci-1) number of channels. (Ci – Ci-1) channel(s) consists of as follows: - - (Xi – Xi-1) surround channel(s) if Xi > Xi-1 . When \(S_{\text{set}} = \{x \mid \text{Xi}-1 < x \le \text{Xi}\} \) and \(x\) is an integer, - - If 2 is an element of \(S_{\text{set}}\), the L2 channel is contained in this CG #i. - - If 3 is an element of \(S_{\text{set}}\), the Center channel is contained in this CG #i. - - If 5 is an element of \(S_{\text{set}}\), the L5 and R5 channels are contained in this CG #i. - - If 7 is an element of \(S_{\text{set}}\), the Lss7 and Rss7 channels are contained in this CG #i. - - The LFE channel if Yi > Yi-1. - - (Zi - Zi-1) top channels if Zi > Zi-1. - - If Zi-1 = 0, the top channels of the [=down-mixed audio=] for CL #i are contained in this [=Channel Group=] #i. - - If Zi-1 = 2, the Ltf and Rtf channels of the [=down-mixed audio=] for CL #i are contained in this [=Channel Group=] #i. - - Where Xi.Yi.Zi denotes the channel layout in CL #i, where Xi is the number of surround channels, Yi is the number of LFE channels and Zi is the number of height channels. +### Annex A3: Mix Presentation Encoding (Informative) ### {#iamfgeneration-mixpresentation} + +The [=Mix Presentation OBU=] for one single channel-based [=Audio Element=] is set as follows: +- [=num_sub_mixes=]: set to 1. +- [=num_audio_elements=]: set to 1. +- [=element_mix_config=]: No [=Parameter Block OBU=]s for [=element_mix_config=] and [=default_mix_gain=] = 0 dB. +- [=output_mix_config=]: No [=Parameter Block OBU=]s for [=output_mix_config=] and [=default_mix_gain=] = 0 dB. +- [=num_layouts=]: set to N, where N is the number of input channel layouts. +- [=loudness_layout=]: set to L(1), L(2), ..., L(N), where L(i) is the measured layout for the i-th layer and i = 1, 2, ..., N. + - [=LoudnessInfo()=] for L(1), [=LoudnessInfo()=] for L(2), ..., [=LoudnessInfo()=] for L(N): loudness information of the audio rendered to to the measured layout L(i). + +NOTE: If the input channel layouts do not include Stereo, then [=num_layers=] is set to N + 1 and the [=loudness_layout=]s includes Stereo. + + +The [=Mix Presentation OBU=] for one single scene-based [=Audio Element=] is set as follows: +- [=num_sub_mixes=]: set to 1 +- [=num_audio_elements=]: set to 1 +- [=element_mix_config=]: set to [=mix_gain=] +- [=output_mix_config=]: set to [=output_mix_gain=] +- [=num_layouts=]: set to M1, the number of layouts for which loudness information is provided. +- [=loudness_layout=]: set to L(1), L(2), ..., L(M1), where L(i) is the measured layout for the i-th loudness information and i = 1, 2, ..., M1. + - One of them is Stereo. +- [=LoudnessInfo()=] on L(1), [=LoudnessInfo()=] on L(2), ..., [=LoudnessInfo()=] on L(M1): loudness information of the audio rendered to the measured layout L(i). +- This [=Mix Presentation=] is authored using the highest [=loudness_layout=]. + +The [=Mix Presentation OBU=] for 2 [=Audio Element=]s is set as follows: +- [=num_sub_mixes=]: set to 1 +- [=num_audio_elements=]: set to 2 +- [=element_mix_config=] for each [=Audio Element=]: set to [=mix_gain=] +- [=output_mix_config=]: set to [=output_mix_gain=] +- [=num_layouts=]: set to M2, the number of layouts for which loudness information is provided. +- [=loudness_layout=]: set to L(1), L(2), ..., L(M2), where L(i) is the measured layout for the i-th loudness information and i = 1, 2, ..., M2. + - One of them is Stereo. +- [=LoudnessInfo()=] on L(1), [=LoudnessInfo()=] on L(2), ..., [=LoudnessInfo()=] on L(M2): loudness information of the audio rendered to the measured layout L(i). +- This [=Mix Presentation=] is authored using the highest [=loudness_layout=]. + +#### Annex A3.1:Element Mix Config (Informative) #### {#iamfgeneration-mixpresentation-mix} + +This section provides a guideline to generate [=element_mix_config=]. + +An IA multiplexer may merge two [=IA Sequence=]s (or two [=Audio Element=]s). In this case, it adjusts the gain values for [=element_mix_config=]s as necessary to describe the desired relative gains between the [=IA Sequence=]s (or two [=Audio Element=]s) when they are summed to generate the final mix. It also ensures that the gains selected do not result in clipping when the final mix is generated. + +### Annex A4: Two Audio Elements Encoding (Informative) ### {#iamfgeneration-multipleaudioelements} + +This section provides a way to generate an [=IA Sequence=] with two [=Audio Element=]s from two [[#profiles-simple|Simple Profile]] [=IA Sequence=]s. + +#### Annex A4.1: Two Audio Elements with One Codec Config (Informative) #### {#iamfgeneration-multipleaudioelements-onecodec} + +This section provides a way to generate an [=IA Sequence=] with two [=Audio Element=]s from two [[#profiles-simple|Simple Profile]] [=IA Sequence=]s with the same [=Codec Config OBU=]. The result complies with the [[#profiles-base|Base Profile]]. + +Step 1: [=Descriptors=] are generated as follows: +- [=IA Sequence Header OBU=]: take the larger [=primary_profile=] field and the larger [=additional_profile=] field from the two input [=IA Sequence=]s. +- [=Codec Config OBU=]: take the [=Codec Config OBU=] from either of the input [=IA Sequence=]s. +- Two [=Audio Element OBU=]s: take both [=Audio Element OBU=]s from both the input [=IA Sequence=]s and make the following modifications as needed: + - The [=audio_element_obu/codec_config_id=]s in both [=Audio Element OBU=] are updated to indicate the [=codec_config_obu/codec_config_id=] specified in the taken [=Codec Config OBU=]. + - The [=audio_element_obu/audio_element_id=]s are updated to be unique between the two [=Audio Element OBU=]s. + - The [=audio_element_obu/audio_substream_id=]s are updated to be unique between the two [=Audio Element OBU=]s. + - The [=ParamDefinition/parameter_id=]s in [=ParamDefinition()=]s carried in the [=Audio Element OBU=]s are updated to be unique within the new [=IA Sequence=]. +- [=Mix Presentation OBU=]s: generate new ones which are used for mixing the two [=Audio Element=]s. + - The [=mix_presentation_obu/audio_element_id=]s in each [=Mix Presentation OBU=] are set to indicate the [=audio_element_obu/audio_element_id=]s of the referred [=Audio Element OBU=]s. + - The [=ParamDefinition/parameter_id=]s in [=ParamDefinition()=]s carried in each [=Mix Presentation OBU=] are set to refer to their associated [=Parameter Substream=]s. + +Step 2: The i-th [=Temporal Unit=] is generated as follows: +- Place all [=Parameter Block OBU=]s for the i-th frame, followed by the [=Audio Frame OBU=]s for the i-th frame (grouped by [=Audio Element=]s). Make the following modifications as needed: + - The [=obu_type=]s of the [=Audio Frame OBU=]s are updated to be aligned with the [=audio_element_obu/audio_substream_id=]s specified in the [=Audio Element OBU=]s. + - The [=parameter_block_obu/parameter_id=]s in the [=Parameter Block OBU=]s are updated to identify their associated [=Parameter Substream=]s based on the [=ParamDefinition/parameter_id=]s carried in the [=Descriptors=]. +- It may have an immediately preceding [=Temporal Delimiter OBU=]. + +Step 3: Generate an [=IA Sequence=] which starts with [=Descriptors=] and is followed by [=Temporal Unit=]s, in order. + +### Annex A5: Post Processing (Informative) ### {#iamfgeneration-postprocessing} + +This section provides a way to generate metadata for post-processing. + +#### Annex A5.1: Loudness Information (Informative) #### {#iamfgeneration-postprocessing-loudness} + +This section provides a way to generate [=LoudnessInfo()=], given a [=Mix Presentation OBU=] and a [=loudness_layout=]. + +1. Each [=Audio Element=] specified in the given [=Mix Presentation OBU=] is rendered to the given [=loudness_layout=]. +2. Each rendered [=Audio Element=] specified in the given [=Mix Presentation OBU=] has a gain applied using the value from [=mix_gain=] specified in its [=element_mix_config=]. +3. All rendered and processed [=Audio Element=]s specified in the given [=Mix Presentation OBU=] are summed. +3. The summed audio (i.e., [=Rendered Mix Presentation=]) has a gain applied using the value from [=mix_gain=] specified in [=output_mix_config=]. +4. Generate [=LoudnessInfo()=] for the [=Rendered Mix Presentation=] according to [[#obu-mixpresentation-loudness]]. + +## Annex B: ID Linking Scheme (Informative) ## {#idlinkingscheme} + +The figure below shows the linking scheme among IDs in the obu_header or OBU payload. + +
+
ID Linking Scheme
+ +In the figure above, +- The [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0 is providing its [=codec_id=] and [=decoder_config=]. +- The [=Mix Presentation OBU=] with [=mix_presentation_id=] = 21 is saying: + - There are two [=Audio Element=]s (with [=audio_element_obu/audio_element_id=] = 11 and 12) which need to be mixed. The [=mix_presentation_obu/audio_element_id=] = 11 and the [=mix_presentation_obu/audio_element_id=] = 12 are linked to the [=Audio Element OBU=]s with [=audio_element_obu/audio_element_id=] = 11 and [=audio_element_obu/audio_element_id=] = 12, respectively. + - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 32 to be used for mixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. + - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 33 to be used for mixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 12. + - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 34 to be used for mixing the two [=Audio Element=]s. +- The [=Audio Element OBU=] with [=audio_element_obu/audio_element_id=] = 11 is saying: + - This [=Audio Element=] has been coded using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. + - There are two [=Audio Substream=]s ([=audio_substream/audio_substream_id=] = 0 and 1, respectively) in this [=Audio Element=]. They are linked to the [=Audio Frame OBU=]s with [=audio_substream/audio_substream_id=] = 0 and [=audio_substream/audio_substream_id=] = 1 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID0 and [=obu_type=] = OBU_IA_Audio_Frame_ID1), respectively. + - There are [=Parameter Block OBU=]s with [=parameter_block_obu/parameter_id=] = 31 to be used for demixing this [=Audio Element=]. +- The [=Audio Element OBU=] with [=audio_element_obu/audio_element_id=] = 12 is saying: + - This [=Audio Element=] has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. + - There is one [=Audio Substream=] ([=audio_substream/audio_substream_id=] = 2) in this [=Audio Element=]. It is linked to the [=Audio Frame OBU=]s with [=audio_substream/audio_substream_id=] = 2 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID2). + +- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 0 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID0) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. +- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 1 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID1) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. +- The [=Audio Frame OBU=] with [=audio_substream/audio_substream_id=] = 2 (i.e., [=obu_type=] = OBU_IA_Audio_Frame_ID2) is providing the coded data which has been coded by using the [=Codec Config OBU=] with [=codec_config_obu/codec_config_id=] = 0. +- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 31 is providing [=demixing_info_parameter_data=] to be applied for demixing the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. +- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 32 is providing [=mix_gain_parameter_data=] to be applied to the rendered [=Audio Element=] after rendering according to [=rendering_config=] of the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 11. +- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 33 is providing [=mix_gain_parameter_data=] to be applied to the rendered [=Audio Element=] after rendering according to [=rendering_config=] of the [=Audio Element=] with [=audio_element_obu/audio_element_id=] = 12. +- The [=Parameter Block OBU=] with [=parameter_block_obu/parameter_id=] = 34 is providing [=mix_gain_parameter_data=] to be applied to the [=Rendered Mix Presentation=] of the two rendered [=Audio Element=]s. From 9a2e1e45051160f5d18aa53a46f3d65a885312f2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=ED=99=A9=EC=84=B1=ED=9D=AC/=EC=B0=A8=EC=84=B8=EB=8C=80=20?= =?UTF-8?q?Display=20Lab=28SR=29/=EC=82=BC=EC=84=B1=EC=A0=84=EC=9E=90?= Date: Wed, 30 Aug 2023 17:33:54 +0900 Subject: [PATCH 2/6] Update the data as of August 30th --- index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.bs b/index.bs index ef4334c1..4011c8b9 100644 --- a/index.bs +++ b/index.bs @@ -7,7 +7,7 @@ Editor: Felicia Lim, Google, flim@google.com Repository: AOMediaCodec/iamf Shortname: iamf URL: https://aomediacodec.github.io/iamf/ -Date: 2023-07-17 +Date: 2023-08-30 Abstract: This document specifies the Immersive Audio (IA) model, the standalone IA Sequence format, and the [[!ISO-BMFF]]-based IA container format. Local Boilerplate: footer yes From bd86abd5a18577c4901c519e5a498e6dcc15b265 Mon Sep 17 00:00:00 2001 From: sunghee-hwang <97494915+sunghee-hwang@users.noreply.github.com> Date: Wed, 30 Aug 2023 17:43:38 +0900 Subject: [PATCH 3/6] Improvement on CG relationship --- index.bs | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/index.bs b/index.bs index 4011c8b9..848122f7 100644 --- a/index.bs +++ b/index.bs @@ -918,8 +918,7 @@ Each [=Channel Group=] (or scalable channel audio layer) is associated with a di - The IA decoder SHOULD first attempt to select the layer with a [=loudspeaker_layout=] that matches the physical playback layout. - If there is no match, the IA decoder SHOULD select the layer with the closest [=loudspeaker_layout=] to the physical layout and then apply up- or down-mixing appropriately, after decoding and reconstruction of the channel audio. Sections [[#iamfgeneration-scalablechannelaudio-downmixmechanism]] and [[#processing-downmixmatrix]] provide examples of dynamic and static down-mixing matrices for some common layouts that MAY be used. -The relationship among all [=Channel Group=]s for an [=Audio Element=] SHALL comply with [[#scalablechannelaudio-channelgroupformat]] and the relationship among all channel layouts indicated by [=loudspeaker_layout=]s specified in an [=Audio Element OBU=] SHALL comply with [[#scalablechannelaudio-channellayoutgenerationrule]]. - +The relationship among all [=Channel Group=]s for the given scalable channel audio representation SHALL comply with [[#scalablechannelaudio-channelgroupformat]] and the relationship among all channel layouts indicated by [=loudspeaker_layout=]s specified in an [=Audio Element OBU=] SHALL comply with [[#scalablechannelaudio-channellayoutgenerationrule]]. Semantics From 457c945d7ac19f81e8a23ed57ca035dcbdd80d2f Mon Sep 17 00:00:00 2001 From: sunghee-hwang <97494915+sunghee-hwang@users.noreply.github.com> Date: Thu, 31 Aug 2023 07:39:25 +0900 Subject: [PATCH 4/6] Apply the suggestion from code review Co-authored-by: Felicia Lim --- index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.bs b/index.bs index 848122f7..37d9b43a 100644 --- a/index.bs +++ b/index.bs @@ -2689,7 +2689,7 @@ class Bar { Qx.y -Qx.y indicates that it is stored as a (x+y+1)-bit, signed, two’s complement fixed-point value with y fractional bits. That is, a (x+y+1)-bit signed (two’s complement) integer, that is implicitly multiplied by the scaling factor 2^(−y). +Qx.y indicates that it is stored as a (x+y+1)-bit, signed, two’s complement fixed-point value with y fractional bits. That is, a (x+y+1)-bit signed (two’s complement) integer, that is implicitly multiplied by the scaling factor \(2^{−y}\). # Annex # {#annex} From 4f2435d311b7bb55d1591cb9028e8be72500d705 Mon Sep 17 00:00:00 2001 From: sunghee-hwang <97494915+sunghee-hwang@users.noreply.github.com> Date: Thu, 31 Aug 2023 10:08:25 +0900 Subject: [PATCH 5/6] Apply the suggestion from code review --- index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.bs b/index.bs index 591fb9e6..7504ae7a 100644 --- a/index.bs +++ b/index.bs @@ -1034,7 +1034,7 @@ Bit position : Channel Name output_gain indicates the gain value to be applied to the mixed channels which are indicated by [=output_gain_flags=], where each mixed channel is generated by down-mixing two or more input channels. It is computed as \(20 \times \log_{10}(f)\), where \(f\) is the factor by which to scale the mixed channels. It is stored as a 16-bit, signed, two’s complement fixed-point value with 8 fractional bits (i.e., Q7.8)([[Q-Format]]). -#### Channel Layout Generation Rule (Normative) #### {#scalablechannelaudio-channellayoutgenerationrule} +#### Channel Layout Generation Rule #### {#scalablechannelaudio-channellayoutgenerationrule} This section describes the generation rule for channel layouts for scalable channel audio. From 330cbf75692634f62b9d316cc8d92cb6503fd490 Mon Sep 17 00:00:00 2001 From: sunghee-hwang <97494915+sunghee-hwang@users.noreply.github.com> Date: Thu, 31 Aug 2023 10:08:32 +0900 Subject: [PATCH 6/6] Apply the suggestion from code review --- index.bs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.bs b/index.bs index 7504ae7a..e6f0c409 100644 --- a/index.bs +++ b/index.bs @@ -1047,7 +1047,7 @@ Scalable channel audio with [=num_layers=] > 1 SHALL only allow down-mix paths t
IA Down-mix Path for scalable channel audio
-#### Channel Group Format (Normative) #### {#scalablechannelaudio-channelgroupformat} +#### Channel Group Format #### {#scalablechannelaudio-channelgroupformat} The [=Channel Group=] format SHALL conform to the following rules: - It consists of C number of channels and is structured to n number of [=Channel Group=]s, where C is the number of channels for the input [=3D audio signal=].