TVyVideo + Radio

Audio for UHD TV

Nowadays audio quality is very important. The ears of human beings have great capacity for discernment, but it can also be seen that the audibility and clarity of dialogue is a challenge.

Carlos Pantsios Markhauser PhD *

In recent years a new technology in television has entered the world market known as Ultra High Definition Television (UHDTV), which has 4 times more pixels per image on the screen (8.3 Mpix) than HDTV, 1080p (2Mpix). UHDTV television technology also has other very outstanding features, such as:

(1) a significantly greater dynamic range,
2) better temporal reproduction of images (by means of a higher temporal frequency),
3) substantially higher colour rendering (thanks to an expanded colour space) and,
4) more details (resolution) in the reproduced images.

- Publicidad -

Despite the great advantages mentioned in the video, you are almost not aware of the fact that a major change is also taking place in the sound system that accompanies the UHDTV video.

A new experience in sound is present in the UHDTV
First of all, it is important to highlight here the difference with which the human being perceives audio and video, that is, the difference between the experience produced by audio and video. For example, it is possible, in practice, to perfectly observe two or more images on the same television screen simultaneously. Television images are limited in nature and usually two-dimensional.

The presence of intervals with loss of information, due to transmission or video processing errors, does not completely hinder the user's understanding of distorted images. However, these losses are undoubtedly annoying for the viewer. Compared to the above, it is really difficult to understand several audios that appear simultaneously to the user.

Stereo audio is an unlimited experience (if the user is sitting in the right place) and the presence of information-losing intervals in the audio quickly reduces the user's ability to understand what is happening.

Moreover, if the audio is distorted, it can cause physical pain in the person.

Factors that improve the audio experience
The aforementioned differences in perception show that a significant number of factors must be considered to significantly improve the audio experience. Three areas to consider here are discussed below:

Area 1: It is known that the ability to interact is widely valued positively by the audience, but the audio equivalent to a second screen does not work. Then, how do you go about creating more complete interaction, in addition to conventional audio volume control?

- Publicidad -

Area 2: Audio is nowadays of the "immersive" type, but would it be interesting to know if this experience can be improved?, Is it possible that a true experience in 3D audio can work satisfactorily even when stereoscopic images in 3D can not do it?

Equally important is to ask, will it be possible to deliver this more immersive experience without overloading the production work and distribution process of the finished programs with a lot of added complexity and more cost? Finally, will it be possible to do the above in a way that is also accessible to those users who listen to programs in mono, stereo or with headphones?

Area 3: Nowadays audio quality is very important. The ears of human beings have great capacity for discernment, but it can also be seen that the audibility and clarity of dialogue is a challenge. An important question here is how can the audio experience be adapted and customized to make it work well for different preferences, for a range of technologies, and for a variety of listening environments.

Currently great efforts are made to find different techniques that allow to satisfactorily comply with the following three important areas:
1) interactivity,
2) immersion and
3) adaptation (also known as customization).

The technology that has shown the best results so far, offering backward compatibility with current channel-based technologies, is object-based audio (audio-objects).

In the conventional world, the audio content of a program is rendered using the channel-based format. Here, a number of signals stored in a file are distributed in streams, and each corresponds to a program. The technology known as Broadcasting Wave Format (BWF) does not currently define what each stream in the file represents, nor does Microsoft's Wave Format technology, on which it is based.

- Publicidad -

The arrangement of speakers is assumed from the number of channels available, and the positions of the speakers is also based on the number of the channel. For example, a program with two audio channels involves a stereo format; the signals correspond to the left and right speakers, which must be placed 60 degrees apart. With this system, problems occur quickly when there are more than two channels.

For content in 5.1 format there are different methods that allow you to sort the channels and there is no reliable way to know, from the file only, which convention has been used. The F64 is a supported multi-channel BWF format, which uses a channel mask to map channels to speaker arrays using a descriptive tag, e.g. SPEAKER-FRONT-LEFT. This allows the positions of the speakers to be determined, but the order identifiers of the channels and the metadata stored in an XML file are those used to describe the channels. A set of metadata named EBUCore allows for greater accuracy in defining content within a given file.

For many years, researchers were working on audio formats independent of the speaker configuration. One of them is the Object-Based Format, which describes components in a scene with time-varying metadata, providing maximum flexibility. For the broadcaster this solution is very attractive, since the programs can be produced only once, and distributed in different formats, which are generated automatically. This new BWF allows the representation of the scene and audio-objects, which makes it possible for broadcasters to transport and exchange programs generated in these formats.

This audio technology has been evolving rapidly in recent times, giving rise to new standards. Audio-object-based audio describes a general presentation of audio, structured into individual values (or objects), each with its own metadata, describing its relationships, behavior, and associations. The metadata tells an "assembler", in the AV system, how to assemble the audio-objects in the best possible way in the desired presentation, with the available speaker array.

Conceptually, this technological approach is very powerful and flexible, but to achieve a practical implementation it is necessary to know which problems you want to focus on first for their corresponding solution.

Proposing concepts and solutions
One of the most important concepts of audio-object based technology is that of "renderer". This is defined in the so-called Forum for Advanced Media in Europe (FAME), an organization that deals with research and development in Ultra High Definition (UHD), Virtual Reality (VR) and other new technologies.

Most likely, in real life it will be necessary to transcode between different object-based presentations. This is because high-level dramatic productions will require working with a very large number of objects (possibly hundreds of them or more). Actual workflows typically operate with subsets of fewer objects, and bandwidth constraints will force the use of fewer objects for the proper and economical delivery of productions to households.

Likewise, it is also necessary to be able to evaluate the quality of the different audio renderings corresponding to the different implementations. So far there is no technique for evaluating the quality of the different renderings corresponding to the implementations carried out. Techniques already known as the so-called Multiple Stimuli with Hidden Reference and Anchor (MUSHRA) do not work here, since now you are interested in evaluating the "immersiveness" of the production material, rather than in the errors that may appear in it.

The above definition also makes it clear that for the renderer to render, both audio and metadata are required.

The true nature of such a flexible approach lies in the fact that renderers can be developed to choose a simple published version, and implement it in the best possible way for a group of platforms, devices and situations. If this is the case, then there is a new challenge as, as a result, the creative working group will have a very remote idea about the way the audio program will sound in the house.

This raises the question of whether benchmark renderers and monitoring arrangements are required to allow for representative evaluation, which applies to the production in question. To crown object-based audio playback on professionally configured speakers, the renderer designer has also added the even more difficult challenge of how to produce great sound when presented in the asymmetrical arrangement commonly used in the home.

Currently you can see implementations in the consumer market of the new generation of 4k TVs (UHDTV) that continue to be equipped with conventional audio technology for broadcasting. However, the newest audio solutions are not associated with UHDTV technology and can be applicable to ordinary TV receivers as well as standard optical discs.

As a consequence, audio-object-based technologies are emerging in many places. For example, Dolby has objects at the heart of its ATMOS solution for cinema (including home cinema) and is introducing its object-based technology as part of the AC4 standard. DTS has released its Multi-Dimensional Audio (MDA) format. Farelight has implemented the ATMOS and MDA tools in its 3DAW audio tools.

The BBC recently demonstrated several examples of immersive, personalization and interaction developments, based on audio-objects at the 2014 IBC exhibition, and MPEG-H has been built to be "object ready" for the delivery not only of 3D audio for broadcasting, but also for gaming and video conferencing.

Big changes await us in the audio part in the near future and, for this, we must prepare properly.

*Carlos Pantsios Markhauser is a Telecommunications Engineer and Master in Communications from the Simón Bolivar University, with a Specialization in Telecommunications in satellite and networks The George Washington University - School of Engineering & Applied Science, Specialization in Digital Telecommunications University of Colorado Boulder. He works as a full professor of postgraduate studies in the telecommunications schools at the Simón Bolivar Universities and the Andrés Bello Catholic University. In addition to professional consultant in TV projects based in Argentina.

Richard Santa, RAVT

Richard Santa, RAVTEmail: [email protected]

Editor

Periodista de la Universidad de Antioquia (2010), con experiencia en temas sobre tecnología y economía. Editor de las revistas TVyVideo+Radio y AVI Latinoamérica. Coordinador académico de TecnoTelevisión&Radio.

No comments

• If you're already registered, please log in first. Your email will not be published.

A new Alegria Party at NAB Show with a full house

A new Alegria Party at NAB Show with a full house

NAB. A new NAB Show, a new Fiesta Alegría. It is always a pleasure to meet colleagues and friends from the Latin American broadcast industry who attend the invitation of TVyVideo+Radio every year.

Sony recognized its Latam strategic partners at NAB 2026

Sony recognized its Latam strategic partners at NAB 2026

NAB. As part of its participation in NAB Show 2026, Sony Professional Solutions Latin America (PSLA) held its long-awaited Broadcast Reseller Meeting, a key space to strengthen the relationship with...

305 Broadcast and SCMS seek to strengthen their presence in Latam

305 Broadcast and SCMS seek to strengthen their presence in Latam

Latin America. 305 Broadcast, founded by Alfonso Lopez and recognized for more than 18 years of service to the broadcast industry, announced a strategic alliance with SCMS, a major U.S.-based...

Netflix presented creative training initiatives at the FICCI

Netflix presented creative training initiatives at the FICCI

Colombia. As part of the "Industry Night" of the Cartagena de Indias International Film Festival (FICCI), Netflix reaffirmed its commitment to the Colombian creative ecosystem, announcing four new...

Nacho Carballo, new Global Managing Director of EFD Studios

Nacho Carballo, new Global Managing Director of EFD Studios

Latin America. EFD Studios announced the appointment of Nacho Carballo as the new Global Managing Director, in a decisive commitment to transatlantic collaboration and operational integration.

Campaign launched against piracy of audiovisual content

Campaign launched against piracy of audiovisual content

Argentina. ATVC and CAPPSA presented an awareness and prevention campaign aimed at making visible the direct impact of the consumption of pirated content on users, with special emphasis on the risks...

Music business for productions is transformed

Music business for productions is transformed

Slipstream's catalog exceeds one million tracks, in addition to more than 300,000 sound effects. Richard Santa

Lawo introduced converged video and audio stagebox

Lawo introduced converged video and audio stagebox

Latin America. With Edge One, Lawo opens a new chapter in audio and video connectivity for broadcast and professional audio/video workflows. Edge One offers great flexibility on the I/O side,...

Blackmagic Announces Davinci 21 and More News for NAB

Blackmagic Announces Davinci 21 and More News for NAB

NAB. Blackmagic Design made several announcements ahead of NAB Show 2026. Among them is DaVinci Resolve 21, of which its public beta version is now available for download.

Atomos acquires Flanders Scientific

Atomos acquires Flanders Scientific

Latin America. Atomos announced the acquisition of Flanders Scientific, one of the most prestigious brands in professional benchmark monitoring. This strategy reinforces Atomos' long-term commitment...

Suscribase Gratis

SUBSCRIBE TO OUR ENGLISH NEWSLETTER

• Earn 25 Loyalty Points •

DO YOU NEED A PRODUCT OR SERVICES QUOTE?

LATEST INTERVIEWS

SITE SPONSORS

LATEST NEWSLETTER

Ultimo Info-Boletin