NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.

TI811X-HDVPSS-01.00.01.44 Feature Performance Guide

From Texas Instruments Wiki
Jump to: navigation, search

Feature performance guide for HDVPSS release 01.00.01.44


TI811x HD-VPSS Drivers[edit]

This section provides brief overview of the device drivers supported in HDVPSS release. Drivers are mainly classified into three categories:

  • Display Drivers
  • Memory-to-Memory(M2M) Drivers.

HDVPSS Driver Features[edit]

  1. Most of the drivers runs on VPSS-M3 core with BIOS operating system and FVID2 interface.
  2. Display (V4L2) and fbdev drivers are supported on Cortex-A8 core with Linux as operating system using proxy server
  3. Ships with sample applications and documentation.


VPDMA List Usage[edit]

VPDMA had 8 lists which are shared across all drivers:

VPDMA usage
Driver DMA usage
Display One List for each TV output used
M2M Depends on the path used (1-6 lists)

Setup Details[edit]

'

Details TI811X
SoC Details Core VPSS-M3
Operating speed of Core 200 MHz
Operating speed of HD-VPSS 200 Mpixels/sec
EVM Configuration Ducati, HDVPSS, EMIF, DDR2
Optimization Details Is the Ducati cache enabled? Yes
Profile whole program debug
Is the code and data placed in L2/L3 memory? No
Is the L3 interconnect optimized? No


Video Display Drivers[edit]

This section describes the display drivers' performance numbers - throughput and CPU load.

Introduction[edit]

Display drivers takes the video buffers from the application and display the videos on VENCs at specified frame rate and resolution. Display drivers follows the FVID2 interface.

Bypass Path 0/1 and Secondary 1 Path Display Driver[edit]

Bypass path display driver controls the two bypass paths in the hardware. It configures only up to the muxes. The rest of the hardware below the mux/switch like CIG, COMP, VENC etc is controlled by display controller driver.


Setup Details

  • TI811x EVM
  • TV
  • DVD Player
Video Display performance values
Output Display
(Resolution)
TI811x From VPSS-M3
Frame Rate
(in Frames/sec)
CPU Load
(in %)
Off-Chip HDMI - DVO1 (With Hardware Mosaic) 60 FPS for 1080I60, 1080P60 and 50FPS for 1080P50 2%
Off-Chip HDMI - DVO1 (With Hardware Mosaic) 50FPS for 1080P50 and 30 fps for 1080P30 1%
DVO2 (With Hardware Mosaic) NRY NRY


Graphics Path 0/1/2 Driver[edit]

Graphics path display driver controls the three graphics paths in the hardware to display graphics planes including multi-regions support. The rest of the hardware below like COMP, VENC etc is controlled by display controller driver.

Graphics Planes performance values
Output Display
(Resolution)
TI811x VPSS-M3 TI811x Cortex-A8
Frame Rate
(in Frames/sec)
CPU Load
(in %)
Frame Rate
(in Frames/sec)
CPU Load M3
(in %)
CPU Load A8
(in %)
DVO1 60 FPS for 1080P60, 1080I60, 720P60 and 50FPS for 1080P50, 1080I50, 720P50 2% NRY NRY NRY
DVO1 30 FPS for 1080P30 1% NRY NRY NRY
DVO2 NRY NRY NRY NRY NRY




Video Capture Driver[edit]

This section describes the video capture driver performance numbers - throughput and CPU load.

Introduction[edit]

VIP capture driver makes use of VIP hardware block in HDVPSS to capture data from external video source like video decoders (example, TVP5158, TVP7002). The video data is captured from the external video source by the VIP Parser sub-block in the VIP block. The VIP Parser then sends the captured data for further processing in the VIP block which can include color space conversion, scaling, chroma down sampling and finally writes the video data to external DDR memory.

Setup Details

  • TI811x EVM
  • TV
  • DVD Player
Video Capture NTSC VIP input performance values
Output Display
(Resolution)
TI811x M3 Core
Frame Rate
(in Frames/sec)
CPU Load
(in %)
NTSC single in 60 3%



Memory to Memory Drivers[edit]

This section describes the memory-to-memory drivers' performance numbers - throughput and CPU load.

Introduction[edit]

M2M drivers takes the video buffer from the memory, optionally process the buffer, (processing done on the buffer depends on the specific M2M driver) and puts it back to memory. M2M driver follows the FVID2 interface for the applications.

Secondary 0 Or Bypass path 0/1 to SC5 and Sec 0/1 to SC3/SC4 M2M driver[edit]

This driver takes video data from one of the three paths(SEC0/BP0/BP1), does scaling(SC5) and writes output video to memory. Other variants take data from secondary path 0/1(SEC0/SEC1) and scales via VIP path scalars (SC3/SC4) and writes output videoto memory.

Setup Details:

  • Calculate time required for single scaling operation and for CPU load, issue scaling operation in contiguous loop with queuing buffer for each resize.
Scalar Driver Performance values
Scaling Factor
(Resolution)
TI811x VPSS-M3
Frames per Sec
CPU Load
(in %)
SEC0-SC5 Single Ch (720480YUV420 => 720X480YUV422 interleaved ) NRY NRY
SEC0-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) 94 2%
SEC0-SC5 1/4x (1920x1080YUV420 => 720x480 YUV422 interleaved) 94 1%
BP0-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) NRY NRY
BP1-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) NRY NRY
SEC0-SC3-VIP0 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV420) NRY NRY
SEC1-SC4-VIP1 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV420) 94 2%
MultiCh - 3Ch (720x480 => 720xXXX) NRY NRY
8D1@60fps NRY NRY


SubFrame level processing in Secondary 0 Or Bypass path 0/1 to SC5 M2M driver[edit]

This driver takes video data from one of the three paths(SEC0/BP0/BP1), does scaling(SC5) subfframe by sub-frame and writes output video to memory.Frame is divided into multiple subframes and processed.

Setup Details:

  • Calculate time required for single scaling operation and for CPU load, issue scaling operation in contiguous loop with queuing buffer for each resize.
Scalar Driver Performance values
Scaling Factor
(Resolution)
TI811x VPSS-M3
Frames per Sec
CPU Load
(in %)
SEC0-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) 83 14%
BP0-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) NRY NRY
BP1-SC5 Single Ch 4x (720x480 YUV420 => 1920x1080 YUV422 interleaved) NRY NRY


DEI M2M Driver[edit]

This driver takes YUYV422/YUV420 interlaced/progressive input via the DEI path and provide one/two scaled version of the deinterlaced/bypassed outputs - one via writeback path 0/1 and another via VIP 0/1.

Setup Details

  • CPU Idle - Disabled
  • Tool Used for measurement - LFTB
  • Calculate time required for single resize operation and for CPU load, issue resize operation in contiguous loop with queuing buffer for each resize.
DEI Scalar Driver Performance values
Scaling Factor
(Resolution)
TI811x VPSS-M3
Frames per Sec
CPU Load
(in %)
DEI-WB1 - Single Ch 720x240 YUV420 => scaled to 360x240 YUYV422 via WB1 NRY NRY
DEI-WB1-VIP1 - Single Ch 720x240 YUV420 => dual scaled to 360x240 YUYV422 via WB1 and 720x480 YUV420 via VIP1 NRY NRY
Single o/p writeback path 4x (720x480 => 1920x1080) NRY NRY
Single o/p VIP path 4x (720x480 => 1920x1080) NRY NRY
Single o/p writeback path 1/4x (1920x1080 => 720x480) NRY NRY
Single o/p VIP path 1/4x (1920x1080 => 720x480) NRY NRY


Noise Filter (NSF) M2M Driver[edit]

Noise filter driver allows user to filter noise from video data by processing them through the noise filter hardware. This driver can also be used for only YUV422 to YUV420 chroma downsampling.

Noise filter Driver Performance values
Mode
TI811x VPSS-M3
Frames per Sec
CPU Load
(in %)
Single Ch Chroma downsampling (640X480YUV422 interleaved => 640X480 YUV420 Semiplanar) 502 5%
NF spatial (1080P input) NRY NRY
NF temporal (1080P input) NRY NRY
MultiCh NF spatial (480P input) NRY NRY
MultiCh NF temporal (480P input) NRY NRY
16 Ch Chroma downsampling (720X240YUV422 interleaved => 720X240 YUV420 Semiplanar) NRY NRY


Calculating Performance for different Memory to memory paths
[edit]

The description below is based on actual performance seen with SW drivers on actual Si.

Performance of Scalar (SC) Path
[edit]

This is applicable for all SC's in TI811x.
Here DEI, whereever applicable, is assumed to be in bypass mode.
When DEI is not in bypass mode the performance description is given in subsequent section.


Each SC operates at 200Mhz clock.
In theory it can process 1 pixel per clock, i.e, about 200 mega pixel per second. (MP/s).

But due to inherent overheads due to overlapping needed for various filtering operations, the practical standalone (i.e only SC running in system) speed would be about 180-190 MP/s (mega pixels/sec)

When SC is run with other modules like other driver, or codecs the performance may drop further due to DDR BW.

SW overheads will also reduce SC performance, but with TI HDVPSS driver we see very little impact of SW overheads. With SW overheads DEI can safely do about 130MP/s processing.

Number of pixel processed when doing SC for a 1 D1 CH of 720x480 @ 30frames per second, is 720x480x30(frames per second) = 10.3MP/s

Here Output from SC is <= 720x480

Thus SC can safely do about 12CHs of D1 when its output size is <= 720x480, i.e only downscaling is done in the scaler.

In practice with HDVPSS only applications we found that measured SC performance is about 13 D1 CHs (about 140MP/s)

With other activity like codec, performance should drop but each SC will safely give 12CH D1 performance (130MP/s)

When scalar upsampling is used the results would be bit different.
For use-case of scaling 720x480 to 1920x1080 output size, the performance for 1CH would be,
1920x1080(since 1920x1080 > 720x480) x30(frames per second) = 62.2MP/s

In TI811x, assuming SC performance is 130MP/s, thats about 2 CHs


Performance of DEI[edit]

Each DEI operates at 200Mhz clock in TI811x.

In theory it can process 1 pixel per clock, i.e, about 200 mega pixel per second. (MP/s)

But due to inherent overheads due to overlapping needed for various filtering operations, the practical standalone (only DEI running in system) speed would be about 150-160 MP/s (mega pixels/sec)

When DEI is run with other modules like other driver, or codecs the performance may drop further due to DDR BW.

SW overheads will also reduce DEI performance, but with TI HDVPSS drivers we see very little impact of SW overheads. With SW overheads DEI can safely do about 130MP/s processing.

Number of pixel processed when doing DEI for a 1 D1 CH of 720x240 @ 60fields per second, is

720x240x2(since DEI results in 1 line becoming two lines)x60(fields per second) = 20.7MP/s

Here Output from DEI is <= 720x480

Thus DEI can safely do, about 6CHs of D1 in TI811x

when its output size is <= 720x480, i.e only downscaling is done in the scaler after DEI.

In practice with HDVPSS only applications we found that measured DEI performance is about 6-7 D1 CHs (about 140MP/s).

With other activity like codec, performance should drop but each DEI will safely give 6CH D1 performance.

Above is when scalar downsampling is used after DEI.

When scalar upsampling is used the results would be bit different.
For use-case of 704x480 output size, the performance for 1CH would be,

704x480(since 704x480 > 720x240) x60(fields per second) = 20.3MP/s

Assuming DEI performance is 130MP/s, thats about 6 CHs


Performance of Noise Filter (NF)[edit]

NF operates at 200Mhz clock .
In theory it can process 1 pixel per clock, i.e about 200 mega pixel per second. (MP/s).

But due to inherent overheads due to overlapping needed for various filtering operations, the practical standalone (only NF running in system) speed would be about 130-140 MP/s (mega pixels/sec).

When NF is run with other modules like other driver, or codecs the performance may drop further due to DDR BW.

SW overheads will also reduce NF performance, but with our driver we see very little impact of SW overheads. With SW overheads DEI can safely do about 130MP/s processing.

Number of pixel processed when doing NF for a 1 D1 CH of 720x240 @ 60fields per second, is

720x240x60(fields per second) = 10.3MP/s

Thus NF can safely do about 12CHs of D1 in TI811x.

In practice with HDVPSS only applications we found that measured NF performance is also about 12 D1 CHs (about 130 MP/s).

With other activity like codec performance should drop but each NF will safely give 12 CH D1 performance (130MP/s).

Overall System Performance[edit]

HDVPSS BIOS package is having Links and Chains example. It shows the typical use cases exercising many different HDVPSS drivers. Below table shows the performance numbers for the different combination of the HDVPSS drivers. Details of each of the different combination can be found in the Links and Chains UserGuide

System Performance Values
Mode
TI811x VPSS-M3
CPU Load
(in %)
Single CH Capture + Scale + Display [Option 1] 4%
Multi CH Capture + Scale + Display [Option 2] 6%
Multi CH Capture + NSF + Scale + Display[Option 3] 7%
Multi CH Capture + DEI + Scale + Display[Option 4] 32%
Single CH Capture + NSF + DEI + Display (Full screen DEI) [Option 6] 6%


E2e.jpg {{
  1. switchcategory:MultiCore=
  • For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
  • For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here.

Keystone=
  • For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
  • For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here.

C2000=For technical support on the C2000 please post your questions on The C2000 Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. DaVinci=For technical support on DaVincoplease post your questions on The DaVinci Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. MSP430=For technical support on MSP430 please post your questions on The MSP430 Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. OMAP35x=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. OMAPL1=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. MAVRK=For technical support on MAVRK please post your questions on The MAVRK Toolbox Forum. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here. For technical support please post your questions at http://e2e.ti.com. Please post only comments about the article TI811X-HDVPSS-01.00.01.44 Feature Performance Guide here.

}}

Hyperlink blue.png Links

Amplifiers & Linear
Audio
Broadband RF/IF & Digital Radio
Clocks & Timers
Data Converters

DLP & MEMS
High-Reliability
Interface
Logic
Power Management

Processors

Switches & Multiplexers
Temperature Sensors & Control ICs
Wireless Connectivity