Author: rosn

Boost your GStreamer pipeline with the GPU plugin

Embedded devices like the Nvidia Tegra X1/2 offer tremendous video processing capabilities. But often there are bottlenecks hindering you from taking advantage of their full potential. One solution to this problem is to employ the general purpose compute capabilities of the GPU (GPGPU). For this purpose, we have developed a GStreamer Plug-In that lets you add a customized video processing functionality to any pipeline with full GPU support.

A possible application is shown in the image below. Two video inputs are combined to a single video output as a picture-in-picture video stream. A 4k image is depicted in the background and on top of it a downscaled FullHD input is streamed.

In order to cope with the huge amount of data, the video processing is outsourced to the GPU. The use of CUDA allows you to create new algorithms from scratch or integrate existing libraries. The plugin enables you to benefit of the unique architecture of the TX1/2, where CPU and GPU share access to the same memory. Therefore, memory access time is reduced  and unnecessary copies are avoided. The next image shows a pipeline of the example mentioned above.

At the beginning of the pipeline, where the data rates are the highest, the GPU and internal Hardware encoders are used. The CPU can then handle the compressed data easily and gives access to the huge number of existing GStreamer Plug-Ins. For example it is capable of preparing a live video stream for clients.

The GStreamer Plug-In can also serve as a basis for other applications like format conversion, debayering or video filters.

Feel free to contact us on this topic.

Open Source drivers for HDMI2CSI module updated to support TX1 and TX2

The HDMI2CSI board for capturing 4K HDMI now supports both TX1 and TX2. Video capturing is fully supported for resolutions up to 2160p30 on Input A and 1080p60 on Input B.

Driver development will continue on L4T 28.1. The previous 24.2.1 branch is considered deprecated.

Get started with the Readme:
and find detailed instructions (for building the Kernel etc.) on the Wiki:

Main changes:

  • Driver for tc358840: Now using the updated version that is already in the 28.1 kernel (with a small modification)
  • Device tree: Adapted to be compatible with 28.1 (if you come from previous L4T, please note the new way of flashing a device tree in U-Boot! Also the structure is different with separate repositories for kernel and device tree)
  • Vi driver: Using the new version from Nvidia instead of our implementation, since it now supports “ganged mode” for combining multiple VI ports
  • Custom resolutions: The EDID can be read and written from the Linux userspace (See [1]) to support different resolutions/timings on the fly

If you want to use Userptr/Dmabuf mode in GStreamer v4l2src, you still need to rebuild GStreamer. The reason is that GStreamer by default uses libv4l for the REQBUF ioctl. The libv4l implementation of this ioctl does NOT support userptr/dmabuf. But you can just build GStreamer without libv4l and it will use correct implementations for the ioctls and work.

Original release:


MIPI CSI/DSI Interface for General Purpose Data Acquisition

Modern SoC devices offer high performance for data analysis and processing. In order to transfer accordingly high data rates, the choices for high speed general purpose interfaces are limited. The first that comes to mind is PCIe, which is available in most high performance SoCs. However, PCIe requires a relatively complex controller on both data source and sink. Additionally the fact that PCIe is such a commonly used interface means that all of the SoCs PCIe controllers may already be occupied by peripherals.

Coming from the mobile market, some SoCs additionally offer MIPI Camera Serial Interface (CSI) / Display Serial Interface (DSI) [1] interfaces, for example the Nvidia Tegra K1 / X1 or Qualcomm Snapdragon 820. These interfaces were designed for high bandwidth video input (CSI) and output (DSI). These state-of-the-art SoCs provide CSI-2 D-PHY interfaces which can have a transmission rate of 1.5 to 2.5 Gbps/lane. One such interface consists of a maximum of 4 data lanes and one clock lane. Typically, one to three interfaces are available, allowing to connect up to six different devices (depending on the SoC model).

Figure 1: MIPI CSI-2 D-PHY interface

Instead of restricting the use of the CSI/DSI interfaces to video only, we propose to use them for transferring general purpose data. The theoretical maximum bandwidth of such an implementation is 30 Gbps (using 3 4-lane MIPI CSI/DSI interfaces).  For a data acquisition application, a sampling rate of 1.875 GSps can be handled. A comparable PCIe x4 v2 interface provides a maximum throughput of 16Gbps, resulting in 1 GSps sampling rate. We successfully implemented and tested digital audio data transmission over CSI/DSI and will continue to explore this interesting interface.

Audio Video Regression Test System

For our test driven way of development we build up a regression test system for our high performance video and audio transmission. The system is used to schedule and run tests and monitor the results in real time. For this, it provides a wide range of interfaces to interact with the system under test. This includes interfaces to monitor and manipulate the network traffic as well as interfaces to generate and analyse video and audio signals.



The system is based on a Linux OS and can therefore be used on many different hardware platforms. The tests to be run are written in Python and can be run automatically or manually. An interface to Jenkins allows to combine the test system with the build flow.

The regression test system provides following advantages:

 – Improved quality due to regression testing

 – Automation of the testing process

 – Simplification of the test implementation

 – Individual adaptions depending on the test dependencies

Improved quality due to regression testing

With regression tests is the system tested with a large number of test cases. Some of the cases are based on the expected behavior of the system. Some cases are based on reports from customers and partners. Before a new software is released it has to pass all this test cases. Like this, each release provides at least as good as the last one and the software will continuously improve with each release.


Automation of the testing process

The InES regression test system provides an interface to Jenkins. This allows to include the tests directly into the build flow. The newest software can be built and downloaded to the target system. Which is then tested with all the regression test cases. The Jenkins web interface allows the user always to see the current progress as well as to change or interrupt some steps if required.

Simplification of the test implementation

The InES regression test systems provides the required interfaces to the device under test as well as the tools to schedule and execute the tests. The user just has to describe the test cases in Python. The test system can be set up on a PC or embedded system. It is also possible to split the test system over multiple platforms.

Individual adaptions depending on the test dependencies

The regression test system is built up modular. It’s possible to deactivate unused interfaces to reduce the requirements for the platform. It is also possible to add new interfaces specifically adapted to the device under test. Like this, it’s possible to adapt the test system perfectly to the device under test as well as to the platform it runs on.

HDMI Real-Time Analyzer and Tester

The High-Performance Multimedia Group has developed an HDMI Real-Time Analyzer and Tester which allows logging and real-time modifications of the HDMI stream between source and sink.


  • Compliance testing of HDMI devices
  • Simulating non-compliant communication behavior to test your device’s robustness




  • User-defined insertion and simulation of Enhanced Extended Display Identification Data (E-EDID) communication
  • Automatic simulation of several HDMI devices in a batch testing-procedure
  • Real-time modification of HDMI communication-stream, such as corrupting or delaying the responses of HDMI devices.
  • Logging of DDC communication
  • Real-time image analysis for automated testing

The HDMI Real-Time Analyzer and Tester is implemented on an Altera FPGA with NIOS II softcore processor running Linux. The device taps into the DDC channel to allow logging and modifying the communication in real-time according to user-defined behavior rules.  

Open source driver for HDMI2CSI module released


The open source driver for the HDMI2CSI Interface (HDMI2CSI)  are now available at The driver was developed within the Video4Linux2 (V4L2) framework and consists of three main components:

  • A host driver for controlling the Video Interface (VI2) on the Nvidia Jetson TX1 host processor
  • A subdevice driver that interacts with the Toshiba TC358840 CSI-HDMI bridge IC
  • A video buffer driver

The driver supports standard device tree and currently runs on the Linux4Tegra (L4T) R24.1 operating system.  The video stream can be accessed from user space using GStreamer, the versatile multimedia framework.

Fig.1: Hardware and software components for using HDMI2CSI with a Nvidia Jetson TX1 host processor

Documentation containing instructions about how to employ the drivers for the HDMI2CSI module are available at

Embedded Computing Conference


Embedded Computing Conference

Meet us at the Embedded Computing Conference on May 31st on the Campus of  ZHAW School of Engineering, Technikumstrasse 9, 8401 Winterthur. We are presenting new technology highlights based on embedded processors with integrated GPU.

4K HDMI to CSI Interface for TX1 Evalboard

The High-Performance Multimedia Group has developed a High Definition Multimedia Interface (HDMI®) to MIPI®Camera Serial Interface Type 2 (CSI-2) converter module (HDMI2CSI) as a plug in to the NVIDIA Jetson TX1 development kit.
The HDMI2CSI module supports 4K video resolution for next-generation embedded Ultra High Definition video applications. The HDMI2CSI module offers two 4K/2K HDMI video and audio streams to be simultaneously converted in MIPI CSI-2 video and TDM audio format that can be processed by the Jetson TX1 processor.
The Jetson TX1 board is equipped with 3 four-lanes MIPI high-speed camera serial interfaces (CSI-2) which are used by the HDMI2CSI board to input HDMI video. The module utilizes two MIPI CSI-2 ports of the Jetson TX1 board (8 lanes) to input a 4K HDMI video stream. For a second 2K HDMI video stream, the remaining MIPI CSI-2 port is used (4 lanes).
Eight channels of HDMI audio streams per HDMI input are also supported and can be transmitted over TDM or I2S.
4K capable drivers for the HDMI2CSI are available as open source.



Fig. 1: The HDMI2CSI Board




Fig. 2:  The HDMI2CSI Board attached to the NVidia TX1 Evaluation Board


Technical Data:

  • Based on Toshiba TC358840 Camera Serial Interface converter ICs
  • HDMI 1.4b
  • 4096 x 2160 (4Kx2K) @ 24 fps
  • 3840 x 2160, @ 30 fps
  • 4 x I2S Audio Interface with 16, 18, 20 or 24-bit
  • 8 Channel TDM (Time Division Multiplexed Audio Interface) with 16, 18, 20 or 24-bit
  • HDCP 1.4 support
  • EDID support via I2C

For purchasing orders please contact Pender Electronic Design


(c) ZHAW Institute of Embedded Systems

Blog is online

Welcome to the new High-Performance Multimedia Blog. Our research group is part of the Institute of Embedded Systems (InES)  at the Zurich University of Applied Sciences (ZHAW). Our main activities are Audio/Video processing and data acquisition systems. We are involved in a number of different projects with industry partners in medical, professional video and audio, high-speed signal analysis as well as other research domains.