Category: Allgemein (Page 1 of 3)

Nvidia Jetson Orin NX Modular Vision System

The Institute of Embedded Systems (InES) at ZHAW, with great experience in hardware and software development for NVIDIA Jetson computing modules, is presenting the successor of the NVIDIA Jetson AGX Xavier modular vision system based on the new high-performance NVIDIA Jetson Orin NX.

As the Jetson AGX Xavier system, this new prototyping platform consists of a greatly reduced baseboard which can be extended with different types of M2-footprint modules to add functions like HDMI in, FPD-Link III, USB-C, etc. Due to the modular architecture, a personalized system can be configured. The low complexity and flexibility of the provided interfaces allow the development of more custom-made, application-specific extension boards.

Figure 1: Anyvision Orin NX Carrier Board

The system consists of a minimal motherboard which includes the necessary circuitry to start the Orin NX and program it to an external NVMe SSD.

The baseboard is powered with a common 12V DC power supply. To keep size and costs low, the available peripheral interfaces on the mainboard are limited to 2x USB-A 3.2 Gen 2, 1x Micro-USB for flashing, 1x HDMI out, and 1x Gigabit Ethernet.

Figure 2: Interfaces and Features of Orin NX Motherboard (example configuration with a Dual HDMI module and a CAN module)

All additional interfaces are exposed by M2-footprint sockets. The Networking interface, SSD interface, and PCIe x4 slot meet industry standards. The Video Input and General-Purpose interfaces on the other hand implement a custom pinout defined by ZHAW InES and share dedicated Orin NX interfaces like MIPI CSI-2 (2x 4-lane), USB, I2C, SPI, and UART, as well as GPIO functionalities.

In comparison to the original Jetson AGX Xavier Anyvision baseboard, only one Video Input Interface (instead of two) is available. Furthermore, the Ethernet interface (on the new module, ethernet is accessible directly through the baseboard ethernet port) as well as the Video Output interface were omitted.

An overview of possible configurations and currently available extension modules is given in the table below (for more information about the extension modules see Nvidia Xavier-AGX Modular Vision System).

Interface NameOrin Dedicated LinesAvailable Modules
Video Input Interface (M.2 M)2x 4-lane (or 4x 2-lane) CSI-2 1x I2C 1x I2S 6x GPIODual HDMI (4k30) Module FPD-Link III Module Dual RPI Camera Module
General Purpose Interface (M.2 E)2x USB 3.2 Gen2 1x I2C 1x CAN 1x SPI 1x UART 6x GPIODual USB-C Module CAN Module (3x CAN)
Table 1: Interfaces for custom Anyvision modules
SlotPCIeAvailable Modules
M.2 M (WWAN, SSD)2x PCIeoccupied by NVMe SSD
M.2 E (WiFi, BT, NFC)1x PCIegeneric WiFi or BT modules
PCIe Slot4x PCIegeneric PCIe cards
Table 2: Available industry-compliant PCIe interfaces

In the future, the plan is to expand the selection of available extension modules, for example with an FPD-Link IV or an HDMI 4k60 module.

For more information about the baseboard or the extension modules, feel free to contact us!

Running Artificial Intelligence Algorithms Directly on Jetson Nano and Microcontrollers

Running artificial intelligence (AI) algorithms, such as neural networks, directly on embedded devices has many advantages compared to running them in the cloud: One can save significant amounts of cloud storage, reduce power consumption and enable real-time applications. In addition, privacy is increased and required bandwidth reduced because only the AI algorithms results are forwarded to the cloud, not the full data. However, setting up the environments for custom neural networks on embedded devices can be difficult. Thats why the HPMM team provides a fully “dockerized” reference workflow for the Nvidia Jetson Nano. It includes:

  • Container to convert Tensorflow and Pytorch models to .onnx models
  • Container to cross-compile a C++ TensorRT applications for a Jetson Nano including opencv
  • Container to run TensorRT networks on the Jetson Nano with the C++ and Python API

Please find the link to the reference workflow here.

If you are interested in running AI algorithms on microcontrollers, such as the Cortex M4, we provide the reference workflows for several frameworks such as TensorFlow lite for microcontrollers, CMSIS-NN and ST CUBE AI here.

Feel free to contact us regarding your custom application or project regarding embedded artificial intelligence!

Jamulus-Direct low latency music performance application for RaspberryPi 4

The Institute of Embedded Systems (InES) High Performance Multimedia group generated a low latency version of the classic Jamulus music rehearsal application for the Raspberry Pi 4

The classic Jamulus is an open source application for music groups who want to rehearse over the internet. Especially with the global pandemic, Jamulus is a great solution for bands and choirs to rehearse from home.
With the Jamulus-Direct solution, the Institute of Embedded Systems reduces the latency of the audio connection compared to the classic Jamulus. The low audio latency is achieved through multiple peer-to-peer connections. This means that each participant is connected to the other members of the group via a dedicated connection. The audio no longer needs to be sent via an audio server, which reduces the latency.

Peer-to-Peer communication in Jamulus Direct reduces latency

The figure below shows the audio transmission in a classical server based session with three clients. Three computers run a client software. Computer 0 starts a client and a server. Each client sends their own audio to the server. The audio is mixed together at the server and sent back to the clients. However, the server topology shown below introduces latency because all data has to make a detour over the server.

No peer-to-peer communication in classical Jamulus introduces latency


Therefore, in Jamulus direct, a peer-to-peer topology eliminates the detour via the server (see figure below). Each participant in a peer-to-peer system exchanges its data with each other participant. Peer-to-peer is therefore the preferred structure to achieve the lowest possible latency.

Peer to peer connections between three clients

This setup ensures the lowest possible latency between each device.
To achieve low latency, a few challenges had to be mastered to get a well functioning system.
For instance, that peer to peer connections usually are blocked by the firewall of the network router and therefore need a mechanism to open these ports on the firewall.
A further issue is the management of the session. In server based systems, the server usually manages the session. It is the contact point for new clients to register to join the session and also to unregister when leaving.
The peer-to-peer audio transmission showed audio latencies under 30 milliseconds for two locations, with a ping time of 13 milliseconds.
A detailed paper of Jamulus Direct can be downloaded here.

Jamulus-Direct is written for the Raspberry Pi 4 and achieves high quality audio utilizing a USB audio card connected via the RPI4 USB interface. Every audio card with available Linux drivers is suitable (Tested with Focusrite Scarlet 2i2 and Behringer U-Phoria UM2)

There are two ways you can use Jamulus-Direct:
1.) Burn Jamulus-Direct onto a SD-card and dedicate your Raspberry Pi for Jamulus-Direct
(Instructions for dedicated SD-card installation)

2.) Use Jamulus on your existing Raspberry Pi4 with Raspberry Pi OS already installed
(Instructions with already installed Raspberry Pi OS)

Nvidia Xavier-AGX Modular Vision System

The Institute of Embedded Systems (InES) at ZHAW, with great experience in hardware development for NVIDIA computing modules, is now shipping its modular vision system based on high-performance NVIDIA Xavier AGX.

To shorten the time to market, the prototyping system consists of a greatly reduced motherboard that can be equipped with different types of M2-footprint modules to add functions like HDMI in and out, FPD-Link III in and out, USB-c etc.

Due to the modular architecture, the user may configure a personalized system by adding from a choice of different modules. Due to the low complexity and flexibility of the provided interfaces, more custom-made modules can easily be developed.

AGXC512: Nvidia-Xavier-AGX Carrier Board
(Configuration Example)
HDMI4KI2: 4k HDMI card with 2 Inputs
HDMI4KO2: 4k HDMI card with 2 outputs
FPDI4: FPD-Link III deserializer card with 4 inputs
USBC2: USB-C card with 2 x USB-C
NWG1: GBit Ethernet card

The system consists of a minimal mother board, which only includes the necessary circuitry to startup and program the Xavier.

To keep size and costs low, the main board includes USB 2.0, HDMI and one 16-lane PCIe standard card slot. All other interfaces are provided by nine PCIe M2-footprint sockets which share not only PCIe lines, but also dedicated Xavier-AGX interfaces like MIPI- CSI, USB and I2C. An overview of possible configurations and currently available devices is given in the table below.

Front view with Nvidia-Xavier-AGX on top
Slot connections to Xavier and currently available Modules
Modules may easily be exchanged
  • The Video-Out socket provides two HDMI or DisplayPort lines, as well as the required I2C lines. It is also possible to configure one DisplayPort and one HDMI output. This slot also carries USB2.0, if the user wants to implement HD-BaseT.
  • The USB socket allows two USB interfaces with USB connectors of choice (USB-A-B-C, USB on the go) I2C, CAN-Bus and serial UART are also supported.
  • Video-In Module 1 and 2 each support 2 CSI ports with 4 lanes or 4 CSI ports with 2 lanes which allows connecting 4k cameras or HDMI to CSI converters
  • The Network Module allows the connection of 1GBIT Ethernet over RGMII or another PCIe lane for PCIe high speed PHYs
Available Modules for Xavier prototyping board

For more information, availability and pricing, please see contacts on the right.

Linux Driver for the LT6911UXC HDMI to MIPI CSI-2 Converter

The ZHAW Institute of Embedded Systems (InES), High Performance Multimedia Group, developed a 4k video for Linux driver for the Lontium LT6911UXC HDMI to MIPI CSI-2 converter IC.
The driver was written for NVIDIA Jetson Processors and enables the following features of the LT6911UXC

  • Supports 4k HDMI 2.0 to MIPI CSI-2, requiring only one CSI port
  • Up to 4k resolution
  • Only 4 CSI lanes (one port) are required to receive 4k@30fps
  • Converts 4:2:2 YcBCr to CSI-2 YUV streams 1)
  • Converts RGB to RGB CSI-2 streams 1)

A driver for the advanced LT6911GX with HDMI 2.1 support and 4k@60fps on a single CSI-2 port is in the pipeline at InES-HPMM group.

Download:
The driver can be downloaded from here:
https://github.com/InES-HPMM/Lontium_lt6911uxc

1) If RGB or YUV streams are accepted depends on the corresponding LT6911UXC firmware provided by Lontium. Contact Lontium for the firmware.

Keywords: TC358743, HDMI to CSI converter, HDMI to CSI bridge, LT6911UXC, LT6911GX, MIPI, Jetson, Xavier

Secure Boot Concept for the Zynq Ultrascale+ MPSoC

The complexity of today’s multiprocessor System-on-Chip (MPSoC) can lead to major security risks in embedded designs, as the available security functions are often not or insufficiently utilized.

InES (Institute of Embedded Systems at ZHAW) developed a reference design which demonstrates a concept of a secure boot implementation and runtime system on a Xilinx Zynq Ultrascale+.

The security concept includes dedicated on-chip security features like AES, RSA and hashing core. The reference design also describes how to implement a voltage and temperature tamper detection. In addition, secure key storage and various methods to minimize key usage are provided. The demonstrator implements the ARM Trust Zone technology with OP-TEE as a secure operating system.

Implementation examples and usage description of the Linux Crypto-API, using the dedicated cryptographic cores, are also included in the documentation.

The modular open-source reference design is provided on GitHub, which contains implementation examples for all the above features.

Please find the link to our secure boot reference design here:

https://github.com/InES-HPMM/ZYNQ_USplus_secure_boot_reference_design


Linux Driver for TI DS90UB95x FPD-Link III serializer and deserializer

The Institute of Embedded Systems at ZHAW developed a driver for the deserializer DS90UB954 and serializer DS90UB953 from Texas Instruments. The driver was tested on the RaspberryPi 4, NVIDIA Nano and NVIDIA Xavier modules.

FPD-Link III is a cost-effective solution for high speed video transmission. The video data, the bidirectional configuration signal and the power supply are all transmitted over a single coaxial cable. At the length of up to 15 m the coaxial cable can support data rates of 6 Gbps.

In order to use our FDP-Link III driver (link driver) on different hardware, the driver was designed to be highly configurable. Additionally, the driver can be used with various FPD-Link III cameras. Instead of integrating the FPD-Link III part into already existing camera sensor drivers, the link driver is standalone and creates a transparent CSI and I2C link to the data source. This means that after the driver has set up the FPD-Link III connection, the sensor driver can be used without modifications.

Transparent Link

The following figure shows an example of a sensor driver which controls a camera sensor directly over I2C and configures the video interface

Camera sensor pipeline

Adding an FDP-Link III connection means that the I2C interface and the video channel will be routed through the coaxial cable. The deserializer and serializer are responsible for the conversion and forwarding of I2C and video data. The following figure shows the link driver configuring the deserializer and serializer over I2C. Once the setup is done, the sensor driver can configure the camera sensor over the I2C interface. Since the link is transparent, no changes have to be made to the sensor driver.

Camera sensor pipeline with FPD-Link III

Configurability of the Driver

The following configurations can be done in the device tree:
– I2C address of deserializer/serializer
– Number of MIPI CSI lanes (the camera sensor and the hardware do not need to have the same number of lanes)
– MIPI CSI lane speed
– Enable/disable continuous clock
– Enable/disable test pattern of deserializer/serializer
– Virtual channel ID mapping
– Configure GPIOs of deserializer/serializer
– Set I2C alias addresses

Follow this link for the Driver source code and documentation A detailed description of device tree configurations can be found in ds90ub95.txt.

A hardware reference design for the Raspberry Pi can be found in our blog: Power over Coax FPD-Link III Link Streaming Adapter for Raspberry PI CSI-Interface

Power over Coax FPD-Link III Link Streaming Adapter for Raspberry PI CSI-Interface

The Institute of Embedded Systems at ZHAW has developed an open source adapter which allows streaming of a CSI-2 Camera interface to a Raspberry Pi. This allows connecting cameras with CSI interface via a long distance cable (up to 15m) to the CSI-2 input of a Raspberry Pi.


The long range adapter uses FPD-Link III high-speed video transmission technology by utilizing the existing MIPI CSI-2 interfaces of cameras and Raspberry Pi. A deserializer converts the FPD-Link III signal to CSI, is based on the DS90UB954 from Texas Instruments (TI). The counterpart located at the Raspberry Pi camera is based on the serializer DS90UB953 from TI. With these two components it is possible to transmit high-speed video data over a single coaxial cable which can be up to 15 meters long. Another advantage of FPD-Link III is the power over coax (PoC) capability, which transmits the power required for the camera sensor directly from the Raspberry Pi.
The schematics and PCB designs are open source and available here.
A driver for the Texas Instruments DS90UB95x serializer and deserializer can be found in our blog Linux Driver for TI DS90UB95x FPD-Link III serializer and deserializer

Deep Learning for Classifying Food Waste

Amin Mazloumian
Hans-Joachim Gelke
Matthias Rosenthal

Institute of Embedded Systems Zurich University of Applied Sciences Zurich, Switzerland amin.mazloumian@zhaw.ch

One third of food produced in the world for human consumption – approximately 1.3 billion tons – is lost or wasted every year. By classifying food waste of individual consumers and raising awareness of the measures, avoidable food waste can be significantly reduced. In this research, we use deep learning to classify food waste in half a million images captured by cameras installed on top of food waste bins. We specifically designed a deep neural network that classifies food waste for every time food waste is thrown in the waste bins. Our method presents how deep learning networks can be tailored to best learn from available training data.

In this paper, a more informative view to food waste production behavior at the consumption stage is achieved through classifying food waste in waste bins. The classification task is feasible by processing images captured from food waste in the waste bins. The images are captured by installing cameras on top of the waste bins and monitoring the top surfaces of food waste in the bins. This study focuses on classifying food waste in half a million images captured by cameras installed on top of waste bins. The system design of a smart garbage systems that uses our classification is out of the scope of this study.

The automatic classification of food waste in waste bins is technically a difficult computer vision task for the following reasons.
a) It is visually hard to differentiate between edible and not-edible food waste. As an example consider distinguishing between eggs and empty eggshells.

b) Same food classes come in a wide variety of textures and colors if cooked or processed. c) Liquid food waste, e.g. soups and stews, and soft food waste, e.g. chopped vegetables and salads, can largely hide and cover visual features of other food classes.

In this research, we adopt a deep convolutional neural network approach for classifying food waste in waste bins. Deep convolutional neural networks are supervised machine learning algorithms that are able to perform complicated tasks on images, videos, sound, text, etc. The deep neural networks are composed of tens of convolutional layers (deep) that train on labelled data (supervised training) to learn target tasks. Labelled training data is composed of thousands of input- output pairs. In the training phase, the networks learn to produce the expected training output (labels) given the training input data. The training is performed by calculating millions of parameter values for feature extraction convolutional filters. In image processing, first layers of trained deep convolutional networks detect simple features, e.g. edges and corners. Based on the low level features extracted in first layers, deeper layers detect higher level features such as contours and shapes.

For more information please read our paper:
Deep Learning for Classifying Food Waste

Artificial Intelligence on Microcontrollers

Using artificial intelligence algorithms, specifically neural networks on microcontrollers offers several possibilities but reveals challenges: limited memory, low computing power and no operating system. In addition, an efficient workflow to port neural networks algorithms to microcontrollers is required. Currently, several frameworks that can be used to port neural networks to microcontrollers are available. We evaluated and compared four of them:

The frameworks differ considerably in terms of workflow, features and performance. Depending on the application, one has to select the best suited framework. On our github page we offer guides and example applications which can help you to get started with those frameworks!

The neural networks that are generated with all those frameworks are static. This means that once they are integrated into the firmware they cant be changed anymore. However, it would be beneficial if the neural network running on the microcontroller could adapt itself to a changing domain. We developed an algorithm (emb-adta) which could be used for unsupervised domain adaptation on microcontrollers. The prototype python implementation is also available on github!

« Older posts