As presented in a prior post, Singer.io is a modern, open-source ETL (Extract, Transform and Load) framework for integrating data from various sources, including online datasets and services, with a focus on being simple and light-weight. The basics of the framework were explored in our last post on the topic, so we will refer you to that if you are unfamiliar.
This post is about our process for deploying Singer to the cloud, more specifically, to the Cloud Foundry open source cloud application platform. This was done in the context of researching the maturity of data transformation tools in a cloud-native environment. We will explore the options for deploying Singer taps and targets to a cloud provider and discuss our implementation and deployment process in detail.
Many providers of hosted services, including cloud applications, are subject to a contradiction in handling log data. On the one hand, storing logs consumes resources and should be minimised or avoided altogether to save resource cost. On the other hand, regulatory constraints such as keeping the data for the purpose of future audits exist. A smart solution to encode the data appropriately needs to be found. The coding encompasses both compression, to keep resource use low, and encryption, to prevent leaking information to unauthorised parties, for instance when logging for the purpose of intrusion detection. On an algorithmic level, the encoded data should still be usable for computation, in particular comparison and search. In this blog post, based on the didactic log example shown in the figure below, we present algorithms and architectures to handle cloud log files in a smart way.
In Switzerland, opendata.swiss is the go-to location for any open dataset resulting from federal, cantonal or municipal sources. From a societal and economics perspective, the portal is an important asset following the “protect private data, make use of public data” mantra, and has already led to digital innovation through the availability of many third-party applications. In this research blog post, we look at some numbers associated with the portal.
Singer.io is an open-source JSON-based data shifting (ETL: extract, transform, load) framework, designed to bring simplicity when moving data between a source and a destination service on the Internet. In this post, we present the framework as entry point into the world of SaaS-level data exchange and some associated research questions.
Docker images have become the valuta franca in the cloud and container platform world. Although on the path to vendor-neutral standardisation (e.g. with OCI also being in Docker Hub for a year now), developers for now have settled on plain Docker as de-facto standard due to the vast ecosystem of base images and dependency images which speed up the rapid prototyping of complex scalable applications. From a production-grade DevOps perspective, a key concern is then to be assured that the containers used are of high quality, not infected by security vulnerabilities, and still containing the latest features available. In this blog post, a novel approach to visualise the situation around a particular container image is presented.
The MAO-MAO research collaboration aims to provide metrics, analytics and quality control for microservice artefacts of all kinds, including but not limited to, Docker containers, Helm charts and AWS Lambda functions. As such, an integral part of prior research has been the various periodic data collection experiments, gathering metadata and conducting automatic code analysis.
However, the ambition of the project to collect data consistently, combined with the need for the collaborators to be able to use each other’s tools and access each other’s data, have created a need for a collaboration framework and distributed execution platform.
In response to this need, we present the first release of the MAO Orchestrator, a tool designed to run these experiments in a smart way and on a schedule, within a federated cluster across research sites. As a plus, there is nothing implementation-wise tying it to the existing assessment tools, so it is reusable for any use-case that requires collaboratively running periodic experiments.
When cloud application developers are working with docker-compose to combine multiple microservices into a single manageable entity, they can make some easy mistakes. To prevent these mistakes, they can rely on internal validation logic, which however does not catch many of the typical issues. Therefore, researchers at the Service Prototyping Lab at Zurich University of Applied Sciences wrote a dedicated quality check and assessment tool targeting developers, but also students trying to learn the technology, which has a wider range of checks. The DCValidator tool is available as a web application (see demo instance) or command-line interface. This blog post describes how to check that docker-compose files are free of issues.
The occasion of the 3.3.2 release of the rating-charging-billing solution for cloud software and platform providers, Cyclops, is a good opportunity for a deep dive into the new forecasting engine, the how and the why of its functionality and how to use it.
First, a bit of news. Active maintenance and further updates to Cyclops will now be found under the repository https://github.com/serviceprototypinglab/cyclops. The primary new addition is the forecasting engine. It helps SaaS/PaaS/CaaS/…XaaS providers to not only charge customers for their services, but also predict a revenue flow for deciding about future investments.