As we have recently been granted Google Cloud Research Credits for the investigation of Serverless Data Integration, we continue our exploration of open and public data. This HOWTO-style blog post presents the application domain of financial analytics and explains how to run a cloud function to achieve elastically scalable analytics. Although there are no research results to report yet, it raises a couple of interesting challenges that we or other computer scientists should work on in the future.Continue reading
Self-management is an important property of software services to increase the degree of exploiting benefitial characteristics of underlying runtime systems. Whether such services run in a managed cloud environment, on a device or somewhere else in the computing continuum, there may always be limitations in the managing runtime platform that a complementary or overarching application-level management can help to overcome. Using a Python Flask-based web service as example, this research blog post informs about our ongoing investigations into two specific self-management aspects: runtime resilience and feistiness.Continue reading
As presented in a prior post, Singer.io is a modern, open-source ETL (Extract, Transform and Load) framework for integrating data from various sources, including online datasets and services, with a focus on being simple and light-weight. The basics of the framework were explored in our last post on the topic, so we will refer you to that if you are unfamiliar.
This post is about our process for deploying Singer to the cloud, more specifically, to the Cloud Foundry open source cloud application platform. This was done in the context of researching the maturity of data transformation tools in a cloud-native environment. We will explore the options for deploying Singer taps and targets to a cloud provider and discuss our implementation and deployment process in detail.Continue reading
This post was first published on Medium by Leonardas (Badrie) Persaud – one of the students who was involved in this project. The post is republished here as the project was run within the context of the Software Maintenance and Evolution course run by Sebastiano and the project itself was supervised by Seán. The students involved in the project were UZH CS Master’s students: Badrie L. Persaud, Bill Bosshard, and, William Martini and all project related content is in the project’s github repo.Continue reading
Many providers of hosted services, including cloud applications, are subject to a contradiction in handling log data. On the one hand, storing logs consumes resources and should be minimised or avoided altogether to save resource cost. On the other hand, regulatory constraints such as keeping the data for the purpose of future audits exist. A smart solution to encode the data appropriately needs to be found. The coding encompasses both compression, to keep resource use low, and encryption, to prevent leaking information to unauthorised parties, for instance when logging for the purpose of intrusion detection. On an algorithmic level, the encoded data should still be usable for computation, in particular comparison and search. In this blog post, based on the didactic log example shown in the figure below, we present algorithms and architectures to handle cloud log files in a smart way.Continue reading
In Switzerland, opendata.swiss is the go-to location for any open dataset resulting from federal, cantonal or municipal sources. From a societal and economics perspective, the portal is an important asset following the “protect private data, make use of public data” mantra, and has already led to digital innovation through the availability of many third-party applications. In this research blog post, we look at some numbers associated with the portal.Continue reading
Singer.io is an open-source JSON-based data shifting (ETL: extract, transform, load) framework, designed to bring simplicity when moving data between a source and a destination service on the Internet. In this post, we present the framework as entry point into the world of SaaS-level data exchange and some associated research questions.Continue reading
Docker images have become the valuta franca in the cloud and container platform world. Although on the path to vendor-neutral standardisation (e.g. with OCI also being in Docker Hub for a year now), developers for now have settled on plain Docker as de-facto standard due to the vast ecosystem of base images and dependency images which speed up the rapid prototyping of complex scalable applications. From a production-grade DevOps perspective, a key concern is then to be assured that the containers used are of high quality, not infected by security vulnerabilities, and still containing the latest features available. In this blog post, a novel approach to visualise the situation around a particular container image is presented.Continue reading
The MAO-MAO research collaboration aims to provide metrics, analytics and quality control for microservice artefacts of all kinds, including but not limited to, Docker containers, Helm charts and AWS Lambda functions. As such, an integral part of prior research has been the various periodic data collection experiments, gathering metadata and conducting automatic code analysis.
However, the ambition of the project to collect data consistently, combined with the need for the collaborators to be able to use each other’s tools and access each other’s data, have created a need for a collaboration framework and distributed execution platform.
In response to this need, we present the first release of the MAO Orchestrator, a tool designed to run these experiments in a smart way and on a schedule, within a federated cluster across research sites. As a plus, there is nothing implementation-wise tying it to the existing assessment tools, so it is reusable for any use-case that requires collaboratively running periodic experiments.Continue reading