A common narrative in practice sounds something like this: “people claim data protection is important to them, but in reality they give away everything on the internet anyway”. There are also some science studies that seem to prove this again and again: that we are generally careless with our and other personal data and that we consider data protection important but neglect it in everyday life. For example, a “pizza experiment” with 3,000 students at a US university in 2017 concluded that a free pizza was enough of an incentive to reveal the email addresses of three fellow students (Athey et al. 2017).
Many Internet users inside and outside the European Union are very familiar with cookie banners: they pop up on websites, they are often annoying, and it is tedious to really deal with them. Having to state our data sharing and protection preferences over and over again is a questionable concept by itself. But even if we accept the concept of cookie banner as a matter of fact our behavior towards them seems paradox at a first glance.
ANNPR, the “International Workshop on Artificial Neural Networks in Pattern Recognition” is a biennial academic conference where researchers come together to discuss the most recent advances in the fields of neural networks, deep learning and artificial intelligence as applied to pattern recognition. Pattern recognition is the field of computer science which is concerned with making sense of data such as images (“What do we see in the picture?”), audio data (for example, to recognize spoken words) or time-dependent inputs such as weather or stock-market data. This year’s edition was organized by Frank-Peter Schilling and Thilo Stadelmann from ZHAW’s Institute of Applied Informatics (InIT) and took place from 2-4 September.
We concluded an compelling interdisciplinary project on the topic of digitalization, where we applied a selection of fundamental methods of data science: web scraping, data wrangling with elastic search/kibana juggling, data cleaning, counting, posing questions and searching for answers in the data. We would like to share some results on this blog.
The project was called “DIGITAL COMMUNICATION STRATEGIES FOR THE CULTURAL SECTOR IN THE BODENSEE REGION”, in which the data analysis module dealt with the question of how digitalization was actually implemented in the region of the Lake of Constance. This was done using the example of some cultural providers such as museums, galleries, exhibitions and theatres on the region. We use in the terms Lake of Constance region and Bodensee region interchangeably this article, since Bodensee is Lake of Constance in German.
Can a prisoner be released early, or released on
bail? A judge who decides this should also consider the risk of
recidivism of the person to be released. Wouldn’t it be an
advantage to be able to assess this risk objectively and reliably?
This was the idea behind the COMPAS system developed by the US
system makes an individual prediction of the chance of recidivism for
imprisoned offenders, based on a wide range of personal data. The
result is a risk score between 1 and 10, where 10 corresponds to a
very high risk of recidivism. This system has been used for many
years in various U.S. states to support decision making of judges –
more than one million prisoners have already been evaluated using
COMPAS. The advantages are obvious: the system produces an objective
risk prediction that has been developed and validated on the basis of
thousands of cases.
May 2016, however, the journalists’ association ProPublica published
the results of research suggesting that this software systematically
discriminates against black people and overestimates their risk
(Angwin et al. 2016): 45 percent of black offenders who did not
reoffend after their release were identified as high-risk. In the
corresponding group of whites, however, only 23 percent were
attributed a high risk by the algorithm. This means that the
probability of being falsely assigned a high risk of recidivism is
twice as high for a black person as for a white person.
Kurt Stockinger was invited to contribute a blog to ACM SIGMOD – the leading world-wide community of database research. The blog discusses recent technological advances of natural language interfaces to databases. The ultimate goal is to talk to a database (almost) like to a human.
The full blog can be found on the following ACM SIGMOD link:
In a lecture for the Fair Data Forum, I dealt with the question “What value does data protection have for individuals and what are they willing to pay for it?”
The three data privacy types
As always, there is not one “individual”, as everyone has different data protection preferences and thus, attributes different value to having personal data safeguarded. Therefore, in order to classify individuals, there are different “typologies”. For example, Westin distinguishes between data protection fundamentalists, data protection pragmatists and completely unconcerned individuals. In 2002, Sheehan (2002) selected 889 persons in the USA and classified them with a questionnaire. Conclusion: 16% of the respondents were completely unconcerned about data protection, 81% were classified as pragmatists, and 3% as fundamentalists.
The aim of the PhD Network in Data Science is to offer students with a master degree (including degrees from an university of applied sciences) the opportunity to obtain a PhD in cooperation between a university of applied sciences and a university.
The PhD Network in Data Science is supported by Swissuniversities. It is a cooperation between three departments of ZHAW Zurich University of Applied Sciences (School of Management and Law, Life Science and Facility Management, School of Engineering), three departments of the University of Zurich (Faculty of Science, Faculty of Business, Economics and Informatics, Faculty of Arts and Social Sciences), the Faculty of Science at the University of Neuchatel and the Department of Innovative Technologies at SUPSI University of Applied Sciences and Arts of Southern Switzerland.
PhD students work in applied research projects at the university of applied sciences and are supervised jointly by a supervisor at the university and a co-supervisor at the university of applied sciences. They are enrolled in the regular PhD programs of the partner universities and have to go through the standard admission procedure. After successful completion they receive the doctorate of the respective partner university (UZH or UNIBE). The PhD Network is also open to students with a master’s degree from a university of applied sciences. They, however, have to go through convergence programs (specific to the respective faculty) for admission to the partner universities.
The final results of an interdisciplinary study funded by „TA Swiss“ on „Quantified Self“ with participation of the Datalab have been published. The study was performed by three ZHAW departments (School of Health Professions, School of Management and Law, School of Engineering) in cooperation with the Institute for Futures Studies and Technology Assessment, Berlin. The focus of the Datalab was on legal and Big Data aspects of quantified self.
In 2014, ZHAW Datalab started the SDS conference series. It was the year with only one Swiss data scientist identifiable on LinkedIn (at Postfinance…). The year where we talked about “Big Data”, and not “Digitization”. The year where we were unsure if such a thing as a Swiss data science community would exist, and if it actually would come to such an event.
SDS grew from a local workshop to a conference with over 200 participants and international experts as keynote speakers in 2016. This was the year where finally a Swiss-wide network of strong partners form academia and industry emerged to push innovation in data-driven value creation: the Swiss Alliance for Data-Intensive Services (www.data-service-alliance.ch). We as datalabbers have been instrumental in founding this alliance, and then found it to be the perfect partner to take this event to the next level of professionalism.