Category: Blog (page 1 of 3)

Algorithmic Fairness – Algorithms and Social Justice

By Christoph Heitz (ZHAW)

translated from original German language version published at Inside IT

Can a prisoner be released early, or released on bail? A judge who decides this should also consider the risk of recidivism of the person to be released. Wouldn’t it be an advantage to be able to assess this risk objectively and reliably? This was the idea behind the COMPAS system developed by the US company Northpoint.

The system makes an individual prediction of the chance of recidivism for imprisoned offenders, based on a wide range of personal data. The result is a risk score between 1 and 10, where 10 corresponds to a very high risk of recidivism. This system has been used for many years in various U.S. states to support decision making of judges – more than one million prisoners have already been evaluated using COMPAS. The advantages are obvious: the system produces an objective risk prediction that has been developed and validated on the basis of thousands of cases.

In May 2016, however, the journalists’ association ProPublica published the results of research suggesting that this software systematically discriminates against black people and overestimates their risk (Angwin et al. 2016): 45 percent of black offenders who did not reoffend after their release were identified as high-risk. In the corresponding group of whites, however, only 23 percent were attributed a high risk by the algorithm. This means that the probability of being falsely assigned a high risk of recidivism is twice as high for a black person as for a white person.

Continue reading

Twistbytes Approach to Hierarchical Classification shared Task at GermEval 2019

by Fernando Benites (ZHAW and SpinningBytes)

cross-posted from github

We explain here, step by step, how to reproduce results of the approach and discuss parts of the paper. The approach was aimed at building a strong baseline for the task, which should be beaten by deep learning approaches, but we did not achieve that, so we submitted this baseline, and got second in the flat problem and 1st in the hierarchical task (subtask B). This baseline builds on strong placements in different shared tasks, and although it only is a clever way for keyword spotting, it performs a very good job. Code and data can be accessed in the repository GermEval_2019

Continue reading

Twist Bytes @Vardial 2018

by Fernando Benites (ZHAW and SpinningBytes)

cross-posted from the SpinningBytes blog

schwiiz ja*

This year, the SpinningBytes team participated in the VarDial competition, where we achieved second place in the German Dialect Identification shared task. The task’s goal was to identify, which region the speaker of a given sentence is from, based on the dialect he or she speaks. Dialect identification is an important NLP task; for instance, it can be used for automatic processing in a speech-to-text context, where identifying dialects enables to load a specialized model. In this blog post, we do a step by step walkthrough how to create the model in Python, while comparing it to previous years’ approaches.

Continue reading

ZHAW Datalab organizes Data Science Event in Silicon Valley

By Kurt Stockinger (ZHAW)

As part of “Zürich meets San Francisco – A Festival Of Two Cities”, the ZHAW Datalab co-organized the event Data Science and Beyond: Technical, Economic and Societal Challenges, which took place at the campus of San José State University (SJSU) – in the heart of Silicon Valley. One interesting fact about SJSU is that it has the highest number of graduates among all US universities that get jobs either at Apple or Cisco.

Continue reading

Book Review: Paul D. Ellis, The Essential Guide to Effect Sizes

Reviewed by Thoralf Mildenberger (ZHAW)

  • Paul. D. Ellis, The Essential Guide to Effect Sizes. Statistical Power, Meta-Analysis and the Interpretation of Research Results. Cambridge University Press, Cambridge 2010. Link to book on publisher’s website.

In the last few years, statistical hypothesis testing – with the p-value still being THE standard for reporting results in many fields of science – has increasingly been criticized. Many researchers have even called for abandoning the “NHST” (Null Hypothesis Significance Testing) approach all together. I think this is going too far as many problems are due to misapplication of the techniques and – perhaps even more importantly – misinterpretation of the results. There is also no consensus on how to replace hypothesis testing with a better methodology – some of the more moderate critics suggest using confidence intervals, but while these are often more informative they are essentially equivalent to hypothesis tests and share some of the problems. This makes it all the more important to highlight difficulties in the correct application and interpretation of statistical methodology. Continue reading

Study on “Quantified Self” Published: Links to Book and Summary

By Kurt Stockinger (ZHAW)

The final results of an interdisciplinary study funded by „TA Swiss“ on „Quantified Self“ with participation of the Datalab have been published. The study was performed by three ZHAW departments (School of Health Professions, School of Management and Law, School of Engineering) in cooperation with the Institute for Futures Studies and Technology Assessment, Berlin. The focus of the Datalab was on legal and Big Data aspects of quantified self.

The results are available in various forms:

Enjoy reading and maybe you get encouraged to “quantify yourself” a bit better 😉

PhD Network in Data Science

By Dirk Wilhelm (ZHAW)

Reposted from https://blog.zhaw.ch/industrie4null/2018/01/10/phd-network-in-data-science/

Studierende können nun an der ZHAW in Kooperation mit der Universität Zürich oder der Universität Neuenburg im Bereich Data Science doktorieren. Continue reading

Artificial Intelligence in Industry and Finance

2nd European COST Conference on Mathematics for Industry in Switzerland
September 7, 2017
Zurich University of Applied Sciences,
Technikumstr. 71, 8400 Winterthur

By Jörg Osterrieder (ZHAW)

Below please find a short recap and an outlook for our next conference on September 6, 2018.

Aim of the conference

The aim of this conference was to bring together European academics, young researchers, students and industrial practitioners to discuss the application of Artificial Intelligence to various practical fields. In a broader context, we wanted to promote «Mathematics for Industry» in Switzerland, as part of the European COST (Cooperation in Science and Technology) Action “Mathematics for Industry”, where members of ZHAW are in the management committee for Switzerland. Continue reading

R: Reduce() Part 2 – some pitfalls using Reduce

By Matthias Templ (ZHAW), Thoralf Mildenberger (ZHAW)

By way of example, functionality of Reduce() is shown in in https://blog.zhaw.ch/datascience/r-reduce-applys-lesser-known-brother/ . It’s great to learn about how to use this function on interesting problems. If you are ready (equals if you read the first blog post on Reduce), we want to push you further on writing efficient code. Continue reading

R: Reduce() – apply’s lesser known brother

By Thoralf Mildenberger (ZHAW)

Everybody who knows a bit about R knows that in general loops are said to be evil and should be avoided, both for efficiency reasons and code readability, although one could argue about both.

The usual advice is to use vector operations and apply() and its relatives. sapply(), vapply() and lapply() work by applying a function on each element of a vector or list and return a vector, matrix, array or list of the results. apply() applies a function on one of the dimensions of a matrix or array and returns a vector, matrix or array. These are very useful, but they only work if the function to be applied to the data can be applied to each element independently of each other.

There are cases, however, where we would still use a for loop because the result of applying our operation to an element of the list depends on the results for the previous elements. The R base package provides a function Reduce(), which can come in handy here. Of course it is inspired by functional programming, and actually does something similar to the Reduce step in MapReduce, although it is not inteded for big data applications. Since it seems to be little known even to long-time R users, we will look at a few examples in this post. Continue reading

« Older posts