AMW Summer School 2020

The AMW Summer School is a two day event (May 18 - 19) preceding the Alberto Mendelzon Workshop on Foundations of Data Management 2020. The event consists of multiple tutorials aimed at a mixed audience of students and other interested attendees.

Our goals

Summer School Speakers

Leopoldo Bertossi (Universidad Adolfo Ibáñez, Chile)

Explanations in Data Management and Artificial Intelligence

ExplainableAI is one of the most active areas of research in AI and machine learning in these days. In this tutorial we will review some recent approaches to providing explanations in data management, knowledge representation and machine learning. In particular, causality-based approaches, model-based diagnosis, and score-based explanations will be described and applied to query answering and consistency assessment in databases, and outcomes of classification models.

Speaker Bio

Leopoldo Bertossi has been Full Professor at the School of Computer Science, Carleton University (Ottawa, Canada) from 2001 to 2019. In September 2019 he took up a full-professorship at Universidad Adolfo Iba~nez (UAI, Chile), the oldest and most prestigious fully-private university in Chile. He is a Senior Computer Scientist at RelationalAI Inc., since August 2018. He is since 2019 a senior member of the "Millenium Research Institute for Foundations of Data" (IMFD, Chile), a 10-year initiative funded by the Government of Chile. He obtained a PhD in Mathematics from the Pontifical Catholic University of Chile (PUC) in 1988, with a PhD thesis on mathematical logic (model theory) under the supervision of Prof. Joerg Flum (University of Freiburg, Germany). Prof. Bertossi's research interests include data science, database theory, data management, semantic web, intelligent information systems, data management for business intelligence, knowledge representation, uncertain reasoning, logic programming, computational logic, and statistical relational learning.


Marco Calautti (University of Edinburgh, UK)

Tutorial title (TBA)

Uncertain and imprecise data occur more and more often in today’s information systems, and can arise for different reasons, including integrating data originating from different sources. This phenomenon is widely recognized nowadays, to the point that it has been included, with the name of "Veracity", as one of the 4 V’s of Big Data (the others being Volume, Velocity and Variety). Dealing with such kind of data poses different challenges, and a fundamental one is obtaining meaningful answers to our queries. The goal of this tutorial is to discuss the notion of inconsistent data, the challenges that these kind of data arise, and approaches to deal with such issues, with a particular focus on query answering. We will first overview the classical Consistent Query Answering (CQA) approach, by introducing the framework and some fundamental results. We will then identify some of the limitations of classical CQA, and discuss how one can overcome them, by means of more refined approaches. We will see that obtaining refined query answers comes at price, as the complexity of query answering increases, in general, w.r.t. classical CQA. Thus, we will discuss how query answers can be approximately computed under the new frameworks, by means of efficient approximation algorithms.

Speaker Bio

(TBA)


Fatma Ozcan (IBM Almaden Research Center, USA)

Natural Language Interfaces to Databases

Users need to learn and master a complex query language like SQL or SPARQL to access to data in knowledge and databases. Another and more natural way to query the data is using natural language interfaces to explore the data. The main challenge in natural language based querying of data is to identify user intent. To interpret the user's natural language query, many systems try to identify the entities in the query and the relationships between them. There are many entity-based solutions in the literature, with varying complexity of the queries that they can generate. Recently, machine learning and deep learning based techniques have also become popular. In this tutorial, we will review state-of-the-art natural language interface solutions in terms of their interpretation approach, as well as the complexity of the queries they can generate.

Speaker Bio

(TBA)


Jorge Pérez (Universidad de Chile, Chile)

Vectorial Representation of Words based on Neural Networks

Word Embeddings (WE) are vectorial representation of words in a low dimensional space. WE are usually constructed by training Neural Networks from huge text corpora and have revolutionized the field of Natural Language Processing. In this tutorial we will survey the main methods for constructing WE as well as some applications for text classification. We will also present code and pretrained WE models in the Spanish language that the audience can later use for their own applications. Finally we will discuss on new ways of constructing WE contextualized with a now very famous model called BERT. BERT code and models in Spanish will also be presented in the tutorial.

Speaker Bio

Jorge Pérez is Associate Professor in the Department of Computer Science at Universidad de Chile, and Associate Researcher at the Millennium Institute for Foundational Research on Data (IMFD). His research interests include data exchange and integration, Web data, and the theory of modern neural network architectures. He has received several awards for his research, including the best paper award in 5 international conferences, the Microsoft Research PhD Fellowship, and the SWSA Ten-Years Award for his work on query languages for Web data. His interests also include the analysis of social, medical and political text data, particularly in Spanish.