Guest Speaker: Yu Su
480 Dreese Labs
2015 Neil Ave, Columbus, Ohio 43210
Bridging the Gap between Human and Data with AI
Data-driven problem solving and decision making is one of the most common activities in daily life. For example, doctors make diagnostic decisions by gathering information from patient inquiry and examination. The rise of big data, such as electronic medical records and digitized scientific literature, bears the promise of bringing unprecedented opportunities for better-informed decision making. However, as data becomes more and more massive and heterogeneous, standing in stark contrast to this promise is the quickly growing gap between users and data: Accessing and analyzing even very simple data requires extensive training, which is not economic for casual users who only use data on an occasional and on-demand basis.
In this talk, I will discuss our research efforts of developing AI techniques to bridge the gap between users and data. In the first part of the talk, I will discuss how to construct knowledge bases, which contain structured knowledge about entities and their relationships, from massive unstructured texts, and present a new approach to build more robust relation extractors using global statistics collected from the entire corpus. In the second part of the talk, I will discuss how to construct natural language interfaces to various kinds of data such as knowledge bases and relational databases, so that users can query data using natural language questions instead of writing SQL-like formal queries. I will discuss in depth our explorations on the benchmarking and portability issues of natural language interface.
Bio: Yu Su is a Ph.D. candidate in Computer Science at University of California, Santa Barbara. He obtained his bachelor's degree in Computer Science from Tsinghua University in 2012. His research intersects are in the areas of data mining and natural language processing, towards the overarching goal of enabling seamless access to massive and heterogeneous data. His recent research includes natural language interfaces (to knowledge bases, relational databases, APIs, etc.) and knowledge base construction from text. He has regularly published and served in top venues of both data mining and natural language processing. His recent service includes being PC co-chair of the first workshop on Knowledge Base Construction, Reasoning and Mining at WSDM'18. He has interned at Microsoft Research Redmond, IBM T.J. Watson Research Center, and U.S. Army Research Laboratory.
Host: Arnab Nandi