Research has shown that internet users care about their privacy, but they do not have the time or legal expertise to understand the privacy policies of all the websites they visit or all the mobile apps they use. Fixing this gap in online notice and choice is the goal of the Usable Privacy Policy Project, an NSF-funded project to extract salient details from privacy policies and present them to internet users in ways that respond to their needs. I will present results from our work that show crowdworkers can answer questions about privacy policies with high accuracy and automated methods can identify important details in policy texts, such as statements about data collection and users’ privacy options. I will then present some spinoff work on identifying the visual organization of text in documents from the web.

Dr. Shomir Wilson is an Assistant Professor in the College of Information Sciences and Technology at Penn State. He received his Ph.D. in Computer Science from the University of Maryland in 2011 and held postdoctoral positions in Carnegie Mellon University’s School of Computer Science and the University of Edinburgh’s School of Informatics. Dr. Wilson’s research interests span natural language processing (NLP), privacy, and artificial intelligence. He is director of the Human Language Technologies Lab at Penn State and a member of the Usable Privacy Policy Project. His recent work with the project uses a combination of crowdsourcing, text classification, and machine learning to extract key details from the privacy policies of websites and mobile apps. He is also interested in computational discourse, particularly in the roles that metalanguage and visual organization play in structuring text for readers. Additionally, he collaborates with researchers to apply NLP methods to a variety of problems.

Goldberg Computer Science Building, Dalhousie University

Date(s) - 05/12/2018
11:30 am - 12:30 pm