Textual Hydraulics: Mining Online Newspapers to Detect Physical, Social, and Institutional Water Management Infrastructure

John T. Murphy
Seminar

Abstract:
Water issues have always been crucial in the U.S. Southwest and are increasingly pressing for other areas of the U.S. and the world. Water management is undertaken in a rich physical and social context, which includes an important historical component. Infrastructure decisions, such as the establishment of towns and cities and the construction of dams and canals, can initially open possibilities but later constrain the adaptive options available to a town, city, or region that begins to suffer increased water-related stresses. The historically-situated social context, including the ways that water rights are legally defined, jurisdictional and institutional boundaries, and social, economic, or cultural factors, can shape the way that decisions are made about water acquisition, distribution, and conservation. The mechanisms through which these decisions play out can therefore vary in different locales, and lead to different effects with different costs and benefits; optimal decisions may be rare.

A project currently being undertaken in collaboration with researchers at the University of Alaska Anchorage and the University of New Hampshire is examining these issues using open-source textual data. Data mining of newspaper articles obtained freely via the Internet is being used to ask questions about water management issues in several initial test cases, all in the U.S. Southwest. Four comparative cases from Colorado (Grand Junction), Nevada (Las Vegas), and Arizona (Flagstaff, Tucson) are examined. A mixture of rule-based, supervised and unsupervised data mining approaches are employed to determine the degree to which the physical and social water-management infrastructure and the cognitive, culturally-mediated perceptions of water and water issues in these areas are reflected in these text sources. The initial efforts focus, first, on determining which among the collection of source documents is genuinely related to the water topics of interest; with these documents singled-out, other analyzes can be conducted that relate published discussions to real-world events (e.g. floods). Still other efforts extract lists of the study area's water management authorities (institutions and individuals), from which further analyzes (e.g. network analyzes) can be conducted. Future efforts will include incorporating local decision making and environmental characteristics into agent-based models of coupled social and biophysical systems. An overview of the goals and current status of this project will be presented.

Bio:
John T. Murphy received his PhD in Anthropology from the University of Arizona in 2009 and is currently a Computational Postdoctoral Fellow at Argonne National Laboratory and a researcher at the University of Chicago Computation Institute. He is an affiliated researcher at the Complexity Center in the University of Arizona School of Anthropology, and is the Communications Officer for the Computational Social Science Society of the Americas (CSSSA). His work focuses on computational modeling of complex adaptive social systems, especially archeologically attested irrigation systems, and on social science modeling on high-performance computing platforms.