Big Data and Public History

Clavert, Frédéric <University of Luxembourg>; Wieneke, Lars <University of Luxembourg>

Please use this identifier to cite or link to this item: http://elea.unisa.it/xmlui/handle/10556/6158

Title:	Big Data and Public History
Authors:	Clavert, Frédéric <University of Luxembourg> Wieneke, Lars <University of Luxembourg>
Keywords:	Big data;Digitization;Born digital;Distant reading;Metadata;Visualization;Crowdsourcing;Collective memory
Issue Date:	2022
Citation:	Frédéric Clavert, Lars Wieneke, "Big Data and Public History", in Handbook of Digital Public History, edited by Serge Noiret, Mark Tebeau and Gerben Zaagsma, Berlin, Boston: De Gruyter Oldenbourg, 2022, pp. 447-458
Abstract:	In this chapter, we define big data in history in three ways: (1) big data implies the use of an amount of data that the historian’s personal computer cannot deal with; (2) the data historians are using must be either directly linked to primary sources (digitized primary sources) or must be a primary source itself (born-digital primary sources); (3) big data implies a redefinition of some aspects of historians’ methodologies. We then try to identify the challenges and uses of big data in public history, which we consider as multifaceted ways to study, deepen and empower public connections to the past. For instance, the study of social networks online can help us better understand public uses of the past while the provision and application of large databases for historical primary and secondary sources can help us establish links with a wide non-academic audience. In particular, we discuss two opportunities offered by big data to public historians: the study of artefacts of collective memory as well as crowdsourcing, especially to collect primary sources that would not be accessible otherwise, or to improve them by collaboratively transcribing images of sources into texts. Born-digital sources are now quite numerous online (newsgroups, web archives, social media), and we illustrate the potential that born-digital primary sources offer public historians through the #ww1 project, which is based on a large database of Tweets written during the centenary of World War I. Using distant reading techniques and by looking at the specific case of the French “dead for France” database and how it was tweeted, we show how we can study the way that people engage with their own national collective memory and national history. We argue that this kind of study does not necessarily require a strong digital infrastructure. The chapter then focuses on en masse digitization projects, particularly when they involve crowdsourcing as part of the digitization effort. Going beyond very well-known examples (Google Books, Gallica, and Europeana newspapers, for example), we look at the case of “What’s on the menu?” a project to digitize the New York Public Library’s historical collections of the city’s restaurants’ menus. Looking at the crowdsourcing part of the project, we show that it solves the pitfalls of the most well-known large digitization projects, in particular poor OCR quality. It also invites citizens to participate in the development of their town’s historical narrative, together with historians. In conclusion, we draw the reader’s attention to the limitations of big data. Focusing particularly on two of its pitfalls – inequality of access and what is outside big data – we emphasize that big data does not mean complete or representative data. Data, datasets, and big data remain socially constructed objects. Inequality of access is here understood in two ways: data itself is not always accessible, and not everybody can access online services. For instance, big data does not document life in rich Western urban areas the same way as it documents life in the same countries’ poorer https://doi.org/10.1515/9783110430295-040 rural areas. To put it another way: inequalities in data accessibility undermine the possibilities of creating public history projects that are not mainstream or based on mainstream data. Furthermore, many aspects of our lives are not documented by big data. In other words, the large-scale digitization of sources casts shadows on and influences both our research and the way that citizens empower themselves to develop their own historical narratives. Nevertheless, the pitfalls of big data should not prevent us from making use of it in public history, but the historian’s and citizen’s critical thinking is key to its proper use.
URI:	http://dx.doi.org/10.14273/unisa-4250 http://elea.unisa.it:8080/xmlui/handle/10556/6158 https://doi.org/10.1515/9783110430295-040
ISBN:	e-ISBN: 978-3-11-043029-5 978-3-11-043922-9
Appears in Collections:	Contributi in volume / Contributions in books

Files in This Item:

File	Description	Size	Format
la documentazione non è disponibile.jpg	la documentazione non è disponibile	41.54 kB	JPEG	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets