In two previous blogposts (1, 2) I introduced the amazing Connected Island project Iza and I have been working on recently. This third blogpost about the Connected Island project will introduce our method for analysing publications and their citations. We will briefly discuss how citation network analysis works and the issues surrounding its applications. Finally, we will look at the very first results of this project: an analysis of publications about the Middle and Lower Palaeolithic in Hungary.
Hungarian Houses of Parliament
Citation network analysis
Recently, a wider availability of powerful computational resources, bibliometric software (e.g. HISTCITE; PAJEK; PUBLISH OR PERISH) and large bibliographic datasets in the sciences as well as the humanities resulted in significant progress in the analysis of citation networks in which vertices represent publications and a directed edge (or arc) between two vertices indicates a citation (Eom and Fortunato, 2011).
The foundations of citation network analysis were laid by Garfield et al. (1964) and the application of graph theory for citation network analysis was subsequently explored by Garner (1967). Despite this long tradition, its use in an archaeological context has not yet been thoroughly explored. In a number of studies researchers used simple counts of citations or other bibliometric data to track trends in the archaeological sciences and compare the impact and evolution of archaeological journals (e.g. Butzer, 2009; Marriner, 2009; Rehren et al., 2008; Rosenswig, 2005; Sterud, 1978), or to evaluate the impact of gender differentiation in archaeology (e.g. Beaudry and White, 1994; Hutson, 2002; 2006; Victor and Beaudry, 1992).
Citation network analyses in the Arts and Humanities are rare (Leydesdorff et al., 2011). The main reason for this is that the available citation databases for the Arts and Humanities (in particular the Institute for Scientific Information’s Arts and Humanities Citation Index) have significant limitations (Nederhof, 2006): books were until recently not indexed and publications in languages other than English are rare. However, monographs (rather than peer-reviewed journal articles) are often the dominant format of cited sources in the Humanities. Disciplines in the Arts and Humanities also show very different citation patterns and should therefore be considered separately (Knievel and Kellsey 2005). Despite these shortcomings citation analyses in the Arts and Humanities should not be discarded out of hand as it can still provide an alternative look at scientific practice through large aggregated datasets as long as the nature of the datasets and their limitations are thoroughly understood.
We came across some of these obstacles very early on during data collection for this project. Existing citation databases, like Web of Knowledge, contained only a fraction of the publications we were interested in. Those that are indexed in this resource are mostly written in English by Western European researchers (with a few exceptions) and it only rarely includes publications in Hungarian, Polish, Czech, Slovakian, or Russian. Manual data collection was therefore necessary.
A first test: the Lower and Middle Palaeolithic in Hungary
As a test-case we explored a small part of the project’s dataset, containing the 31 synthetic publications about the Lower and Middle Palaeolithic in Hungary we found in Budapest’s libraries. This collection of publications was written by nine Hungarian archaeologists between 1945 and 1990. This case-study aims to explore the citation patterns between them.
Chronological plot of citation network of Hungarian Palaeolithic researchers. Nodes are publications and directed lines are citations. Colours reflect publication language.
One would expect the older publications to be the most prominent since these had the time to accumulate the largest number of citations, and the results do show this process to some extent. Using the input domain measure (de Nooy et al., 2005: p. 193) we found that a few publications from the 50’s and early 60’s can be connected to by a larger number of nodes than any of the publications from the late 60’s and later, which indicates that these few publications influenced (directly or indirectly) the largest number of other publications. All of these publications with a high input domain were in fact written by a single author László Vértes who, although being very often cited by his colleagues, is guilty of quite a bit of self-citation as well. Although self-citation is common in academia and completely understandable (one always builds on one’s previous research), we needed to evaluate to what extent this affects the analytical techniques used. In this case the input domain seems to reflect largely the citation behaviour of one scholar who was extremely active throughout several decades.
Input domain score of publications: the number of publications that can be connected to a certain publication via a sequence of citations. This reflects the potential field of influence of a publication.
Another way of evaluating the relative prominence of old and more recent publications is to look at the number of citations they received. It is interesting to note that the oldest as well as the recent publications receive a relatively small number of citations compared to a few publications from the mid- to late-60‘s. One of these is a monograph edited by Vértes on one of the most important Middle Palaeolithic sites in Hungary, Tata, which also received a high input domain score. The second highly cited publication was a book about the Middle Palaeolithic in Hungary also written by Vértes. The third most frequently cited work is a monograph about another prominent Middle Palaeolithic site, Érd, written by Veronika Gábori-Csánk.
In citation network analysis authoritative sources are often defined as publications that receive a high number of citations and particularly from so-called hubs. Hubs are defined as publications that cite a lot of other works especially authorities. Given these definitions we can identify the site monographs of Tata and Érd as well as the second highly cited book by Vértes as such authorities. The hubs in this network are three publications by the same authors: Miklos Gábori. All three of these publications are reviews of the Hungarian Palaeolithic and due to their very nature will include a lot of references, especially to key site reports.
The above measures very much over-emphasize the most cited publications and the work of the most active authors. We should note, however, that six works in this citation network are not cited or do not cite any others. These include publications from the 60’s by Vértes and Gabori, a few publications from the 50’s that seem to have been ignored by all those who followed, and the most recent publications from 1988 and 1990 that could not have been cited by others in this network.
Language of Publication
On the basis of the small sample of publications gathered in Budapest we can say that the widely held assumption that archaeological data from Central Europe was published in local languages is incorrect (Table 2). At least half, if not more, of Central European archaeology publications from this period were published in German, French or English alongside the national language. The image that all countries under the influence of the former Soviet Union published in Russian is incorrect.
Hungarian researchers in the case study, number of publications per language, and publishing date of publications included in the case study.
We can conclude that although the effects of self-citation were definitely felt in this analysis, especially by those authors of whom we included multiple publications like Vértes or Gábori-Csánk, there are a number of publications that can be considered most pivotal in Hungarian Palaeolithic studies. These include the site reports of Tata and Érd.
Contrary to popular believe, Hungarian authors rarely published in their own language. Especially key site reports and synthetic works were written in these foreign languages, making them accessible to Western European archaeologists.
This blog post has explored the citation behaviour within a subset of the project’s dataset, and has concluded that Hungarian Palaeolithic archaeologists cited Central European and famous Western European scholars almost equally. Publications were almost always written in English, French or German, in addition to Hungarian, making most of them accessible to Western European archaeologists. But did the latter build on the work done by their Hungarian colleagues to improve their understanding of the European Lower and Middle Palaeolithic? Future work in this project will focus on the interactions between Western and Central European researchers.
Beaudry, M., & White, J. 1994. Cowgirls with the Blues? A Study of Women’s Publication and the Citation of Women’s Work in Historical Archaeology. In C. Claassen (ed) Women in Archaeology, 138–158. Philadelphia: University of Pennsylvania Press.
Butzer, K.W. 2009. Evolution of an interdisciplinary enterprise: the Journal of Archaeological Science at 35years. Journal of Archaeological Science 36(9): p.1842–1846.
Eom, Y.-H., & Fortunato, S. 2011. Characterizing and Modeling Citation Dynamics M. Perc (ed). PLoS ONE 6(9): p.e24926.
Garfield, E., Irving, H.S., & Richard, J.T. 1964. The use of citation data in writing the history of science. Philadelphia: Institute for scientific information.
Garner, R. 1967. A computer-oriented graph theoretic analysis of citation index structures. In B. Flood (ed) Three drexel information science research studies, 3–46. Philadelphia: Drexel press.
Hutson, S. 2002. Gendered citation practices in American Antiquity and other archaeology journals. American antiquity 67(2): p.331–342.
Hutson, S.R. 2006. Self-Citation in Archaeology: Age, Gender, Prestige, and the Self. Journal of Archaeological Method and Theory 13(1): p.1–18.
Knievel, J.E., & Kellsey, C. 2005. Citation analysis for collection development: a comparative study of eight humanities fields. The Library Quarterly 75(2): p.142–168.
Leydesdorff, L., Hammarfelt, B., & Salah, A. 2011. The structure of the Arts & Humanities Citation Index: A mapping on the basis of aggregated citations among 1,157 journals. Journal of the American Society for Information Science and Technology 62(12): p.2414–2426.
Marriner, N. 2009. Currents and trends in the archaeological sciences. Journal of Archaeological Science 36(12): p.2811–2815.
Nederhof, A. 2006. Bibliometric monitoring of research performance in the Social Sciences and the Humanities : a review. Scientometrics 66(1): p.81–100.
Nooy, W. de, Mrvar, A., & Batagelj, V. 2005. Exploratory social network analysis with Pajek. Cambridge ; New York: Cambridge University Press.
Rehren, T., Grattan, J., & Klein, R. 2008. Going strong, and growing. Journal of Archaeological Science 35: p.94305.
Rosenswig, R. 2005. A tale of two antiquities: Evolving editorial policies of the SAA journals. The SAA Archeological Record 5(1): p.15–21.
Sterud, E. 1978. Changing Aims of Americanist Archaeology: A Citations Analysis of American Antiquity. 1946-1975. American Antiquity 43(2): p.294–302.
Victor, K., & Beaudry, M. 1992. Women’s Participation in American Prehistoric and Historic Archaeology: A Comparative Look at the Journals American Antiquity and Historical Archaeology. In C. Claassen (ed) Exploring Gender through Archaeology, 11–22. Madison, Wisconsin: Prehistory Press.