Connected Island: Citation Network Analysis

In two previous blogposts (1, 2) I introduced the amazing Connected Island project Iza and I have been working on recently. This third blogpost about the Connected Island project will introduce our method for analysing publications and their citations. We will briefly discuss how citation network analysis works and the issues surrounding its applications. Finally, we will look at the very first results of this project: an analysis of publications about the Middle and Lower Palaeolithic in Hungary.

Hungarian Houses of Parliament
Hungarian Houses of Parliament

Citation network analysis

Recently, a wider availability of powerful computational resources, bibliometric software (e.g. HISTCITE; PAJEK; PUBLISH OR PERISH) and large bibliographic datasets in the sciences as well as the humanities resulted in significant progress in the analysis of citation networks in which vertices represent publications and a directed edge (or arc) between two vertices indicates a citation (Eom and Fortunato, 2011).

The foundations of citation network analysis were laid by Garfield et al. (1964) and the application of graph theory for citation network analysis was subsequently explored by Garner (1967). Despite this long tradition, its use in an archaeological context has not yet been thoroughly explored. In a number of studies researchers used simple counts of citations or other bibliometric data to track trends in the archaeological sciences and compare the impact and evolution of archaeological journals (e.g. Butzer, 2009; Marriner, 2009; Rehren et al., 2008; Rosenswig, 2005; Sterud, 1978), or to evaluate the impact of gender differentiation in archaeology (e.g. Beaudry and White, 1994; Hutson, 2002; 2006; Victor and Beaudry, 1992).

Citation network analyses in the Arts and Humanities are rare (Leydesdorff et al., 2011). The main reason for this is that the available citation databases for the Arts and Humanities (in particular the Institute for Scientific Information’s Arts and Humanities Citation Index) have significant limitations (Nederhof, 2006): books were until recently not indexed and publications in languages other than English are rare. However, monographs (rather than peer-reviewed journal articles) are often the dominant format of cited sources in the Humanities. Disciplines in the Arts and Humanities also show very different citation patterns and should therefore be considered separately (Knievel and Kellsey 2005). Despite these shortcomings citation analyses in the Arts and Humanities should not be discarded out of hand as it can still provide an alternative look at scientific practice through large aggregated datasets as long as the nature of the datasets and their limitations are thoroughly understood.

We came across some of these obstacles very early on during data collection for this project. Existing citation databases, like Web of Knowledge, contained only a fraction of the publications we were interested in. Those that are indexed in this resource are mostly written in English by Western European researchers (with a few exceptions) and it only rarely includes publications in Hungarian, Polish, Czech, Slovakian, or Russian. Manual data collection was therefore necessary.

A first test: the Lower and Middle Palaeolithic in Hungary

As a test-case we explored a small part of the project’s dataset, containing the 31 synthetic publications about the Lower and Middle Palaeolithic in Hungary we found in Budapest’s libraries. This collection of publications was written by nine Hungarian archaeologists between 1945 and 1990. This case-study aims to explore the citation patterns between them.

Chronological plot of citation network of Hungarian Palaeolithic researchers. Nodes are publications and directed lines are citations. Colours reflect publication language.
Chronological plot of citation network of Hungarian Palaeolithic researchers. Nodes are publications and directed lines are citations. Colours reflect publication language.

One would expect the older publications to be the most prominent since these had the time to accumulate the largest number of citations, and the results do show this process to some extent. Using the input domain measure (de Nooy et al., 2005: p. 193) we found that a few publications from the 50’s and early 60’s can be connected to by a larger number of nodes than any of the publications from the late 60’s and later, which indicates that these few publications influenced (directly or indirectly) the largest number of other publications. All of these publications with a high input domain were in fact written by a single author László Vértes who, although being very often cited by his colleagues, is guilty of quite a bit of self-citation as well. Although self-citation is common in academia and completely understandable (one always builds on one’s previous research), we needed to evaluate to what extent this affects the analytical techniques used. In this case the input domain seems to reflect largely the citation behaviour of one scholar who was extremely active throughout several decades.
Input domain score of publications: the number of publications that can be connected to a certain publication via a sequence of citations. This reflects the potential field of influence of a publication.
Input domain score of publications: the number of publications that can be connected to a certain publication via a sequence of citations. This reflects the potential field of influence of a publication.

Another way of evaluating the relative prominence of old and more recent publications is to look at the number of citations they received. It is interesting to note that the oldest as well as the recent publications receive a relatively small number of citations compared to a few publications from the mid- to late-60‘s. One of these is a monograph edited by Vértes on one of the most important Middle Palaeolithic sites in Hungary, Tata, which also received a high input domain score. The second highly cited publication was a book about the Middle Palaeolithic in Hungary also written by Vértes. The third most frequently cited work is a monograph about another prominent Middle Palaeolithic site, Érd, written by Veronika Gábori-Csánk.

In citation network analysis authoritative sources are often defined as publications that receive a high number of citations and particularly from so-called hubs. Hubs are defined as publications that cite a lot of other works especially authorities. Given these definitions we can identify the site monographs of Tata and Érd as well as the second highly cited book by Vértes as such authorities. The hubs in this network are three publications by the same authors: Miklos Gábori. All three of these publications are reviews of the Hungarian Palaeolithic and due to their very nature will include a lot of references, especially to key site reports.

The above measures very much over-emphasize the most cited publications and the work of the most active authors. We should note, however, that six works in this citation network are not cited or do not cite any others. These include publications from the 60’s by Vértes and Gabori, a few publications from the 50’s that seem to have been ignored by all those who followed, and the most recent publications from 1988 and 1990 that could not have been cited by others in this network.

Language of Publication

On the basis of the small sample of publications gathered in Budapest we can say that the widely held assumption that archaeological data from Central Europe was published in local languages is incorrect (Table 2). At least half, if not more, of Central European archaeology publications from this period were published in German, French or English alongside the national language. The image that all countries under the influence of the former Soviet Union published in Russian is incorrect.

Hungarian researchers in the case study, number of publications per language, and publishing date of publications included in the case study.
Hungarian researchers in the case study, number of publications per language, and publishing date of publications included in the case study.

Conclusions

We can conclude that although the effects of self-citation were definitely felt in this analysis, especially by those authors of whom we included multiple publications like Vértes or Gábori-Csánk, there are a number of publications that can be considered most pivotal in Hungarian Palaeolithic studies. These include the site reports of Tata and Érd.

Contrary to popular believe, Hungarian authors rarely published in their own language. Especially key site reports and synthetic works were written in these foreign languages, making them accessible to Western European archaeologists.

This blog post has explored the citation behaviour within a subset of the project’s dataset, and has concluded that Hungarian Palaeolithic archaeologists cited Central European and famous Western European scholars almost equally. Publications were almost always written in English, French or German, in addition to Hungarian, making most of them accessible to Western European archaeologists. But did the latter build on the work done by their Hungarian colleagues to improve their understanding of the European Lower and Middle Palaeolithic? Future work in this project will focus on the interactions between Western and Central European researchers.

Bibliography

Beaudry, M., & White, J. 1994. Cowgirls with the Blues? A Study of Women’s Publication and the Citation of Women’s Work in Historical Archaeology. In C. Claassen (ed) Women in Archaeology, 138–158. Philadelphia: University of Pennsylvania Press.

Butzer, K.W. 2009. Evolution of an interdisciplinary enterprise: the Journal of Archaeological Science at 35years. Journal of Archaeological Science 36(9): p.1842–1846.

Eom, Y.-H., & Fortunato, S. 2011. Characterizing and Modeling Citation Dynamics M. Perc (ed). PLoS ONE 6(9): p.e24926.

Garfield, E., Irving, H.S., & Richard, J.T. 1964. The use of citation data in writing the history of science. Philadelphia: Institute for scientific information.

Garner, R. 1967. A computer-oriented graph theoretic analysis of citation index structures. In B. Flood (ed) Three drexel information science research studies, 3–46. Philadelphia: Drexel press.

Hutson, S. 2002. Gendered citation practices in American Antiquity and other archaeology journals. American antiquity 67(2): p.331–342.

Hutson, S.R. 2006. Self-Citation in Archaeology: Age, Gender, Prestige, and the Self. Journal of Archaeological Method and Theory 13(1): p.1–18.

Knievel, J.E., & Kellsey, C. 2005. Citation analysis for collection development: a comparative study of eight humanities fields. The Library Quarterly 75(2): p.142–168.

Leydesdorff, L., Hammarfelt, B., & Salah, A. 2011. The structure of the Arts & Humanities Citation Index: A mapping on the basis of aggregated citations among 1,157 journals. Journal of the American Society for Information Science and Technology 62(12): p.2414–2426.

Marriner, N. 2009. Currents and trends in the archaeological sciences. Journal of Archaeological Science 36(12): p.2811–2815.

Nederhof, A. 2006. Bibliometric monitoring of research performance in the Social Sciences and the Humanities : a review. Scientometrics 66(1): p.81–100.

Nooy, W. de, Mrvar, A., & Batagelj, V. 2005. Exploratory social network analysis with Pajek. Cambridge ; New York: Cambridge University Press.

Rehren, T., Grattan, J., & Klein, R. 2008. Going strong, and growing. Journal of Archaeological Science 35: p.94305.

Rosenswig, R. 2005. A tale of two antiquities: Evolving editorial policies of the SAA journals. The SAA Archeological Record 5(1): p.15–21.

Sterud, E. 1978. Changing Aims of Americanist Archaeology: A Citations Analysis of American Antiquity. 1946-1975. American Antiquity 43(2): p.294–302.

Victor, K., & Beaudry, M. 1992. Women’s Participation in American Prehistoric and Historic Archaeology: A Comparative Look at the Journals American Antiquity and Historical Archaeology. In C. Claassen (ed) Exploring Gender through Archaeology, 11–22. Madison, Wisconsin: Prehistory Press.

‘A Connected Island?’: measuring academic influence

By Iza Romanowska and Tom Brughmans

This second blog post about the Connect Island project, funded by a sotonDH small award, discusses the relative influence of Central European Palaeolithic researchers using the H-index measure.

hindex all

Figure 1: H-index scores of Central European Palaeolithic researchers (left) versus Iron Age (right) researchers.

It has been claimed that Central European archaeologists specializing in Stone Age studies are quite well-known in the West compared to their colleagues leading research in later epochs. To test this anecdotal supposition we analysed the H-index of Central European Palaeolithic researchers.

The H-index (Hirsch 2005) is a measure of an author’s academic impact that takes into account both the number of papers published by the author and the number of citations to these papers (Bornmann and Daniel 2005; 2007). Its main advantage is that it balances the effects of a small number of high hitting papers and a large number of rarely cited publications. Neither a researcher with a one-hit-wonder paper, nor one producing hundreds of mediocre publications will score high. The H-index therefore favours enduring performance both in terms of quality and quantity. We used publications and citations recorded in Google Scholar as it covers a higher number of publications than ISI Web of Knowledge, especially for the fields of Social Sciences and Arts and Humanities (Kousha and Thelwall 2008). In contrast to ISI Web of Knowledge, however, Google’s bibliographic indexing is automated and not routinely manually edited by Google staff making it prone to inconsistencies and duplication. We noticed that the H-index results for archaeologists were unrealistically low when only taking publications in Web of Knowledge into account, and Google Scholar was therefore considered the lesser of two evils.

To provide a benchmark, we compared the results with a large sample of Central European Iron Age researchers. The Central European Iron Age is quite extensive, well-studied and some of its main proponents are well-known internationally. Arguably, the fact that we are using Iron Age researchers for this benchmark is irrelevant, any sub-discipline within archaeology would have done the job. In order for the anecdotal statement we are trying to test to be true, however, the H-index scores of the Palaeolithic researchers should be close to or higher than the Iron Age researchers’.

The results strongly confirm the intuitive observation (see Table 1 and Figure 1). Compared to a test sample of Iron Age specialists, Central European Palaeolithic researchers have been quoted more extensively and their papers were more influential abroad (as reflected in Google Scholar), indicating that they had a higher direct impact (as measured by the H-index) on the discipline globally.

Palaeolithic researchers   Iron Age researchers  
Karel Absolon 9 Kazimierz Bielenin 4
Viola Dobosi 5 Anna Bitner-Wróblewska 2
Boleslaw Ginter 7 Éva Bónis 3
Jan Fridrich 4 Jaroslav Böhm 6
 Bohuslav Klíma 10 Miloš Čižmář 3
Michal Kobusiewicz 8 Jana Čižmářová 1
Janusz Krzysztof Kozlowski 10 Sylwester Czopek 2
Stefan Kozlowski 9 Petr Drda 4
Gábori Miklós 5 Jan Filip 11
Martin Oliva 9 Kazimierz Godlowski 7
Romuald Schild 22 Eszter Istvánovits 2
Josef Skutil 5 Libuše Jansová 3
Jiří Svoboda 14 Fitz Jenő 9
Karel Valoch 14 Piotr Kaczanowski 5
László Vértes 11 Andrzej Kokowski 3
Jerzy Kmieciński 4
Valéria Kulcsár 2
Karel Ludikovský 1
Henryk Machajewski 2
 Renata Madyda-Legutko 3
Magdalena Mączyńska 3
Jiří Meduna 4
Szabó Miklós 5
Karla Motyková-Šneidrová 2
Jerzy Okulicz-Kozaryn 2
Emanuel Šimek 4
Jaroslav Tejral 8
Andrea Vaday 3
Natalie Venclová 5
Jiří Waldhauser 3
Ryszard Wołągiewicz 2
Table 1: all Palaeolithic and Iron Age researchers included in the analysis with their H-index scores.

The Matthew effect?

We suspect that we are dealing here with a good example of the “Matthew effect” in science. Coined by Robert K. Merton (1968), the term refers to a passage from the Gospel of Matthew: “For to all those who have, more will be given, and they will have an abundance; but from those who have nothing, even what they have will be taken away.” – Matthew 25:29.

In simple terms it can be referred to as the “rich get richer” effect. Applied to academia it describes the phenomenon of more established, better-known scholars receiving disproportionately more credit than their lesser-known colleagues for equal or even smaller contributions to the research. Thus, they are more likely to spread their results wider and to have a higher impact on the discipline. Lower Palaeolithic archaeology had an additional boost when it came to creating a strong Matthew effect. The few irregularly distributed Lower Palaeolithic sites could be studied and published by only a handful of specialists. As a result, only a limited number of archaeologists were drawn into Palaeolithic studies and those who did were exempt from the fierce competition that their colleagues working on later epochs faced.

This also meant that invitations to conferences, scientific collaboration and co-authoring would be shared within a smaller cluster of scholars creating a self-propelling positive feedback loop and strengthening the natural Matthew effect. Combined with the nature of Palaeolithic data which is of global relevance and the high demand for Palaeolithic researchers in the second half of the 20th century, this could have contributed to a better recognition of Central European Palaeolithic researchers in the West, giving them more opportunities to collaborate, publish and spread their results in the international research community. Such a process could account for the higher H-index compared to their colleagues specializing in later epochs.

Bibliography

Bornmann, L., H.-D. Daniel. 2005. “Does the h-index for ranking of scientists really work?” Scientometrics 65 (3): 391-392. doi:10.1007/s11192-005-0281-4.
Bornmann, L., H.-D. Daniel. 2007. “What do we know about the h-index?” Journal of the American Society for Information Science and Technology 58 (9): 1381-1385. doi:10.1002/asi.20609.
Hirsch, J. E. 2005. “An index to quantify an individual’s scientific research output.” Proceedings of the National Academy of Sciences of the United States of America 102 (46) (November 15): 16569-16572. doi:10.1073/pnas.0507655102.
Kousha, K., M. Thelwall. 2007. “Sources of Google Scholar citations outside the Science Citation Index: A comparison between four science disciplines.” Scientometrics 74 (2): 273-294. doi:10.1007/s11192-008-0217-x.
Merton, Robert K. 1968. “The Matthew Effect in Science.” Advancement of Science 159 (3810): 56-63.
Merton, Robert K. 1988. “The Matthew Effect in Science II. Cumulative Advantage and the Symbolism of Intellectual Property.” Sociology. The Journal of the British Sociological Association 159: 606-623.

Happy New Year! and CAAUK

Screen shot 2013-01-15 at 17.10.32Happy New Year all! There are a couple of events in 2013 I am really looking forward to, including the CAA conference in Perth and the SAAs in Hawaii, more about those later. The first conference of the year for me will be CAAUK in London, on 22-23 February. The programme sounds great, with a keynote by Mark Lake discussing the special issue of World Archaeology he recently edited on Open Archaeology. Registration is now open but almost full, so hurry up if you wanna be part of it!

I will present a poster on a project Iza Romanowska and I have set up: ‘A Connected Island?: how the Iron Curtain affected archaeologists’. We are touring Central Europe’s libraries for this project, collecting publications by Central European Palaeolithic archaeologists. We hope to be able to evaluate the interactions between Western and Central European archaeologists, and we hope our methodology of citation network analysis will help us do this. More about the project in later posts! The poster will be presented by Iza at the Unravelling the Palaeolithic conference in Cambridge this weekend. Here is the abstract:

‘A Connected Island?’: How the Iron Curtain affected Palaeolithic Archaeologists in Central Europe

Iza Romanowska (Centre for the Archaeology of Human Origins, University of Southampton)
Tom Brughmans (Archaeological Computing Research Group, University of Southampton)

After the Second World War the Iron Curtain sliced through the very centre of Europe. The Soviet regime introduced a new structure to the academic institutions in countries like Poland, Hungary and former Czechoslovakia, including restrictions on contacts with the Western world and ideological pressure. How did this situation affect researchers on both sides? Was Central European Academia really isolated from western influences?

It is difficult to quantitatively determine to what degree these limitations affected archaeologists. The project team argues that citation data might allow (at least in part) for such a quantitative evaluation. Citations are like handy formal proxies for tracing lines of knowledge dissemination and academic influence, obviously not fully representative for these very complex processes, but well suited to quantify the ‘awareness’ of other peoples’ research.

The project will initially focus on the Lower and Middle Palaeolithic of Poland, former Czechoslovakia and Hungary. Citations have been extracted from publications of a synthetic nature (i.e. not field reports) and a citation network analysis has been performed on that data. Our preliminary results indicate that a lot of common presumptions regarding the research behind the Iron Curtain, like the dominance of Russian or national languages in Academic writing, are in fact false.

Introducing ‘A Connected Island?’: how the Iron Curtain affected Archaeologists

Eötvös Loránd University (University of Budapest)
Eötvös Loránd University (University of Budapest)
After the Second World War the Iron Curtain sliced through the very centre of Europe forming a very real divide in both political and daily lives. In the second half of the 20th century the Soviet regime introduced a new structure to the academic institutions to countries like Poland, Hungary and former Czechoslovakia, including restrictions on contacts with the Western world and ideological pressure previously unknown in these parts of Europe. How did this situation affect researchers on both sides? Was Central European Academia really isolated from western influences? A new project funded by SotonDH aims to address this issue using Palaeolithic archaeology as a case study. In ‘A connected island? Evaluating influence and isolation of Central European Palaeolithic researchers during communism’ ACRG members Iza Romanowska and Tom Brughmans combine a traditional historiography with novel citation network analysis techniques to approach this issue from a new angle.

Isolated or not?

A heated debate has been taking place in Central European archaeology in the last two decades regarding the issue of isolation (or the lack thereof) from western influences during the second half of the 20th century. Difficulties related to obtaining the necessary passports and visas, the disparity in the values of currencies, and only limited formal international links between research institutions restricted research visits, data collection, literature review, and conference attendance. Equally hindering was the limited circulation of Western archaeological journals within the Soviet Bloc countries, and restricted accessibility to archaeological publications in general. This could have been further aggravated by language barriers and, to some extent, different disciplinary interests. All this does not necessarily mean that Central European researchers were completely unaware of what was happening in the West, as if living on an island unconnected to the rest of the world and immune to external influences.

It is difficult to quantitatively determine to what degree these limitations affected Central European researchers. The project team argues that citation data might allow (at least in part) for such a quantitative evaluation. When a researcher cites the work of another scholar they express in a very formalised way that they were influenced by this person. Citations are like handy proxies for tracing lines of knowledge dissemination and academic influence, obviously not fully representative for these very complex processes, but well suited to quantify the ‘awareness’ of other peoples research.

‘A connected island?’ will collect and explore citation data for Central European Stone Age studies, a relatively small but highly international research field that forms a well-defined case-study suitable for quantitative analysis. The project will initially focus on the Lower and Middle Palaeolithic of Poland, former Czechoslovakia and Hungary. The citation behaviour of scholars working in these countries will be confronted with that of Western European Palaeolithic researchers. The proposed project therefore aims to explore the degree of interaction and academic influence between Central and Western European researchers in Lower and Middle Palaeolithic archaeology during communism (1945-1989) through citation network analysis, in order to evaluate the hypothesis that Central European researchers worked in strong academic isolation.

Data collection in Hungary

Digital citation datasets are available online through services like Google Scholar or Web of Knowledge. However, neither of these is very comprehensive for books or local non-peer-reviewed journals where a lot of the Palaeolithic archaeology of Central Europe was published in. So despite of the revolution in digital data collection brought about by the World Wide Web, a critical analysis could not be performed without visiting key libraries and research institutions in Central Europe. So the ‘connected island’ team hit the road.

The first phase of data collection was conducted in May 2012. In this first round a visit was paid to the Institute of Archaeology at the Jagiellonian University in Krakow. Thanks to a bursary from SotonDH the team could perform the second phase of data collection at the Hungarian Academy of Sciences and the Institute of Archaeological Science at the Eötvos Loránd University, both in Budapest. The third phase will aim to collect relevant literature from the Ústav pro pravěk a ranou dobu dějinnou at Charles University in Prague.

A few preliminary results

The widely held assumption that archaeological data from Central Europe was published in local languages is false. At least half, if not more, of Central European archaeology publications from this period were published in either German, French or English alongside the national language. The image that all countries under the influence of the former Soviet Union published in Russian is incorrect.

Just like their Western European or American colleagues Central European Palaeolithic researchers worked within a very strong ‘Bordean’ framework (named after a famous French researcher: Francois Bordes and his wife Denisse de Sonneville-Bordes).

From a cursory check, one gets an impression that Palaeolithic researchers in Central Europe were well informed of the developments on the other side of the Iron Curtain and quoted western authors extensively. The same can be said in reverse for their Western European colleagues who occasionally quote one or two Central European sites but did not seem to be aware of the full scope of the research happening in the region.

A second blog post on this project will follow soon featuring the first results of the citation network analysis, aimed at exploring this notion of unbalanced citation behaviour between the Eastern and Western researchers. Watch this space.

Iza Romanowska and Tom Brughmans

Blog at WordPress.com.

Up ↑