Network Analysis with Visone Tutorial

Network science and statistical techniques for dealing with uncertainties in archaeological datasets

Network Science with Netlogo Tutorial

Network Analysis with Cytoscape Tutorial

Network Science Software

Network Science Glossary

External Resources

## Network Analysis with Visone Tutorial

This tutorial is a step-by-step guide to network creation, visualisation and analysis using the free to use software Visone, through an archaeological case study on Maya obsidian networks in Mesoamerica.

Download the tutorial files as a .zip archive

Please cite this tutorial as:

Weidele, D., and Brughmans, T. (2015) Network Analysis with Visone Tutorial. https://archaeologicalnetworks.wordpress.com/resources/#Visone

The archaeological case study and data used here is published as (we thank the authors of this paper for allowing us to use this study as an example and for providing feedback on the tutorial):

Golitko, M., Meierhoff, J., Feinman, G. M., & Williams, P. R. (2012). Complexities of collapse : the evidence of Maya obsidian as revealed by social network graphical analysis. Antiquity, 86: 507–23.

## Network science and statistical techniques for dealing with uncertainties in archaeological datasets

This tutorial created by Matt Peeples provides an overview of applying statistical techniques in R to perform sensitivity analyses of network metric results in the context of uncertainties in archaeological datasets. This tutorial requires basic knowledge of network science and R.

Please cite this tutorial as:

Peeples, M. A. (2017). Network Science and Statistical Techniques for Dealing with Uncertainties in Archaeological Datasets. [online]. Available: http://www.mattpeeples.net/netstats.html

## Network Science with Netlogo Tutorial

This tutorial provides an introduction to creating agent-based network models with Netlogo. By working through this tutorial you will learn how to create nodes, create edges, perform layouts introduce probability in edge creation, create a trade process working on the network and how to derive network measures.

Download the Netlogo tutorial as a .pdf file

Please cite this tutorial as:

Brughmans, T. (2016). Network Science with Netlogo Tutorial, https://archaeologicalnetworks.wordpress.com/resources/#netlogo

## Network Analysis with Cytoscape Tutorial

This tutorial is a step-by-step guide to network creation, visualisation and analysis using the free to use software Cytoscape, through an archaeological and geographical case study on inter-visibility in Iron Age and Roman Southern Spain.

Files:

Download the tutorial in PDF format

Download the network spreadsheet

Download the network attributes spreadsheet

Please cite this tutorial as:

Brughmans, T. (2013). Network Analysis with Cytoscape Tutorial, November 2013. https://archaeologicalnetworks.wordpress.com/resources/#cytoscape

The archaeological case study and data used here is published as:

Brughmans, T., Keay, S., & Earl, G. (2015). Understanding Inter-settlement Visibility in Iron Age and Roman Southern Spain with Exponential Random Graph Models for Visibility Networks. Journal of Archaeological Method and Theory, 22: 58–143. DOI: 10.1007/s10816-014-9231-x

Brughmans, T., Keay, S., & Earl, G. P. (2014). Introducing exponential random graph models for visibility networks. Journal of Archaeological Science, 49: 442–54. DOI: 10.1016/j.jas.2014.05.027

## Network Science software

Pajek

Up to date set of analysis techniques, can handle large networks, good manual and supporting documentation, less easy to use than UCINET, includes some specific features not included in UCINET (Triad counts; nice matrix graphs; graphs automatically separating components).

UCINET

up to date set of analysis techniques, good supporting documentation, great for converting to different network data formats. See handout network analysis practical 2.

Cytoscape

Poor documentation, user-friendly interface, easy analysis. See handout network analysis practical 1.

Gephi

Verry pretty visualisation, poor documentation, user-friendly interface, manual modification of layout algorithm settings.

GIS

Grass and ArcGIS (networkAnalyzer) have some network features. Can be used to produce visibility networks, least-cost paths, etc.

Processing

visualization and animation, programming skills needed.

Mathematica

Tools in the Wolfram language for analysing, modelling and visualising networks.

See also this iGraph interface for Mathematica.

Matlab: http://www.levmuchnik.net/Content/Networks/ComplexNetworksPackage.html

R Language

You can do basically anything you want if you can be bothered to code it, some great network analysis libraries (network, sna, Rnetworks, igraph, ergm, networkDynamic, Rsiena, statnet, tnet).

Excel and NodeXL

Everyone knows how Excel works, now you can use it to make networks.

TRACER

A text re-use tracing software

VISONE

Many functions and also runs RSiena

Blocks

free separate program for blockmodeling – very good at it, does nothing else. Result pages somewhat cumbersome, but very good documentation

## Network Science Glossary

### Introduction

This glossary contains definitions of concepts commonly used in network science. For each concept we first provide a formal definition, often followed by a description of the main use of the concept or its implications. The networks represented in the figures accompanying this glossary are used to illustrate a number of concepts. Where examples drawn from figures are provided, we refer to connected nodes by their number separated by a hyphen (e.g. 1-2 indicates that node 1 is connected to node 2). A key reference work for most of the concepts described here is that by Wasserman and Faust (1994), in which more elaborate descriptions, mathematical formulations, and additional bibliographic resources can be found. A limited number of additional primary sources are given in this glossary and included in a separate bibliography below. The terms included in this glossary are underlined.

### Acknowledgements

The glossary presented here benefited greatly from discussions with members of the algorithmics group at the Department of Computer and Information Science of the University of Konstanz, John M. Roberts Jr., and the contributors to the special issue of Journal of Archaeological Method and Theory where it was first published. It is co-authored with Habiba. The authors of this paper are solely responsible for any remaining mistakes in this glossary.

### Citation

Please refer to this glossary by citing the paper where it originally appeared:

Collar, A., Coward, F., Brughmans, T., & Mills, B. J. (2015). Networks in Archaeology: Phenomena, Abstraction, Representation. Journal of Archaeological Method and Theory, 22, 1–32. doi:10.1007/s10816-014-9235-6

### Glossary

Actor: See __node__.

Acyclic network: Defined as a __directed network__ with no __cycle__s.

For example, the __directed network__ in figure 1b is not __acyclic__ because it includes the cycle 2-3-4-2. Examples of __acyclic network__s include citation __network__s and dendrograms.

Adjacency matrix: Defined as a way of representing a __network__ where there is a row and a column for each __node__, and the values in the cells indicate whether an __edge__ exists between a pair of __node__s.

Affiliation network: See __two-mode network__.

Arc: See __directed edge__.

Average degree: See __degree__.

Average shortest path: See __geodesic__.

Betweenness centrality: A __node__’s betweenness centrality is defined as the fraction of the number of __geodesic__s passing through this __node__ over the number of __geodesic__s between all pairs of __node__s in the __network__.

__Node__s with a high betweenness centrality are often considered to be important intermediaries for controlling the flow of resources between other __node__s, because they are located on __path__s between many other __node__ pairs. The concept of brokerage is often mentioned in relation to __betweenness centrality__. __Node__s which are incident to the *only *__edge__ connecting two subsets of __nodes__ in the __network__ are in a position to broker the relationship between these __nodes__. These __node__s will typically have a high __betweenness centrality__ but not necessarily a high __degree__ or __closeness centrality__. Betweenness centrality was first quantitatively expressed by Anthonisse (1971) and Freeman (1977).

Bipartite network: See __two-mode networks__.

Blockmodel: Defined as a partitioning into blocks of __structurally equivalent__ nodes, where the blocks are connected by hypothesised __edge__s.

In blockmodelling the rows and columns of __adjacency matrices__ are arranged so that __structurally equivalent__ nodes are in adjacent positions in the matrix, and the __edge__s between different blocks can be studied. First introduced by White, Boorman, and Breiger (1976).

Brokerage: see __betweenness centrality__.

Centrality: Defined as a family of measures of the __node__’s position within the __network__, which represent a ranking of __node__s (see __betweenness__, __closeness__, __degree__, and __eigenvector centralities__).

Centrality measures are used to identify the most important or prominent __node__s in the __network__, depending on the different definitions of importance or prominence implemented in the __network__ measure used.

Clique: A clique is a subset of __node__s in a __network,__ where every pair of __node__s is connected by an __edge__.

In the social sciences only cliques of three __node__s or more are usually considered. For example, in figure 1a __node__s 2, 3, 4 form a clique of size 3. The definition of a clique is independent of whether it applies to the whole __network__ or not. For example, a network can consist of multiple cliques, or an entire __network__ can be one clique. The latter can also be called a complete network.

Closeness centrality: The closeness centrality of a __node__ is defined as the inverse of the sum of the __geodesic__s of that __node__ to all other __node__s divided by the number of __node__s in the __network__.

The closeness centrality of a __node__ gives an indication of how close this __node__ is to all other __node__s in the __network__, represented as the number of steps in the __network__ that are necessary on average to reach another __node__. __Node__s with a high closeness centrality score could be considered important or prominent, since they can share and obtain resources in less steps than other __node__s. Early quantitative implementations of closeness centrality are reviewed by Freeman (1979).

Clustering coefficient: Defined as the number of closed triplets over the total number of triplets in a __network__, where a triplet is a set of three __node__s with two (open triplet) or three (closed triplet) __undirected edge__s between them.

The clustering coefficient represents the average probability that two __node__s connected to a third __node__ are themselves connected, and it is commonly used for this purpose since the publication of the __‘small-world’ network__ model (Watts and Strogatz 1998). The clustering coefficient of a __network__ is closely related to the concept of transitivity in the social sciences, which captures the notion that “a friend of a friend is a friend”. Transitivity refers to the tendency of an open triplet to become a closed triplet.

Cohesion: See __density__.

Complete network: See __clique__.

Connected component: Defined as a subset of an __undirected network__ in which any pair of __node__s can be connected to each other via at least one __path__, and where there can be no __path__s to any nodes outside this subset.

For example, in the __undirected network__ in figure 5a there are two connected components: __node__ 5, and __node__s 1, 2, 3, 4.

Connection: See __edge__.

Cycle: Defined as a __path__ in a __directed network__ in which the starting __node__ and ending __node__ are the same. It is also called a closed __path__.

For example, in figure 1b the path 2-3-4-2 is a cycle. See also __acyclic network__.

Degree: The degree of a __node __is defined as the number of __edges__ incident to this __node__.

The average degree of a __network__ is the sum of the degrees of all __node__s in this __network__ divided by the number of __node__s. In a directed __network__, the indegree of a __node__ refers to the number of incoming incident __edge__s of a __node__. In a __directed network__, the outdegree of a __node__ refers to the number of outgoing incident __edge__s of a __node__. For example, node 2 in figure 1a has a degree of 3, whilst in figure 1b the same node can be said to have an indegree of 2 and an outdegree of 1.

Degree centrality: Defined as the __centrality__ of a __node__ based on the number of __edge__s incident to this __node__.

According to the degree __centrality__ measure, a __node__ is important or prominent if it has __edge__s to a high number of other __node__s.

Degree distribution: Defined as the probability distribution of all degrees over the whole network.

The measure is commonly used to compare the structure of networks since the publication of the ‘scale-free’ network structure (Albert & Barabási 2002, p. 49; Barabási & Albert 1999; Newman 2010, p. 243-247). In ‘scale-free’ networks the degree distribution follows a __power-law__.

Density: Defined as the fraction of the number of __edge__s that are present to the maximum possible number of __edge__s in the __network__.

Cohesion is a commonly used concept which is often operationalised using the density measure. For other cohesion measures see Wasserman and Faust (1994, 249-290).

Diameter: Defined as the length of the longest __geodesic__ in the __network__.

Directed edge: Defined as an ordered pair of __nodes__, which is often graphically represented as an arrow drawn from a starting __node__ to an end __node__.

A directed __edge__ is asymmetric. It connects a starting __node__ with an ending __node__, and cannot be traversed in the other direction. For example, all __edge__s in figure 1b are directed __edge__s.

Directed network: Defined as a set of __node__s and a set of __directed edge__s.

A __path__ through a directed network will need to follow the direction of the __directed edge__s. For example, the __network__ in figure 1b is a directed network.

Distance: See __path length__.

Dyad: Defined as any pair of __node__s in a __network__ that may or may not have an __edge__ between them.

For __undirected edges__ there are two possible dyadic relationships: connected, or not connected. For __directed edge__s, there are four: unconnected; connected in one direction; connected in the other direction; connected in both directions.

Edge: Defined as a line between a pair of __node__s, representing some kind of relationship between them.

Many synonyms exist to refer to an edge, including tie, arc, relationship, link, connection and line. An edge can be __directed__ or __undirected__, and __weighted__ or __unweighted__. The concept ‘arc’ is often used to refer to a directed edge.

Ego-network: Defined as a __network__ consisting of a __node__ (called ego), the __node__s it is directly connected to, and the __edge__s between these __node__s.

Eigenvector centrality: The eigenvector __centrality__ of a __node__ is defined in terms of the eigenvector __centrality__ of __node__s incident on it.

More descriptively, instead of assigning a single __centrality__ score to a __node__, a __node__’s eigenvector __centrality__ is defined in terms proportional to the __node__s incident on it. A __node__ with a high eigenvector __centrality__ is a __node__ that is connected to other __node__s with a high eigenvector __centrality__. See Newman (2010, 169-172) for the procedure to calculate eigenvector __centrality__.

Embeddedness: A polyvalent concept that comprises two variants. The first is the structural integration of a __node__ or any group of __node__s within the __network__. Different measures for structural integration exist. The E/I index is one example of this, which is calculated as a ratio of the number of __edge__s within a group of __node__s (internal) and between groups of __node__s (external). The second variant, as popularized in economic theory through the work of Polanyi (1944) and Granovetter (1985), relates to the intertwined nature of social, economic, political, religious, and cultural interactions. See Borck et al. (this issue) and Hess (2004) for overviews of this concept.

Equivalence: See __structural equivalence__.

Geodesic: Defined as the __path__ between a pair of __node__s with the shortest __path length__.

Sometimes referred to as the shortest __path length__ between a pair of __node__s. For example, the geodesic between __node__s 1 and 3 in figure 1b is the __path__ 1-3 with a length of 1. The average shortest __path length__ is the average of all geodesics in a __network__.

Graph: See __network__.

Heterophily: Defined as a tendency of __node__s to become connected to other __node__s that are dissimilar under a certain definition of dissimilarity.

For example, in figure 2a a __node__ with an attribute value represented in grey will have a tendency of being connected to a __node__ with a different attribute value represented in white.

Homophily: Defined as a tendency of __node__s to become connected to other __node__s that are similar under a certain definition of similarity.

For example, in figure 2b a __node__ with an attribute value represented in grey will have a tendency of being connected to a __node__ with the same attribute value.

Indegree: See __degree__.

Isolates: Defined as __node__s in a __network__ which have no incident __edge__s.

For example, node 5 in figure 1a is an isolate.

Line: See __edge__.

Link: See __edge__.

Network: Defined as a set of __node__s and a set of __edge__s.

In mathematics a network is referred to as a graph, whilst in the social sciences networks often consist of social __node__s and __edge__s, and are referred to as social networks.

Node: Defined as an atomic discrete entity representing a network concept.

A vertex (plural vertices) is a commonly used synonym to refer to a node. The term actor is sometimes used as a synonym for nodes in the social sciences.

One-mode network: See __two-mode network__.

Outdegree: See __degree__.

Path: Defined as a __walk__ between a pair of __node__s in which no __node__s and __edge__s are repeated.

For example, __node__s 1 and 3 in figure 1b are connected by the path 1-2-3.

Path length: Defined as the number of __edge__s in a __path__.

For example, __node__s 1 and 3 in figure 1b are connected by the path 1-2-3, which has a path length of 2.

Power-law: Defined as a mathematical relationship between two entities where the frequency of one entity varies as a power of the second entity. More formally, the probability of a node with degree *k* is proportional to *k ^{a}*.

Commonly used to describe the

__degree distribution__of

__network__s with a ‘scale-free’ structure (Barabási & Albert 1999). When a

__network__’s

__degree distribution__follows a power-law, it implies that few

__nodes__have a much higher

__degree__than all other

__node__s in the

__network__and most

__node__s have a very low

__degree__.

__Node__s with a very high

__degree__, sometimes referred to as “hubs” in the

__network__, significantly reduce the average

__shortest path length__of the

__network__.

Relationship: See __edge__.

Shortest path: See __geodesic__.

‘Small-world’ network: A ‘small-world’ __network__ is defined as a __network__ in which the average __shortest path length__ is almost as small as that of a uniformly random __network__ with the same number of __node__s and __density__, whereas the __clustering coefficient__ is much higher than in a uniformly random __network__ (a uniformly random __network__ is defined as a __network__ in which each __edge__ exists with a fixed probability *p*).

The ‘small-world’ __network__ structure as described here was first published by Watts and Strogatz (1998). This structure illustrates that relatively few __edge__s between clusters of __node__s are needed to significantly reduce the average __shortest path length__. It implies that resources can flow between any pairs of __node__s in the __network__ relatively efficiently, whilst maintaining a high degree of clustering.

Social network: See __network__.

Strongly connected component: Defined as a __connected component__ in a __directed network__.

In a __directed network __a __connected component__ is always either strongly or weakly connected. For example, in figure 1b node 5 is a connected component, whilst the set of __node__s 1, 2, 3, and 4 are not because __node__ 1 cannot be reached by a __path__ from the other __node__s.

Strong tie: A number of theoretical network models used in the social sciences rely on a distinction between strong and __weak ties__ (particularly those drawing on Granovetter 1973). The distinctions between the two, however, are rarely formally defined. In general, strong ties are used to describe frequently activated relationships (such as family/kin ties) whereas __weak ties__ are used to describe infrequently accessed connections (acquaintances). Strong ties tend to be among actors with similar sets of overlapping relationships whereas weak ties more often connect sets of actors who would otherwise be unconnected. In __weighted network__s, thresholds on the distribution of weights across a network as a whole are often used to define strong vs. __weak ties__ though there are no consistent rules used for this distinction.

Structural equivalence: Defined as, two __node__s are structurally equivalent if they have identical __edge__s to and from all other __node__s in the __network__ (see Lorrain and White 1971).

Structural equivalence is used to identify __node__s which have the same position in a __network__. It can be used to inform __blockmodelling__. In the social sciences, the structural similarities identified through structural equivalence are used to study social positions and social roles.

Tie: see __edge__.

Transitivity: see __clustering coefficient__.

Two-mode network: Defined as a __network__ in which two sets of __node__s are defined as modes. In a two-mode __network__, __node__s of one mode can only be connected to __node__s of another __node__.

Two-mode __network__s are sometimes referred to as bipartite __network__s. The definition of modes depends on the research context. In the social sciences two-mode __network__s are often used as a representation of affiliation __network__s, where one mode represents individuals and the other mode represents institutions or other concepts these individuals are affiliated with (given the definition of affiliation within the research context). For example, individuals may be affiliated to political parties, or be members on different boards of directors. The most common example of the use of two-mode __network__s in archaeology is to represent sites as one mode and the artefact types found on sites as a second mode. Two-mode __network__s can be transformed into two different one-mode __network__s by focusing on either one of the two modes. In a one-mode __network__, only the __node__s of one of the two modes is included, and pairs of nodes are connected by an __edge__ if both have a connection to at least one __node__ of the other mode in the two-mode __network__. For example, the two-mode __network__ in figure 3a (where two different modes are represented as __node__s with a different colour) can be transformed into a one-mode __network__ of only grey __node__s (Fig. 3b) or a one-mode __network__ of only white __node__s (Fig. 3c).

Undirected edge: Defined as an unordered pair of __node__s, which is often graphically represented as a line drawn between the pair of __node__s.

An undirected __edge__ is symmetric. Typically just called an __edge__. For example, all __edge__s in figure 1a are undirected __edge__s.

Undirected network: a set of __node__s and a set of __undirected edge__s.

An undirected __network__ is symmetric. For example, the __network__ in figure 1a is an undirected __network__.

Unweighted edge: Defined as an __edge__ which is not weighted.

Typically just called an __edge__. See also __weighted edge__.

Unweighted network: Defined as a set of __nodes__ and a set of __unweighted edge__s.

Valued network: See __weighted network__.

Vertex: See __node__.

Walk: A walk between a pair of __node__s is defined as any sequence of __node__s connected through __edge__s which has that pair of __node__s as endpoints.

In contrast to a __path__, __node__s and __edge__s can be repeated in a walk. For example, in figure 1b a walk between __node__s 2 and 3 could be 2-3-4-2-3.

Weakly connected component: Defined as a connected component in a __directed network__ where the directionality of __edge__s is ignored.

In a directed network a connected component is always either strongly or weakly connected. For example, in figure 1b there are two weakly connected components: node 5, and nodes 1, 2, 3, 4.

Weak tie: See __strong tie__.

Weighted edge: Defined as an __edge__ with a value associated to it.

These values are often real numbers but they can also be any concept connecting the end __node__s of the __edge__. The definition of an __edge__ weight depends on the research context. Weights could be represented as an attribute of an __edge__. Thresholding can be applied to select a subset of __edge__s with a given __edge__ weight.

Weighted network: Defined as a set of __node__s and a set of __weighted edge__s.

### References cited in glossary

Albert, R., & Barabási, A. (2002). Statistical mechanics of complex networks. *Reviews of Modern Physics*, *74*(January), 47–97.

Anthonisse, J. M. (1971). *The Rush in a graph*. Amsterdam: Mathematische Centrum.

Barabási, A.-L., & Albert, R. (1999). Emergence of Scaling in Random Networks. *Science*, 286/5439: 509–12. DOI: 10.1126/science.286.5439.509

Borck, L., Mills, B. J., Peeples, M. A., & Clark, J. J. (*this issue*). Are Social Networks Survival Networks? An Example from the Late Prehispanic U.S. Southwest. *Journal of Archaeological Method and Theory*.

Freeman, L. C. (1977). A Set of Measures of Centrality Based on Betweenness. *Sociometry*, 40/1: 35–41.

Freeman, L. C. (1979). Centrality in Social Networks. I. Conceptual Clarification. *Social networks*, 1: 215–39.

Granovetter, M. (1973). The strength of weak ties. *American Journal of Sociology* *78*(6): 1360–1380.

Granovetter, M. (1985). Economic action and social structure: The problem of embeddedness. *The American Journal of Sociology, 91*(3), 481–510.

Hess, M. (2004). “Spatial” relationships? Towards a reconceptualization of embeddedness. *Progress in Human Geography, 28*(2), 165–186.

Lorrain, F., & White, H. C. (1971). Structural equivalence of individuals in social networks. *Journal of mathematical sociology1*, 1: 49–80.

Newman, M. E. J. (2010). *Networks: An introduction*. Oxford: Oxford University Press.

Polanyi, K. (1944) *The great transformation. The political and economic origins of our time*. Boston: Beacon Press.

Wasserman, S., & Faust, K. (1994). *Social network analysis: methods and applications*. Cambridge: Cambridge University Press.

Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of “small-world” networks. *Nature*, *393*(6684), 440–2. doi:10.1038/30918

White, H. C., Boorman, S. A., & Breiger, R. L. (1976). Social structure from multiple networks. I. Blockmodels of roles and positions. *American Journal of Sociology*, 81/4: 730–79.

## External resources

There are some great resources available on other websites, definitely check out the following:

### Tutorials

First steps on Historical Network Research

Agent-based modelling tutorials on Simulating Complexity

### Software

Tools list on Historical Network Research

Resources list on Netplexity

### Related projects and blogs

Electric Archaeology

Hestia Project

Historical Network Research

Netplexity

Réseaux et Histoire

Simulating Complexity

Six Degrees of Spaghetti Monsters

Recorded presentations from the Historical Network Research events.