3 - Landscape graph construction and analysis with Graphab and graph4lg

Introduction

The rationale of graph4lg package in R is to make easier the construction and analysis of genetic and landscape graphs in landscape genetic studies (hence the name graph4lg, meaning Graphs for Landscape Genetics). This package provides users with tools for:

  • Landscape and genetic data processing
  • Genetic graph construction and analysis
  • Landscape graph construction and analysis
  • Landscape and genetic graph comparisons

Each one of the included tutorials focuses on one of these points. This third tutorial will focus on landscape graph construction and analysis. It will describe the package functions allowing users to:

  • Create a landscape graph
  • Analyse the landscape graph
  • Import the landscape graph to perform landscape genetic analyses

NOTA BENE: The package graph4lg integrates functions making possible the construction and analysis of landscape graphs but most of them are only wrappers that launch computations with Graphab software.

What is Graphab?

Graphab is an open source software tool dedicated to the modelling of landscape networks (Foltête, Clauzel, and Vuidel 2012). It was developed in ThéMA laboratory (Besançon, France) to make possible the creation of landscape graphs and to integrate a complete set of connectivity analysis functions in a single application.

Integrating Graphab functionalities into graph4lg aims at facilitating the import of Graphab output into R for subsequent analyses, including the comparison of genetic graphs and landscape graphs. All the functions calling Graphab software are described in graph4lg help files. However, users are invited to read the complete documentation manuals as well as other resources on Graphab project website. Some computations are not possible when using graph4lg but are possible when using directly Graphab software which can be downloaded for free on the same website. Foltête, Clauzel, and Vuidel (2012) paper also provides users with a synthetic description of the tool.

The function get_graphab from graph4lg package downloads automatically Graphab in the user’s machine, provided Java is installed on the machine. If it has already been downloaded, a message is displayed. Once Graphab has been downloaded in a machine, it is not necessary to download it again.

get_graphab()

In the table below, we present all the functions from grapha4lg calling Graphab software:

Function Description
get_graphab Check if Graphab software (graphab-2.6.jar) is in the user’s machine and downloads it if necessary, provided Java is installed.
graphab_project Creates a Graphab project from a raster file (.tif)
graphab_link Creates a set of links (Euclidean or least cost paths, planar or complete)
graphab_graph Creates a graph from a set of links
graphab_metric Computes graph-theoretic metrics from a graph
graphab_modul Partition the graph into modules
graphab_pointset Imports a set of points in the project and relates them to habitat patches and corresponding metrics
get_graphab_metric Imports a table with the metrics computed and the patch properties
get_graphab_linkset Imports a table with the link properties for a given link set
graphab_to_igraph Imports a graph created with Graphab as an igraph graph object with node and link attributes

Data used in this tutorial

Here, we do not rely on genetic data sets and will only use Graphab projects already created for the vignettes.

Landscape graph construction

Project creation

When using Graphab, input data consists of an 8-bit raster file (.tif) with discrete cell values. One (or several) of these values will correspond to habitat cells on the raster. Contiguous habitat cells will form habitat patches (8-neighbourhood criterion), as soon as they exceed the minimum patch area when specified

This is the first step of habitat network modelling with Graphab. At this stage, a project is created. This project is given a name (proj_name), which will be the name of a directory containing geographical information layers relative to the project as well as a .xml file (“proj_name.xml”), which is the main project file.

The function graphab_project creates the project directory called proj_name either in the current working directory (default) or in the directory whose path is given as a proj_path argument. The function takes as arguments:

  • proj_name: Name of the project (project and .xml file directory)
  • raster: Path to the raster file on which habitat patches are identified
  • habitat: Cell value(s) corresponding to habitat patches (possibly a vector with several values (e.g. c(1,2))
  • minarea (optional): Minimum habitat patch size in hectares
  • nodata (optional): Cell value corresponding to no data values, thereafter ignored
  • alloc_ram (optional): Integer or numeric value indicating RAM gigabytes allocated to the java process. Increasing this value can speed up the computations. Too large values may not be compatible with user’s machine settings
  • proj_path(optional): Path to the directory that contains the project directory. It should be used when the project directory is not in the current working directory. Default is NULL. When 'proj_path = NULL', the project directory is equal to getwd().

The two latter arguments are included in almost every function calling Graphab.

As an example, we will create a landscape graph from a simulated landscape raster with 5 land use categories (see map below).

load(file = paste0(system.file('extdata', package = 'graph4lg'), 
                               "/", "rast_simul50.RDa"))

r.spdf <- as(rast, "SpatialPixelsDataFrame")
#> Registered S3 method overwritten by 'proxy':
#>   method   from   
#>   dim.dist ecodist
r.df <- as.data.frame(r.spdf)

r.df$layer <- as.factor(r.df$rast_simul50)

g <- ggplot(r.df, aes(x=x, y=y)) + geom_tile(aes(fill = layer)) + coord_equal()+
  theme_bw()+
  #scale_fill_brewer(palette="Dark2")+
  scale_fill_manual(values = c("#396D35", "#FB9013", "#EDC951", "#80C342", "black", "#396D35"),
                    labels = c("0 - Forest", "1 - Shrublands", "2 - Crops", 
                               "3 - Grasslands","4 - Artificial areas", "5 - Forest"),
                    name = "Land use type")+
  labs(x="Longitude",y="Latitude")
g

Habitat patches will correspond to contiguous patches of raster cells equal to 0 or 5, as soon as they are larger than 200 hectares. They correspond to forests.

proj_name <- "graphab_example"

graphab_project(proj_name = proj_name,
                raster = "rast_simul50.tif",
                habitat = c(0, 5),
                minarea = 200)

Graph construction

We can now create a landscape graph with graphab_graph. This function takes as arguments:

  • proj_name: Name of the project
  • linkset: Name of the link set used to create the graph
  • name: A character string indicating the name of the graph created, if only one is created
  • thr: An integer or numeric value indicating the maximum distance associated with the links of the created graph. It allows users to create a pruned graph based on a distance threshold. Note that when the link set used has a planar topology, the graph is necessarily a pruned graph (not complete) and adding this threshold parameter can remove other links. When the link set has been created with cost-distances, the parameter is expressed in cost-distance units whereas when the link set is based upon Euclidean distances, the parameter is expressed in meters.
  • cost_conv: Logical (TRUE or FALSE) indicating whether numeric thr values are converted from cost-distance into Euclidean distance using a log-log linear regression. See also convert_cd function presented in the first tutorial.

proj_name is the only mandatory argument. Without any other argument, a non-thresholded graph is created for every link set present in the project. Names are created automatically.

In our example, we can create a planar graph with the following command:

graphab_graph(proj_name = proj_name,
              linkset = "lkst1",
              name = "graph")

Landscape graph analysis

Once the graph has been created, it can be analysed by computing graph-theoretic connectivity metrics or by partitioning its nodes.

Metric calculation

Many connectivity metrics can be computed from a landscape graph (see Baranyi et al. (2011) and Rayfield, Fortin, and Fall (2011) among other reviews on the subject). Graphab software includes a large range of connectivity metrics and its manual provides users with a comprehensive description of every one of them (see Graphab 2.6 manual).

The function graphab_metric computes these metrics. It takes as arguments:

  • proj_name: Name of the project

  • graph: Name of the graph on which the metric is computed

  • metric: Name of the metric among:

    • Probability of Connectivity ("PC"),
    • Integral Index of Connectivity ("IIC"),
    • Flux ("F"),
    • Interaction Flux ("IF"),
    • Degree ("Dg"),
    • Clustering Coefficient ("CCe"),
    • Current Flow ("CF")
    • delta Probability of Connectivity ("dPC").
  • dist: A numeric or integer value specifying the distance at which dispersal probability is equal to prob. This argument is mandatory for weighted metrics (PC, F, IF, BC, dPC, CCe, CF) but not used for others. It is used to set α for computing dispersal probabilities associated with all inter-patch distances such that dispersal probability between patches i and j is pij = eαdij.

  • prob: A numeric or integer value specifying the dispersal probability at distance dist. By default, code=0.05. It is used to set α (see argument dist above).

  • beta: A numeric or integer value between 0 and 1 specifying the exponent associated with patch areas in the computation of metrics weighted by patch area. By default, beta=1. When beta=0, patch areas do not have any influence in the computation.

  • cost_conv: Logical (TRUE or FALSE) indicating whether numeric dist values are converted from cost-distance into Euclidean distance using a log-log linear regression.

  • return_val: Logical (default = TRUE) indicating whether metric values are returned in R (TRUE) or only stored in the patch attribute layer (FALSE)

Metrics fall into different categories. PC and IIC are global metrics and take only one value for the entire graph, which is returned in R environment when return_val=TRUE.

The other metrics are computed at the node level. When return_val=TRUE, a data.frame is returned in R environment specifying the value of the metric for every graph node.

The description of every metric is beyond the scope of this tutorial. We again invite users to read the help file (?graphab_metric) as well as the Graphab 2.6 user manual.

We will compute metrics on the graph "graph" in our example. First, we will compute the probability of connectivity at the global level. We set the dist and prob parameter such that: p(10km) = eα × dij = 0.05. We convert 10 km in cost-distance units for the computation.

# Global metric: PC
graphab_metric(proj_name = proj_name,
               graph = "graph",
               metric = "PC",
               dist = 10000,
               prob = 0.05,
               beta = 1,
               cost_conv = TRUE)
#> [1] "Project : graphab_example"    "Graph : graph"               
#> [3] "Metric : PC"                  "Dist : 10000"                
#> [5] "Prob : 0.05"                  "Beta : 1"                    
#> [7] "Value : 0.000903573300776615"

We obtain the value in an object of class list.

Now, using the same parameters, we compute the local metric Flux.

f <- graphab_metric(proj_name = proj_name,
               graph = "graph",
               metric = "F",
               dist = 10000,
               prob = 0.05,
               beta = 1,
               cost_conv = FALSE)
#> [[1]]
#> [1] "Project : graphab_example" "Graph : graph"            
#> [3] "Metric : F"                "Dist : 10000"             
#> [5] "Prob : 0.05"               "Beta : 1"
#>   Id F_d1889.4356657600988_p0.05_beta1_graph
#> 1  1                              26657076.8
#> 2  2                              16294717.6
#> 3  3                               2621667.5
#> 4  4                                705995.2
#> 5  5                              18215710.3
#> 6  6                              21918505.4

Every time a local metric is computed, it can be returned in R environment (here stored in f) but is also stored in the habitat patch attribute table, such that at the end this table contains every metric computed. We will see later that it can be very useful.

Graph partitioning

Another way to analyse landscape graph connectivity and topology is to make a partitioning. The principle is the same as that of modularity analyses of genetic graphs described in the second tutorial.

graphab_modul function carries out such an analysis, but before we perform some modifications in the current code, we recommend doing this analysis directly in Graphab software.

For the moment, it only creates a shapefile polygon layer with the Voronoi polygons corresponding to the groups of patches forming modules.

It takes as arguments:

  • proj_name: Name of the project
  • graph: Name of the graph on which the partition is performed
  • dist: A numeric or integer value specifying the distance at which dispersal probability is equal to prob. This argument is used to weight the links when modularity is computed.
  • prob: A numeric or integer value specifying the dispersal probability at distance dist. By default, code=0.05. It is used to set α (see argument dist above).
  • beta: A numeric or integer value between 0 and 1 specifying the exponent associated with patch areas in the computation of the modularity. By default, beta=1. When beta=0, patch areas do not have any influence in the computation.
  • nb: Optional argument indicating the number of modules to create. By default, the number of modules maximising the modularity index is used.

For example:

graphab_modul(proj_name = proj_name,
              graph = "graph",
              dist = 10000,
              prob = 0.05,
              beta = 1)

NOTA BENE: All these analyses can be performed for several graphs created in the same project, provided their name is correctly indicated in the function arguments.

Using landscape graphs in landscape genetic analyses

Landscape graphs have been created and analysed by computing connectivity metrics and/or by partitioning them. The output of these analyses can be used in landscape genetic analyses, e.g. to assess the relationship between genetic variables measured in populations inhabiting the study landscape and the connectivity of its habitat patches. To do that, two approaches are here presented:

  • 1 - Import external point data in Graphab project to link every point to the nearest habitat patch
  • 2 - Import Graphab output in R environment to use it in subsequent (statistical) analyses

Point set import

Imagine that we have sampled several populations in the forests of the study landscape. We want to know in which landscape graph node lives every population, to get the corresponding connectivity metrics and compare them with the genetic response.

To do that, the function graphab_pointset imports a set of points into the Graphab project. It takes as arguments:

  • proj_name: Name of the project

  • linkset: Name of the link set used to compute the distance from each point to the nearest patch (in cost-distance or Euclidean distance units depending on the way the link set was created)

  • pointset: it can be either:

    • A character string indicating the path (absolute or relative) to a shapefile point layer
    • A character string indicating the path to a .csv file with three columns: ID, x and y, respectively indicating the point ID, longitude and latitude
    • A data.frame with three columns: ID, x and y, respectively indicating the point ID, longitude and latitude.
    • A SpatialPointsDataFrame

Point coordinates must be in the same coordinate reference system as the habitat patches (and initial raster layer). The function returns a data.frame with the properties of the nearest patch to every point in the point set, as well as the distance from each point to the nearest patch.

For example, we can add the point data.frame pts_pop_simul to the project with the following command.

# Point data frame
head(pts_pop_simul)
#>   ID     x     y
#> 0  1 25650 54150
#> 1  2 22550 53650
#> 2  3  7250 51750
#> 3  4 32350 50550
#> 4  5 27350 49650
#> 5  6 15950 49550
graphab_pointset(proj_name = proj_name,
                 linkset = "lkst1",
                 pointset = pts_pop_simul)
#>   Nearest_patch_ID            Point_ID Dist_to_patch    Area Perim Capacity
#> 1                9  file15143e6c31ff.2             0 5250000 19200  5250000
#> 2               11  file15143e6c31ff.1             0 4730000 15800  4730000
#> 3               14 file15143e6c31ff.10           121 3750000 19200  3750000
#> 4               17  file15143e6c31ff.6             0 3120000 12600  3120000
#> 5               18  file15143e6c31ff.4          1065 3660000 15000  3660000
#> 6               18  file15143e6c31ff.5             0 3660000 15000  3660000
#>   F_d1889.4356657600988_p0.05_beta1_graph
#> 1                                 6430186
#> 2                                 5682508
#> 3                                 6774869
#> 4                                11837650
#> 5                                 2221650
#> 6                                 2221650

Note that when pointset is not the path to a shapefile point layer, such a layer will be created in the temp files of the user’s machine. It is therefore preferable to give the path to a shapefile layer as a pointset argument.

Landscape graph import

Finally, users can also create a graph from a Graphab link set and convert it into a graph object of class igraph. The imported graph has weighted links. Nodes attributes present in the Graphab project are included, including connectivity metrics when computed. A figure can be displayed automatically representing the graph on a map. The graph is given the topology of the selected link set.

The function graphab_to_igraph takes as arguments:

  • proj_name: Name of the project
  • linkset: Name of the link set used to compute the distance from each point to the nearest patch (in cost-distance or Euclidean distance units depending on the way the link set was created)
  • nodes: A character string indicating whether the nodes of the created graph are given all the attributes (nodes="patches", default) or metrics computed in Graphab or only those specific to a given graph previously created (nodes="graph name").
  • weight: A character string (weight="euclid" or weight="cost") indicating whether to weight the links with Euclidean distance or cost-distance (default) values.
  • fig: Logical (TRUE or FALSE) indicating whether to plot a figure of the resulting spatial graph with plot_graph_lg.
  • crds: Logical (TRUE or FALSE) indicating whether to create an object of class data.frame with the node centroid spatial coordinates.
land_graph <- graphab_to_igraph(proj_name = proj_name,
                                linkset = "lkst1",
                                nodes = "patches",
                                weight = "cost",
                                fig = TRUE,
                                crds = TRUE)

crds_patches <- land_graph[[2]]
land_graph <- land_graph[[1]]

The function returns a list of two objects:

  • The graph in itself (land_graph[[1]])
  • The patch centroid coordinates (land_graph[[2]])

This graph can be plotted on a map with plot_graph_lg with node sizes proportional to habitat patch area and link width inversely proportional to cost-distances:

plot_graph_lg(land_graph,
              crds = crds_patches,
              mode = "spatial",
              node_size = "Area")
#> This graph was created by an old(er) igraph version.
#> ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
#> For now we convert it on the fly...

Conclusion

We have seen in this third tutorial how to construct, analyse and import landscape graphs with Graphab and graph4lg. In the next and last tutorial, we will see how to compare genetic graphs and landscape graphs.

References

Baranyi, Gabriella, Santiago Saura, János Podani, and Ferenc Jordán. 2011. “Contribution of Habitat Patches to Network Connectivity: Redundancy and Uniqueness of Topological Indices.” Ecological Indicators 11 (5): 1301–10.
Foltête, Jean-Christophe, Céline Clauzel, and Gilles Vuidel. 2012. “A Software Tool Dedicated to the Modelling of Landscape Networks.” Environmental Modelling & Software 38: 316–27.
Rayfield, Bronwyn, Marie-Josée Fortin, and Andrew Fall. 2011. “Connectivity for Conservation: A Framework to Classify Network Measures.” Ecology 92 (4): 847–58.