graph4lg
The rationale of graph4lg
package in R is to make easier
the construction and analysis of genetic and landscape graphs in
landscape genetic studies (hence the name graph4lg
, meaning
Graphs for Landscape Genetics). This package provides users with tools
for:
Each one of the included tutorials focuses on one of these points. This third tutorial will focus on landscape graph construction and analysis. It will describe the package functions allowing users to:
NOTA BENE: The package graph4lg
integrates functions making possible the construction and analysis of
landscape graphs but most of them are only wrappers that launch
computations with Graphab software.
Graphab is an open source software tool dedicated to the modelling of landscape networks (Foltête, Clauzel, and Vuidel 2012). It was developed in ThéMA laboratory (Besançon, France) to make possible the creation of landscape graphs and to integrate a complete set of connectivity analysis functions in a single application.
Integrating Graphab functionalities into graph4lg
aims
at facilitating the import of Graphab output into R for subsequent
analyses, including the comparison of genetic graphs and landscape
graphs. All the functions calling Graphab software are described in
graph4lg
help files. However, users are invited to read the
complete documentation manuals as well as other
resources on Graphab
project website. Some computations are not possible when
using graph4lg
but are possible when using directly Graphab
software which can be downloaded for free on the same website. Foltête, Clauzel, and Vuidel (2012) paper also
provides users with a synthetic description of the tool.
The function get_graphab
from graph4lg
package downloads automatically Graphab in the user’s
machine, provided Java is installed on the machine. If it has already
been downloaded, a message is displayed. Once Graphab has been
downloaded in a machine, it is not necessary to download it again.
In the table below, we present all the functions from
grapha4lg
calling Graphab software:
Function | Description |
---|---|
get_graphab |
Check if Graphab software (graphab-2.6.jar ) is in the
user’s machine and downloads it if necessary, provided Java is
installed. |
graphab_project |
Creates a Graphab project from a raster file (.tif) |
graphab_link |
Creates a set of links (Euclidean or least cost paths, planar or complete) |
graphab_graph |
Creates a graph from a set of links |
graphab_metric |
Computes graph-theoretic metrics from a graph |
graphab_modul |
Partition the graph into modules |
graphab_pointset |
Imports a set of points in the project and relates them to habitat patches and corresponding metrics |
get_graphab_metric |
Imports a table with the metrics computed and the patch properties |
get_graphab_linkset |
Imports a table with the link properties for a given link set |
graphab_to_igraph |
Imports a graph created with Graphab as an igraph graph
object with node and link attributes |
Here, we do not rely on genetic data sets and will only use Graphab projects already created for the vignettes.
When using Graphab, input data consists of an 8-bit raster file (.tif) with discrete cell values. One (or several) of these values will correspond to habitat cells on the raster. Contiguous habitat cells will form habitat patches (8-neighbourhood criterion), as soon as they exceed the minimum patch area when specified
This is the first step of habitat network modelling with Graphab. At
this stage, a project is created. This project is given
a name (proj_name
), which will be the name of a directory
containing geographical information layers relative to the project as
well as a .xml file (“proj_name.xml”), which is the main project
file.
The function graphab_project
creates the project
directory called proj_name
either in the current working
directory (default) or in the directory whose path is given as a
proj_path
argument. The function takes as arguments:
proj_name
: Name of the project (project and .xml file
directory)raster
: Path to the raster file on which habitat
patches are identifiedhabitat
: Cell value(s) corresponding to habitat patches
(possibly a vector with several values (e.g. c(1,2
))minarea
(optional): Minimum habitat patch size in
hectaresnodata
(optional): Cell value corresponding to no data
values, thereafter ignoredalloc_ram
(optional): Integer or numeric value
indicating RAM gigabytes allocated to the java process. Increasing this
value can speed up the computations. Too large values may not be
compatible with user’s machine settingsproj_path
(optional): Path to the directory that
contains the project directory. It should be used when the project
directory is not in the current working directory. Default is
NULL
. When 'proj_path = NULL'
, the project
directory is equal to getwd()
.The two latter arguments are included in almost every function calling Graphab.
As an example, we will create a landscape graph from a simulated landscape raster with 5 land use categories (see map below).
load(file = paste0(system.file('extdata', package = 'graph4lg'),
"/", "rast_simul50.RDa"))
r.spdf <- as(rast, "SpatialPixelsDataFrame")
#> Registered S3 method overwritten by 'proxy':
#> method from
#> dim.dist ecodist
r.df <- as.data.frame(r.spdf)
r.df$layer <- as.factor(r.df$rast_simul50)
g <- ggplot(r.df, aes(x=x, y=y)) + geom_tile(aes(fill = layer)) + coord_equal()+
theme_bw()+
#scale_fill_brewer(palette="Dark2")+
scale_fill_manual(values = c("#396D35", "#FB9013", "#EDC951", "#80C342", "black", "#396D35"),
labels = c("0 - Forest", "1 - Shrublands", "2 - Crops",
"3 - Grasslands","4 - Artificial areas", "5 - Forest"),
name = "Land use type")+
labs(x="Longitude",y="Latitude")
g
Habitat patches will correspond to contiguous patches of raster cells equal to 0 or 5, as soon as they are larger than 200 hectares. They correspond to forests.
The project has been created and we will now create a link
set between the habitat patches. We will consider least
cost paths between habitat patches on the raster. To that
purpose, we need to create a data.frame
specifying the
cost values associated with every raster cell value. In
the following example, these values are:
Code | Description | Cost value |
---|---|---|
0 | Forests (habitat) | 1 |
1 | Shrublands | 5 |
2 | Crops | 60 |
3 | Grasslands | 40 |
4 | Artificial areas | 1000 |
5 | Forest (habitat) | 1 |
They are stored into the cost
object
(data.frame
):
cost <- data.frame(code = 0:5,
cost = c(1, 5, 60, 40, 1000, 1))
cost
#> code cost
#> 1 0 1
#> 2 1 5
#> 3 2 60
#> 4 3 40
#> 5 4 1000
#> 6 5 1
The link set is created with graphab_link
function with
the following arguments:
proj_name
: Name of the projectdistance
: Type of distance computed. By default,
distance="cost"
, meaning that cost
distances are computed between patches according the cost
values specified in cost
data.frame. Alternatively,
distance="euclid"
, meaning that straight line
Euclidean distances are computed.name
: A character string indicating the name of link
setcost
: A data.frame
with the cost values
(cost
in our example)topo
: A character string indicating the
topology of the created link set. It can be
planar (topo='planar'
(default)). It
speeds up the computation but will prevent from creating complete graphs
with graphab_graph
, or it can be complete
(topo='complete'
).See more details in the help file (?graphab_link
).
In our example, we create a planar link set in cost-distances:
graphab_link(proj_name = proj_name,
distance = "cost",
cost = cost,
name = "lkst1",
topo = "planar")
NOTA BENE: Several link sets can be created in the same project.
We can now create a landscape graph with
graphab_graph
. This function takes as arguments:
proj_name
: Name of the projectlinkset
: Name of the link set used to create the
graphname
: A character string indicating the name of the
graph created, if only one is createdthr
: An integer or numeric value indicating the
maximum distance associated with the links of the created
graph. It allows users to create a pruned
graph based on a distance threshold. Note that when the link
set used has a planar topology, the graph is necessarily a pruned graph
(not complete) and adding this threshold parameter can remove other
links. When the link set has been created with cost-distances, the
parameter is expressed in cost-distance units whereas when the link set
is based upon Euclidean distances, the parameter is expressed in
meters.cost_conv
: Logical (TRUE
or
FALSE
) indicating whether numeric thr
values
are converted from cost-distance into Euclidean
distance using a log-log linear regression. See also
convert_cd
function presented in the first tutorial.proj_name
is the only mandatory
argument. Without any other argument, a non-thresholded graph
is created for every link set present in the project. Names are created
automatically.
In our example, we can create a planar graph with the following command:
Once the graph has been created, it can be analysed by computing graph-theoretic connectivity metrics or by partitioning its nodes.
Many connectivity metrics can be computed from a landscape graph (see Baranyi et al. (2011) and Rayfield, Fortin, and Fall (2011) among other reviews on the subject). Graphab software includes a large range of connectivity metrics and its manual provides users with a comprehensive description of every one of them (see Graphab 2.6 manual).
The function graphab_metric
computes these metrics. It
takes as arguments:
proj_name
: Name of the project
graph
: Name of the graph on which the metric is
computed
metric
: Name of the metric among:
"PC"
),"IIC"
),"F"
),"IF"
),"Dg"
),"CCe"
),"CF"
)"dPC"
).dist
: A numeric or integer value specifying the
distance at which dispersal probability is equal to prob
.
This argument is mandatory for weighted metrics (PC, F, IF, BC, dPC,
CCe, CF) but not used for others. It is used to set α for computing dispersal
probabilities associated with all inter-patch distances such that
dispersal probability between patches i and j is pij = e−αdij.
prob
: A numeric or integer value specifying the
dispersal probability at distance dist
. By default,
code=0.05
. It is used to set α (see argument dist
above).
beta
: A numeric or integer value between 0 and 1
specifying the exponent associated with patch areas in the computation
of metrics weighted by patch area. By default, beta=1
. When
beta=0
, patch areas do not have any influence in the
computation.
cost_conv
: Logical (TRUE
or
FALSE
) indicating whether numeric dist
values
are converted from cost-distance into Euclidean distance using a log-log
linear regression.
return_val
: Logical (default = TRUE
)
indicating whether metric values are returned in R (TRUE
)
or only stored in the patch attribute layer
(FALSE
)
Metrics fall into different categories. PC and IIC are global
metrics and take only one value for the entire graph, which is
returned in R environment when return_val=TRUE
.
The other metrics are computed at the node level.
When return_val=TRUE
, a data.frame
is returned
in R environment specifying the value of the metric for every graph
node.
The description of every metric is beyond the scope of this tutorial.
We again invite users to read the help file
(?graphab_metric
) as well as the Graphab
2.6 user manual.
We will compute metrics on the graph "graph"
in our
example. First, we will compute the probability of connectivity at the
global level. We set the dist
and prob
parameter such that: p(10km) = e−α × dij = 0.05.
We convert 10 km in cost-distance units for the computation.
# Global metric: PC
graphab_metric(proj_name = proj_name,
graph = "graph",
metric = "PC",
dist = 10000,
prob = 0.05,
beta = 1,
cost_conv = TRUE)
#> [1] "Project : graphab_example" "Graph : graph"
#> [3] "Metric : PC" "Dist : 10000"
#> [5] "Prob : 0.05" "Beta : 1"
#> [7] "Value : 0.000903573300776615"
We obtain the value in an object of class list
.
Now, using the same parameters, we compute the local metric Flux.
f <- graphab_metric(proj_name = proj_name,
graph = "graph",
metric = "F",
dist = 10000,
prob = 0.05,
beta = 1,
cost_conv = FALSE)
#> [[1]]
#> [1] "Project : graphab_example" "Graph : graph"
#> [3] "Metric : F" "Dist : 10000"
#> [5] "Prob : 0.05" "Beta : 1"
#> Id F_d1889.4356657600988_p0.05_beta1_graph
#> 1 1 26657076.8
#> 2 2 16294717.6
#> 3 3 2621667.5
#> 4 4 705995.2
#> 5 5 18215710.3
#> 6 6 21918505.4
Every time a local metric is computed, it can be returned in R
environment (here stored in f
) but is also stored in the
habitat patch attribute table, such that at the end this table contains
every metric computed. We will see later that it can be very useful.
Another way to analyse landscape graph connectivity and topology is to make a partitioning. The principle is the same as that of modularity analyses of genetic graphs described in the second tutorial.
graphab_modul
function carries out such an analysis, but
before we perform some modifications in the current code, we recommend
doing this analysis directly in Graphab software.
For the moment, it only creates a shapefile polygon layer with the Voronoi polygons corresponding to the groups of patches forming modules.
It takes as arguments:
proj_name
: Name of the projectgraph
: Name of the graph on which the partition is
performeddist
: A numeric or integer value specifying the
distance at which dispersal probability is equal to prob
.
This argument is used to weight the links when modularity is
computed.prob
: A numeric or integer value specifying the
dispersal probability at distance dist
. By default,
code=0.05
. It is used to set α (see argument dist
above).beta
: A numeric or integer value between 0 and 1
specifying the exponent associated with patch areas in the computation
of the modularity. By default, beta=1
. When
beta=0
, patch areas do not have any influence in the
computation.nb
: Optional argument indicating the number of modules
to create. By default, the number of modules maximising the modularity
index is used.For example:
NOTA BENE: All these analyses can be performed for several graphs created in the same project, provided their name is correctly indicated in the function arguments.
Landscape graphs have been created and analysed by computing connectivity metrics and/or by partitioning them. The output of these analyses can be used in landscape genetic analyses, e.g. to assess the relationship between genetic variables measured in populations inhabiting the study landscape and the connectivity of its habitat patches. To do that, two approaches are here presented:
Imagine that we have sampled several populations in the forests of the study landscape. We want to know in which landscape graph node lives every population, to get the corresponding connectivity metrics and compare them with the genetic response.
To do that, the function graphab_pointset
imports a set
of points into the Graphab project. It takes as arguments:
proj_name
: Name of the project
linkset
: Name of the link set used to compute the
distance from each point to the nearest patch (in cost-distance or
Euclidean distance units depending on the way the link set was
created)
pointset
: it can be either:
data.frame
with three columns: ID, x and y,
respectively indicating the point ID, longitude and latitude.SpatialPointsDataFrame
Point coordinates must be in the same coordinate reference system as
the habitat patches (and initial raster layer). The function returns a
data.frame
with the properties of the nearest patch to
every point in the point set, as well as the distance from each point to
the nearest patch.
For example, we can add the point data.frame
pts_pop_simul
to the project with the following
command.
# Point data frame
head(pts_pop_simul)
#> ID x y
#> 0 1 25650 54150
#> 1 2 22550 53650
#> 2 3 7250 51750
#> 3 4 32350 50550
#> 4 5 27350 49650
#> 5 6 15950 49550
#> Nearest_patch_ID Point_ID Dist_to_patch Area Perim Capacity
#> 1 9 file15143e6c31ff.2 0 5250000 19200 5250000
#> 2 11 file15143e6c31ff.1 0 4730000 15800 4730000
#> 3 14 file15143e6c31ff.10 121 3750000 19200 3750000
#> 4 17 file15143e6c31ff.6 0 3120000 12600 3120000
#> 5 18 file15143e6c31ff.4 1065 3660000 15000 3660000
#> 6 18 file15143e6c31ff.5 0 3660000 15000 3660000
#> F_d1889.4356657600988_p0.05_beta1_graph
#> 1 6430186
#> 2 5682508
#> 3 6774869
#> 4 11837650
#> 5 2221650
#> 6 2221650
Note that when pointset
is not the path to a shapefile
point layer, such a layer will be created in the temp files of the
user’s machine. It is therefore preferable to give the path to a
shapefile layer as a pointset
argument.
Users can also import in R environment the link set properties in a
data.frame
with an edge list format with the function
get_graphab_linkset
which takes as arguments the project
name and the link set name.
#> Id ID1 ID2 Dist DistM
#> 1 6-1 6 1 956.73717 5704.1631
#> 2 7-1 7 1 82.24264 641.4214
#> 3 2-1 2 1 206.00000 900.0000
#> 4 7-2 7 2 313.62742 3679.8990
#> 5 9-2 9 2 1508.91084 6311.2698
#> 6 8-3 8 3 122.82843 1982.8427
Similarly, users can import the table in which are stored all the
metrics computed in the project as well as the node properties
(including their areas). The function get_graphab_metric
takes as argument the project name.
#> Id Area Perim Capacity F_d1889.4356657600988_p0.05_beta1_graph
#> 1 1 2610000 13000 2610000 26657076.8
#> 2 2 8800000 23000 8800000 16294717.6
#> 3 3 3270000 10800 3270000 2621667.5
#> 4 4 5330000 18800 5330000 705995.2
#> 5 5 6580000 18200 6580000 18215710.3
#> 6 6 2620000 8800 2620000 21918505.4
Finally, users can also create a graph from a Graphab link set and
convert it into a graph object of class igraph
. The
imported graph has weighted links. Nodes attributes present in the
Graphab project are included, including connectivity metrics when
computed. A figure can be displayed automatically representing the graph
on a map. The graph is given the topology of the selected link set.
The function graphab_to_igraph
takes as arguments:
proj_name
: Name of the projectlinkset
: Name of the link set used to compute the
distance from each point to the nearest patch (in cost-distance or
Euclidean distance units depending on the way the link set was
created)nodes
: A character string indicating whether the nodes
of the created graph are given all the attributes
(nodes="patches"
, default) or metrics computed in Graphab
or only those specific to a given graph previously created
(nodes="graph name"
).weight
: A character string
(weight="euclid"
or weight="cost"
) indicating
whether to weight the links with Euclidean distance or cost-distance
(default) values.fig
: Logical (TRUE
or FALSE
)
indicating whether to plot a figure of the resulting spatial graph with
plot_graph_lg
.crds
: Logical (TRUE
or FALSE
)
indicating whether to create an object of class data.frame
with the node centroid spatial coordinates.land_graph <- graphab_to_igraph(proj_name = proj_name,
linkset = "lkst1",
nodes = "patches",
weight = "cost",
fig = TRUE,
crds = TRUE)
crds_patches <- land_graph[[2]]
land_graph <- land_graph[[1]]
The function returns a list of two objects:
land_graph[[1]]
)land_graph[[2]]
)This graph can be plotted on a map with plot_graph_lg
with node sizes proportional to habitat patch area and link width
inversely proportional to cost-distances:
We have seen in this third tutorial how to construct, analyse and
import landscape graphs with Graphab and graph4lg
. In the
next and last tutorial, we will see how to compare genetic
graphs and landscape graphs.