Model family | Assumptions | Data/process | ‘Limitation’ | Network type | Key reference |
---|---|---|---|---|---|
null | Links are randomly distributed within a network | parameter assumptions, species agnostic | structural network | ||
neutral | Network structure is random, but species abundance determines links between nodes | abundance | parameter assumptions | structural network | Canard et al. (2012) |
resource | Networks are interval, species can be ordered on a ‘niche axis’ | parameter assumptions, species agnostic | structural network | Williams and Martinez (2008) | |
generative | Networks are determined by their structural features | need real world networks | structural network | ||
energetic | Interactions are determined by energetic costs | abundance + energy | does not account for forbidden links in terms of evolutionary compatibility | ‘energy’ network | |
graph embedding | Interactions can be predicted from the latent traits of networks | evolutionary compatibility | need real world networks | metaweb | Strydom et al. (2023) |
trait matching | Interactions can be inferred by a mechanistic framework/relationships | evolutionary compatibility | well studied species/communities | metaweb | Morales-Castilla et al. (2015) |
binary classifiers | Interactions can be predicted by learning the relationship between interactions and ecologically relevant predictors | evolutionary compatibility | need real world networks | metaweb | Pichler et al. (2020) |
expert knowledge | ‘Boots on the ground’ ecological knowledge and observations | evolutionary compatibility | well studied species/communities | metaweb | |
data scavenging | Webscraping to create networks from online databases | need real world networks | metaweb | Poisot, Gravel, et al. (2016) (if you squint?) | |
co-occurrence | co-occurrence patterns arise from interactions so we can use these patterns to reverse engineer the interactions | co-occurrence | does not account for forbidden links in terms of evolutionary compatibility or account for energy constraints | co-occurrence network |
Topology generators
This broader group of model families share one fundamental feature and that is that they are parametrised (i.e., delimited) by the expected structure of a network i.e., the links between the nodes and how they might be distributed.
Null models
The interactions between species occurs regardless of the identity of the species (i.e., species have no agency) and links are randomly distributed throughout the network. This family of models is often used as a ‘null hypothesis’ to ask questions about network structure (e.g., Banville, Gravel, and Poisot 2023; Strydom, Dalla Riva, and Poisot 2021). Broadly there are two different approaches; Type I (Fortuna and Bascompte 2006), where interactions happen proportionally to connectance and Type II (Bascompte et al. 2003), where interactions happen proportionally to the joint degree of the two species involved.
Neutral models
Can be tied to Hubble’s neutral theory (Hubbell 2001) where it is assumed that the interactions that occur between species are due to the abundance of species within the community (Pomeranz et al. 2019).
Resource models
Based on the idea that networks follow a trophic hierarchy and that network structure can be determined by distributing interactions along single dimension [the “niche axis”; Allesina, Alonso, and Pascual (2008)]. Essentially these models can be viewed as being based on the idea of resource partitioning (niches) along a one-dimensional resource which will result in the standard ‘trophic pyramid’ to ensure that all species can ‘fit’ along this resource (which has strong ties back to the idea of intervality), importantly there is a strong assumption that the resulting network structure is constrained by connectance.
Cascade model (Joel E. Cohen, Briand, and Newman 1990): Much like the name suggests the cascade model rests on the idea that species feed on one another in a hierarchical manner. This rests on the assumption that the links within a network are variably distributed across the network; with the proportion of links decreasing as one moves up the trophic levels (i.e., ‘many’ prey and ‘few’ predators). This is achieved by assigning all species a random rank, this rank will then determine both the predators and prey of that species. A species will have a particular probability of being fed on by any species with a higher ranking than it, this probability is constrained by the specified connectance of the network. Interestingly here ‘species’ are treated as any individual that consume and are consumed by the same ‘species’, i.e., these are not taxonomical species (Joel E. Cohen, Newman, and Steele 1985). The original cascade model has altered to be more ‘generalised’ (Stouffer et al. 2005), which altered the probability distribution of the prey that could be consumed by a species.
Niche models (Williams and Martinez 2000): The niche model introduces the idea that species interactions are based on the ‘feeding niche’ of a species. Broadly, all species are randomly assigned a ‘feeding niche’ range and all species that fall in this range can be consumed by that species (thereby allowing for cannibalism). The niche of each species is randomly assigned and the range of each species’ niche is (in part) constrained by the specified connectance of the network. The niche model has also been modified, although it appears that adding to the ‘complexity’ of the niche model does not improve on its ability to generate a more ecologically ‘correct’ network (Williams and Martinez 2008).
Nested hierarchy model This model attempts to add some concept of phylogenetic identity to the species.(Cattin et al. 2004)
Generative models
(this is maybe a bit of a bold term to use)
Structural representation of the interactions between species based on the mathematical properties of a network e.g., MaxEnt (Banville, Gravel, and Poisot 2023), (maybe) stochastic block (Xie et al. 2017).
Interaction predictors
This broader group of model families can be thought of as only being able to predict interactions but are unable to recover structural features. This is because these models are often concerned with the feasibility of interactions between a species pair (e.g. does species \(a\) have the correct traits to be able to kill and consume species \(b\)) but fail to account if the interaction is being realised ‘in the field’.
Energetic models
Broadly this family of models is rooted in feeding theory and allocates the links between species based on energetics, which predicts the diet of a consumer based on energy intake. This means that the model is focused on predicting not only the number of links in a network but also the arrangement of these links based on the diet breadth of a species. See also DeAngelis, Goldstein, and O’Neill (1975)
In order to construct realised networks models need to incorporate both the feasibility of interactions (i.e., determine the entire diet breadth of a species) as well as then determine which interactions are realised (i.e., incorporate the ‘cost’ of interactions). As far as we are aware there is no model that explicitly accounts for both of these ‘rules’ (although see Olivier et al. (2019)) and rather only account for processes that determine the realisation of an interaction (i.e., abundance, predator choice, or non-trophic interactions). Although the use of allometry i.e., body size (e.g., Valdovinos et al. 2023; Beckerman, Petchey, and Warren 2006; Yodzis and Innes 1992; White et al. 2007) may represent a first step in capturing ‘evolutionary compatibility’ alongside more energy (predator choice) driven processes we still need to account for other traits that determine feeding compatibility (e.g., Van De Walle et al. 2023 show how incorporating prey defensive properties alongside body size improves predictions). As realised networks are are build on the concept of dynamic processes (the abundance of species will always be in flux) these networks are valuable for understanding the behaviour of networks over time or their response to change (Lajaaiti et al. 2024; Delmas et al. 2017; Curtsdotter et al. 2019). However, they are ‘costly’ to construct (requiring data about the entire community as it is the behaviour of the system that determines the behaviour of the part) and also lack the larger diet niche context afforded by metawebs.
DBM (Beckerman, Petchey, and Warren 2006): Feeding ‘rules’ are based on the energetic content of species, which is theoretically and conceptually embedded within the foraging ecology space
ADBM (Petchey et al. 2008): A version of the DBM that adds the idea of body size (allometry) to the equation
Trait hierarchy
Interactions are determined by a series of ‘feeding rules’ that are based on the traits of species, whereby the interaction between a species pair will only occur if all feeding rules are met. These rules are determined on an a priori basis using expert/ecological knowledge to determine the underlying feeding hierarchy by using ecological proxies (see Morales-Castilla et al. 2015 for a more details on the idea of using this approach). What sets this family of models apart from expert knowledge ones is that there is a formalisation of the feeding rules and thus there is some ability to transfer these rules to different communities.
PFIM (Shaw et al. 2024): uses a series of rules for a set of trait categories (such as habitat and body size) to determine if an interaction can feasibly occur between a species pair.
Graph embedding
This family of approaches has been extensively discussed in Strydom et al. (2023). Broadly speaking this family of models uses the structural features of a network for the construction of networks, specifically the latent traits/spaces/features that can be ‘extracted’ during the embedding process. However, these models are still arguably ‘interaction predictors’ and not ‘topology generators’ because they are still linking the process of ‘network construction’ to the prediction/inference of interactions based on the location that species are occupying in the latent spaces.
Transfer learning/RDPG (Strydom et al. 2022): uses a transfer learning framework (specifically using a random dot product graph for embedding) based around the idea that interactions are evolutionarily conserved and that we can use known networks, and phylogenetic relationships, to predict interactions for a given species pool.
Log-ratio (Rohr et al. 2010): Interestingly often used in paleo settings (at least that’s what it currently looks like in my mind… Pires et al. (2020))
Binary classifiers
The task of predicting if an interaction will occur between a species pair is treated as a binary classification task, where the task is to correlate ‘real world’ interaction data with a suitable ecological proxy for which data is more widely available (e.g., traits). Model families often used include generalised linear models (e.g., Caron et al. 2022), random forest (e.g., Llewelyn et al. 2023), trait-based k-NN (e.g., Desjardins-Proulx et al. 2017), and Bayesian models (e.g., Eklöf, Tang, and Allesina 2013; Cirtwill et al. 2019). See Pichler et al. (2020) for a more detailed overview on the performance of machine learning and statistical approaches for inferring trait-feeding relationships.
Expert knowledge
This approach involves having a group of experts come together to assess and assign the likelihood of feeding interactions being able to occur for a specified community. This is done in a pairwise manner where the experts will assign a value of how confident they are that a specific species pair are likely to interact (e.g., Dunne et al. 2008) This has the added advantage that interactions can be scored in a more categorical (or probabilistic) as opposed to binary fashion, e.g., Maiorano et al. (2020) score interactions as either obligate (typical food resources) or occasional (opportunistic feeding) interactions.
Data scavenging
There are also a lot of published pairwise interaction data e.g., the Global Biotic Interactions (GloBI) database (Poelen, Simons, and Mungall 2014) or network e.g., Mangal (Poisot, Baiser, et al. 2016) datasets, these can be mined to look for interactions for specific species pairs. This is done by matching species pairs against those within a dataset of trophic interactions to determine if an interaction is present between the two species (e.g., the WebBuilder tool developed by Gray et al. 2015). It is important to note that this methodology is only going to be able to infer observations that have been recorded and will thus be prone to many false negatives (missing pairwise interactions) being generated using this approach.
Pattern finders
This group of families uses the co-occurrence patterns of species to presume that there are interactions (more accurately links) between the species. To put it perhaps too bluntly this approach rests strongly on the assumption that co-occurrence = interaction.
Co-occurrence
Trying to infer interactions from the co-occurrence patterns of species pairs within the community e.g., the geographical lasso (Ohlmann et al. 2018). This (for me) seems like a fundamentally flawed assumption to make and Blanchet, Cazelles, and Gravel (2020) seems to agree with me at least a little bit.
Ovaskainen et al. (2016)
Expectation-maximisation (EM) algorithms (Momal, Robin, and Ambroise 2020):