Some quick first notes on the selection of methods used in our EI exercises.
First of all, it should be noted that are approach is model based. Meaning a predictive model will be fit to the available data which we believe is related to Ecosystem Integrity. The approach is fundamentally supervised. Meaning our models need evidence (data) of different states of Ecosystem Integrity.
From here that the basis of our modelling will be a data table which includes a variable which serves as reference for the EI levels and other variables which will be used to explain such levels. For example we could have a table as the following, with only two reference EI levels (1-high and 0-low) and two explanatory variables.
Figure 1: A toy example of a EI model training table.
EI, as human health, is a latent variable. It’s true state cannot be directly observed. Even so, we have chosen to continue on the path of a supervised EI model. We have chosen (discrete) proxies for EI to be used as evidence and hope to achieve a more detailed EI estimation (continuous).
We have chosen to use bayesian networks to model EI. From observing the previous table it should be clear that any predictive (classification) model could be used, from linear regressions, to tree-based methods and deep neural networks. So why bayesian networks?
Well, machine learning has become incredibly good at signal processing. Especially deep machine learning.
Figure 2: Deep learning has achieved good results identifying fauna in digital media.
[previously] In the 1st wave of AI, systems followed clear (hard coded) rules.
[currently] In the 2nd wave, systems rely on ML to replicate simple human tasks, e.g. computer vision.
In the 3rd wave of AI, systems are more than just tools that execute human programmed rules or generalize from human-curated data sets. The systems will function as partners rather than as tools. So AI models will have to be auditable, the predictions they make will have to be transparent and explainable.
As AI dominates decision making tasks it will be ever more important to think about the end user of the systems: Who will be using the system? What do they want to know? How do we communicate that to them?
This will not only increase the probability that the system is used by the end-user but it may be, arguably, a basic human right in itself:
In the regulation of algorithms, particularly artificial intelligence and its subfield of machine learning, a right to explanation (or right to an explanation) is a right to be given an explanation for an output of the algorithm. Such rights primarily refer to individual rights to be given an explanation for decisions that significantly affect an individual, particularly legally or financially. For example, a person who applies for a loan and is denied may ask for an explanation.
Bayesian networks are some of the most interpretable models out there and also allow the analysis of the interconnected relations of a group of variables, something naturally interesting when analyzing ecosystems.
Figure 3: BNs allow to propose a complex criss-cross structure for relations between variables.