In recent years, severe drought and heat waves have led to significant damage to deciduous trees in the forests of northern Bavaria, as well as devastating bark beetle outbreaks on spruce. In view of the fact that we are likely to see a further rise in temperature and an increase in the frequency of extreme weather events in the future, forestry practitioners are calling for efficient monitoring and assessing systems, so that damaged trees and forest stands can be detected and localised precisely as early as possible. Scientists at the LWF, together with the company IABG, thus set about investigating the potential application of optical remote sensing techniques and AI methods for damage assessment purposes in the BeechSAT and IpsSAT research projects.
Remote sensing data make a more extensive view over forest areas possible than is the case with field surveys. The view from above gives a good overview of the condition of forest stands (see Figure 1).
BeechSAT and IpsSAT research projects
The focus of the investigations in the BeechSAT project (2019-2020) was on the development and testing of a methodology for detecting damaged, i.e. defoliated common beech as efficiently as possible. Two project areas in Lower and Upper Franconia were selected for this purpose. From 2020, the IpsSAT project then focussed on the possibilities and limitations of the assessment and monitoring of bark beetle damage to spruce trees. This work concentrated on the detection of later stages of infestation, when the spruce trees have already turned reddish-brown or grey and are thus visible in optical remote sensing data. Early detection is not possible with this method.
Image data from several Earth observation satellites of different spatial and spectral resolutions were compared. A comparison was also made with aerial image data taken using a measurement camera from an aeroplane:
- Aerial images (4 spectral bands)
- WorldView-3 (8 spectral bands)
- SkySat (4 spectral bands)
- PlanetScope (4 spectral bands)
- Sentinel-2 (10 spectral bands)
Figure 2 illustrates how important the spatial resolution of the image data is for recognising individual trees and assessing crown structures. In addition to the remote sensing data listed above, a very high-resolution aerial image from an unmanned aerial vehicle (UAV/drone) is shown here. Two badly damaged, defoliated beech trees have been outlined in orange. Only in the high-resolution image data from the UAV and from the aircraft are the damage features “defoliation” and “crown deadwood” clearly recognisable. With decreasing spatial resolution, it becomes increasingly difficult to assess the crown structure. In the PlanetScope and Sentinel-2 data, individual tree crowns can no longer be identified at all.
Machine learning for the detection of damaged trees
The research field of machine learning (ML) is a sub-field of artificial intelligence. ML methods are now used for numerous applications in the field of image processing - for example for image classification or object recognition - and they are increasingly also being used for the automated evaluation of remote sensing data.
A key objective of the BeechSAT and IpsSAT projects was to test different ML methods for the automated detection of damaged trees in the above-mentioned image data. Supervised learning methods were used for this purpose. This means that the ML procedure first has to be “trained” with a manually created learning data set - often referred to as “labels” in the specialist literature - before it can be used. During this process, the algorithm is taught what vigorous and damaged trees look like in the remote sensing data. By applying a sufficiently trained model to the entire aerial image or satellite image scene, each image element or pixel can then be assigned to one of the two categories, “vigorous” or “damaged”.
Classic machine learning and deep learning
Two approaches were tested in BeechSAT and IpsSAT:
- Random Forest (Breiman 2001): Random Forest was selected as an example of a classic machine learning method for image classification. Based on training examples for the target classes, this method generates many (uncorrelated) decision trees, a so-called ensemble. The trained Random Forest model can then be used to make predictions based on new data.
- U-Net (Ronneberger et al. 2015): Deep learning is a sub-field of machine learning. It involves training deep, artificial neural networks (deep neural networks) with learning data. In recent years, research on the evaluation of remote sensing data using deep learning models has increased significantly. Different variants of the Convolutional Neural Network (CNN) in particular are used to process image data. These are particularly suitable for extracting information from image data. In the BeechSAT and IpsSAT projects, the U-Net architecture was used for this.
For training classic machine learning methods, a specifically selected, limited number of pixels or image regions that are as representative as possible are usually used for the target classes. In comparison, deep learning models require a much larger amount of training data, due to their significantly larger number of learnable parameters. The training data for a U-Net is often required in the form of small image sections (with a size of 256×256 pixels, for example). For each of these image sections, a binary mask must be generated. This demarcates the objects of the target class (in this case damaged tree crowns). The creation of this mask required both the manual marking of a large number of damaged trees, and automatic segmentation of the tree crowns, which then had to be manually adjusted again in places. The preparation of the training data is thus a time-consuming step in the process. To improve the robustness of the deep learning model, the available training data set was artificially enlarged using data augmentation techniques. Due to the large amount of data and the associated computationally intensive learning phase, deep learning models require powerful computers.
In classical ML, relevant characteristics or explanatory variables for distinguishing damaged from vigorous trees must be selected from the remote sensing data in order to create the model. Vegetation indices derived from the original data, such as the Normalised Difference Vegetation Index (NDVI), were used for this purpose in addition to the original spectral bands. By contrast, for a CNN model to be able to extract and learn characteristics for the separation of damaged and vital trees, the original spectral bands are usually sufficient.
The accuracies that can be achieved
In order to assess the accuracy of the trained models for the automated detection of damaged trees, small test areas were demarcated within each study area. The image content in these test areas was used exclusively for validation purposes and not to train the models.
The so-called Intersection over Union (IoU) was used for the quantitative evaluation of the models. The IoU is also referred to as the Jaccard coefficient and describes the similarity of sets. In this case, the IoU expresses the spatial agreement of automatically recorded damaged trees with a manually created reference data set. The agreement match is expressed in a value range between 0 and 1. The closer it is to 1, the closer the agreement. Figure 3 is intended to facilitate the interpretation of different IoU values.
In Figure 4 the calculated IoU values for both Random Forest and the U-Net (deep learning) model for all tested remote sensing data are shown, separately for the BeechSAT and IpsSAT projects. It can be seen that consistently better accuracies were achieved using deep learning in the test areas. The difference in IoU values between Random Forest and U-Net is greatest for the aerial image data with the highest spatial resolution. Furthermore, the spatial resolution of the image data has a significant influence on the accuracies that can be achieved. Especially in the case of the U-Net, the best results were achieved with the high-resolution remote sensing data. As Figure 2 shows, only very high resolution remote sensing data can reveal the distinct structures of damaged and vigorous tree crowns that a deep learning method can potentially learn. When using lower-resolution image data, in which individual tree features are no longer recognisable, the differentiation between damaged and vigorous vegetation is primarily based on differences in spectral values. In comparison with PlanetScope, the Sentinel-2 images have a lower spatial resolution. Nevertheless, slightly better IoU values were calculated with Sentinel-2 - something which is attributed here to the higher spectral resolution of the Sentinel-2 images.
Figure 5 shows an example of a small aerial image section from the IpsSAT project with discoloured spruce trees. The corresponding automatically generated classifications using the trained Random Forest and U-Net models are shown, along with a manually created reference data set. Compared with the Random Forest classification, the U-Net has outlined more compact and homogeneous areas for the red-attack and grey-attack damage categories, or spruce trees discoloured reddish-brown or grey. In addition, undesirable confusions between background classes such as soil and strongly shaded areas are also less likely when the U-Net is used.

Fig. 5: Example of an image evaluation in the IpsSAT project. The automated classification of discoloured spruce trees is shown, with the damage categories red-attack and grey-attack, using Random Forest and a U-Net (deep learning) model for an aerial image section.
The bottom line
When using classical machine learning methods such as Random Forest, specific models are usually developed for a particular situation, i.e. the models are adapted for a defined study area or a single remote sensing data set with conditions as homogeneous as possible in terms of the spectral values. Representative training regions for the target classes in the image are selected. When adapting the models, several aspects often have to be taken into account, such as the respective lighting situation in the images, the phenology at the time of recording, the topography, or different age classes in the study area. The direct transferability of classic machine learning models from one remote sensing data set to another is barely or not at all possible for many forestry issues without further training.
The use of much larger training data sets is intended to improve the transferability of models by means of deep learning. In the BeechSAT and IpsSAT projects, the highest accuracies in the detection of damaged trees were achieved using deep learning. Nevertheless, it cannot currently be assumed that the training data set used is sufficient to allow transfer of the created models to other image data sets without further training. As the creation of large training data sets currently has to be carried out manually using aerial image interpretation, this step is very labour-intensive and time-consuming. It is thus currently a limiting factor in the practical use of deep learning.
The study showed that a U-Net model trained with heterogeneous training data from different image data sets resulted in more reliable damage classification than Random Forest. There is thus a need for further research into the possibilities and limitations of deep learning methods for remote sensing-based damage assessment.
To assist the Offices for Food, Agriculture and Forestry in Upper Franconia with the assessment of the massive damage caused by bark beetle outbreaks and the resulting bare areas, large-scale aerial surveys have been commissioned by the LWF and carried out since 2021 (Straub et al. 2023). This flight data is currently analysed manually. To make the image analysis more efficient in future, an automated localisation of discoloured spruce trees using a deep learning model is to be tested.
Summary
Against the background of damage to deciduous trees and bark beetle damage in northern Bavaria, the BeechSAT and IpsSAT projects focussed on the automated detection of damaged common beech and spruce trees in aerial and satellite image data. For the image evaluation, both classic machine learning methods and deep learning models for damage detection were tested. The highest levels of accuracy were achieved using deep learning. However, this success is offset by the need for the creation of large learning data sets to train a deep learning model. The training data are not currently available in finished form. Instead they must be prepared manually in large quantities using aerial image interpretation. For the recognition of individual tree crowns and an assessment of their level of defoliation and discolouration, image data with high spatial resolution are required.







