Tutorial Part I: Train Your First Model with YAPiC
Installation and Preparation of Training Data
-
Install YAPiC as explained here
- The command line tool
yapic
should be available now:yapic --help
- Download and unzip leaf example data. It contains following files:
leaves_example_data/ ├── leaf_labels_ilastik133.ilp ├── leaves_1.tif ├── leaves_2.tif ├── leaves_3.tif ├── leaves_4.tif ├── leaves_5.tif ├── leaves_6.tif └── leaves_7.tif
The tif files are RGB images containing photographs of different leaf types.The images were saved with Fiji. Make sure to always convert your pixel images with Fiji. Large image series can be conveniently converted with Fiji by using batch processing.
The ilp file is a so called ilastik Project File and contains training labels. Next, we will have a look at the labels.
-
For looking at the label data, download and install Ilastik. We created the label data with Ilastik version 1.3.3.
-
Launch Ilastik and open the ilastik project leaf_labels_ilastik133.ilp
You see manually drawn labels for the leaves_7.tif image. There are labels for 6 different classes: 5 leaf types and the backgound class (red). In current view you can select one of the other images.
Ilastik comes with built in functionality for classifier training and pixel classification. It is very convenient to use and much faster than the deep-learing based classificaton of YAPiC. However this leaf classification task, can not be solved with Ilastik’s built in classification. For this reason, we use Ilastik in this case just as a tool for labeling and will train a classifier with YAPiC.
-
Add some more training data.
Choose some other images with less labels and use the brush to paint more image regions. The more labels you have, the better will be your training result. Amount of labels should be more or less balanced between all classes.
- Save your updated Ilastik project:
Project>>Save Project...
Model Training
- Now you can start a training session with YAPiC command line tool:
yapic train unet_2d "path/to/leaves_example_data/*.tif" path/to/leaves_example_data/leaf_labels_ilastik133.ilp -e 800 --gpu=0
unet_2d
defines the type of deep learning model to train. We choose the original U-Net architecture as described in this paper.- Next, we define the pixel data source with a wildcard. With wildcards you have to use quotation marks.
- Next, we have to define the label data source. In our case the ilastik project file
path/to/leaves_example_data/leaf_labels.ilp
- The optional argument
e
defines the number of training epochs, i.e. the length of the training process. - If you have multiple GPU cards available, you can select a specific GPU
with the optional
--gpu
argument. - Use
yapic --help
to get an overview about all arguments.
- Training progress can be observed via command line output. Training 2500 epochs will take several hours.
Epoch 5/500 50/50 [==============================] - 63s 1s/step - loss: 1.7317 - accuracy: 5.3919 - val_loss: 1.6949 - val_accuracy: 3.7496 Epoch 6/500 50/50 [==============================] - 64s 1s/step - loss: 1.7241 - accuracy: 5.0915 - val_loss: 1.6836 - val_accuracy: 3.8097 Epoch 7/500 50/50 [==============================] - 64s 1s/step - loss: 1.7246 - accuracy: 5.4757 - val_loss: 1.6919 - val_accuracy: 3.6405 Epoch 8/500 17/50 [=========>....................] - ETA: 27s - loss: 1.7165 - accuracy: 5.5103
- Training progress is also logged to loss.csv.
- The best performing model (the model with the lowest validation loss) is repeatedly saved as model.h5.
The loss of training data and validation data is continuously written to loss.csv file. You can open this file in any spreadsheet software (e.g. MS Excel) and plot loss and validation_loss: You see that the model initially learns to predict the data (the loss decreases). But from time to time, the curve falls back to a higher loss and training process starts again. You can also see, that training loss tends to be a bit lower than validation loss. The shape of the loss curve is very dependent on the dataset you process.
Please note that at the end of the 500 interations the validation loss is higher than some iterations earlier. YAPiC only stores the best performing model parameters, i.e. the model with lowest validation loss over the whole training period.
Apply your model
After the training process, the model.h5 file contains the best performing model, i.e. the model with best performance on the validation dataset. You have two options how to apply your model: Either you can run your model on a set of tif images by using YAPiC command line tool (the one you used for training) or you can export your model to run it in ImageJ/Fiji by using DeepimageJ plugin. We tested YAPiC trained models with DeepimageJ versions 1.0.1. and 1.2.0.
Apply model using YAPiC command line tool
Apply your model to the images
yapic predict model.h5 "path/to/leaves_example_data/*.tif" path/to/results
Predictions will be saved as 32 bit tif images in path/to/results
.
You can open the result files in Fiji. Each channel represents one class (i.e. one of 5 leaf types or background).
Apply model in Fiji using DeepImageJ plugin
Go to part II for learning how to use you custom made leaf classifier in Fiji by using DeepImageJ plugin.