ScanG16-train&data: ------------------- This example should be run once you understand how to prepare and analyze image coming from a flat-bed scanner (see ScanG16-example). This is a collection of 9 samples of zooplankton from Tulear, Madagascar. Five samples were collected at the same date (20 October 2004). There are three replicates sampled in the lagoon, plus two samples in different sites (North Pass and South Pass). The remaining 6 samples are all collected in the lagoon and form a short time series monitoring plankton in the lagoon before and after a hurricane (on 23-24 January 2005) followed by a big tropical storm until the end of January. Note the bloom of diatoms and the drop in copepods after this major perturbation. There are also two example manual training sets provided. The first one, 'train-detailed' is an original training set as one gets after the taxonomists have done their work. You should first try to use this training set and realize the limitations of the computer to separate some groups (mainly because there are not enough objects to correctly learn how to separe them). The second training set, 'train-reworked' is similar to the first one, but in the course of a reworking of the groups (simply put the vignettes on differents folders, merging group that are not correctly separated by the computer, for instance). This is NOT the optimal training set for this kind of data and we encourage you to experiment other groupings here. Once you have a satisfactory training set, you can save it for further reuse (Objects -> Save), and you can process all the 27 samples in this series. To run this example: -------------------- - Download and unzip "ScanG16-train&data.zip" in your 'ZooPhytoImage Examples' directory, or anywhere you like to place it. - Start Zoo/PhytoImage and import the detailed training set (sixth button). Select the _train-detailed subdirectory. Importation takes a couple of seconds Wait until you see a table of summary statistics about this training set. - Train a classifier with it (seventh button). Experiment with the different algorithms provided, and display the "confusion matrix" using the eighth button after the training. As you can see, some groups are hard to identify with these data. You should rework the groups at this stage. - In order to show you how to do it, a second training set, '_train-reworked' is provided. Do the same operations as with the previous one, and compare results. See what was changed by exploring both training sets with XnView (Apps -> Image viewer (XnView)). The later training set is not optimal yet, but it begins to be interesting, especially with the "random forest" algorithm. - Train a random forest classifier with the '_train-reworked' training set and save the classifier. Inspect its performances. For the sake of this tutorial, we will assume that you are satisfied with it. - Now that you have a classifier, you can predict all particules in your 9 samples, and calculate total/partial abundances, total/partial biomasses and total/partial size spectra. Usually, you need to document your series in a 'description.zis' file (nineth button) before you can process it. In the present case, we provide an example description file. Inspect it to learn what you are supposed to put in it. Take a look on the way the three different stations were encoded in this file. - For processing all the 9 samples, click on the tenth button and select the provided 'description.zis' file. Each sample is processed in turn. If you checked the option for saving individual measurements, a table for each sample with the ECD, identification and calculation of individual biomass of each particle is saved on the same directory as the one where the 'description.zis' file resides. Once this is done, you can visualize the results by clicking on the eleventh button (you can select, for instance, 'Abd total', 'Abd diatom', 'spectrum of MTLG.2004-12-17.H1' and 'spectrum of MTLG.2005-02-01.H1', and then, select 'diatom' for partial spectrum, in order to visualize the bloom of diatoms just after the hurricane/tropical storm). - Finally, you can further analyse your series by using all the statistical or graphical tools available in R. See Help -> Manuals -> An Introduction to R from the 'R Console' menu, if you are not familiar with R. All the data is in a 'ZIRes' object (called 'ZIres' by default), and it is a 'data frame' in R's terminology. Since this is the basic component in R, you can easily use any function (like plot(), for instance) on it to continue your analysis with R. - If you prefer to use another software for you analysis (Matlab? Excel? other?) you just need to export your data (twelveth button). Data are exported as tab separated ASCII files, a format easily readable by any external software.