Random Forest for Toxicity of Chemical Emissions： Features Selection and Uncertainty Quantification

2015-11-01 01:27:31AntoninoMarvugliaMichaelLeuenbergerMikhailKanevskiEnricoBenetto

Journal of Environmental Accounting and Management 2015年3期

Antonino Marvuglia, Michael Leuenberger, Mikhail Kanevski, Enrico Benetto

1Luxembourg Institute of Science and Technology （LIST）, 41, rue du Brill, L-4422 Belvaux, Luxembourg

2University of Lausanne, Faculty of Geosciences and Environment, Institute of Earth Surface Dynamics, Geopolis building CH-1015 Lausanne, Switzerland

Random Forest for Toxicity of Chemical Emissions： Features Selection and Uncertainty Quantification

Antonino Marvuglia1,?, Michael Leuenberger2, Mikhail Kanevski2, Enrico Benetto1

1Luxembourg Institute of Science and Technology （LIST）, 41, rue du Brill, L-4422 Belvaux, Luxembourg

2University of Lausanne, Faculty of Geosciences and Environment, Institute of Earth Surface Dynamics, Geopolis building CH-1015 Lausanne, Switzerland

Submission Info

Communicated by Beniamino Murgante

Toxicity characterization

Life Cycle Assessment

USEtox

Random Forest

Variables selection

Toxicity characterization of chemicals' emissions is a complex task which proceeds via multimedia fate and exposure models attached to models of dose-response relationships. Several different environmental multimedia models exist, but in any case a vast amount of data on the properties of the chemical compounds being assessed is required. This paper deals with the selection of informative variables in the problem of deriving characterization factors for eco-toxicology and human toxicology of chemical compounds starting from molecular-based properties. The Random Forest algorithm has been applied to single out the most relevant variables when modelling one toxicity factor at the time. The set of variables retained varies according to the modeled output factor, but certain variables are almost always retained among the top three most important ones, regardless the output factor taken into consideration. The modelling performed in this paper is one of the first applications of nonlinear techniques to the database of organic substances made available by the multimedia fate and exposure model USEtox, largely used by the Life Cycle Assessment （LCA） community.

1 Introduction

One of the ultimate goals of Life Cycle Assessment （LCA） is the evaluation of the potential impacts on humans and ecosystems of the anthropic activities taking place in many different parts of the world, which support the production of any good or service or the implementation of a policy. The result of this evaluation, which takes a lifecycle perspective, is the environmental profile of the studied good, service or policy. This profile comprises different categories of environmental impacts （the so-called “impact category indicators”）, ranging from planetary scale ones （such as global warming） to local effects like acidification and local toxicity-related impacts.

Toxicity characterization is carried out in LCA using multimedia environmental models. Commonly used models include USES-LCA （Huijbregts et al., 2000； Van Zelm et al., 2009）, CalTOX （Hertwich et al., 2001）, IMPACT 2002 （Pennington et al., 2005）, USEtox （Rosenbaum et al., 2008） and GLOBOX （WegenerSleeswijk and Heijungs, 2010）. They have different scope and modelling principles and hence produce different characterization factors （CF）, i.e. factors that, once multiplied by the corresponding inventory inputs（quantity of released chemical）, translate them into directly comparable impact indicators.

In life cycle impact assessment （LCIA） the CF is built with a number of separate elements （Rosenbaum et al 2008）：

· The fate factor （FF）, which takes into account the successive steps of fate. It represents the persistence of the chemical into the environment；

· The aspect of exposure or intake, symbolised by the exposure factor （XF）. It expresses the bioavailability of a chemical, represented by the fraction of it dissolved in the environmental compartment at stake；

· The aspect of effect, symbolised by the effect factor （EF）；

· The combined aspect of fate and human exposure, represented by the intake fraction （iF）.

This results in a set of scale-specific CFs, given as the product：

In order to come to the calculation of the factors in Eq. （1）, several fate and exposure equations have to be solved within the chosen multimedia environmental models for each chemical substance concerned. The input of the model is a set of molecular-based properties of the substance at stake and the output is the CF, decomposed in its components, as shown in Eq. （1）.

The aim of this paper is performing a screening of variables' importance in the input-output mapping process, in order to select those which, in principle, have to be paid particular attention in data collection. In other words we perform here a features or variables selection task. The database used to develop the case study shown in this paper is the one available with the model USEtox （Rosenbaum et al., 2008）, which is the result of a scientific consensus process involving comparison of and harmonization among existing environmental multimedia fate models. In particular, the data set of organic substances, consisting of 3,073 data records, has been used. In USEtox? two geographical scales are specified：

· the continental scale with the following compartments： urban air, rural air, freshwater, sea, natural soil and agricultural soil；

· the global scale with the following compartments： air, freshwater, ocean, natural soil and agricultural soil.

The continental scale is nested in the global scale, i.e. chemicals can be transported from one scale to a higher scale and vice versa （Rosenbaum et al 2008）. In USEtox there are a total of 11 compartments which, in the most general case, can be at the same time emitting and receiving compartments. The full list of compartments comprises： 1. Urban air at the continental scale （airU）, 2. Rural air at the continental scale （airC）, 3. Freshwater at the continental scale （fr.waterC）, 4. Coastal sea water at the continental scale （seawaterC）, 5. Natural soil at the continental scale （nat.soilC）, 6. Agricultural soil at the continental scale （agr.soilC）, 7. Rural air at the global scale （airG）, 8. Freshwater at the global scale （fr.waterG）, 9. Ocean at the global scale（oceanG）, 10. Natural soil at the global scale （nat.soilG）, 11. Agricultural soil at the global scale （agr.soilG）.

In the case of the FFs, only the first six compartments are concerned as emission compartments, therefore there are 66（=11*6） parameters for each chemical. USEtox also allows the calculation of 42 exposure factors, 1 eco-exposure factor, 1 eco-effect factor on freshwater and 4 human health effect factors, for a total of 114 factors for each substance. In USEtox naming convention, the name of a factor carries the acronym of the emission compartment first and the name of the receiving compartment after. For example, FF_airC_fr.waterC is the FF for an emission on continental rural air, which reaches freshwater at the continental scale. Several substance specific molecular-based input properties are required by USEtox to calculate the full set of factors. Although these properties are known for all the substances contained in USEtox database （since they have been measured or estimated） and despite the fact that chemical properties can relatively easily be made available from the Estimation Program Interface （EPI） Suite1The EPI Suite （http：//www.epa.gov/opptintr/exposure/pubs/episuite.htm） is a suite of physical/chemical property and environmental fate estimation programs developed by the US Environmental Protection Agency （EPA）for at least 30,000 substances,not all of them have the same explanatory power with respect to each of the output factors （FF, XF, EF, iF）.

Based on the same USEtox database of CFs, Birkved and Heijungs （2011） derived multidimensional bilinear models for emission compartment specific fate characterisation of chemical emissions applying Partial Least Squares Regression （PLSR）. However, the authors did not perform a model-based features selection, but “decided to let data availability determine the grouping of the variables” （Birkved and Heijungs, 2011）.

In Marvuglia et al. （2014a） the non-linear technique known as Gamma Test （Stefánsson et al., 1997） has been applied to perform a ranking of the input variables based on their contribution to the output's variance. A single-task modelling was carried out for 5 out of the 114 factors, meaning that the Gamma Test was applied separately each time with only one of these 5 output variables. A choice of some representative output variables out of the full set to study was necessary since a full embedding search of the features space was performed, checking all the possible combinations of inputs. This means that for m inputs and n outputs, n（2m-1） models were explored.

After a preliminary data analysis, 9 inputs were retained from the original set of 18 variables available in the USEtox database. Some variables were not used because of the scarce data coverage （a percentage of missing data up to 99% for certain variables, such as the biotransfer factor for meat）； other variables were discarded because a perfectly linear relation was found with other variables （namely between the degradation rate in sediment, KdegSd, and the degradation rate in water, KdegW, and between the degradation rate in soil, KdegSl, and the two variables KdegWand KdegSd）. Remaining those preliminary analyses valid also in the context of the present paper, the same set of input variables was used here, i.e.： 1. Molecular weight （MW）； 2. Partitioning coefficient between octanol and water （KOW）； 3. Partitioning coefficient between organic carbon and water （KOC）； 4. Henry law coefficient at 25°C （KH25C）； 5. Vapour pressure at 25°C （Pvap25）； 6. Solubility at 25°C （Sol25）； 7. Degradation rate in air （KdegA）； 8. Degradation rate in water （KdegW）； 9. Bioaccumulation factor in fish/biota （BAFfish）.

In Marvuglia et al. （2015） the modelling （still single-task） is performed on 13 output variables instead of 5. More precisely, the following 9 fate factors and 4 intake fractions were chosen for exemplificative purposes：Output 1= FF from urban air to continental air （FF_airC_airC）； Output 2= FF from urban air to continental freshwater （FF_airC_frwaterC）； Output 3= FF from urban air to continental natural soil （FF_airC_natsoilC）；Output 4= FF from continental freshwater to continental air （FF_frwaterC_airC）； Output 5= FF from continental freshwater to continental freshwater （FF_frwaterC_frwaterC）； Output 6= FF from continental freshwater to continental natural soil （FF_frwaterC_natsoilC）； Output 7= FF from continental natural soil to continental air （FF_natsoilC_airC）； Output 8= FF from continental natural soil to continental freshwater（FF_natsoilC_frwaterC）； Output 9= FF from continental natural soil to continental natural soil（FF_natsoilC_natsoilC）； Output 10= iF from continental freshwater to air （iF_frwater_air）； Output 11= iF from continental freshwater to drinking water （iF_frwater_drwater）； Output 12= iF from continental freshwater to meat （iF_frwater_meat）； Output 13= iF from continental freshwater to fish （iF_frwater_fish）.

This choice allowed covering toxicity mechanisms of increasing complexity, going from simple dissociation in water to the bioaccumulation mechanisms influencing the intake fractions from freshwater to meat and fish. In Marvuglia et al. （2015） the application of a set of linear models, based on partial least squares （PLS）regression, as well as a nonlinear model （general regression neural network—GRNN） was explored in the seek for an automatic selection strategy of the most informative variables according to the modeled output（USEtox factor）. As in Marvuglia et al. （2015）, we consider here that, given the extensive computational effort required by the simulations, a complete analysis for all the toxicity factors present in USEtox is outside of the scope of the paper.

2 Material s and methods

2.1.1 Random forest

Random forest （RF） is a classification and regression algorithm based on an ensemble of decision trees（Breiman, 2001）. It increases diversity among the classification trees by resampling the data with replacement, and by randomly changing the predictive variable sets over the different tree induction processes. During thetraining process, it also provides a measure of the variable importance based on a permutation test. The basic idea is that when a specific variable is not important, dismissing it from the estimation does not degrade the accuracy of the regression. In this way, RF allows assessing the importance and the related prediction power of each single selected variable with respect to a target output with the following steps：（1） generating k bootstrap subsets Xiof the original data set X, consisting in an iterative resampling of the original dataset；（2）growing a decision tree for each bootstrap subset where each node of the tree is split using the best split predictive variable among a subset of m randomly selected predictive variables （Liaw and Wiener, 2002）；（3）computing errors for each tree； and （4） comparing these errors with the errors obtained by shuffling values of one variable. Iterating point （4） for each variable and for each tree leads to the so-called percentage increase of mean squared error （%IncMSE）. This index expresses the contribution of each variable to the prediction of the output factor, so that the higher the %IncMSE, the higher the ranking of the corresponding variable.

More precisely, the algorithm for growing a RF of k classification trees is based on the following steps：

（1） for i = 1 to k do：

（a） draw a bootstrap subset Xicontaining approximately 2/3 of the elements of the original data set X；

（b） use Xito grow an unpruned classification tree to the maximum depth, selecting at each node m predictive variables and choosing the best split among these variables；

（2） predict new data according to the majority vote of the ensemble of k trees.

The number of trees （k） and the number of predictive variables used to split the nodes （m） are two userdefined parameters required to grow a RF.

Since each tree is constructed using a different bootstrap sample Xifrom the original data set X and Xicontains only 2/3 of the elements of the original data set X, at each run there are elements of the original data set which are not included in Xi. These elements are called out-of-bag elements.

At the end of the run, on average each element of the original data set X is out-of-bag in one-third of the k tree constructing iterations. In other words, each element of the original data set is classified by one-third of the k trees. The proportion of misclassifications （in percentage） over all out-of-bag elements is called the global out-of-bag （Oob） error.

This Oob error is an unbiased estimate of the generalization error. Breiman （2001） proved that RFs produce a limiting value of the generalization error. As the number of trees increases, the generalization error always converges. The number of trees （k） needs to be set sufficiently high to allow for this convergence. Consequently RFs do not overfit the data. For this study the number of trees and the number of selected variables m were fixed to 1000 and 3 respectively.

2.1.2 Experimental procedure

The experimental protocol adopted here follows, with the necessary adaptations, the procedure described in Kanevski et al. （2009） and extended in Kanevski （2013）. A schematic representation of the adopted procedure is depicted in Fig. 1. For each of the following analysis, two separate subsets （i.e. training and testing） were randomly generated from the original data set with respectively 75% and 25% of the amount of data. One RF model was built using the training set and evaluated with the testing set. This process was iterated 50 times and the total average result was retained and is presented in the results and discussion section.

As in Marvuglia et al. （2014a）, data have been logarithmically transformed before processing. To check model's behavior in the presence of data with no explanatory structure, simulated （shuffled） data have been also added to the original ones. Shuffling means randomizing the raw variables so that the original global distributions are preserved but the structures are destroyed.

In order to check whether the nature of the data had some relevant influence on the results of the variables' selection, we performed a first set of experiments using the same organization of the data as in Marvuglia et al. （2014a）. Some of the substance-related data contained in the USEtox database come in fact from experimental measurements, while others are estimated. In particular, when a piece of data is not available from measurements, a set of inter-relationships （see Huijbregts et al., 2010, page 12） has been used in USEtox to derive them. In Marvuglia et al. （2014a）, for each input variable, we split the original database in two subsets；one containing only those substances for which the variable at stake is known from measurements, and the other contains only those substances for which it was estimated. We repeated the procedure for all the 8 vari-ables, thus obtaining 16 different subsets of data. It was not possible selecting a database containing only measured values for each variable, since the resulting set would have contained too few data.

Fig.1. Schematic representation of the modelling procedure adopted in the paper.

Table 1 shows the variables which occupied the first four positions of the ranking for RF models run on a data subset in which one of the variables comes from estimations and all the other variables come from measurements. Table 2 is analogous to Table 1, i.e. shows the results of the RF run on subsets of data where one of the variables comes from measurements and all the remaining variables come from estimations.

For each of the output variables whose name appears as column heading in Tables 1, eight RF models have thus been run （one for each estimated variable at the time, considering that MW is always measured）. For each of the output variables whose name appears as column heading in Tables 2 only seven RF models have been run instead （one for each measured variable at the time, excluding KdegWbecause the sub data set where KdegWis measured contained too few data to run a model on it）. The variable shown in Table 1 and Table 2 at each ranking position is the one which was ranked in that position by the most of the RF models （as explained above, eight models for Table 1 and seven models for Table 2）. In some cases two variables appear in the same ranking position； this happens when the two variables have been ranked in that position the same number of times.

As one can see, there are some variables like KH25C and Pvap25 which are very often ranked in the first two positions regardless the subset of data from which the models have been run. Besides this fact, there are several cases in which the ranking order of the variables only slightly changes when switching from the subset of data obtained using experimental to measured values for a certain variable. However, these results have to be taken with care, because there is a certain risk that they could have suffered from the influence of the number of data contained in the subset used, since each of the 15 subsets had a different cardinality （Marvuglia et al., 2014a）. Furthermore, extensive analyses carried out using different variations of partial least squares regression （PLSR） models have shown that the coefficients of determination （R2） obtained on the two cases （estimated vs measured data of a certain variable） are always very close （with about 13% discrepancy, on average）, whatever is the output that one tries to model （Marvuglia et al., 2014b）. The only exception is the case of the subsets obtained using KdegAas the discriminative variable. In this case, differences up to 46% in the R2have been obtained for certain output variables between the model built on data with measured KdegAand the model built on data with estimated KdegA.

Generally speaking, we therefore do not consider the influence of this splitting procedure very influential on the results obtained and on the general conclusions that one can draw. For this reason we decided not to perform any further analysis using this data splitting procedure. This choice seems to be in line with the findings of Birkved and Heijungs （2011）, who, however, use only linear models. They state in fact that “whether the linear models have been derived from fate factors calculated from real independent data （measured compounds specific input） or from estimated data sets will most likely not influence the parameterisation of the derived linear meta-models significantly”. They also argue that “the lack of importance of data origin iscaused by the fact that USEtox as any other model treats estimated and measured data sets the same way”.

3 Results and discussion

In this section the results of the RF in terms of informative variables selection （importance ranking） and uncertainty quantification will be presented. The %IncMSE of each input variable is the measure used to rank the variables, while the standard deviation of the %IncMSE, computed according to the 50 iterations, gives a measure of the level of confidence one can have on the results. The boxplot diagram in the left hand side part of Fig. 2 shows the increase of mean squared error for each variable in the model having FF_airC_airC as the output. It can be observed that in this case the variable KdegAis the most important, being followed by KH25Cand Pvap25. Adding other input variables to the model does not improve its explanatory power. One can also notice in this case that the variance of the %IncMSE （distance of the whiskers of the boxes in the boxplot） is low. Because bootstrap sampling is used, the variable importance values can vary slightly each time RF is run, but the ranking position of each variable typically remains unchanged （Breiman, 2001）.

The box in the middle of Fig. 2 shows the same kind of boxplot, but for a model where the output variable（still FF_airC_airC） has been shuffled. One can notice now that there is no neat ranking of variables' importance like in the previous case, the values of %IncMSE are ranging from 0 to 0.04 （instead of 0 to 0.4） and the variance of the %IncMSE is now much bigger.

The model is then very sensitive to the nature of the output and “recognizes” corrupted outputs （i.e. outputs which are not correlated with the inputs, but are just the effect of noise）. Finally, the right hand side part of Fig. 2 describes the case in which the output variable is re-established as the correct one, but three additional noisy variables have been added to the set of the inputs. In particular, the noisy variables have been obtained by shuffling the variable previously identified as the most important （KdegA） and two variables previously identified among the least important （BAFfishand KOC）. As one can observe, the model recognizes these three additional variables （respectively named KdegA_sh, BAFfish_shand KOC_sh） as noisy variables which do not contribute to the explanation of the output.

These %IncMSE values, on which the importance ranking of the variables is based, incorporate not only the effects of individual parameter uncertainties, but also all the higher-order interactions among the input parameters that best explain the variance in model predictions. This ability to consider all higher-order parameter interactions in calculating relative importance among input parameters is particularly useful in global sensitivity analysis for models based on complex mechanisms （Harper et al., 2011）.

Fig.2. %IncMSE of the RF run on Output 1 （left）, on Output 1 shuffled （middle） and on Output 1 with three shuffled variables added to the set of input variables （right）.

Table 1 Variables in the first four ranking positions according to the RF run on the subsets of estimated data. The values of the variables are logarithmically transformed.

Table 2 Variables in the first four ranking positions according to the RF run on the subsets of measured data. The values of the variables are logarithmically transformed.

In order to further test the model, the same procedure was applied for two other output variables involving more complex environmental paths： Output 8= FF from continental natural soil to continental freshwater（FF_natsoilC_frwaterC） and Output 12= iF from continental freshwater to meat （iF_frwater_meat）. The results are shown in Fig. 3 for the first case and in Fig. 4 for the second case.

The remaining boxplots （for all the other 10 output variables explored） are shown in the Appendix. The obtained MSE, the Oob MSE and the ranking of the variables in order of importance （measured via the corresponding %IncMSE） are reported in Table 3. The variables whose position is marked with an asterisk are those for which the ranking position can be affected by randomness in the algorithm, such that in a new run of the RF this ranking position could slightly change. The ones with no asterisk are the variables whose position is clearly not subjected to change as the effect of the randomness. As one can see, in the most of the cases up to the 3rdor 4thimportant variables are determined without ambiguity.

The next step in the analysis is the investigation of the models' residuals, both in the presence and in the absence of shuffled variables. Figure 5 shows the residuals obtained with the first output variable（FF_airC_airC）.

As one can see, the predicted values are in good agreement with the true （target） values, both using the training and the test sets. As one can expect, when the structure of the output is destroyed with the shuffling, then the predictive performances of the model are bad. In particular, the training data （black circles） try toreach the diagonal line, that is due to the learning process of RF, but the generalization ability, that is represented by the testing data （red triangles）, is of poor quality （as expected）. It is in fact known that the best model when there is no structure in the data is the mean value of the data, which in this case is around -0.2.

Just for exemplification purposes, Fig. 6 shows the model residuals for two more output variables, namely Output 8 and Output 12, which have been chosen because they represent more complex fate and effect mechanisms than the transfer of chemicals in the same compartment, like urban air at the continental scale in the case of Output 1.

Fig.3. %IncMSE of the RF run on Output 8 （left）, on Output 8 shuffled （middle） and on Output 8 with three shuffled variables added to the set of input variables （right）.

Fig.4. %IncMSE of the RF run on Output 12 （left）, on Output 12 shuffled （middle） and on Output 12 with three shuffled variables added to the set of input variables （right）.

Fig.5. Model residuals of the RF run on Output 1 （FF_airC_airC） with （right） and without （left） shuffling.

Table 3 Model errors assessment obtained on models' residuals； variables' importance ranking （variables logarithmically transformed）.

Summary statistics for the models' residual are shown in Table 4. As one can observe, when the output variable is shuffled the residuals are characterized by a high standard deviation and a low Kurtosis.

Fig.6. Model residuals of the RF run on Output 8 （left） and Output 12 （right）.

Table 4 Summary statistics computed on models' residuals （TRN= training set； TST= testing set）

4. Conclusions

The paper deals with the application of the RF algorithm to the study of input-output relationships between molecular-based properties and human and eco-toxicology impact factors of organic compounds obtained from the USEtox toxicity model.

The application of the RF algorithm allowed at the same time： i） a ranking of the variables in order of importance as to their contribution to the prediction of the target output； ii） an assessment of the uncertainty related to this ranking. It is noteworthy considering that a single-task modelling has been carried out here, meaning that a RF was run using one output variable at the time. In particular, 9 FFs and 4 iF （out of the full set of factors calculated by USEtox） have been chosen to perform the modelling.

The results obtained show that the fact that a particular variable in the input set is coming from a measurement or is estimated using default QSAR models （see Huijbregts et al., 2010, page 12） does not particularly influence the results of the ranking procedure.

Moreover, it is important to remark that changes and adaptations of USEtox database （as of any other multimedia fate and exposure model） are likely to happen as frequently as new measurement data are available. Since the quality and type of data very influential on the type of results obtained with a data-driven model（and on the conclusions drawn therein） the results of our analysis could significantly change if dramatic changes in the substance-specific properties database occurred. However, this would not undermine the validity of the overall methodology.

The RF models show good performances, as demonstrated by the analysis of residuals o training and test sets, and are robust with respect to noise injection, as demonstrated through a variables shuffling procedure.

The main conclusion that can be drawn from the study is that there is no fixed set of input variables which is selected as informative for all the outputs. This means that, even though only a few （maximum 4） variables are detected every time as important to perform the modelling for a specific output （factor）, these variables may change when one decides to predict another factor. This is the reason why the study of the entire output space at the same time, termed multi-task learning （Kanevski, 2012）, is envisioned as a possibly promising new research direction in order to derive a generally applicable variable selection procedure and identify a minimum set of inputs which are really crucial to globally explain the entire output space. This will require a careful （machine-learning based） analysis of the output space, including the identification and filtration of possible redundant information.

The approach could also be extended to other input-output datasets and models, e.g. USES-LCA 2.0 （van Zelm et al., 2009） which covers 3,396 chemicals, or using the physical-chemical, fate and toxicity data made available by REACH2ECHA Chemical inventory database： http：//echa.europa.eu/web/guest/information-on-chemicals/cl-inventorydatabasefor a big number of substances （about 12000 substances today and about 30000 by the end of 2018）. In particular, it would be interesting starting from the application to the modelling of human effect factors, for which recognized data gaps are present in USEtox （especially in median effective doses, ED50）. This task is indeed very challenging and it will firstly require a data gap filling procedure to handle missing values.

Acknowledgements

This work has been carried out in the framework of the project UNIC （Using Machine Learning for toxicological characterization of chemical emissions） under a research visiting grant provided by the Herbette Foundation, Lausanne, Switzerland.

Birkved, M. and Heijungs, R. （2011）, Simplified fate modelling in respect to ecotoxicological and human toxicological characterisation of emissions of chemical compounds. International Journal of Life Cycle Assessment, 16（8）, 739-747.

Breiman, L. （2001）, Random Forests. Machine Learning, 45, 5-32.

Harper, E.B., Stella, J.C. and Fremier, A.K. （2011）, Global sensitivity analysis for complex ecological models： a case study of riparian cottonwood population dynamics. Ecological Applications, 21（4）, 1225-1240.

Hertwich, E.G., Mateles, S.F., Pease, W.S. and McKone, T.E. （2001）, Human toxicity potentials for life-cycle assessment and toxics release inventory risk screening. Environmental Toxicology and Chemistry, 20（4）, 928-939.

Huijbregts, M., Hauschild, M., Jolliet, O., Margni, M., McKone, T., Rosenbaum, R.K., van de Meent, D. （2010）, USEtox? User manual. http：//www.usetox.org/sites/default/files/support-tutorials/user_manual_usetox.pdf.

Huijbregts, M.A.J., Thissen, U.M.J., Guinée, J.B., Jager, T., Kalf, D., Van de Meent, D., Ral gas, A.M.J., Wegener Sleeswijk, A., Reijnders, L. （2000）, Priority assessment of toxic substances in life cycle assessment. Part I： calculation of toxicity potentials for 181 substances with the nested multi-media fate, exposure and effects model USES-LCA. Chemosphere, 41, 541-573.

Kanevski, M. （2012）, Multitask Learning of Environmental Spatial Data. In： Seppelt et al. （Eds.）： Proceedings of the Sixth Biennial Meeting of the International Environmental Modelling and Software Society （iEMSs 2012）： Managing Resources of a Limited Planet., Leipzig, Germany, 2012.

Kanevski, M. （2013）, A Methodology for Automatic Analysis and Modeling of Spatial Environmental Data. GEOProcessing 2013：The Fifth International Conference on Advanced Geographic Information Systems， Applications， and Services.

Kanevski, M., Pozdnoukhov, A. and Timonin, V. （2009）, Machine Learning for Spatial Environmental Data. Theory， Applications，and Software, EPFL Press： Lausanne, Switzerland.

Liaw, A. and Wiener, M. （2002）, Classification and regression by random forest. R News, 2/3, 18-22.

Marvuglia, A., Kanevski, M., Leuenberger, M. and Benetto, E. （2014a）, Variables selection for ecotoxicity and human toxicity characterization using Gamma Test. In： B. Murgante et al. （Eds.）： ICCSA 2014， Part III， LNCS 8581, pp. 640-652, 2014.

Marvuglia, A., Kanevski, M., Leuenberger, M. And Benetto, E. （2014b）, Using machine learning for human toxicity and freshwater ecotoxicity characterization of chemical emissions, SETAC Europe 24thannual meeting， Basel, Switzerland, 11-15 May 2014.

Marvuglia, A., Kanevski, M., Benetto. E. （2015）, Machine learning for toxicity characterization of organic chemical emissions using USEtox database： learning the structure of the input space. Environment International, 83, 72-85.

Pennington, D.W., Margni, M., Ammann, C. and Jolliet, O. （2005）, Multimedia fate and human intake modeling： spatial versus nonspatial insights for chemical emissions in Western Europe. Environmental Science & Technology, 39（4）, 1119-1128.

Rosenbaum, R.K., Bachmann,T.M., Gold, L.S., Huijbregts, M., Jolliet, O., Juraske R., K?hler, A., Larsen, H.F., MacLeod, M., Margni, M., McKone, T.E., Payet, J., Schuhmacher, M., van de Meent, D., Hauschild, M.Z. （2008）, USEtox—The UNEP-SETAC toxicitymodel： recommended characterisation factors for human toxicity and freshwater ecotoxicity in Life Cycle Impact Assessment. International Journal of Life Cycle Assessment, 13（7）, 532-546.

Stefánsson, A., Kon?ar, N., and Jones, A.J. （1997）, A note on the Gamma Test. Neural Computing and Applications, 5, 131-133.

Van Zelm, R., Huijbregts, M.A.J. and Van de Meent, D. （2009）, USES-LCA 2.0 — a global nested multimedia fate, exposure, and effects model. International Journal of Life Cycle Assessment, 14, 282-284.

Wegener Sleeswijk A. and Heijungs R. （2010）, GLOBOX： A spatially differentiated global fate, intake and effect model for toxicity assessment in LCA. Science of the Total Environment, 408, 2817-2832.