A multi-loss super regression learner (MSRL) with application to survival prediction using proteomics

Document Type



Brain and Mind Institute


Even though a number of regression techniques have been proposed over the years to handle a large number of regressors, due to the complex nature of data emerging from recent high-throughput experiments, it is unlikely that any single technique will be successful in modeling all data types. Thus, multiple regression algorithms from the collection of modern regression techniques that are capable of handling high dimensional regressors should be entertained for analyzing such data. A novel approach of building a super regression learner is proposed which can be fit with a training data set in order to make future predictions of a continuous outcome. The resulting super regression model is multi-objective in nature and mimics the performances of the best component regression models irrespective of the data type. This is accomplished by combining elements of bootstrap based risk calculation, rank aggregation, and stacking. The utility of this approach is demonstrated through its use on mass spectrometry data.


This work was published before the author joined Aga Khan University.


Computational Statistics