Efficient Multi-Task Feature Learning with Calibration
published: Oct. 7, 2014, recorded: August 2014, views: 2180
Report a problem or upload filesIf you have found a problem with this lecture or would like to send us extra material, articles, exercises, etc., please use our ticket system to describe your request and upload the data.
Enter your e-mail into the 'Cc' field, and we will keep you updated with your request's status.
Multi-task feature learning has been proposed to improve the generalization performance by learning the shared features among multiple related tasks and it has been successfully applied to many real-world problems in machine learning, data mining, computer vision and bioinformatics. Most existing multi-task feature learning models simply assume a common noise level for all tasks, which may not be the case in real applications. Recently, a Calibrated Multivariate Regression (CMR) model has been proposed, which calibrates different tasks with respect to their noise levels and achieves superior prediction performance over the non-calibrated one. A major challenge is how to solve the CMR model efficiently as it is formulated as a composite optimization problem consisting of two non-smooth terms. In this paper, we propose a variant of the calibrated multi-task feature learning formulation by including a squared norm regularizer. We show that the dual problem of the proposed formulation is a smooth optimization problem with a piecewise sphere constraint. The simplicity of the dual problem enables us to develop fast dual optimization algorithms with low per-iteration cost. We also provide a detailed convergence analysis for the proposed dual optimization algorithm. Empirical studies demonstrate that, the dual optimization algorithm quickly converges and it is much more efficient than the primal optimization algorithm. Moreover, the calibrated multi-task feature learning algorithms with and without the squared norm regularizer achieve similar prediction performance and both outperform the non-calibrated ones. Thus, the proposed variant not only enables us to develop fast optimization algorithms, but also keeps the superior prediction performance of the calibrated multi-task feature learning over the non-calibrated one.
Link this pageWould you like to put a link to this lecture on your homepage?
Go ahead! Copy the HTML snippet !