Font Size:

A unified approach to multi-block and multi-group data analysis

Last modified: 2013-06-16

#### Abstract

On the one hand, multi-block data analysis concerns the analysis of several sets of variables (blocks) observed on the same set of individuals. On the other hand, multi-group data analysis concerns the analysis of one set of variables observed on a set of individuals taking into account a group-structure at the level of the individuals. Two types of partition of an individualsΓvariables data matrix X are then defined. In the multi-block framework, the column partition π=[π1,β¦,ππβ¦,ππ½] is considered. In this case, each block ππ is an nΓp

*j*data matrix and represents a set of p*j*variables observed on a set of n individuals. The number and the nature of the variables differ from one block to another but the individuals must be the same across blocks. In the multi-group framework, the row partition π=[π1π‘,β¦,πππ‘,β¦,ππΌπ‘]π‘ is considered. In this framework, the same set of variables is observed on different groups of observations. Each matrix ππ is an n*i*Γp data matrix, is called a group in this paper and represents a set of p variables observed on a set of ni individuals. The number of observations of each block could differ from one block to another. Many methods exist for multi-block and multi-group data analysis. Regularized Generalized Canonical Correlation Analysis (RGCCA) has been proposed in Tenenhaus & Tenenhaus (2011) and appeared to include an amazing large number of criterion-based multi-block data analysis methods as particular cases. In this paper, we intend to extend RGCCA so that it can also be a unifying tool for multi-group data analysis. Only first dimension components will be discussed in this paper. Components related to other dimensions can be obtained by following the same procedures on deflated blocks or groups with respect to the previous dimension components.
Full Text:
PDF