ホーム

MEMBERS

Message Download Consortium

MEMBERS

Program Members

Statistical Science and its Applications

Ryuei Nishii
Degree: Doctor of Science (Hiroshima University)
Research Interests: Statistics, Pattern Recognition, Image Analysis
Unit: Uncertainty

Report
Pattern recognition of high-dimensional data

Thanks to recent development of the measurement technique, data with more than several thousand variables can be obtained. For example, the gene expression of more than 20,000 (= p) human genes is observable, whereas the sample size n is around dozens. This is known as the issue of ``n << p" in statistics, and it is one of the recent problems.

As for the pattern recognition technique to such high-dimensional data, it has been argued in various ways. For example, support vector machine (SVM) and artificial neural network (ANN) are known to be effective. My laboratory discussed AdaBoost which is derived by the linear combination of the large number of weak classifiers generated randomly. The choice of the base classifiers becomes important because it determines that the final classifier is powerful or not. We generate the random linear combination of randomly-selected variables and make a base classifier. Furthermore, we adopted Bagging to overcome the over-learning and instability. The proposed method hereby showed a result to exceed SVM and ANN for the issue of various fields including the remote sensing field. (Figure 1 shows that two-class classification when the true boundaries of the two categories are given by concentric circles. It is seen the proposed method (right bottom) is superior to SVM and the ordinary AdaBoost).

In addition, we considered the image classification based on Markov random fields which model the spatial dependence of pixels. We have shown that the modeling successfully classifies the pixel labels.

Cooperative research with the industry

My laboratory started the collaborative research with the industry in 2008. Since physical models are basic equations in the production industry, the statistical models are not regarded as important. Through the cooperative research, I found that the introduction of statistical approach here is very important. I hope the cooperative research contributes not only to the industry but also to our whole society (cf. Figure 2).

RETURN LIST