Events and Meetings of Italian Statistical Society, Advances in Latent Variables - Methods, Models and Applications

Mining the ambiguity: correspondence and network analysis for discovering word sense
Simona Balbi, Agnieszka Elzbieta Stawinoga

Assuming that language can be modelled as a network of words, it is difficult to mine knowledge in textual data bases, due to their high dimensionality and the ambiguity which characterises words and their use. From a methodological viewpoint, here we propose a strategy for stressing the differences in the manifest relations emerging by Network Analysis (NA) and the latent relations obtained by lexical Correspondence Analysis (CA). Aim of this paper is to deal with the word-sense disambiguation problem, not in the usual pre-processing step, but during the analysis. The results applied to the analysis of a management commentary are presented in order to propose some statistical lexical sources, useful in the peculiar domain of business information.

