A general fuzzy-based framework for text representation and its application to text categorization
In this paper we develop the general framework for text representation based on fuzzy set theory. This work is extended from our original ideas [5],[4], in which a document is represented by a set of fuzzy concepts. The importance degree of these fuzzy concepts characterize the semantics of documents and can be calculated by a specified aggregation function of index terms. Based on this representation, a general framework is proposed and applied to text categorization problem. An algorithm is given in detail for choosing fuzzy concepts. Experiments on the real-world data set show that the proposed method is superior to the conventional method for text representation in text categorization. © Springer-Verlag Berlin Heidelberg 2006.