CÁC BÀI BÁO KHOA HỌC 06:40:27 Ngày 20/04/2024 GMT+7
Transformation rule learning without rule templates: A case study in part of speech tagging

Part of speech (POS) tagging is an important problem and is one of the first steps included in many tasks in natural language processing. It affects directly on the accuracy of many other problems such as Syntax Parsing, Word Sense Disambiguation, and Machine Translation. Stochastic models solve this problem relatively well, but they still make mistakes. Transformation-based learning (TBL) is a solution which can be used to improve stochastic taggers by learning a set of transformation rules. However, its rule learning algorithm has the disadvantages that rule templates must be prepared by hand and only rules are instances of rule templates can be generated. In this paper, we propose a model to learn transformation rules without rule templates. This model considers the rule learning problem as a feature selection problem. Experiments on Penn TreeBank showed that the proposal model reduces errors of stochastic taggers with some tags. © 2008 IEEE.


 Bach N.X., Cuong L.A., Ha N.V., Binh N.N.
   452.pdf    Gửi cho bạn bè
  Từ khóa : Artificial intelligence; Computational linguistics; Computer aided language translation; Education; Feature extraction; Information technology; Information theory; Laws and legislation; Learning algorithms; Learning systems; Linguistics; Mathematical models; Natural language processing systems; Speech; Speech processing; Speech transmission; Stochastic programming; Technology; Case studies; Feature selection; International conferences; Language processing; Machine translation; NAtural language processing; Part-of-Speech tagging; Rule learning; Transformation rules; Transformation-based learning; Treebank; Web information; Word-sense disambiguation; Stochastic models