Concept Extraction on Quranic Translation Text
Abstract
The Semantic knowledge that based on ontology learning technology introduced is to reduce the overall time of ontology construction. Ontology construction process includes several aspects and layers, and extraction domain concept is one of the most important aspects. This step becomes a prerequisite process in developing ontology and also becomes a seed to the next step. In this paper, we carried out several experiments based on linguistics, statistical and hybrid approaches in order to identify which are the best techniques and approaches to extract terms from Quranic translation text. For linguistic approach, we used POS pattern, for statistical approaches, we choose seven frequency-based as the techniques to choose frequency terms and for hybrid, we combined both linguistic and statistical approaches. The results obtained show that the hybrid approach is the best in identifying and filtering relevant concepts in Quranic domain corpus.
Full Text:
PDFReferences
Church, K. and Gale, W. (1995). Inverse Document Frequency (IDF): a measure of deviations from Poisson, in D. Yarowsky and K. Church (Eds), Third Workshop on very large corpora, ACL, MIT, pp. 121–130
Cimiano, P. (2006). Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Springer. November 2006.
Daille B., Gaussier E., Lange J, (1994). Towards Automatic Extraction of Mono lingual and Bilingual Terminology. Proceeding of COLING 94. 515-524.
FrantziΚ.Τ., Ananiadou S., Tsujii, J. (1998).The C-value/NC-value method of Automatic Recog¬nition for Multi-Word Terms. In Christos N. and Staphanidis C. (Eds.) Lecture Notes in Computer Science, LNCS 1513, Springer, 1998, pp. 585-604.
Gerard, S. and Chris, B. (1987). Term Weighting Approaches in Automatic Text Retrieval. Technical Report.Cornell University, Ithaca, NY, USA.
Kageura, K, and Umino, B. (1996). Methods of automatic term recognition: a review. Terminology 3(2):2590–289.
Medelyan, O., Witten, Ian H., Thesaurus based automatic keyphrase indexing, Digital Libraries, 2006. JCDL '06. Proceedings of the 6th ACM/IEEE-CS Joint Conference on , vol., no., pp.296,297, June 2006
Pazienza, M. T, Pennacchiotti, M., Zanzotto F M. (2005). Terminology extraction: an analysis of linguistic and statistical approaches. In: S. Sirmakessis (ed.) Knowledge Mining. Series: Studies in Fuzziness and Soft Computing, Vol.185, Springer Verlag.
Rennie J. (2005). Using term informativeness for named entity detection. In Proceeding SIGIR '05 Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval.Pages 353-360.
Salton, G. and McGill, M. J., (1983). Introduction to Modern Information Retrieval. McGraw-Hill. Printed in New York.
Wong, W., Liu, W., Bennamoun, M. (2008).Determination of unithood and termhood for term recognition. In: Handbook of research on text and web mining technologies. IGI Global (2008).
Zhang, Z., Iria, J., Brewster, C., Ciravegna, F. (2008).A Comparative Evaluation of Term Recognition Algorithms. In Proceedings of The sixth international conference on Language Resources and Evaluation, (LREC 2008), May 28-31, 2008, Marrakech, Morocco.
Refbacks
- There are currently no refbacks.