Text mining : extract numerical measures to identify documents attributes

Author

Salman, Mahdi Abd

Source

Journal of Babylon University : Journal of Applied and Pure Sciences

Issue

Vol. 18, Issue 3 (30 Sep. 2010)6 p.

Publisher

University of Babylon

Publication Date

2010-09-30

Country of Publication

Iraq

No. of Pages

6

Main Subjects

Information Technology and Computer Science

Topics

Abstract AR

الغرض من عملية التنقيب في النصوص لمعالجة المعلومات الغير مهيكلة و استخلاص أرقام ذات معنى من النصوص و كذلك توفير إمكانية الوصول للمعلومات الموجودة في النص لمختلف خوارزميات التنقيب.

بالاعتماد على المعالجة الأولية للملفات النصية تم استخدام طريقة للتنقيب في النص لاستخدامها في استخراج و تحديد الكلمات المهمة في النص و التي تدخل لاحقا في خوارزميات التصنيف.

Abstract EN

The purpose of Text Mining is to process unstructured (textual) information, extract meaningful numeric indices from the text, and, thus, make the information contained in the text accessible to the various data mining (statistical and machine learning) algorithms.

We have described here approach to text mining that is based on a preprocessing of documents to identify significant words and phrases to be used as attributes in the classification algorithm.

American Psychological Association (APA)

Salman, Mahdi Abd. 2010. Text mining : extract numerical measures to identify documents attributes. Journal of Babylon University : Journal of Applied and Pure Sciences،Vol. 18, no. 3.
https://search.emarefa.net/detail/BIM-287505

Modern Language Association (MLA)

Salman, Mahdi Abd. Text mining : extract numerical measures to identify documents attributes. Journal of Babylon University : Journal of Applied and Pure Sciences Vol. 18, no. 3 (2010 ).
https://search.emarefa.net/detail/BIM-287505

American Medical Association (AMA)

Salman, Mahdi Abd. Text mining : extract numerical measures to identify documents attributes. Journal of Babylon University : Journal of Applied and Pure Sciences. 2010. Vol. 18, no. 3.
https://search.emarefa.net/detail/BIM-287505

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references.

Record ID

BIM-287505