The text document processing in the information retrieval from web pages

Joint Authors

Chova, K.
Tak, V.
Ednar, P.

Source

International Journal of Intelligent Computing and Information Sciences

Issue

Vol. 6, Issue 2 (31 Jul. 2006)11 p.

Publisher

Ain Shams University Faculty of Computer and Information Sciences

Publication Date

2006-07-31

Country of Publication

Egypt

No. of Pages

11

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

The paper describes possible representation models and ways of weighting text documents, which can be found on the Internet.

The focus is on automatic extraction of information from texts including pre-processing of text documents.

The paper presents also results of experiments, which were carried out using the 20 News Groups collection of documents.

These experiments concern with the influence of the training set cardinality and a suitable weighting of text documents to the precision of document classification.

The results of experiments with k-means clustering and k-means clustering with controlled initialization are also presented.

American Psychological Association (APA)

Chova, K.& Tak, V.& Ednar, P.. 2006. The text document processing in the information retrieval from web pages. International Journal of Intelligent Computing and Information Sciences،Vol. 6, no. 2.
https://search.emarefa.net/detail/BIM-284200

Modern Language Association (MLA)

Chova, K.…[et al.]. The text document processing in the information retrieval from web pages. International Journal of Intelligent Computing and Information Sciences Vol. 6, no. 2 (Jul. 2006).
https://search.emarefa.net/detail/BIM-284200

American Medical Association (AMA)

Chova, K.& Tak, V.& Ednar, P.. The text document processing in the information retrieval from web pages. International Journal of Intelligent Computing and Information Sciences. 2006. Vol. 6, no. 2.
https://search.emarefa.net/detail/BIM-284200

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references.

Record ID

BIM-284200