An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling

المؤلفون المشاركون

Devi, R. Suganya
Manjula, D.
Siddharth, R. K.

المصدر

The Scientific World Journal

العدد

المجلد 2015، العدد 2015 (31 ديسمبر/كانون الأول 2015)، ص ص. 1-9، 9ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2015-06-07

دولة النشر

مصر

عدد الصفحات

9

التخصصات الرئيسية

الطب البشري
تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Web Crawling has acquired tremendous significance in recent times and it is aptly associated with the substantial development of the World Wide Web.

Web Search Engines face new challenges due to the availability of vast amounts of web documents, thus making the retrieved results less applicable to the analysers.

However, recently, Web Crawling solely focuses on obtaining the links of the corresponding documents.

Today, there exist various algorithms and software which are used to crawl links from the web which has to be further processed for future use, thereby increasing the overload of the analyser.

This paper concentrates on crawling the links and retrieving all information associated with them to facilitate easy processing for other uses.

In this paper, firstly the links are crawled from the specified uniform resource locator (URL) using a modified version of Depth First Search Algorithm which allows for complete hierarchical scanning of corresponding web links.

The links are then accessed via the source code and its metadata such as title, keywords, and description are extracted.

This content is very essential for any type of analyser work to be carried on the Big Data obtained as a result of Web Crawling.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Devi, R. Suganya& Manjula, D.& Siddharth, R. K.. 2015. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling. The Scientific World Journal،Vol. 2015, no. 2015, pp.1-9.
https://search.emarefa.net/detail/BIM-1079073

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Devi, R. Suganya…[et al.]. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling. The Scientific World Journal No. 2015 (2015), pp.1-9.
https://search.emarefa.net/detail/BIM-1079073

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Devi, R. Suganya& Manjula, D.& Siddharth, R. K.. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling. The Scientific World Journal. 2015. Vol. 2015, no. 2015, pp.1-9.
https://search.emarefa.net/detail/BIM-1079073

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1079073