Exploiting Wikipedia to support exploratory Arabic search on the web

Other Title(s)

استغلال الويكيبيديا لتدعيم البحث الاستكشافي العربي في الويب

Dissertant

Abid, Ahmad Muhammad Abd al-Aziz

Thesis advisor

al-Agha, Iyad Muhammad

Comitee Members

Abu-Shaban, Yusuf Nabil
al-Halis, Ala Mustafa

University

Islamic University

Faculty

Faculty of Information Technology

Department

Information Technology

University Country

Palestine (Gaza Strip)

Degree

Master

Degree Date

2016

English Abstract

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially in case of explanatory search when users are unfamiliar with the search domain.

Many efforts have been proposed to support exploratory search on the Web by using different knowledge sources such as DBpedia and Linked Open Data (LOD).

However, these knowledge sources have limited support for the Arabic content, and thus they can be hardly used with queries expressed in Arabic.

In this research, we propose a fully automated approach that is run on query time to support search results for Arabic language by exploiting Wikipedia link structure.

It aims to use the Arabic version of Wikipedia to extract complementary knowledge that is relevant to the search query submitted by the user.

We propose ArabXplore, a system that extracts key entities from search snippets and Wikipedia pages and ranks them based on a new ranking algorithm that is based on the traditional PageRank algorithm.

Finally, a graph is built to visually represent highly ranked topics and their relations to the end user.

Our proposed system was assessed over a dataset of 100 Arabic search queries covering different domains, and results were assessed and rated by a human expert.

The underlying ranking algorithm was also compared with the conventional PageRank.

Results showed that our ranking algorithms outperformed the PageRank algorithm.

Our ranking algorithm achieved 87.7 nDCG and 68.2 MAP while the conventional PageRank achieved 84.5 nDCG and 50.3 MAP.

The source code, test dataset, and complete experimental results are available online on: https://github.com/aabed91/ArabXplore

Main Subjects

Information Technology and Computer Science

No. of Pages

77

Table of Contents

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Background and related works.

Chapter Three : Methodology.

Chapter Four : Results and discussion.

Chapter Five : Conclusions.

References.

American Psychological Association (APA)

Abid, Ahmad Muhammad Abd al-Aziz. (2016). Exploiting Wikipedia to support exploratory Arabic search on the web. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-727250

Modern Language Association (MLA)

Abid, Ahmad Muhammad Abd al-Aziz. Exploiting Wikipedia to support exploratory Arabic search on the web. (Master's theses Theses and Dissertations Master). Islamic University. (2016).
https://search.emarefa.net/detail/BIM-727250

American Medical Association (AMA)

Abid, Ahmad Muhammad Abd al-Aziz. (2016). Exploiting Wikipedia to support exploratory Arabic search on the web. (Master's theses Theses and Dissertations Master). Islamic University, Palestine (Gaza Strip)
https://search.emarefa.net/detail/BIM-727250

Language

English

Data Type

Arab Theses

Record ID

BIM-727250