An approach for instance based schema matching with google similarity and regular expression

Joint Authors

Ibrahim, Hamidah
Mahdi, Usamah
Afandi, Lili

Source

The International Arab Journal of Information Technology

Issue

Vol. 14, Issue 5 (30 Sep. 2017)10 p.

Publisher

Zarqa University

Publication Date

2017-09-30

Country of Publication

Jordan

No. of Pages

10

Main Subjects

Information Technology and Computer Science

Abstract EN

Instance based schema matching is the process of comparing instances from different heterogeneous data sources in determining the correspondences of schema attributes.

It is a substitutional choice when schema information is not available or might be available but worthless to be used for matching purpose.

Different strategies have been used by various instance based schema matching approaches for discovering correspondences between schema attributes.

These strategies are neural network, machine learning, information theoretic discrepancy and rule based.

Most of these approaches treated instances including instances with numeric values as strings which prevents discovering common patterns or performing statistical computation between the numeric instances.

As a consequence, this causes unidentified matches especially for numeric instances.

In this paper, we propose an approach that addresses the above limitation of the previous approaches.

Since we only fully exploit the instances of the schemas for this task, we rely on strategies that combine the strength of Google as a web semantic and regular expression as pattern recognition.

The results show that our approach is able to find 1-1 schema matches with high accuracy in the range of 93%-99% in terms of Precision (P), Recall (R), and F-measure (F).

Furthermore, the results showed that our proposed approach outperformed the previous approaches although only a sample of instances is used instead of considering the whole instances during the process of instance based schema matching as used in the previous works.

American Psychological Association (APA)

Mahdi, Usamah& Ibrahim, Hamidah& Afandi, Lili. 2017. An approach for instance based schema matching with google similarity and regular expression. The International Arab Journal of Information Technology،Vol. 14, no. 5.
https://search.emarefa.net/detail/BIM-852236

Modern Language Association (MLA)

Mahdi, Usamah…[et al.]. An approach for instance based schema matching with google similarity and regular expression. The International Arab Journal of Information Technology Vol. 14, no. 5 (Sep. 2017).
https://search.emarefa.net/detail/BIM-852236

American Medical Association (AMA)

Mahdi, Usamah& Ibrahim, Hamidah& Afandi, Lili. An approach for instance based schema matching with google similarity and regular expression. The International Arab Journal of Information Technology. 2017. Vol. 14, no. 5.
https://search.emarefa.net/detail/BIM-852236

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-852236