A hybrid technique for annotating book tables

Joint Authors

Ahmad, Nasir
Latif, Asima
Khusro, Shah
Ullah, Irfan

Source

The International Arab Journal of Information Technology

Issue

Vol. 15, Issue 4 (31 Jul. 2018)7 p.

Publisher

Zarqa University

Publication Date

2018-07-31

Country of Publication

Jordan

No. of Pages

7

Main Subjects

Information Technology and Computer Science

Abstract EN

Table extraction is usually complemented with the table annotation to find the hidden semantics in a particular piece of document or a book.

These hidden semantics are determined by identifying a type for each column, finding the relationships between the columns, if any, and the entities in each cell.

Though used for the small documents and web-pages, these approaches have not been extended to the table extraction and annotation in the book tables.

This paper focuses on detecting, locating and annotating entities in book tables.

More specifically it contributes algorithms for identifying and locating the tables in books and annotating the table entities by using the online knowledge source DBpedia Spotlight.

The missing entities from the DBpedia Spotlight are then annotated using Google Snippets.

It was found that the combined results give higher accuracy and superior performance over the use of DBpedia alone.

The approach is a complementary one to the existing table annotation approaches as it enables us to discover and annotate entities that are not present in the catalogue.

We have tested our scheme on Computer Science books and got promising results in terms of accuracy and performance.

American Psychological Association (APA)

Latif, Asima& Khusro, Shah& Ullah, Irfan& Ahmad, Nasir. 2018. A hybrid technique for annotating book tables. The International Arab Journal of Information Technology،Vol. 15, no. 4.
https://search.emarefa.net/detail/BIM-839058

Modern Language Association (MLA)

Latif, Asima…[et al.]. A hybrid technique for annotating book tables. The International Arab Journal of Information Technology Vol. 15, no. 4 (Jul. 2018).
https://search.emarefa.net/detail/BIM-839058

American Medical Association (AMA)

Latif, Asima& Khusro, Shah& Ullah, Irfan& Ahmad, Nasir. A hybrid technique for annotating book tables. The International Arab Journal of Information Technology. 2018. Vol. 15, no. 4.
https://search.emarefa.net/detail/BIM-839058

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-839058