Information-Balance-Aware Approximated Summarization of Data Provenance

Joint Authors

Pei, Jisheng
Ye, Xiaojun

Source

Scientific Programming

Issue

Vol. 2017, Issue 2017 (31 Dec. 2017), pp.1-11, 11 p.

Publisher

Hindawi Publishing Corporation

Publication Date

2017-09-12

Country of Publication

Egypt

No. of Pages

11

Main Subjects

Mathematics

Abstract EN

Extracting useful knowledge from data provenance information has been challenging because provenance information is often overwhelmingly enormous for users to understand.

Recently, it has been proposed that we may summarize data provenance items by grouping semantically related provenance annotations so as to achieve concise provenance representation.

Users may provide their intended use of the provenance data in terms of provisioning, and the quality of provenance summarization could be optimized for smaller size and closer distance between the provisioning results derived from the summarization and those from the original provenance.

However, apart from the intended provisioning use, we notice that more dedicated and diverse user requirements can be expressed and considered in the summarization process by assigning importance weights to provenance elements.

Moreover, we introduce information balance index (IBI), an entropy based measurement, to dynamically evaluate the amount of information retained by the summary to check how it suits user requirements.

An alternative provenance summarization algorithm that supports manipulation of information balance is presented.

Case studies and experiments show that, in summarization process, information balance can be effectively steered towards user-defined goals and requirement-driven variants of the provenance summarizations can be achieved to support a series of interesting scenarios.

American Psychological Association (APA)

Pei, Jisheng& Ye, Xiaojun. 2017. Information-Balance-Aware Approximated Summarization of Data Provenance. Scientific Programming،Vol. 2017, no. 2017, pp.1-11.
https://search.emarefa.net/detail/BIM-1203411

Modern Language Association (MLA)

Pei, Jisheng& Ye, Xiaojun. Information-Balance-Aware Approximated Summarization of Data Provenance. Scientific Programming No. 2017 (2017), pp.1-11.
https://search.emarefa.net/detail/BIM-1203411

American Medical Association (AMA)

Pei, Jisheng& Ye, Xiaojun. Information-Balance-Aware Approximated Summarization of Data Provenance. Scientific Programming. 2017. Vol. 2017, no. 2017, pp.1-11.
https://search.emarefa.net/detail/BIM-1203411

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references

Record ID

BIM-1203411