Speech-based CALL system to evaluate the meaning and grammar errors in English spoken utterance

Dissertant

Atiq, Muhammad

Thesis advisor

Hanani, Abu al-Suud

University

Birzeit University

Faculty

Faculty of Engineering and Technology

Department

Department of Computer Science

University Country

Palestine (West Bank)

Degree

Master

Degree Date

2019

English Abstract

In this research, we are developing a CALL (Computer Assisted Language Learning) system to evaluate the English spoken sentences grammatically and linguistically.

We give the user a certain prompt written in his native language, then the response is recorded as English audio file.

The English spoken response is converted to text using baseline English DNN-HMM ASR and another two commercial ASRs (Google and Microsoft Bing).

The produced transcription is assessed in terms of language and meaning errors.

Grammatical errors are detected using English grammar checker, part of speech analysis and extracting incorrect bi-grams from grammatically incorrect responses.

Errors related to the meaning are detected using novel approaches which measure the similarity between the given response and stored set of reference responses.

The training and testing datasets of spoken CALL shared task 2017 and 2018 were used in all of our experiments presented in this thesis.

We propose three main approaches to build this CALL system.

The first approach is rule-based, which take a final decision about the given response (accept or reject) by passing audio transcription given by ASR (text) through a sequence of pipelined stages and rules.

Each rule checks if the response has a language error or not.

If a rule can not detect any errors, it passes the response to the next rule, and so on.

In the second approach, the genetic algorithm was combined with firs approach to tune the parameters and thresholds used in each rule.

The third approach is a machine learning model which predicts the final decision, accept or reject.

Different types of features were extracted from the response and used in these approaches.

The universal sentence encoder was used to encode each sentence into 512-dimensional vector to represent the semantic features of the response.

Also, we propose a binary embedding approach to produce 438 binary features vector from the response.

To assess the grammatical errors, a set of features were extracted using the grammar checker tool and part of speech analysis from the text response.

Finally, the best two DNN models have been fused together to enhance the system performance.

D-score was used as a performance metric in all of our experiments.

The D-score of our three proposed systems are 6.5, 14.4 and 13.87, respectively.

Compared with the results of similar systems (spoken CALL shared task 2018) published in Interspeech 2018, our second and third systems outperform them

Main Subjects

Languages & Comparative Literature
Information Technology and Computer Science

Topics

No. of Pages

69

Table of Contents

Table of contents.

Abstract.

Chapter One : Introduction.

Chapter Two : Overview of the CALL shared task.

Chapter Three : Optimization techniques background

Chapter Four : Automatic speech recognition ASR-background.

Chapter Five : Introduction to machine learning methods.

Chapter Six : Methodology.

Chapter Seven : Experiments and results.

Chapter Eight : Conclusion and future work.

References.

American Psychological Association (APA)

Atiq, Muhammad. (2019). Speech-based CALL system to evaluate the meaning and grammar errors in English spoken utterance. (Master's theses Theses and Dissertations Master). Birzeit University, Palestine (West Bank)
https://search.emarefa.net/detail/BIM-977613

Modern Language Association (MLA)

Atiq, Muhammad. Speech-based CALL system to evaluate the meaning and grammar errors in English spoken utterance. (Master's theses Theses and Dissertations Master). Birzeit University. (2019).
https://search.emarefa.net/detail/BIM-977613

American Medical Association (AMA)

Atiq, Muhammad. (2019). Speech-based CALL system to evaluate the meaning and grammar errors in English spoken utterance. (Master's theses Theses and Dissertations Master). Birzeit University, Palestine (West Bank)
https://search.emarefa.net/detail/BIM-977613

Language

English

Data Type

Arab Theses

Record ID

BIM-977613