The role of machine learning in health insurance industry : a case study

Dissertant

Rashmawi, Ruba

Thesis advisor

Abu Hasan, Hasan

University

Birzeit University

Faculty

Faculty of Engineering and Technology

Department

Department of Computer Science

University Country

Palestine (West Bank)

Degree

Master

Degree Date

2022

Arabic Abstract

تقوم شركات التأمين في جميع أنحاء العالم بتطوير منتجات باستخدام كميات كبيرة من البيانات لتقييم المخاطر و تحديدها و التنبؤ بها، و تستكشف كيف يمكن تطبيق أنظمة قائمة على الذكاء الصناعي و خوارزميات التعلم الآلي بهدف تحسين رضا العملاء و تقليل التكاليف التشغيلية و زيادة الربحية، و بعد التنبؤ بالمخاطر و الكشف عن الاحتيال من أكثر المواضيع تحديا في مجال التأمين.

الهدف الأول من الدراسة هو استخدام التحليلات التنبؤية للتنبؤ بمستوى المخاطر لدى مشتركي التأمين الصحي باستخدام خوارزميات التعلم الآلي.

اشتملت الدراسة على حوالي 10 آلاف مشترك تم تصنيفهم على 3 مستويات مخاطر (عالية المخاطر، متوسطة المخاطر، منخفضة المخاطر).

الهدف الرئيسي هو التنبؤ بدقة بمستوى المخاطر و بالتالي مساعدة شركة التأمين في توفير معدل أقساط تأمين متناسبة مع المخاطر للعملاء الجند.

تم تحليل و مقارنة بعض خوارزميات التصنيف Classification algorithms باستخدام لغة البرمجة الإحصائية R .

و حصلت خوارزمية Random Forest على أعلى دقة حيث بلغت حوالي 94، و كانت الحساسية sensitivity لمستوى المخاطر العالية حوالي %72، و 81 المستوى المخاطر المتوسطة.

الهدف الثاني من الدراسة هو اكتشاف القيم المتطرفة و الشادة في المطالبات الطبية للمساعدة في الكشف عن الاحتيال باستخدام خوارزميات التعلم الآلي ومساعدة المتخصصين في التأمين على التحقيق و اكتشاف المطالبات الطبية المشتبه في ارتكابها أنشطة احتيالية.

تساعد خوارزميات التعلم الآلي الخاصة بالكشف عن القيم الشاذة Outlier Detection في اكتشاف السلوك و النمط غير الطبيعي في المطالبات الطبية بهدف اكتشاف حالات الاحتيال تمت مقارنة ثلاث تقنيات (LOF) Clustering (PAM) و (IF).

تم الكشف عن مجموعة من القيم المتطرفة ذات السلوك أو النمط غير الطبيعي،وكان لبعضها تكاليف عالية جدا مقارنة بالمجموعات التي ينتمون إليهم، و كان لدى البعض الآخر فترة زمنية قصيرة جدا بين الزيارات الطبية و التي قد تكون سلوكا مشبوها.

تستهلك عمليات تدقيق المطالبات الكثير من الوقت والجهد و هي مكلفة للغاية لشركات التأمين، لذا فإن استخدام خوارزميات التعلم الآلي في اكتشاف المطالبات المشتبه بارتكابها أنشطة احتيالية ليتم مراجعتها من قبل الخبراء سيوفر للشركة مبالغ ضخمة من التكاليف التشغيلية حيث سيتم التركيز على المطالبات الأكثر شبهة للاحتيال عوضاً عن مراجعة و تدقيق كميات كبيرة من المطالبات الطبية.

English Abstract

Insurance companies worldwide are exploring how machine learning (ML) can improve customer satisfaction, reduce operational costs and increase profitability.

The greatest opportunity lies in risk prediction and fraud detection, whereas these two topics are the most challenging in the Insurance domain.

The first goal of the study is to predict the risk level in Health Insurance subscribers using machine learning Classification algorithms.

The study included around 10 thousand subscribers who were classified on 3 risk levels (High-risk, Mid-risk, and Low-risk).

The main objective is to predict accurately the risk level and accordingly assist the company in providing an accurate premium rate for new customers.

Some supervised classification algorithms were analyzed and compared using R statistical programming Language; the Random Forest algorithm had the highest accuracy which was around 94%, sensitivity for the High-risk level was around 72%, and 81% for the Mid-risk level.

The second goal of the study is to detect outliers in medical claims to help predict which claims are suspects of fraudulent activities and assist insurance specialists to investigate and discover cases of fraud.

Unsupervised Outlier Detection algorithms assist in discovering abnormal behavior in medical pattern claims and discover cases of fraud.

Three outlier detection techniques were compared, the first using Clustering algorithm (PAM), the second using Densitybased local outliers (LOF), and the third using Isolation Forests (IF).

Outliers with abnormal behavior were detected, some had very high costs compared to the clusters/neighbors they belong to, and others had a very small-time interval between clinical visits which may be suspicious behavior.

Claims audits consume lots of time and are very costly for insurance companies, so having machine learning algorithms detect the suspicious claims that require review will save the company huge amounts of operational costs and will increase work efficiency.

Main Subjects

Information Technology and Computer Science

No. of Pages

Table of contents.

Abstract.

Abstract in Arabic.

Chapter One : Introduction.

Chapter Two : Literature review.

Chapter Three : Methodology.

Chapter Four : Results and discussion.

Chapter Five : Conclusions and recommendations.

References.

American Psychological Association (APA)

Rashmawi, Ruba. (2022). The role of machine learning in health insurance industry : a case study. (Master's theses Theses and Dissertations Master). Birzeit University, Palestine (West Bank)
https://search.emarefa.net/detail/BIM-1429254

Modern Language Association (MLA)

Rashmawi, Ruba. The role of machine learning in health insurance industry : a case study. (Master's theses Theses and Dissertations Master). Birzeit University. (2022).
https://search.emarefa.net/detail/BIM-1429254

American Medical Association (AMA)

Language

English

Data Type

Arab Theses

Record ID

BIM-1429254

SaveSaved Print

Arab Citation & Impact Factor "Arcif"

Largest Arabic Database of Citations Analysis for the Arabic Scholarly Journals Issued in Arab World.

e-Marefa Platform for Arabic Textbook.

"Kashif" for Checking Similarity or Plagiarism in the Arabic Researches. know more