Categories
- DATA SCIENCE / AI
- AFIR / ERM / RISK
- ASTIN / NON-LIFE
- BANKING / FINANCE
- DIVERSITY & INCLUSION
- EDUCATION
- HEALTH
- IACA / CONSULTING
- LIFE
- PENSIONS
- PROFESSIONALISM
- THOUGHT LEADERSHIP
- MISC
ICA LIVE: Workshop "Diversity of Thought #14
Italian National Actuarial Congress 2023 - Plenary Session with Frank Schiller
Italian National Actuarial Congress 2023 - Parallel Session on "Science in the Knowledge"
Italian National Actuarial Congress 2023 - Parallel Session with Lutz Wilhelmy, Daniela Martini and International Panelists
Italian National Actuarial Congress 2023 - Parallel Session with Kartina Thompson, Paola Scarabotto and International Panelists
170 views
1 comments
0 likes
4 favorites
ConstanzeArnold
Fraud detection is an essential problem in the bank industry. It can create the loss of money and can do massive harm to the reputation of financial institutions. Therefore, in real-world examples, fraud comes as a prevalent and influential research area. The goal is to train the transactions classifier of two classes: fraudulent and regular transactions. Fraudulent transactions are a rare event that leads to very imbalanced data. Therefore, the imbalanced data set faces unsolved issues when used for classifier training. Let us have a data set of transactions. We suggest splitting the classification process into several ones. The training data set is clustered, and different sub-classifiers are trained on the clustered data. We chose XGBoost as the classifier of transactions. When testing the classification, the decision is made by a sub-classifier whose training set center is the
closest to the particular point from the training set. In our case, the proper criterion of classification is the F1 score because it is a harmonic mean of precision and recall. For the experimental evaluation of the suggested strategy, We use the credit card transaction database (https://data.world/ealtman/synthetic-credit-card-transactions) representing actual transactions of the credit card users living in the United States. The experiments show that we succeed in the significant increase of F1 score as compared with the case without clustering.
1 Comments
November 3, 2022 01:00:44 AM UTC
It is an interesting presentation..so insightful