Use Case Financial

Unsupervised fraud detection

Financial institutions often face an overwhelming number of cases that require manual review to determine potential fraudulent activity. Traditionally, the identification of possibly fraudulent clients relies on basic business rules, which can be heavily influenced by human factors. Our project addresses this challenge by reorganizing transactional and sociodemographic customer data to model behavior using unsupervised algorithms. The approach involves two main phases: clustering followed by anomaly detection.

Challenges

This project focuses on improving fraud detection by aggregating customer data, predicting fraudulent cases based on business criteria, and creating a focused list of the most anomalous transactions across different categories like international transfers, income, and expenses.

Customer data aggregation

Agreggate all the information the bank has of their customers.

Fraud detection modeling

Model and predict fraudulent cases validated by business criteria.

Identification of anomalous transactions

Build a reduced list of the most anomalous cases by several transactional axis (international transferences, incomes, expenses…).

Solution

We generate around multiple indicators for each customer, covering both transactional data (e.g., total income, transfers to risk areas) and sociodemographic data (e.g., number of accounts or products, age, sector…).

These indicators are log-transformed and scaled for normalization. Unsupervised clustering models are used to segment customers with custom metrics ensuring cluster stability and determining the optimal number of clusters.

Anomaly detection models are then applied to identify outliers within each segment. The results are validated with both business feedback and SHAP values on synthetic data are used to ensure model transparency.

Tech stack

Results

This approach offers a comprehensive view of customer behavior, enabling a deeper understanding of interactions and trends across various touchpoints. It introduces a new statistical methodology specifically designed for fraud detection, allowing for more accurate identification of suspicious activities based on data patterns. Additionally, it presents an innovative way to detect misclassified clients, helping to refine customer categorization and improve targeted strategies. Together, these advancements contribute to more effective risk management and better customer segmentation.

Let’s stay in touch !