DataGlue

The DATA GLUE Artificial Intelligence and Machine Learning (AI/ML) practice group recently developed a fraud detection solution using a set of machine learning (ML) models. The team showed how from a simple data set of credit card transactions, we were able to train the model to recognize fraud patterns. We developed a self-learning model which enabled it to adapt to new, unknown fraud patterns based on similar but distinct data sets. The team deployed the solution utilizing AWS cloud native serverless applications which easily scale and shave off costs by only charging for time used. By implementing cost optimization mechanisms, the team managed to reduce the costs associated with fraud detection by more than fifteen-percent (15%).

Some 15.4 million consumers were victims of identity theft or fraud last year, according to a new report from Javelin Strategy & Research.

Fraud is an ongoing problem for financial institutions that can cost businesses billions of dollars annually and damage customer trust. Some 15.4 million consumers were victims of identity theft or fraud last year, according to a new report from Javelin Strategy & Research. That’s up 16 percent and the highest figure recorded since the firm began tracking fraud instances in 2004. Many companies use a rule-based approach to detect fraudulent activity where fraud patterns are defined as rules. But, implementing and maintaining rules can be a complex, time-consuming process because fraud is constantly evolving, rules require fraud patterns to be known, and rules can lead to false positives or false negatives. The team wanted to develop a machine learning solution that did not require a rule set that must be maintained and constantly updated. In order to generate the machine learning model for fraud detection the team used a Linear Learner Algorithm, which brings state of the art computing power and combines it with complex mathematical theorems in order to generate a predicted score. The score is then measured to determine if the transaction is fraudulent or not fraudulent. The score is determined based on a series of factors that are available in the data. The team used a set of credit card transactions that contained features like: price of item purchased, time of purchase, recipient of funds, frequency of transactions and other commonly found data points in such sets. No additional information was generated or incorporated.

Fully Managed - No Need to Administer Servers

The fraud detection model was trained using the data set with the assistance of a supervised human labeler. Once the model was able to label the fraud cases without human intervention, then the model was trained further by using the same test data set. Once training and tuning finished, the model was then validated by being given a new set of data it had not previously seen. The new set of data contained similar features as the training set but with some variations. The model proved successful in identifying fraud cases with a high degree of certainty. The variation of data features did not affect the performance of the model. The team used AWS Sagemaker to train the model in a larger instance than the one hosting the model. By separating the training environment from the hosting environment, the team managed to decrease costs associated with the model. Further, the team applied optimization principles in training by using a larger instance for a smaller period of time, to further generate costs savings associated with the model. Once deployed, the model is hosted in a scalable environment that only pays for the time used, which in turns generates constant savings.

It pays to catch fraud

The DATA GLUE team has shown how we can generate machine learning models to detect fraud instances and how the model can handle unknown data sets to continue generating highly accurate fraud detections. Using cloud native serverless applications, the team managed to decrease the costs associated with training and deploying a machine learning model.

Data Glue

Fraud Detection

Fully Managed - No Need to Administer Servers

It pays to catch fraud

Carlo Rostant

1 Comment

Kay Duggan Reply

Leave a Reply by Emailing Us

Search

Categories

Recent Posts

Contact Center vs Call Center

AWS Launches Lake Formation

Tags