What's new

P2.T9.903. Machine Learning in Risk Management (van Liebergen)

Nicole Seaman

Chief Admin Officer
Staff member
Learning objectives: Describe the process of machine learning and compare machine learning approaches. Describe the application of machine learning approaches within the financial services sector and the types of problems to which they can be applied. Analyze the application of machine learning in three use cases: Credit risk and revenue modeling; Fraud; Surveillance of conduct and market abuse in trading


903.1. Peter is an analyst who is using Microsoft Azure to conduct drag-and-drop (that is, without coding!) machine learning analytics on his company's dataset of consumer loans. The dataset includes the response variable (aka, dependent variable) in a column that indicates the historical performance of the consumer loan as either "defaulted" or "repaid in full." Peter wants to use a training set to predict whether future loans will default and he expects the relationship is non-linear. Which of the following machine learning approaches is probably best?

a. Any unsupervised approach
b. Either ridge or LASSO non-penalized regression
c. Either principal components, or K- and X- means clustering
d. Either decision trees, support vector machines or deep learning

903.2. Barbara has developed a model to detect fraudulent transactions at her bank. Her primary dataset consists of a table that contains millions of rows (aka, observations), one per each customer transaction, and several dozen columns; each column is already a "feature" (aka, attribute, parameter) in the model. Her goal is to increase the predictive power of the model, and the model does perform well when applied to the historical database (aka, in-sample), but she is greatly concerned specifically about overfitting the model. Each of the following techniques is a possible mitigant (or remedy) EXCEPT which of the following is unlikely to help or cure her overfitting problem?

a. Bootstrap aggregation; aka, bagging
b. Build a random forest or ensemble of tree-based models
c. Increase the number of features; ie, add parameters to the model
d. Boosting; ie., overweight scarcer observations in the training dataset

903.3. Bart van Liebergen writes that "Financial institutions (FIs) are looking to more powerful analytical approaches in order to manage and mine increasing amounts of regulatory reporting data and unstructured data, for purposes of compliance and risk management (applying machine learning as RegTech) or in order to compete effectively with other FIs and FinTechs." He explains that machine learning approaches are well-positioned to deliver this analytical power due to their natural ability to cope with extremely large datasets while offering a high granularity and depth of predictive analysis. He presents three use cases: Credit risk and revenue modeling; Fraud; and Surveillance of conduct and market abuse in trading.

In regard to these three case studies, each of the following is true (according to Bart van Liebergen) EXCEPT which is inaccurate?

a. Clustering is an unsupervised learning method that is applicable to anti-money laundering and counter terrorism financing (AML/CTF)
b. Machine learning has been more successful in credit card fraud than anti-money laundering and counter terrorism financing (AML/CTF)
c. To facilitate the surveillance of conduct breaches by traders, supervisory learning approaches are difficult to apply because there is often no labeled training data
d. Widespread adoption of machine learning is limited by two practical constraints: regulations require supervised (i.e., national supervisor) learning; and machine learning's black box character implies that applications in the financial sector are not context-dependent

Answers here: