What techniques can we use to understand the often opaque "computer says no" neural networks produced using ML?
For example if a bank uses ML to decide whether to agree a loan, a customer who is rejected may wish to know what was the main factor behind the rejection.
Is it a case of taking the customer's data and re-trying the ML algorithm on several small variations of the customer's data to see if the loan would have been approved?
For example asking the ML to approve the same loan but pretending that the customer has a higher salary, or that the customer's postcode is different?
What you're referring to here is lack of model interpretability. Deep neural networks are notorious for this. This is one reason institutions that deal with customers (e.g. banks /insurance), like to stick to simple models like linear regression. If inputs are correctly normalized, the weights of LR can be used to explain the importance of features. The method you described (changing input features and analysing the output is also another way - known as partial dependency plots).
In this book I pay special attention to model intepretability when it's needed. For example, in Chapter 06 I talk about a technique called GradCAM which is a good interpretation technique for image classification models.
PhD | Senior Data Scientist | AI/ML Educator