“Why Should I Trust You?”
Explaining the Predictions of Any Classifier

Link to Paper

What did the authors try to accomplish?

The authors introduce a new way to interpret models using a technique they call LIME. They have devised both a local explanation technique and a global explanation technique. The key part of their technique is that their approach is model agnostic and that it will work with any kind of model and various types of data - tabular, text, and image.

What were the key elements of the approach?

Their approach takes random samples around the specific data point to explain, then uses a weighted regression (weighted by how close the samples are to the original data point to explain) and then uses the coefficients from the regression to show the impact of each feature on the data. This can be seen in the visual, from the paper, below: LIME Image from paper

In this example, we are classifying data points as either being red X’s or blue circles. The original black box classifier found a non-linear decision boundary based on the red and blue shading. LIME then takes random samples around the point to explain (the bolded red X), used the black box model to make predictions, and then fits a linear model to separate the data. The downside of this approach is the results it returns can only be applied to that specific sample. If the model says a specific feature has a big influence on predicting the outcome, that may only apply to that specific instance and does not mean that is an important feature globally.

The authors also describe SP-LIME, which is used for coming up with global explanations. SP-LIME works by taking LIME and applying it to multiple data points with the idea that you take enough data points to thoroughly explain the entire model, without taking too long. Then looking at the results of all different points lime describes to get a sense of the global feature importance of each feature.

What can you use yourself?

LIME already has a great python package, I have previously used the package in tutorials but would like to incorporate it in more use cases. I have never previously used SP-LIME and would like to try that out as well.

What other references do you want to follow?

The same authors came up with a follow-up to LIME called Anchors. Anchors addresses some of the limitations of LIME (namely that the results do not generalize to the entire model).