How To - Find a Counterfactual

Learn > Basics of the What-If Tool > Find a Counterfactual

How To - Find a Counterfactual

The datapoint editor is dedicated to a variety of datapoint-level analyses, and visualizes individual data points in the loaded data set. One such functionality is the ability to find counterfactuals for a selected datapoint. In the What-If Tool, a Counterfactual is the most similar datapoint of a different classification (for classification models) or of a difference in prediction greater than a specified threshold (for regression models).

Select a data point of interest in the custom Datapoints visualization by clicking on it. A list of all features and values associated with that datapoint will appear in the Edit module.
In the Visualize module, turn on the counterfactual toggle by clicking on it: a. In the custom Datapoints visualization, the nearest counterfactual datapoint will be highlighted. b. In the Edit module, a list of feature values associated with the counterfactual will appear alongside the selected datapoint. Feature values that are different from the selected datapoint are displayed in green. c. In the Infer module, the prediction values associated with the counterfactual are displayed alongside the selected datapoint.

Above: Find different counterfactuals to a selected data point by model and distance.

Change the similarity metric by selecting from the options provided. a. You can select between L1 Norm distance and L2 Norm distance between data points. More information on how these distances are calculated will be included in a follow-on tutorial. b. When using the What-If Tool in notebook mode, you can provide a custom distance metric to calculate distance between datapoints. In that case, it will be used instead of L1/L2 Norm to find the closest counterfactual.
Change the model used for prediction results for finding counterfactuals by selecting from the dropdown menu, if comparing multiple models.

Above: Using the similarity modal to create a new similarity feature and use it in the Datapoints visualization.

In regression models, change the threshold value using the counterfactual threshold slider. By default, the threshold for finding a counterfactual data point is set to the standard deviation of the prediction scores.

Make a selected datapoint the center of your visualization

You can evaluate how similar all data points are to a given selection by creating a similarity feature. Click on the “Create similarity feature” to open a window. Here you can rename this feature, decide which distance type to use, and directly apply it to the Datapoints visualization. This feature is particularly useful when you want to find clusters of data points that are near a data point of interest.

time to read

7 minutes

use with

Classification models
Multi-class models
Regression models

before you begin

N/A

related demos

Binary Classification Model: UCI Census Income Prediction

Multi-class Classification Model: Flowers Species Identification

Regression Model: UCI Census Age Prediction

takeaways

Learn to find a counterfactual for a datapoint.

Configure metrics and models used when calculating counterfactuals.

what-if questions

What needs to be different in a datapoint to be classified differently?

What differences between two data points cause models to behave differently?

Which two data points are most similar but have different classifications?