Counterfactual Fairness

A fairness metric that checks whether a classification model produces the same result for one individual as it does for another individual who is identical to the first, except with respect to one or more sensitive attributes. Evaluating a classification model for counterfactual fairness is one method for surfacing potential sources of bias in a model.

See either of the following for more information:

Fairness: Counterfactual fairness in Machine Learning Crash Course.

When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness

Real-world uses

Created for this library

1.
A bank's model risk team evaluates counterfactual fairness by checking whether a credit decision changes if only the applicant's race is changed in the model's inputs.
2.
A hiring-tech vendor tests counterfactual fairness on its resume ranker by altering candidate names typically associated with protected attributes.
3.
A health-tech startup audits its triage model for counterfactual fairness across simulated patient profiles that differ only in demographic attributes.

Back to glossary

Counterfactual Fairness

Real-world uses

Related terms

Loading…

Counterfactual Fairness

Real-world uses

Related terms