Table of Contents

1 Executive Summary

This part 2 we build a binary acceptance model: do we accept the customer or not. First with no protected features, then with protected features (such as age, marital status, gender, etc.)

2 Introduction

The car insurance data is provides a wealth of information and points to discuss. What is fair? What is ethical. We will explore questions like:

  • Is it fair to use gender?
  • No? Well can we use age then? That is a good indicator proxy for driving experience, maturity, etc. and indeed has a high information value.
  • Also not? Well, let’s just look at past track record then (did the person have an accident in the last 5 years)
  • But wait, that is somehow discriminatory for people that drive at least 5 years, because only they could build up such track record?
  • That leaves us wealth related parameters such as income, value of the car, etc. They do the same … but isn’t that discriminatory for poor people?
  • So, we have left things related to car use (private or commercial use, daily average travel time)
  • This model won’t be that strong. The insurance company now need to increase prices … which will impact the poor …

The dilemma is that when we believe that there is a protected feature \(S\) (say income), then there are at least two groups for which using the feature will results in more desirable outcome or less desirable outcome. Leaving out the feature will make the model weaker and increase prices for all … impacting the hardest the group that we set out to protect.

Workflow

The work-flow is inspired by: (De Brouwer 2020) and is prepared by Philippe De Brouwer to illustrate the concepts explained in the book.

We will approach the problem in a few phases

  1. Prepare the data: previous file
  2. Build a naive model (this file)
  3. Build a classification model based with CLAIM_FLAG as dependent variable (this variable is 0 of no claim was filed)
  4. Now we need to find a price for the insurance policy. We can use a simple average or build a model. We will build a regression model to determine the fee that the customer should pay based on CLM_AMT (the amount claimed) of those customers that are expected to have an accident
  5. Decide on Protected Features
  6. Remove protected features and proxy features
  7. repeat both models
  8. Draw conclusions

3 Loading the Data

We need to load the data as prepared in previous file

setwd("/home/philippe/Documents/teaching/courses/ethicsAI/3-bias_data/car_insurance")

# import functions:

# Read in the functions:
source("ethics_functions.R")

# List the functions defined in this file:
tmt.env <- new.env()
sys.source("ethics_functions.R", envir = tmt.env)
utils::lsf.str(envir=tmt.env)
## dollars_to_numeric : function (input)  
## expand_factor2bin : function (data, col)  
## fAUC : function (model, data, ...)  
## get_best_cutoff : function (fpr, tpr, cutoff, cost.fp = 1)  
## make_dependency_hist_plot : function (data, x, y, title, method = "loess", q_ymax = 1, q_xmax = 1)  
## make_dependency_plot : function (data, x, y, title, method = "loess", q_ymax = 1, q_xmax = 1)  
## make_WOE_table : function (df, y)  
## opt_cut_off : function (perf, pred, cost.fp = 1)  
## pct_table : function (x, y, round_digits = 2)  
## space_to_underscore : function (input)  
## str_clean : function (x)
# Remove the temporary environment:
rm(tmt.env)


# Read in the data:
d_fact <- readRDS('./data/d_fact.R')

# We replace CLAIM_FLAG with a variable that is 1 when the outcome is positive
d_fact <- d_fact                         %>%
         mutate(isGood = 1 - as.numeric(paste(d_fact$CLAIM_FLAG))) %>%
         select(-c(CLAIM_FLAG))

4 Naive Model Without Protected Features

4.1 Test Model

First try all variables

m0 <- glm(isGood ~ ., d_fact, family = 'binomial')
summary(m0)
## 
## Call:
## glm(formula = isGood ~ ., family = "binomial", data = d_fact)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.3400  -0.6272   0.3692   0.7039   2.4279  
## 
## Coefficients: (5 not defined because of singularities)
##                                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                     -0.395711   0.313416  -1.263 0.206743    
## KIDSDRIV                        -0.363335   0.278051  -1.307 0.191308    
## HOMEKIDS1                       -0.140183   0.108144  -1.296 0.194886    
## HOMEKIDS>=2                     -0.038919   0.094897  -0.410 0.681719    
## PARENT1No                        0.232937   0.112631   2.068 0.038626 *  
## MSTATUSNo                       -0.547889   0.080533  -6.803 1.02e-11 ***
## GENDERF                          0.118503   0.262555   0.451 0.651740    
## EDUCATIONz_High_School          -0.003395   0.088557  -0.038 0.969422    
## EDUCATIONBachelors               0.426812   0.108692   3.927 8.61e-05 ***
## EDUCATIONMasters                 0.477329   0.169691   2.813 0.004909 ** 
## EDUCATIONPhD                     0.331784   0.185936   1.784 0.074358 .  
## OCCUPATIONjunior                -0.220583   0.120845  -1.825 0.067951 .  
## OCCUPATIONLawyer                 0.004206   0.179401   0.023 0.981295    
## OCCUPATIONProfessional          -0.073459   0.144009  -0.510 0.609983    
## OCCUPATIONsenior                 0.718145   0.152230   4.717 2.39e-06 ***
## OCCUPATIONz_Blue_Collar         -0.218401   0.145881  -1.497 0.134364    
## TRAVTIME25--40                  -0.306511   0.068309  -4.487 7.22e-06 ***
## TRAVTIME>40                     -0.621968   0.069355  -8.968  < 2e-16 ***
## CAR_USEPrivate                   0.727758   0.081388   8.942  < 2e-16 ***
## BLUEBOOK10--15K                  0.191409   0.075113   2.548 0.010826 *  
## BLUEBOOK15--20K                  0.230293   0.088199   2.611 0.009026 ** 
## BLUEBOOK>20K                     0.403673   0.094565   4.269 1.97e-05 ***
## TIF2--5                          0.254580   0.074264   3.428 0.000608 ***
## TIF6--8                          0.440816   0.072632   6.069 1.29e-09 ***
## TIF>8                            0.635556   0.076650   8.292  < 2e-16 ***
## CAR_TYPEPickup                  -0.488389   0.093274  -5.236 1.64e-07 ***
## CAR_TYPEPTruck_Van              -0.526232   0.113422  -4.640 3.49e-06 ***
## CAR_TYPESports_Car              -0.946210   0.115749  -8.175 2.97e-16 ***
## CAR_TYPEz_SUV                   -0.681845   0.100024  -6.817 9.31e-12 ***
## RED_CARyes                       0.062462   0.083032   0.752 0.451888    
## OLDCLAIM>0                      -0.552550   0.083395  -6.626 3.46e-11 ***
## CLM_FREQ1                        0.113506   0.095338   1.191 0.233825    
## CLM_FREQ2                        0.129216   0.092369   1.399 0.161839    
## CLM_FREQ>2                             NA         NA      NA       NA    
## REVOKEDYes                      -0.730897   0.074565  -9.802  < 2e-16 ***
## MVR_PTS1                        -0.160456   0.085488  -1.877 0.060525 .  
## MVR_PTS2                        -0.231751   0.090765  -2.553 0.010670 *  
## MVR_PTS3--4                     -0.303138   0.078661  -3.854 0.000116 ***
## MVR_PTS>4                       -0.486111   0.086909  -5.593 2.23e-08 ***
## URBANICITYz_Highly_Rural/ Rural  2.455352   0.104746  23.441  < 2e-16 ***
## AGE30--40                        0.484143   0.210297   2.302 0.021325 *  
## AGE40--55                        0.958097   0.206768   4.634 3.59e-06 ***
## AGE>55                           0.461342   0.230491   2.002 0.045332 *  
## INCOME25--55K                   -0.021786   0.092082  -0.237 0.812975    
## INCOME55--75K                   -0.088236   0.115748  -0.762 0.445876    
## INCOME>75K                       0.268401   0.123270   2.177 0.029455 *  
## YOJ>1                            0.427722   0.113624   3.764 0.000167 ***
## HOME_VAL100--200K                0.255601   0.080120   3.190 0.001422 ** 
## HOME_VAL200--300K                0.280117   0.092913   3.015 0.002571 ** 
## HOME_VAL>300K                    0.390850   0.139332   2.805 0.005029 ** 
## CAR_AGE2--7                      0.087812   0.078542   1.118 0.263559    
## CAR_AGE8--11                     0.056959   0.079792   0.714 0.475325    
## CAR_AGE12--15                    0.021582   0.106894   0.202 0.839993    
## CAR_AGE>=16                     -0.063907   0.134840  -0.474 0.635540    
## KIDSDRV1                        -0.334213   0.297440  -1.124 0.261169    
## KIDSDRV>=2                      -0.159423   0.628694  -0.254 0.799821    
## GENDER.AGEF.>55                 -0.304965   0.289877  -1.052 0.292776    
## GENDER.AGEF.30--40               0.209611   0.270226   0.776 0.437934    
## GENDER.AGEF.40--55              -0.058893   0.259330  -0.227 0.820348    
## GENDER.AGEM.<30                        NA         NA      NA       NA    
## GENDER.AGEM.>55                        NA         NA      NA       NA    
## GENDER.AGEM.30--40                     NA         NA      NA       NA    
## GENDER.AGEM.40--55                     NA         NA      NA       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11181.6  on 9636  degrees of freedom
## Residual deviance:  8431.6  on 9579  degrees of freedom
## AIC: 8547.6
## 
## Number of Fisher Scoring iterations: 5

4.2 Fit the Model With only Relevant Variables

Leave out the least relevant ones (the coefficients without stars) and build a first usable model \(m_1\).

We will not use further:

  • KIDSDRV
  • HOMEKIDS
  • GENDER
  • RED_CAR
  • CAR_AGE
  • CLM_FREQ (too much correlated to OLDCLAIM)
  • GENDER.AGE
frm <- formula(isGood ~ MSTATUS + EDUCATION + OCCUPATION + TRAVTIME +
                 CAR_USE + BLUEBOOK + TIF + CAR_TYPE + OLDCLAIM + 
                 REVOKED + MVR_PTS + URBANICITY + AGE  + INCOME + 
                 YOJ + HOME_VAL)

m1 <- glm(frm, d_fact, family = 'binomial')
summary(m1)
## 
## Call:
## glm(formula = frm, family = "binomial", data = d_fact)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.2743  -0.6660   0.3856   0.7128   2.4101  
## 
## Coefficients:
##                                  Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                     -0.107127   0.224487  -0.477 0.633214    
## MSTATUSNo                       -0.593818   0.065057  -9.128  < 2e-16 ***
## EDUCATIONz_High_School          -0.007295   0.086237  -0.085 0.932584    
## EDUCATIONBachelors               0.438288   0.099852   4.389 1.14e-05 ***
## EDUCATIONMasters                 0.435534   0.149740   2.909 0.003630 ** 
## EDUCATIONPhD                     0.347441   0.168960   2.056 0.039749 *  
## OCCUPATIONjunior                -0.244750   0.118566  -2.064 0.038994 *  
## OCCUPATIONLawyer                 0.013269   0.176246   0.075 0.939985    
## OCCUPATIONProfessional          -0.073681   0.141143  -0.522 0.601649    
## OCCUPATIONsenior                 0.698354   0.149224   4.680 2.87e-06 ***
## OCCUPATIONz_Blue_Collar         -0.239132   0.143192  -1.670 0.094916 .  
## TRAVTIME25--40                  -0.296752   0.067463  -4.399 1.09e-05 ***
## TRAVTIME>40                     -0.596040   0.068454  -8.707  < 2e-16 ***
## CAR_USEPrivate                   0.693804   0.080302   8.640  < 2e-16 ***
## BLUEBOOK10--15K                  0.194169   0.074122   2.620 0.008804 ** 
## BLUEBOOK15--20K                  0.232693   0.085412   2.724 0.006443 ** 
## BLUEBOOK>20K                     0.389974   0.086922   4.487 7.24e-06 ***
## TIF2--5                          0.252219   0.073316   3.440 0.000581 ***
## TIF6--8                          0.450104   0.071800   6.269 3.64e-10 ***
## TIF>8                            0.626743   0.075750   8.274  < 2e-16 ***
## CAR_TYPEPickup                  -0.484687   0.091825  -5.278 1.30e-07 ***
## CAR_TYPEPTruck_Van              -0.508646   0.106474  -4.777 1.78e-06 ***
## CAR_TYPESports_Car              -0.898498   0.095765  -9.382  < 2e-16 ***
## CAR_TYPEz_SUV                   -0.639593   0.076944  -8.312  < 2e-16 ***
## OLDCLAIM>0                      -0.486383   0.060328  -8.062 7.48e-16 ***
## REVOKEDYes                      -0.746171   0.073641 -10.133  < 2e-16 ***
## MVR_PTS1                        -0.178206   0.084428  -2.111 0.034795 *  
## MVR_PTS2                        -0.231409   0.089502  -2.586 0.009724 ** 
## MVR_PTS3--4                     -0.327117   0.077671  -4.212 2.54e-05 ***
## MVR_PTS>4                       -0.514532   0.085991  -5.984 2.18e-09 ***
## URBANICITYz_Highly_Rural/ Rural  2.361003   0.102937  22.936  < 2e-16 ***
## AGE30--40                        0.513866   0.132334   3.883 0.000103 ***
## AGE40--55                        0.891984   0.128534   6.940 3.93e-12 ***
## AGE>55                           0.322565   0.145418   2.218 0.026541 *  
## INCOME25--55K                   -0.016250   0.090983  -0.179 0.858248    
## INCOME55--75K                   -0.060066   0.114426  -0.525 0.599628    
## INCOME>75K                       0.261078   0.121693   2.145 0.031922 *  
## YOJ>1                            0.409499   0.112193   3.650 0.000262 ***
## HOME_VAL100--200K                0.248116   0.079048   3.139 0.001696 ** 
## HOME_VAL200--300K                0.269736   0.091436   2.950 0.003178 ** 
## HOME_VAL>300K                    0.357513   0.137360   2.603 0.009248 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11181.6  on 9636  degrees of freedom
## Residual deviance:  8589.2  on 9596  degrees of freedom
## AIC: 8671.2
## 
## Number of Fisher Scoring iterations: 5
AUC1 <- fAUC(m1, d_fact, type="response")

The AUC of model \(m_1\) is: 0.817

# Re-use the model m and the dataset t2:
pred1 <- prediction(predict(m1, type = "response"), d_fact$isGood)

# Visualize the ROC curve:
#plot(performance(pred, "tpr", "fpr"), col="blue", lwd = 3)
#abline(0, 1, lty = 2)

AUC1 <- attr(performance(pred1, "auc"), "y.values")[[1]]
#paste("AUC:", AUC)

perf1 <- performance(pred1, "tpr", "fpr")
ks <- max(attr(perf1,'y.values')[[1]] - attr(perf1,'x.values')[[1]])
#paste("KS:", ks)


#predScores <- modelr::add_predictions(df, m)$pred  # not correct?
predScores1 <- predict(m1, type = "response")

# Visualize the KS with the package InformationValue:
InformationValue::ks_plot(actuals = d_fact$isGood, predictedScores = predScores1)
The ROC (receiver operating curve) for our model

The ROC (receiver operating curve) for our model

InformationValue::plotROC(actuals = d_fact$isGood, predictedScores = predScores1)
The lift of the model (bottom): the cumulative percentage of responders (ones) captured by the model

The lift of the model (bottom): the cumulative percentage of responders (ones) captured by the model

4.2.1 Optimal Cut-Off

# The optimal cutoff if the false positives cost 3 times more than the false negatives:
cutoff_vals <- opt_cut_off(perf1, pred1, cost.fp = 3)
cutoff_vals # print the values related to the cutoff
##                  [,1]
## sensitivity 0.5695683
## specificity 0.8748056
## cutoff      0.8301915
cutoff1 <- cutoff_vals[3,1]

# Binary predictions based on that cutoff:
bin_pred1 <- if_else(pred1@predictions[[1]] > cutoff1, 1, 0)

We accept around 45.09% of the customers. However, note that the data is re-balanced (probably rows of good customers have been deleted ad random). So, it is impossible to answer the question how much percent of customers we actually will reject. In any case this should be much less than this calculation seems to imply.

4.3 Fairness of This First Model

First we calculate false negatives and false positives for further use:

FN1 <- if_else(bin_pred1 == 0 & d_fact$isGood == 1, 1, 0)
FP1 <- if_else(bin_pred1 == 1 & d_fact$isGood == 0, 1, 0)

4.3.1 GENDER

In the data men and women were roughly equally represented and in the model we did not use gender.

The following table illustrates Equal Opportunity: compare False Negative Rates:

summarytools::ctable(d_fact$GENDER, FN1, headings=FALSE)
FN1 0 1 Total
GENDER
M 2937 (69.4%) 1293 (30.6%) 4230 (100.0%)
F 3658 (67.7%) 1749 (32.3%) 5407 (100.0%)
Total 6595 (68.4%) 3042 (31.6%) 9637 (100.0%)

Predictive Equality: compare False Positive Rates

summarytools::ctable(d_fact$GENDER, FP1, headings=FALSE)
FP1 0 1 Total
GENDER
M 4081 (96.5%) 149 (3.5%) 4230 (100.0%)
F 5234 (96.8%) 173 (3.2%) 5407 (100.0%)
Total 9315 (96.7%) 322 (3.3%) 9637 (100.0%)

4.3.2 AGE

The variable AGE has, unlike GENDER, been used in our model. So, it would be normal to see a different demographic parity for younger and older people.

summarytools::ctable(d_fact$AGE, bin_pred1, headings=FALSE)
bin_pred1 0 1 Total
AGE
<30 321 (85.8%) 53 (14.2%) 374 (100.0%)
30–40 1523 (66.0%) 783 (34.0%) 2306 (100.0%)
40–55 2723 (47.7%) 2981 (52.3%) 5704 (100.0%)
>55 725 (57.9%) 528 (42.1%) 1253 (100.0%)
Total 5292 (54.9%) 4345 (45.1%) 9637 (100.0%)

Indeed, a young person has an 17.4% probability to be accepted, and an older person around 50%. The difference –amounting to 187% higher probability– is indeed significant. Due to the model design this is rather a desired outcome than bias that was not expected.

Old and young people are under-represented, this makes it harder for the model to correctly predict their behaviour. That could be a hidden bias (or a bias that is not by design.)

The following table illustrates Equal Opportunity: compare False Negative Rates:

summarytools::ctable(d_fact$AGE, FN1, headings=FALSE)
FN1 0 1 Total
AGE
<30 242 (64.7%) 132 (35.3%) 374 (100.0%)
30–40 1482 (64.3%) 824 (35.7%) 2306 (100.0%)
40–55 4042 (70.9%) 1662 (29.1%) 5704 (100.0%)
>55 829 (66.2%) 424 (33.8%) 1253 (100.0%)
Total 6595 (68.4%) 3042 (31.6%) 9637 (100.0%)

Younger people are discriminated against compared to older people. The false positive rates are about 5% higher for younger people.

Predictive Equality: compare False Positive Rates

summarytools::ctable(d_fact$AGE, FP1, headings=FALSE)
FP1 0 1 Total
AGE
<30 371 (99.2%) 3 (0.8%) 374 (100.0%)
30–40 2232 (96.8%) 74 (3.2%) 2306 (100.0%)
40–55 5507 (96.5%) 197 (3.5%) 5704 (100.0%)
>55 1205 (96.2%) 48 (3.8%) 1253 (100.0%)
Total 9315 (96.7%) 322 (3.3%) 9637 (100.0%)

This effect is even more pronounced when comparing the false positives: this happens the oldest group has 162% more probability to get an advantage.

4.3.3 MSTATUS

summarytools::ctable(d_fact$MSTATUS, FN1, headings=FALSE)
FN1 0 1 Total
MSTATUS
Yes 4120 (70.8%) 1700 (29.2%) 5820 (100.0%)
No 2475 (64.8%) 1342 (35.2%) 3817 (100.0%)
Total 6595 (68.4%) 3042 (31.6%) 9637 (100.0%)
summarytools::ctable(d_fact$MSTATUS, FP1, headings=FALSE)
FP1 0 1 Total
MSTATUS
Yes 5620 (96.6%) 200 (3.4%) 5820 (100.0%)
No 3695 (96.8%) 122 (3.2%) 3817 (100.0%)
Total 9315 (96.7%) 322 (3.3%) 9637 (100.0%)

While there are differences for the married status, they are not big and do not indicate signficant bias.

4.4 Conclusions for model1

The first consideration is what we consider as protected features. If we consider AGE, GENDER, MSTATUS, etc. as protected features then we conclude that there are 2 types of cases:

  • variables that have been used in the model: obviously some of the groups have lower probability of favourable outcome (there will be no demographic parity) – False Positive or Negative Rates can show a difference but not necessarily do so. Typically
    • when one group is under-represented, then that group is less understood by the model (e.g. younger people compared to middle aged people).
    • when both are equally represented such as for GENDER, this bias does not seem to happen.
  • variables that have not been used in the model: typically the FNR and FPR are different, but not necessarily so.

5 Model where Protected Features are Left Out

The chosen protected features \(S_j\) are chosen to be:

  • Variables related to gender and family:
    • HOMEKIDS
    • PARENT1
    • MSTATUS
    • GENDER
  • Education and profession:
    • EDUCATION
    • OCCUPATION
  • Wealth related variables:
    • BLUEBOOK (0.4 correlation to INCOME)
    • HOMEVAL

or any variables that are too much correlated with those. [we skip this]

That leaves us the following \(X_i\):

  • Car use related
    • TRAVTIME
    • CAR_USE
    • URBANICITY
  • Car type related
    • CAR_TYPE
  • Behaviour on our books
    • TIF
    • OLDCLAIM
  • Third party driving behaviour indicators:
    • REVOKED
    • MVR_PTS
  • Other behavioural indicators
    • YOJ

5.1 Fitting the Model

frm <- formula(isGood ~ TRAVTIME + CAR_USE + TIF + CAR_TYPE + OLDCLAIM + REVOKED + MVR_PTS + URBANICITY +  YOJ)

m2 <- glm(frm, d_fact, family = 'binomial')
summary(m2)
## 
## Call:
## glm(formula = frm, family = "binomial", data = d_fact)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.9651  -0.8090   0.4747   0.7671   2.2669  
## 
## Coefficients:
##                                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                      0.53254    0.12742   4.179 2.92e-05 ***
## TRAVTIME25--40                  -0.24938    0.06428  -3.880 0.000105 ***
## TRAVTIME>40                     -0.55331    0.06525  -8.480  < 2e-16 ***
## CAR_USEPrivate                   0.90615    0.06017  15.061  < 2e-16 ***
## TIF2--5                          0.21861    0.06999   3.123 0.001788 ** 
## TIF6--8                          0.39730    0.06845   5.804 6.46e-09 ***
## TIF>8                            0.54335    0.07216   7.530 5.09e-14 ***
## CAR_TYPEPickup                  -0.52777    0.08414  -6.273 3.55e-10 ***
## CAR_TYPEPTruck_Van              -0.10433    0.09246  -1.128 0.259193    
## CAR_TYPESports_Car              -0.97359    0.08881 -10.963  < 2e-16 ***
## CAR_TYPEz_SUV                   -0.72451    0.07177 -10.094  < 2e-16 ***
## OLDCLAIM>0                      -0.57691    0.05754 -10.027  < 2e-16 ***
## REVOKEDYes                      -0.78603    0.07020 -11.197  < 2e-16 ***
## MVR_PTS1                        -0.20309    0.08053  -2.522 0.011671 *  
## MVR_PTS2                        -0.22278    0.08485  -2.626 0.008649 ** 
## MVR_PTS3--4                     -0.37063    0.07411  -5.001 5.69e-07 ***
## MVR_PTS>4                       -0.65506    0.08222  -7.967 1.62e-15 ***
## URBANICITYz_Highly_Rural/ Rural  1.96552    0.10001  19.653  < 2e-16 ***
## YOJ>1                            0.78966    0.08681   9.096  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11181.6  on 9636  degrees of freedom
## Residual deviance:  9286.4  on 9618  degrees of freedom
## AIC: 9324.4
## 
## Number of Fisher Scoring iterations: 5

All coeficients are meaningful.

AUC2  <- fAUC(m2, d_fact, type="response")
pred2 <- prediction(predict(m2, type = "response"), d_fact$isGood)
perf2 <- performance(pred2, "tpr", "fpr")
#predScores <- modelr::add_predictions(df, m)$pred  # not correct?
predScores <- predict(m2, type = "response")

The AUC of model \(m_2\) is: 0.775. Compare this to the AUC of \(m_1\): 0.817.

5.1.1 Optimal Cut-Off

# The optimal cutoff if the false positives cost 3 times more than the false negatives:
cutoff_vals <- opt_cut_off(perf2, pred2, cost.fp = 3)
cutoff_vals
##                  [,1]
## sensitivity 0.4454352
## specificity 0.9016330
## cutoff      0.8443056
cutoff2 <- cutoff_vals[3,1]
bin_pred2 <- if_else(pred2@predictions[[1]] > cutoff2, 1, 0)

We accept around 0.35 of the customers. However, we remind ourselves that the data is re-balanced (probably rows of good customers have been deleted ad random). So, it is impossible to answer the question how much percent of customers we actually will reject.

5.2 The Fairness of Model2

First we calculate false negatives and false positives for further use:

FN2 <- if_else(bin_pred2 == 0 & d_fact$isGood == 1, 1, 0)
FP2 <- if_else(bin_pred2 == 1 & d_fact$isGood == 0, 1, 0)

5.2.1 GENDER

In the data men and women were roughly equally represented and in the model we did not use gender.

The following table illustrates Equal Opportunity: compare False Negative Rates:

summarytools::ctable(d_fact$GENDER, FN2, headings=FALSE)
FN2 0 1 Total
GENDER
M 2612 (61.7%) 1618 (38.3%) 4230 (100.0%)
F 3098 (57.3%) 2309 (42.7%) 5407 (100.0%)
Total 5710 (59.3%) 3927 (40.7%) 9637 (100.0%)

Males have now roughly 10% less false negatives. That means that we will unjustly reject 10% more women than men. Note that this situation moved from roughly 6% more false rejections for men.

Predictive Equality: compare False Positive Rates

summarytools::ctable(d_fact$GENDER, FP2, headings=FALSE)
FP2 0 1 Total
GENDER
M 4089 (96.7%) 141 (3.3%) 4230 (100.0%)
F 5296 (97.9%) 111 (2.1%) 5407 (100.0%)
Total 9385 (97.4%) 252 (2.6%) 9637 (100.0%)

The false positives for men are also significantly higher than for women now.

5.2.2 AGE

The variable AGE has, unlike GENDER, been used in our model. So, it would be normal to see a different demographic parity for younger and older people.

summarytools::ctable(d_fact$AGE, bin_pred2, headings=FALSE)
bin_pred2 0 1 Total
AGE
<30 286 (76.5%) 88 (23.5%) 374 (100.0%)
30–40 1549 (67.2%) 757 (32.8%) 2306 (100.0%)
40–55 3578 (62.7%) 2126 (37.3%) 5704 (100.0%)
>55 834 (66.6%) 419 (33.4%) 1253 (100.0%)
Total 6247 (64.8%) 3390 (35.2%) 9637 (100.0%)

While we left out the variable AGE, we still notice that there are still differences in the probability of being accepted for different age groups. Now this is due to the correlation between age group and the features that were used in the model.

Despite the fact that younger people have less change to build up a 5 year negative track record, they are still getting lower acceptance rates.

The following table illustrates Equal Opportunity: compare False Negative Rates:

summarytools::ctable(d_fact$AGE, FN2, headings=FALSE)
FN2 0 1 Total
AGE
<30 271 (72.5%) 103 (27.5%) 374 (100.0%)
30–40 1445 (62.7%) 861 (37.3%) 2306 (100.0%)
40–55 3260 (57.2%) 2444 (42.8%) 5704 (100.0%)
>55 734 (58.6%) 519 (41.4%) 1253 (100.0%)
Total 5710 (59.3%) 3927 (40.7%) 9637 (100.0%)

Younger people are discriminated against compared to older people. The difference in the false positive rates is up from 5% to 13%.

Predictive Equality: compare False Positive Rates

summarytools::ctable(d_fact$AGE, FP2, headings=FALSE)
FP2 0 1 Total
AGE
<30 365 (97.6%) 9 (2.4%) 374 (100.0%)
30–40 2221 (96.3%) 85 (3.7%) 2306 (100.0%)
40–55 5580 (97.8%) 124 (2.2%) 5704 (100.0%)
>55 1219 (97.3%) 34 (2.7%) 1253 (100.0%)
Total 9385 (97.4%) 252 (2.6%) 9637 (100.0%)

This effect is even more pronounced when comparing the false positives: this happens the oldest group has 8% more probability to get an advantage. That difference is much lower than in model m1.

5.2.3 MSTATUS

summarytools::ctable(d_fact$MSTATUS, FN2, headings=FALSE)
FN2 0 1 Total
MSTATUS
Yes 3301 (56.7%) 2519 (43.3%) 5820 (100.0%)
No 2409 (63.1%) 1408 (36.9%) 3817 (100.0%)
Total 5710 (59.3%) 3927 (40.7%) 9637 (100.0%)
summarytools::ctable(d_fact$MSTATUS, FP2, headings=FALSE)
FP2 0 1 Total
MSTATUS
Yes 5703 (98.0%) 117 (2.0%) 5820 (100.0%)
No 3682 (96.5%) 135 (3.5%) 3817 (100.0%)
Total 9385 (97.4%) 252 (2.6%) 9637 (100.0%)

While there are differences for the married status, they are not big and do not indicate significant bias.

5.3 Conclusions for model2

Leaving out variables does not solve all problems.

6 Addressing Unfairness

We choose to address the unfair treatment of young people.

6.1 Weighting Data

Young people are more likely to be misunderstood by the model because we have less young people in the data. A simple and powerful solution could be to weight the data. This technique can only address one bias at a time.

Therefore we will duplicate the lines 5-fold (N in the code below).

x <- d_fact %>% filter(AGE == "<30")
df <- d_fact
for(k in 1:N) {df <- add_row(df, x)}

frm <- formula(isGood ~ TRAVTIME + CAR_USE + TIF + CAR_TYPE + OLDCLAIM + REVOKED + MVR_PTS + URBANICITY +  YOJ)

m2_weighted <- glm(frm, df, family = 'binomial')
summary(m2_weighted)
## 
## Call:
## glm(formula = frm, family = "binomial", data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.9415  -0.9333   0.4678   0.8112   2.3475  
## 
## Coefficients:
##                                 Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                      0.48141    0.10860   4.433 9.30e-06 ***
## TRAVTIME25--40                  -0.22477    0.05635  -3.989 6.64e-05 ***
## TRAVTIME>40                     -0.47744    0.05768  -8.277  < 2e-16 ***
## CAR_USEPrivate                   0.94279    0.05323  17.712  < 2e-16 ***
## TIF2--5                          0.20331    0.06292   3.231  0.00123 ** 
## TIF6--8                          0.34347    0.06079   5.650 1.61e-08 ***
## TIF>8                            0.46414    0.06382   7.272 3.53e-13 ***
## CAR_TYPEPickup                  -0.54513    0.07404  -7.363 1.80e-13 ***
## CAR_TYPEPTruck_Van              -0.15890    0.08244  -1.928  0.05391 .  
## CAR_TYPESports_Car              -1.10351    0.07803 -14.142  < 2e-16 ***
## CAR_TYPEz_SUV                   -0.78235    0.06288 -12.441  < 2e-16 ***
## OLDCLAIM>0                      -0.53347    0.05106 -10.448  < 2e-16 ***
## REVOKEDYes                      -0.71190    0.06336 -11.236  < 2e-16 ***
## MVR_PTS1                        -0.28167    0.07200  -3.912 9.15e-05 ***
## MVR_PTS2                        -0.27150    0.07594  -3.575  0.00035 ***
## MVR_PTS3--4                     -0.48838    0.06540  -7.468 8.14e-14 ***
## MVR_PTS>4                       -0.80067    0.07204 -11.114  < 2e-16 ***
## URBANICITYz_Highly_Rural/ Rural  2.11314    0.08884  23.785  < 2e-16 ***
## YOJ>1                            0.69502    0.07444   9.337  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 14191  on 11506  degrees of freedom
## Residual deviance: 11628  on 11488  degrees of freedom
## AIC: 11666
## 
## Number of Fisher Scoring iterations: 5
AUC2_w <- fAUC(m2_weighted, d_fact, type="response")
pred2_w <- prediction(predict(m2_weighted, type = "response"), df$isGood)
perf2_w <- performance(pred2_w, "tpr", "fpr")
#predScores <- modelr::add_predictions(df, m)$pred  # not correct?
predScores <- predict(m2_weighted, type = "response")

The AUC of model \(m2_{weighted}\) is: 0.774. Compare this to model 2: 0.775: this is hardly lower (remember also the AUC of \(m_1\): 0.817).

6.1.0.1 Optimal Cut-Off of the Weighted Model

# The optimal cutoff if the false positives cost 3 times more than the false negatives:
cutoff_vals <- opt_cut_off(perf2_w, pred2_w, cost.fp = 3)
cutoff_vals
##                  [,1]
## sensitivity 0.4766144
## specificity 0.8972254
## cutoff      0.8067735
cutoff2_w <- cutoff_vals[3,1]
bin_pred2_w <- if_else(pred2_w@predictions[[1]] > cutoff2_w, 1, 0)

6.1.1 The Fairness of the model with weighted data

First we calculate false negatives and false positives for further use:

FN2_w <- if_else(bin_pred2_w == 0 & df$isGood == 1, 1, 0)
FP2_w <- if_else(bin_pred2_w == 1 & df$isGood == 0, 1, 0)

6.1.1.1 GENDER

In the data men and women were roughly equally represented and in the model we did not use gender.

The following table illustrates Equal Opportunity: compare False Negative Rates:

x <- summarytools::ctable(df$GENDER, FN2_w, headings=FALSE)
x
FN2_w 0 1 Total
GENDER
M 3326 (66.8%) 1654 (33.2%) 4980 (100.0%)
F 4003 (61.3%) 2524 (38.7%) 6527 (100.0%)
Total 7329 (63.7%) 4178 (36.3%) 11507 (100.0%)

Now, the females have about 16.43% more probability to get a false negative than a male. That is a significant difference.

Predictive Equality: compare False Positive Rates

x <- summarytools::ctable(df$GENDER, FP2_w, headings=FALSE)
x
FP2_w 0 1 Total
GENDER
M 4778 (95.9%) 202 (4.1%) 4980 (100.0%)
F 6366 (97.5%) 161 (2.5%) 6527 (100.0%)
Total 11144 (96.8%) 363 (3.2%) 11507 (100.0%)

The false positives for men are also significantly higher than for women now. Men are -39.19% more likely to get an advantage.

6.1.1.2 AGE

The variable AGE has, unlike GENDER, been used in our model. So, it would be normal to see a different demographic parity for younger and older people.

summarytools::ctable(df$AGE, bin_pred2_w, headings=FALSE)
bin_pred2_w 0 1 Total
AGE
<30 1650 (73.5%) 594 (26.5%) 2244 (100.0%)
30–40 1497 (64.9%) 809 (35.1%) 2306 (100.0%)
40–55 3397 (59.6%) 2307 (40.4%) 5704 (100.0%)
>55 803 (64.1%) 450 (35.9%) 1253 (100.0%)
Total 7347 (63.8%) 4160 (36.2%) 11507 (100.0%)

While we left out the variable AGE, we still notice that there are still differences in the probability of being accepted for different age groups. Now this is due to the correlation between age group and the features that were used in the model.

Despite the fact that younger people have less change to build up a 5 year negative track record, they are still getting lower acceptance rates.

The following table illustrates Equal Opportunity: compare False Negative Rates:

summarytools::ctable(df$AGE, FN2_w, headings=FALSE)
FN2_w 0 1 Total
AGE
<30 1668 (74.3%) 576 (25.7%) 2244 (100.0%)
30–40 1487 (64.5%) 819 (35.5%) 2306 (100.0%)
40–55 3414 (59.9%) 2290 (40.1%) 5704 (100.0%)
>55 760 (60.7%) 493 (39.3%) 1253 (100.0%)
Total 7329 (63.7%) 4178 (36.3%) 11507 (100.0%)

Now, younger people are not discriminated against anymore

Predictive Equality: compare False Positive Rates

summarytools::ctable(df$AGE, FP2_w, headings=FALSE)
FP2_w 0 1 Total
AGE
<30 2166 (96.5%) 78 (3.5%) 2244 (100.0%)
30–40 2211 (95.9%) 95 (4.1%) 2306 (100.0%)
40–55 5553 (97.4%) 151 (2.6%) 5704 (100.0%)
>55 1214 (96.9%) 39 (3.1%) 1253 (100.0%)
Total 11144 (96.8%) 363 (3.2%) 11507 (100.0%)

The false positives for young people are now in line with the other groups.

6.1.1.3 MSTATUS

summarytools::ctable(df$MSTATUS, FN2_w, headings=FALSE)
FN2_w 0 1 Total
MSTATUS
Yes 4101 (61.0%) 2624 (39.0%) 6725 (100.0%)
No 3228 (67.5%) 1554 (32.5%) 4782 (100.0%)
Total 7329 (63.7%) 4178 (36.3%) 11507 (100.0%)
summarytools::ctable(df$MSTATUS, FP2_w, headings=FALSE)
FP2_w 0 1 Total
MSTATUS
Yes 6570 (97.7%) 155 (2.3%) 6725 (100.0%)
No 4574 (95.7%) 208 (4.3%) 4782 (100.0%)
Total 11144 (96.8%) 363 (3.2%) 11507 (100.0%)

The weighting also introduced a significant bias against married people.

6.2 Additive Counterfactual Model (ACF)

6.2.1 What are ACF models?

ACF models have the following characteristics:

  • can address multiple variables at once
  • can be made for all types of algorithms
  • no need for composite features
  • can handle categorical (factorial) as well as continuous protected features
  • … but might impact the performance of the model significantly

An ACF model is built as follows

  1. Develop a model for each of the non-protected features \(X_i\) based on the protected features \(S_j\) only – \(\forall i\): \(\widehat{X_i} = f_i (S_1, S_2, \ldots, S_n)\)
  2. Compute the residuals \(\epsilon_i\) for all the independent features \(X_i\): \(\forall i: \epsilon_i = X_i - \widehat{X_i}\)
  3. Develop a model with \(Y\) as the response variable and \(\epsilon_i\) as the predictors: \(\widehat{Y} = f_y (\epsilon_1, \epsilon_2, \ldots, \epsilon_n)\)

6.2.2 Building the ACF Model

The chosen protected features \(S_j\) are:

  • Variables related to gender and family:
    • HOMEKIDS
    • PARENT1
    • MSTATUS
    • GENDER
  • Education and profession:
    • EDUCATION
    • OCCUPATION
  • Wealth related variables:
    • BLUEBOOK
    • HOME_VAL

or any variables that are too much correlated with those. [we skip this]

That leaves us the following \(X_i\):

  • Car use related
    • TRAVTIME
    • CAR_USE
    • URBANICITY
  • Car type related
    • CAR_TYPE
    • CAR_AGE
  • Behaviour on our books
    • TIF
    • OLDCLAIM
  • Third party driving behaviour indicators:
    • REVOKED
    • MVR_PTS
  • Other behavioural indicators
    • YOJ

6.2.3 Fitting the Model

# we will need the binary dataset in order to calculate 
d_bin  <- readRDS('./data/d_bin.R')
d_bin$isGood <- 1 - d_bin$CLAIM_FLAG
d_bin <- select(d_bin, -c(CLAIM_FLAG))

S <- c("HOMEKIDS.1","HOMEKIDS.>=2", 
       "PARENT1.Yes", 
       "MSTATUS.No", 
       "GENDER.M", 
       "EDUCATION.<High_School", "EDUCATION.Bachelors", "EDUCATION.Masters", "EDUCATION.PhD", 
       "OCCUPATION.Home_Maker", "OCCUPATION.Lawyer", "OCCUPATION.Professional", "OCCUPATION.senior", "OCCUPATION.z_Blue_Collar",
       "AGE.<30", "AGE.30--40", "AGE.>55", 
       "HOME_VAL.100--200K", "HOME_VAL.200--300K", "HOME_VAL.>300K",   "BLUEBOOK.10--15K", "BLUEBOOK.15--20K", "BLUEBOOK.>20K" )

X <- c("TRAVTIME.<25", "TRAVTIME.>40", 
      "CAR_USE.Commercial", 
      "TIF.2--5", "TIF.6--8", "TIF.>8", 
      "CAR_TYPE.Minivan", "CAR_TYPE.Pickup", "CAR_TYPE.PTruck_Van", "CAR_TYPE.Sports_Car",  
      #"OLDCLAIM.>0",  # <-- this is too much correlated to CLM_FREQ (and gets no stars)
      "CLM_FREQ.1", "CLM_FREQ.2", "CLM_FREQ.>2",
      "REVOKED.Yes", 
      "MVR_PTS.1", "MVR_PTS.2", "MVR_PTS.3--4", "MVR_PTS.>4",
      "URBANICITY.z_Highly_Rural/ Rural",  "YOJ.<=1") 

#not_used_in_this_model <- c("RED_CAR.yes","KIDSDRV.1", "KIDSDRV.>=2","GENDER.AGE.F.<30", "GENDER.AGE.F.>55", "GENDER.AGE.F.30--40", "GENDER.AGE.M.<30", "GENDER.AGE.M.>55", "GENDER.AGE.M.30--40", "GENDER.AGE.M.40--55", "INCOME.<25K", "INCOME.55--75K", "INCOME.>75K","CAR_AGE.2--7", "CAR_AGE.8--11","CAR_AGE.12--15", "CAR_AGE.>=16","CLAIM_FLAG")

X <- str_clean(X)
S <- str_clean(S)
colnames(d_bin) <- str_clean(colnames(d_bin))

rhs <- paste(S, collapse="` + `")
rhs <- paste0("`", rhs, "`")
d_acf <- data.frame(isGood = d_bin$isGood)  # the new dataset that will be used for the model
m_acf_vars <- list()
for (x in X) {
  frm <- formula(paste0("`", x, "` ~ ", rhs))
  m_acf_vars[[x]] <- glm(frm, d_bin, family = "binomial")
  x_predictions <- predict(m_acf_vars[[x]], d_bin, type = "response")
  pearson_residual <- (d_bin[[x]] - x_predictions) / sqrt(x_predictions * (1 - x_predictions))
  d_acf <- cbind(d_acf, pearson_residual)
  colnames(d_acf)[ncol(d_acf)] <- x
}

# Fit the model based on the residuals (stored in d_acf):
frm <- paste("isGood ~", paste(X, collapse = " + "))
m_acf <- glm(frm, d_acf, family = "quasibinomial")

AUC_acf <- fAUC(m_acf, d_acf, type="response")

summary(m_acf)
## 
## Call:
## glm(formula = frm, family = "quasibinomial", data = d_acf)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.4145  -0.9616   0.5112   0.8171   2.0123  
## 
## Coefficients:
##                                    Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                         1.34803    0.03500  38.517  < 2e-16 ***
## TRAVTIME.lt.25                      0.11657    0.03102   3.758 0.000172 ***
## TRAVTIME.gt.40                     -0.12425    0.03047  -4.077 4.59e-05 ***
## CAR_USE.Commercial                 -0.22571    0.02986  -7.560 4.41e-14 ***
## TIF.2.to.5                          0.08461    0.03078   2.749 0.005990 ** 
## TIF.6.to.8                          0.16851    0.03152   5.346 9.18e-08 ***
## TIF.gt.8                            0.22848    0.03218   7.101 1.33e-12 ***
## CAR_TYPE.Minivan                    0.24686    0.03405   7.249 4.53e-13 ***
## CAR_TYPE.Pickup                     0.05850    0.02579   2.268 0.023351 *  
## CAR_TYPE.PTruck_Van                 0.07747    0.04786   1.619 0.105520    
## CAR_TYPE.Sports_Car                -0.04872    0.02922  -1.668 0.095417 .  
## CLM_FREQ.1                         -0.11806    0.02755  -4.285 1.85e-05 ***
## CLM_FREQ.2                         -0.12679    0.02757  -4.599 4.29e-06 ***
## CLM_FREQ.gt.2                      -0.15388    0.02703  -5.693 1.28e-08 ***
## REVOKED.Yes                        -0.20843    0.02469  -8.443  < 2e-16 ***
## MVR_PTS.1                          -0.04941    0.02987  -1.654 0.098084 .  
## MVR_PTS.2                          -0.07424    0.02887  -2.572 0.010140 *  
## MVR_PTS.3.to.4                     -0.10812    0.02935  -3.684 0.000231 ***
## MVR_PTS.gt.4                       -0.15361    0.02896  -5.303 1.16e-07 ***
## URBANICITY.z_Highly_Rural.or.Rural  1.14356    0.05592  20.449  < 2e-16 ***
## YOJ.le.1                           -0.04118    0.02765  -1.489 0.136422    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 1.186)
## 
##     Null deviance: 11181.6  on 9636  degrees of freedom
## Residual deviance:  9531.6  on 9616  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 6

The model is less good (lower AUC), but still acceptable.

6.3 Overview of the performance of the models

model description AUC
m1 naive model, no protected features 0.8171151
m2 protected features not used in m2 0.774611
m2_w data re-weighted for AGE.<.30 0.7739774
m_acf additive counter-factual model 0.7630629

6.4 De-Biasing an Existing Model

6.4.1 Reject Option Classifier

Instead of defining a clear cutoff \(\theta\), define a “grey zone” around \(\theta\). In that zone give a favourable outcome to the disadvantaged group if below threshold and an unfavourable outcome to the advantaged group if above threshold.

Reject Option Classifier is:

  • applicable to all probabilistic classification models;
  • doesn’t require us to modify data or algorithm – is therefore also applicable to models that are in production – it can work as an overlay to existing models;
  • gives clear control over what happens.

However, it feels like discrimination for the people that are selected to be given an advantage or disadvantage (the trolley problem). Obviously, such techniques can only be applied if there is a good reason to assume bias and to reduce bias only.

7 Conclusions

  • The Trolley Problem is unavoidable:
    • we must decide what the protected features are
    • we must decide what we call “fair” (e.g. Equal Opportunity? Predictive Equality, Demographic Parity, etc.)
    • we must realize that even when solving unfairness in a protected feature we can introduce bias in other groups of the same variable.
    • The key issue is to find a coherent framework (e.g. we consider as protected features those that the person cannot “choose”)
    • Also, don’t forget other ethical issues such as accountability, transparency, privacy, data ownership, etc.
  • Leaving out protected features (and the features correlated with them) is not enough to solve bias in data
  • Re-weighting of data
    • can solve one particular bias at a time (though we can use composite features),
    • can introduce other biases for other protected features or even in the same protected feature in other groups
    • only works for classification based algo’s
    • the accuracy of the model suffers not too much
  • Additive Counter-factual Models
    • can handle more than one variable at a time
    • the predictive power of the model is significantly impacted

Summary of the conclusion: You decide what is fair in your domain for your model, sector, and use case. You are biased, so make it a team effort.

8 References

8.1 Acknoledgement

In this document we used workflow and code from: De Brouwer, Philippe J. S. 2020. The Big r-Book: From Data Science to Learning Machines and Big Data. New York: John Wiley & Sons, Ltd.

We mainly used in this document:

  • Part V (p. 373–562): Modelling

Bibliography

Agarwal, Sray, and Shashin Mishra. 2021. Responsible AI: Implementing Ethical and Unbiased Algorithms. Springer.
Coeckelbergh, Mark. 2020. AI Ethics. Mit Press.
De Brouwer, Philippe J. S. 2020. The Big r-Book: From Data Science to Learning Machines and Big Data. New York: John Wiley & Sons, Ltd.
Dubber, Markus Dirk, Frank Pasquale, and Sunit Das. 2020. The Oxford Handbook of Ethics of AI. Oxford Handbooks.
Hardt, Moritz, Solon Barocas, and Arvind Narayanan. 2018. “Fairness in Machine Learning: Limitations and Opportunities.” Solon Barocas Moritz Hardt and Arvind Narayanan.