Fairness in Machine Learning

Just some notes

(Work in progress)

Varieties of Fairness

One of the hardest problems in fairness is that there is no consensus on the definition of fairness or what does it mean to be fair. Depending on the context or culture, the definition of fairness can be different .

Researchers and designers at Google’s PAIR (People and AI Research) , initiative created the What-If visualization tool as a pragmatic resource for developers of machine learning systems. The tool provides a set of fairness metrics that can be used to evaluate the fairness of a model. The metrics are grouped into five categories:

Group unaware: “group unaware” fairness is an approach that advocates for fairness by disregarding demographic characteristics like gender and making decisions solely based on individual qualifications.
Group threshold: “group threshold” is a fairness mechanism that recognizes that not all groups are the same, and historical disparities or biases may warrant different decision thresholds for different groups to promote equitable outcomes. It’s a technique used to fine-tune the behavior of AI models to ensure that they do not disproportionately disadvantage certain demographic groups while still maintaining some level of predictive accuracy.
Demographic parity (or group fairness, statistical parity): is an approach to ensure that the composition of the selected or approved individuals or outcomes reflects the demographic composition of the overall population.
Equal opportunity: aims to promote fairness by ensuring that individuals from different demographic groups are treated equally when they have the same qualifications or attributes relevant to a decision, and their chances of success are not influenced by factors like race, gender, or age.
Equal accuracy: ensuring that the predictive accuracy of a model is similar across different demographic groups.

It can be seen that, these proposed metrics are already complex and hard to understand. For example, in my opinion, the “group unware” and “equal opportunity” are quite similar to each other where both of them aim to ensure that the model does not discriminate based on “protected characteristics” like gender, age, race, etc. Overall, these metrics can be grouped into two categories: group fairness and individual fairness which are also the two main categories of fairness in machine learning.

Learning Fair Representations

One of the milestone work in fairness is the paper “Learning Fair Representations” by Zemel et al. (2013) . The authors proposed a method to learn fair representations by learning a latent representation that encodes the data well but obfuscates information about protected attributes. The method is based on the intuition that if the learned representation does not contain any information about the protected attribute, then any classifier based on these representation cannot use the protected attribute to make predictions.

The authors formulated this using the notion of statistical parity, which requires that the probability that a random element from \(X^+\) maps to a particular prototype is equal to the probability that a random element from \(X^-\) maps to the same prototype

\[P(Z = k \mid x^+ \in X^+) = P(Z = k \mid x^- \in X^-) \; \forall k\]

Where \(X^+\) and \(X^-\) are the sets of protected and unprotected examples, respectively, and \(Z\) is the latent representation with \(K\) prototypes.

Fairness in Deep Learning?

Fairness in Generative Models

Fairness Machine Learning is mostly considered in the context of decision making models such as classification models. However, fairness is also an important issue in generative models, which is not well studied yet. Recently, the central problem of fairness in generative models is how to ensure diversity in the generated outputs. For example, a response to a question about famous musicians should not only include names or images of people of the same gender identity or skin tone , .

Some of the following attributes will be highly considered when talking about fairness in generative models:

Gender identity
Cultural background and demographic
Physical appearance attributes
Political related attributes

When evaluating the fairness of generative models, the authors of suggest to consider the following metrics:

Diversity of the output: Given a set of prompts, the diversity along dimensions of identity attributes represented in the generated outputs. For example, given a set of prompts asking about “famous musicians”, the diversity of gender/culture/nationality in the outputs will be measured. However, when asking about “famous men musicians”, the diversity of culture/nationality will be considered because the gender has been specified in the prompts.
Ability on maintaining fairness: Given a set of prompts that contain counterfactuals of a sensitive attribute, ability to provide the same quality of service. For example, an user revealed his personal demographic information to the system (e.g., an Asian guy), then when the user asks about “famous musicians”, the system should not only provide names of Asian musicians. A fair system should provide answers as the same quality as the case when the user does not reveal his personal demographic information or when the user is a white guy.

In summary, we can think about two scenarios when evaluating the fairness of generative models: same input - diverse output and diverse input - same output.