Trustworthy AI
When can we say that a AI/ML product is trustworthy? This paper authors propose us to analyze it from six different point of view: safety and robustness, explainability, non-discrimination and fairness, privacy, accountability and auditability, environment friendliness. If you are interested, you will find in the paper many examples, references to other articles and tools, and a rich bibliography.
Haochen Liu, Yiqi Wang, Wenqi Fan, Xiaorui Liu, Yaxin Li, Shaili Jain, Yunhao Liu, Anil Jain, and Jiliang
Tang. 2022. Trustworthy AI: A Computational Perspective. ACM Trans. Intell. Syst. Technol. 14, 1, Article 4
(November 2022), 59 pages.
https://doi.org/10.1145/3546872
We are already using many AI application in our day by day life: to unlock a phone, to ask to play a song, find a product in an ecommerce web site, etc. but can we really trust these systems?
From the “safety and robustness” point of view, you would like that the ML system works in a reliable way, even if some input perturbations occurs; this is especially true for applications that can put humans in danger, like autonomous driving. But this should not stop at natural “input perturbations”, we should also take into account that somebody can will to fool the system to take advantages. Think at a recommendation system, if the attacker can fool it making it suggest what they wants, they can sell this to somebody and take profit of this activity. Consider also that many times ML systems are trained with data crawled from internet, so it is not such a remote possibility that somebody starts seeding poisoned data to have some profit. These are kind of black box attacks, the more the opponent has access to the internal of the ML, the more they can do: for instance craft exactly the kind of noise to add to the input data to obtain from the system the prediction they want. There are some techniques to protect the system, like craft another model that validates training input, telling if they are legitimate or adversarial… but does not seems an easy task. The authors cites these tools that can be useful: Cleverhans, Advertorch, DeepRobust, and RobustBench.
The picture below, from Cornell University, show a famous input example that can fool an AI system. You can have a look at https://spectrum.ieee.org/slight-street-sign-modifications-can-fool-machine-learning-algorithms if you are interested.

In real life, I bought a car with a lane assistant device, the car should advise you if you accidentally cross a line without using the turn signal: the steering wheel becomes hard and the car stays in the lane. This is perfect on the highway, but when I drive in the country side it detects the lanes but not other obstacles like branches fallen on the road, or even bikes! Not trustworthy enough, and it is very unpleasant when you have to turn and the car doesn’t want to!
From the “non discriminatory and fairness” point of view, with AI you get what you train: if you train a loan approval system with data that favor white Caucasian people, you will get an unfair system that discriminates other ethnic groups. The same if you train a chatbot with un-proper language, you will have a lot of complains because of sexual or racist language (https://en.wikipedia.org/wiki/Tay_(bot)). An AI system should not be biased toward a certain group of individuals.
The bias can come from many sources: unexperienced annotators preparing the training data set, some improper cleaning procedure, some data enrichment algorithm. Sometimes even if the ethnic group is not explicit in the training data, the algorithm will be able to deduce it. Ideally we would like to have all ethnic group treated in the same way, but this sometimes enters into conflict with the model performance, just because the bias exists in real world. The authors refers to more that 40 works about biased AI in many domains: image analysis, language processing, audio recognition, etc.
To be trustworthy a system should also be “explainable“, especially if their domain is heath care or autonomous driving. It should be possible to get one idea on how the system works in general, and this is actually possible with some algorithms: decision trees are easy to understand an the humans can also adjust what is done. This is not always possible, in this case maybe it is possible to have an explanation for a specific input sample. For instance the system may be an image classifier: you provide a picture and you get an indication of which pixel areas are the most relevant for the classification, an heat map on the image. A famous example of this is LIME Local Interpretable Model-Agnostic Explanations (see https://homes.cs.washington.edu/~marcotcr/blog/lime/ )
The authors provides a list of tools to explore about explainability: AIX360 (AI Explainability 360), InterpretML, DeepExplain and DIC.
Also from the training data comes a “privacy” issue: is it possible, from the model output, to deduce the data used during the training, in this way getting back to information that should not be disclosed? There are 3 approaches to this problem: confidential computing, federated learning, and differential privacy. Confidential computing tries to use devices that are not possible to disclose or operate from outside the training environment, or using homomorphic encryption. Federated computing compute a ML exchanging between parties only parameters, and not the training data: in this way the original data do not exist the boundary of the single entity training the model. Notice also that it is possible to introduce white noise in the parameters exchanges, to be more safe. Differential Privacy aims to provide statistical guarantees for reducing the disclosure about individual introducing some level of uncertainty through randomization.
I did not find the part on auditability much interesting, and I preferred having a look at the “environmental and well-being” section. Actually training some AI models produce the same carbon emission than a human living 7 years! Some other models produce more emission than a car in its lifetime! (Emma Strubell, Ananya Ganesh, and Andrew McCallum. 2019. Energy and policy considerations for deep learning in NLP. arXiv preprint arXiv:1906.02243 2019)
How to reduce this huge consumption? Among the possible techniques there is model compression – reducing the model size will sacrifice some performance, not so much if the model is big and overfits. Also, pruning nodes layer by layer in a neural network seems a promising approach. Specialized hardware is another way to approach the problem. Crafting energy consumption savvy AutoML models is also a good idea: AutoML tools will be more and more widespread, and their users will benefit of environment friendliness without even knowing it.
Leave a comment