An Interpretable Neural Network for AI Incubation in Manufacturing
This week I have decided to read an article about Neural Network interpretation: deep neural networks are made with many layer or many to many connected units, each having many parameters. Once trained they work well but the parameters are too many to understand what they mean, and we have to trust them as black boxes.
As humans we would like instead to have decision trees that we can review, to understand how the decision has been taken. The article I have read was about using neural networks to tune decision rules. You start the training with some expert-defined rules and after the training you get the same rules with better tuned parameters. So the expert can review them and see if they make sense.
The full reference to the article is: Xiaoyu Chen, Yingyan Zeng, Sungku Kang, and Ran Jin. 2022. INN: An Interpretable Neural Network for AI Incubation in Manufacturing. ACM Trans. Intell. Syst. Technol. 13, 5, Article 85 (June 2022), 23 pages. https://doi.org/10.1145/3519313
The idea is that human experts study a problem and are already able to provide some rules to classify the outcome of an experiment. The threshoulds that they set-up are probably sub optimal and they can be improved by some data driven algorithms. The paper refers to the case of crystal growth manufacturing: the expert have already identified some important parameters, such has heater power or pull speed, and have provided an initial model. The model is expressed in term of rules like if x>5 and y<3 or i>6 and j>9 then the crystal is defective (the paper does not provide real rule examples).
What is proposed it to apply a process that translate the rules into a neural network. The particularity is that the layers are not fully connected, but in some layers connections are made just where the input variables are related by model rules. In their case the proposal is to use 4 layers, some rules are provided to help applying the method to different cases.
Once the network is configured, a classical training algorithms can be used
to tune it, and obtain better thresholds in the rules. Notice that they propose
to use a special activation function, called Ca-Sigmoid, to allow incorporating
thresholds in the neural network; this just for the input layer.
Of course the limitation is that the learning process will not learn new rules, it will only optimize the existing ones.
Leave a comment