🏀

Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks

Tags

Model Uncertainty

Created

2021/01/31 11:05

Publication

ICML'20

Rate

4.5

Source

https://arxiv.org/abs/2002.10118

Summary

많은 모델이 잘 모르는 데이터에도 (혹은 잘못된 추론에도) 아주 높은 classification confidence를 보이는 overconfidence 문제를 갖는다. 이 페이퍼는 이를 해결하기 위해 model parameter의 분포를 Gaussian으로 가정하여 (activation function이 sigmoid 혹은 softmax일 경우) Gaussian-probit approximation으로 모델의 predictive distribution을 근사하고, 이러한 predictive distribution의 out-of-distribution data에 대한 confidence가 distribution parameter(mean, variance)에 의한 upper bound를 가짐을 보인다. 더욱이 DNN model의 모든 parameter가 아니라 맨 마지막 layer의 parameter에 대해서만 위와 같은 approximation을 진행해도 동일한 결과가 도출됨을 보인다. 즉 따로 학습할 필요 없이 적은 연산으로 model uncertainty를 도출해낼 수 있다.