|
ИСТИНА |
Войти в систему Регистрация |
Интеллектуальная Система Тематического Исследования НАукометрических данных |
||
A method and computing system for determining mixed-precision quantization parameters to quantize a neural network are provided. The method comprises determining a vector of quantization parameters on the basis of a size of the weight vectors of the neural network, and, for each one of multiple training vectors of a training dataset evaluating a second loss function on the basis of the training vector and the vector of quantization parameters and modifying the weight vectors and the vector of quantization parameters to minimize an output of the second loss function. Each one of the quantization parameters of the vector of quantization parameters constrains the size of a quantized weight vector for a layer of a quantized neural network corresponding to the weight vector for the respective layer of the neural network.