Kolmogorov-Arnold representation

Reference to published paper Kolmogorov-Arnold

Needless to say that Kolmogorov-Arnold representation was introduced by two mathematicians Kolmogorov and Arnold. We can add that it happened in 1957 and that multivariable function $M$ must be continuous $$ M(x_1, x_2, x_3, ... , x_n) = \sum_{i=1}^{2n+1} U_i\left(\sum_{j=1}^{n} f_{i,j}(x_{j})\right). $$ One strange thing about this representation is that until 2021 nobody noticed that it is actually a tree of discrete Urysohn operators,
wich properties are explained in the previous article. The top operator is called outer and bottom blocks are called inner. In the light of what is already explained about discrete Urysohn operators, the identification of this model is not as hard as it may look at first glance.

We limit all functions to be piecewise linear with assigned abscissa nodes and unknown ordinates. The ordinates are randomly initialized, which allows estimation of the output $\hat{M}$ and intermediate values $\hat{y_1}, \hat{y_2}, ...$ for any given input vector $X^j$. Since each $\hat{y_1}, \hat{y_2}, ...$ is an argumet in a linear block of the outer operator, we can introduce small increments $\Delta y_1, \Delta y_2, ...$ which reduce error between estimated and actual outputs $|M - \hat{M}|$ and, having new intermediate values $y_1 + \Delta y_1, y_2 + \Delta y_2, ...$, update all operators.

Training is conducted in the same way as for neural networks by making record-by-record improvements. Each model improvement step is performed for one record and entire process needs multiple epochs.