Content

Multimodal posterior distribution
When models like Bayesian Neural Networks (BNN) and Divisive Data Resorting (DDR) return distribution parameters
or samples of outputs for the inputs, not used in training, we wish to know how accurate they are. The experimental
data or recorded observations of the physical systems usually have all different inputs, so we can't use them even
to assess accuracies of returned expectations not speaking about distributions. For comparison we need
socalled true or actual distributions,
and the only way to get them is to use synthetic data. This generated data must be challenging to expose weaknesses
and strengths of the models. Challenging means that outputs should have not normal and not unimodal distributions
which vary significantly for the different inputs.
The simulating formua, answering such needs, is derived by the authors of this site:
where $C_j$ are uniformly distributed random values on $[0,1]$, $X_j$ are observed values, $X^*_j$
are used in computation of the outputs $y$, parameter $\delta$ is an error level (we use $\delta = 0.8$).
Below are two histograms for different inputs, as exampes:
Bayesian Neural Network test
For BNN test we used published code sample referred here as
Keras benchmark.
The original data Wine quality, used in published
example, was replaced by
10,000 records generated by above formula. The slightly modified Python code with new data and assessed result can be found
in author's repository.
After training of the model is completed, the test program generates 100 new inputs and passes them to model which returns
100 output samples in a form of arrays with 1024 possible output values. They are compared to same size arrays generated by the formula.
Now we can compute and compare expectations, variances and even histograms.
The metric for expectations was Pearson correlation coefficient computed for 100 predicted and actual values. Same metric was
used for the variances. The histograms were compared by KullbackLeibler divergence parameter and, in order to have a single
value as accuracy measure, we present the average KL divergence for 90% best results out of 100 tested samples.
The histograms are computed as 4 bar values. Although 4 bars is rather rough estimation for a
contnuous distribution, it can be used to tell unimodal from multimodal distribution or symmetric from skewed.
For visualization we provide two pairs of 4 bar histograms for different KL divergences equal to 0.11 and 1.24.
Two histograms with KLdivergence = 0.11
 
Two histograms with KLdivergence = 1.24
 
Test results for 8 executions of Python benchmark example
Expectation  0.99  0.99  0.99  0.99  0.99  0.99  0.99  0.99 
Variance  0.93  0.92  0.86  0.91  0.95  0.92  0.94  0.87 
KL divergence  1.33  1.34  1.35  1.16  1.51  1.34  1.26  1.31 
All distribution samples for 100 test inputs returned by this Keras benchmark
model are near normal, below is one of them for example, they do not even remotely resemple the actual distributions.
Divisive Data Resorting test
The deterministic component of the model was KolmogorovArnold representation
(details for the training of this model can be found in
published paper)
$$ M(x_1, x_2, x_3, ... , x_n) = \sum_{q=0}^{2n} \Phi_q\left(\sum_{p=1}^{n} \phi_{q,p}(x_{p})\right).$$
The expectation values are obtained by a single deterministic model $M_E$ by minimization of residual errors $e_i$
$$e_i = [y_i  M_E(X^i)]^2.$$
Variances were computed also by a single deterministic model $M_V$ for a new output values $v_i$
$$v_i = [y_i  M_E(X^i)]^2.$$
by minimization of residuals
$$e_i = [v_i  M_V(X^i)]^2.$$
KL divergence was estimated by DDR model with sliding window (details in archive article).
The source code can be found at
author's repository. The test results are shown in the table below.
Test results for 8 executions of DDR code
Expectation  0.98  0.98  0.99  0.99  0.99  0.99  0.99  0.99 
Variance  0.97  0.97  0.97  0.96  0.97  0.96  0.97  0.98 
KL divergence  0.09  0.10  0.11  0.10  0.12  0.08  0.09  0.10 
Conclusion
The samples returned by BNN reproduce accurately only expectations. Variances are even less accurate than obtained by
two deterministic models. The values in returned samples not reproduce the actual distributions even approximately.
We have to repeat at this point that data is specifically designed to make getting accurate results as mission impossible project. Most datasets
are not that challenging, the outputs usually have unimodal bellshape near symmetric distributions and, if we pass such type of
data to BNN, the result for both expectations and variances will be near 99% accurate. But same is true for two deterministic
models.
 
