First failed benchmark

Some BNN benchmark examples are applied to so-called trivial data. Under trivial we mean an elementary and most common cases when observed outputs or targets have a bell shape distribution density. Most publicly available data sets have this sort of distributions and building stochastic models for this kind of data is elementary, not challenging and can be accomplished in many different ways, including making ensembes by simple random initialization. The claimed advantage of BNN is recognition of non-trivial distributions of the outputs. In this experiment I simply took popular published BNN example and changed the data from trivial bell shape distribution into other.

Benchmark code and benchmark data

Here we consider benchmark example from the site specifically designed and dedicated to probabilistic models. Example: Bayesian Neural Network.

The training data is stochastic function of a single argument $y = F(x)$. The training record has 100 pairs, shown in the image below by 'x':

When library code is executed, it returns expectation model, shown as a solid blue line, and light blue area holding 90% confidence interval. It returns also reference to an object that generates Markov Chain Monte Carlo (MCMC) samples of the output for any provided input $x$. I generated multiple samples with 2000 returned values, passed them to a method building a histogram and obtained perfectly matched distribution of the outputs, below is one of them:

At this point I have to add that outputs were truly generated with normal distribution, it can be found in code, and the result perfectly matches the actual distibution. Here we say a 'good job' to the developers of software and try our own data.

Moving from ideal world into real one

Now we try to use this NUMPYRO example for the data that developer of this code have not seen. It is just a real life situation, not an attack. I only split data randomly into two blocks changing the distribution from unimodal into multimodal:

The expectation and confidence interval looks fine but the histogram is disappointing. It shows unimodal distribution for every point I tried. It appeared that mathematical software, for which authors introduced a special term 'probabilistic programming' did not recognize the basic property of the data that can be seen by the naked eye.

In case if referred site move or remove this example, it can be found in my repository github. The changes that I made to the code are properly commented, so they can be disabled and code may be executed for both original and modified data.