DDR vs. Bagging
In this experiment we compared deep ensemble built by bagging and
In original article deep ensemble is built by neural network and our DDR tests are built for Kolmogorov-Arnold model.
Since the latter exposed in all test cases same or better accuracy, we use Kolmogorov-Arnold for both tests.
We built 8 models by bagging and 8 models by DDR, then we generated 100 inputs, passed them to both ensembles
and compared the accuracy.
The data is generated by Mike's formula
with error level $\delta = 0.8$, $[X_j, y]$ observed and $C_j$ random values uniformly distributed on $[0,1]$.
Observed values were also randomly generated as uniformly distributed on $[0,1]$.
The data set size was $10 000$ records. When single deterministic model is built for entire data set, its Pearson correlation
coefficient for estimated and provided outputs is approximately $0.75$, which is kind of real life case.
In the accuracy test we generated 100 random inputs, passed them to models and obtained 3 samples per input:
Monte Carlo sample is considered as TRUE ouput, to which other two are compared. The metric was Pearson correlation coefficient.
It is applied for expectations and standard deviations. The other metric was Kolmogorov-Smirnov goodness-of-fit test, which shows
if two samples are taken from the same population. The results are shown in the program print out below:
- Array of 8 estimated outputs for bagging model
- Array of 8 estimated outputs for DDR model
- Array of 1024 possible Monte Carlo generated outputs
The accuracy for expectation is very high $0.99$ for both bagging and DDR, however, the standard deviation is not accurately estimated
by bagging, it is $0.29$ when DDR gives $0.96$. Also Kolmogorov-Smirnov test passed for DDR in $84%$ cases, but for bagging
only in $53%$ cases.
The general conclusion can be made, that bagging not returns accurate samples allowing make judgements about
input dependent distributions and
can be used only for estimation of expectations.