Probayes

ProBT-Engine


ProBT® aims at providing a programming tool that facilitates the creation of Bayesian models and their reusability. This property allows the design of advanced features such as submodel reuse, learning, and distributed inference.

ProBT-Engine is a set of high-performance algorithm modules developped in C++ language. The figure below shows the interaction between the different modules of ProBT-Engine.

Learning Algorithms

ProBT-Engine includes two learning modules: parameter learning and structure learning.

The Structure Learning module allows finding the dependencies between the variables of the model. That is, it allows to find the Bayesian network or decomposition according to the provided data. Structure learning is especially useful to construct models on which only poor prior knowledge about variables dependencies is available. In this case, the model construction is data-driven and variables dependencies are discovered from data.

For a given structure, learning the parameters of all (or some) distributions of the model can be achieved using the Parameter Learning module. This module can be used either with complete or incomplete data.

Learning using complete data is adaptive, which allows incremental and online fitting of the data. Depending on the used optimality criterion, one can obtain a Maximum Likelihood (ML) estimation or a Maximum A Posteriori (MAP) estimation (also known as Bayesian estimation). A set of classes allowing learning the most common statistical distributions is implementend for both ML and MAP criteria.

For incomplete data cases, ProBT-Engine implements a generic Expectation-Maximization (EM) algorithm. Indeed, learning using incomplete data is a central issue: some variables of the model may be unobserved and or/some values are missing for another set of variables. The ProBT EM algorithm uses the complete data learning classes above and ProBT’s inference algorithms below in order to provide a very flexible tool for parameter learning using incomplete data.

Inference Algorithms

ProBT-Engine allows inference either in exact or approximate way.

The Exact Inference module consisting of a general-purpose algorithm for optimal marginalization sequences computation. This module allows choice among optimization on the computation memory size or the computation time.

The Approximate Inference module is intended for dealing with those problems where exact inference is intractable. Unfortunately, for complicated real-world applications, satisfying the time/memory constraints is seldom possible when using exact calculation. This is especially the case for probabilistic problems involving a large number of variables and/or dependencies, and/or defined on variables taking values in a huge (or infinite for continuous variables) set of states. In this case, exact inference becomes intractable and approximation methods must be used. This case is known as the “approximate Bayesian inference” problem.

The approximate inference module provides four schemes (levels) of approximation:

- Integrals (sums) estimation.
- A posteriori distributions sampling using MCMC methods to generate a sample set of N points.
- A posteriori distributions maximization to get the Maximum A Posteriori (MAP ) solution.
- Low-memory cost and efficient numerical representation of distributions by selectively visiting high-probability regions of the target space and computing interpolation for the non-visited regions.

The current version of ProBT-Engine proposes a set of approximation for each level, so time/memory constraints in an application can be taken into account in a simple way.