# Beta Distribution as Fundamental Bayesian Network Node

The beta distribution is related to the gamma distribution. Let X be a random number drawn from Gamma(1,α) and Y from Gamma(1,β). Then Z=X/(X+Y) has distribution Beta(α,β). With this transformation, it should only take twice as much time as your gamma distribution test.

Higher α and β values take longer to process (O(α + β)). It also means lower variance. Thus, once α and β get really high, it can be assumed perhaps that distribution is better off being represented using Boolean logic.

Problem space complexity involves 2 parameters for each node plus a boolean value for each possible arc. Number of arc combinations given n nodes is (n^2 + n)/2 - n - 1 = (n^2 + n - 2n + 2)/2 = (n^2 - n)/2 + 1. If max value for beta parameters α and β are m, the size of the problem space is (m^2)[(n^2 - n)/2 + 1] = (m^2)(n^2 - n)/2 + (m^2).

Problem space complexity given n nodes and max value m for beta parameters:

Beta(30000,30000) has a variance of ~1/240000 and a standard deviation of ~0.00204. If the range of that distribution is from 0 to 100, there is a one in a million chance a value falls outside of the range of 50 +/- 1 (give or take). (Rules for normally distributed data)

If logical inference is essentially searching a tree, would it be beneficial to recompile the tree regularly to ensure the path between any two points on the tree is minimized?

It is possible to entirely avoid function symbols and constant symbols, rewriting them via predicate symbols in an appropriate way. For example, instead of using a constant symbol \; 0 one may use a predicate \; 0(x) (interpreted as \; x=0 ), and replace every predicate such as \; P(0,y) with \forall x \;(0(x) \rightarrow P(x,y)) . A function such as f(x_1,x_2,…,x_n) will similarly be replaced by a predicate F(x_1,x_2,…,x_n,y) interpreted as y = f(x_1,x_2,…,x_n) . This change requires adding additional axioms to the theory at hand, so that interpretations of the predicate symbols used have the correct semantics.

:.

:.

Assuming an uninformed prior (Beta(1, 1)):

:.

Assuming an uninformed prior (Beta(1, 1)):

Assuming an uninformed prior (Dir(2/K, 2/K, …, 2/K):

for large K and only one hyperparameter being updated:

for large K, only one hyperparameter being updated, and that one hyperparameter is large:

What if I defined Bayesian networks only using Bernoulli distributions? What expressive power would I loose? What would be the computational cost? Nodes would have to be deterministic. Essential would be subjective logic. Would have binomial distributions and beta priors. When beta prior reaches near certainty of 1, propositional clause could be promoted. Set of propositional clauses of the same “pattern” could be promoted to predicate clause where subjective logic is applied to clause and whether observations show clause to be predominantly true.

Linear regression over Beta distribution done using Logistic regression?

## OpenCL Code

BetaNet.cl
/*
* BayesNetUtilityKernels.cl
*
*  Created on: Feb 4, 2013
*      Author: scannon
*/

#ifndef BayesNetUtilityKernels_CL
#define BayesNetUtilityKernels_CL

//#include "ranluxcl.cl"

kernel void Kernel_Ranluxcl_Init(private uint ins,
global ranluxcl_state_t *ranluxcltab)
{
ranluxcl_initialization(ins, ranluxcltab);
}

__kernel void ExpSample_kernel(global ranluxcl_state_t *ranluxcltab,
__global float4 *result)
{
int gid = get_global_id(0);
ranluxcl_state_t ranluxclstate;

#endif	// BayesNetUtilityKernels_CL