Beta Distribution as Fundamental Bayesian Network Node

(add, sub, mul, div, cnd

http://editthis.info/logic/The_Laws_of_Classical_Logic

http://en.wikipedia.org/wiki/List_of_axioms

http://en.wikipedia.org/wiki/First-order_logic

The beta distribution is related to the gamma distribution. Let X be a random number drawn from Gamma(1,α) and Y from Gamma(1,β). Then Z=X/(X+Y) has distribution Beta(α,β). With this transformation, it should only take twice as much time as your gamma distribution test.

http://www.wikicoursenote.com/wiki/Acceptance-Rejection_Sampling#Special_Technique_for_sampling_from_Gamma_Distribution

http://en.wikipedia.org/wiki/Beta_distribution#Generating_beta-distributed_random_variates

http://www.aiaccess.net/English/Glossaries/GlosMod/e_gm_beta_distri.htm#Animation_BetaDistri

Higher α and β values take longer to process (O(α + β)). It also means lower variance. Thus, once α and β get really high, it can be assumed perhaps that distribution is better off being represented using Boolean logic.

Problem space complexity involves 2 parameters for each node plus a boolean value for each possible arc. Number of arc combinations given n nodes is (n^2 + n)/2 - n - 1 = (n^2 + n - 2n + 2)/2 = (n^2 - n)/2 + 1. If max value for beta parameters α and β are m, the size of the problem space is (m^2)[(n^2 - n)/2 + 1] = (m^2)(n^2 - n)/2 + (m^2).

Problem space complexity given n nodes and max value m for beta parameters: O(m^2n^2)

Beta(30000,30000) has a variance of ~1/240000 and a standard deviation of ~0.00204. If the range of that distribution is from 0 to 100, there is a one in a million chance a value falls outside of the range of 50 +/- 1 (give or take). (Rules for normally distributed data)

If logical inference is essentially searching a tree, would it be beneficial to recompile the tree regularly to ensure the path between any two points on the tree is minimized?

It is possible to entirely avoid function symbols and constant symbols, rewriting them via predicate symbols in an appropriate way. For example, instead of using a constant symbol \; 0 one may use a predicate \; 0(x) (interpreted as \; x=0 ), and replace every predicate such as \; P(0,y) with \forall x \;(0(x) \rightarrow P(x,y)) . A function such as f(x_1,x_2,…,x_n) will similarly be replaced by a predicate F(x_1,x_2,…,x_n,y) interpreted as y = f(x_1,x_2,…,x_n) . This change requires adding additional axioms to the theory at hand, so that interpretations of the predicate symbols used have the correct semantics.

http://en.wikipedia.org/wiki/First-order_logic#Restricted_languages

http://math.stackexchange.com/questions/125818/first-order-logic-why-do-we-need-function-symbols

http://en.wikipedia.org/wiki/Subjective_logic

x = 2b/u + 2a

y = 2d/u + 2(1 - a)

1 = b + d + u

x = 2b/u + 2a

y = 2d/u + 2(1 - a) = 2d/u - 2a + 2

1 = b + d + u

2a = 2d/u - y + 2

x = 2b/u + 2d/u - y + 2

b = 1 - d - u

x = 2(1 - d - u)/u + 2d/u - y + 2

x = 2/u - 2d/u - 2 + 2d/u - y + 2

x = 2/u - y

ux = 2 - uy

u(x + y) = 2

:. u = 2/(x + y)

x = 2b/u + 2a

x = b(x + y) + 2a

:. b = (x - 2a)/(x + y) = x/(x + y) - ua = (x - 2a)/(x + y)

Assuming an uninformed prior (Beta(1, 1)): b = (x - 1)/(x + y)

y = d(x + y) + 2(1 - a)

d(x + y) = y - 2(1 - a)

:. d = (y + 2(a - 1))/(x + y) = y/(x + y) - u(1 - a) = (y - 2 + 2a)/(x + y)

Assuming an uninformed prior (Beta(1, 1)): d = (y - 1)/(x + y)

Assuming an uninformed prior (Dir(2/K, 2/K, …, 2/K):

b_{i} = {alpha_{i}}/{2 + sum{}{}{alpha}}

u = 2/{2 + sum{}{}{alpha}}

sum{}{}{alpha} = 2/u - 2 = {alpha}_{i}/b_{i} - 2

Var[X_{i}] = {{alpha}_{i}(sum{}{}{alpha} - {alpha}_{i})}/{sum{}{}{alpha}^2(sum{}{}{alpha} + 1)}

for large K and only one hyperparameter being updated:

Var[X_{i}] = {alpha}_{i}/{{({alpha}_{i} + 2)^2}({alpha}_{i} + 3)}

for large K, only one hyperparameter being updated, and that one hyperparameter is large:

Var[X_{i}] = {alpha}_{i}/{{{alpha}_{i}}^3} = 1/{{alpha}_{i}}^2

What if I defined Bayesian networks only using Bernoulli distributions? What expressive power would I loose? What would be the computational cost? Nodes would have to be deterministic. Essential would be subjective logic. Would have binomial distributions and beta priors. When beta prior reaches near certainty of 1, propositional clause could be promoted. Set of propositional clauses of the same “pattern” could be promoted to predicate clause where subjective logic is applied to clause and whether observations show clause to be predominantly true.

Linear regression over Beta distribution done using Logistic regression?

http://en.wikipedia.org/wiki/Generalized_linear_model

http://en.wikipedia.org/wiki/Logistic_regression

http://en.wikipedia.org/wiki/Peano_axioms

http://www.proofwiki.org/wiki/Peano%27s_Axioms_Uniquely_Define_Natural_Numbers

OpenCL Code

BetaNet.cl
/*
 * BayesNetUtilityKernels.cl
 *
 *  Created on: Feb 4, 2013
 *      Author: scannon
 */
 
#ifndef BayesNetUtilityKernels_CL
#define BayesNetUtilityKernels_CL
 
//#include "ranluxcl.cl"
 
 
kernel void Kernel_Ranluxcl_Init(private uint ins,
								 global ranluxcl_state_t *ranluxcltab)
{
	ranluxcl_initialization(ins, ranluxcltab);
}
 
__kernel void ExpSample_kernel(global ranluxcl_state_t *ranluxcltab,
							  __global float4 *result)
{
	int gid = get_global_id(0);
	ranluxcl_state_t ranluxclstate;
 
	// Download state
	ranluxcl_download_seed(&ranluxclstate, ranluxcltab);
 
	// Calculate sample from exponential distribution
	result[gid] = -1*native_log(ranluxcl32(&ranluxclstate));
 
	// Upload state again
	ranluxcl_upload_seed(&ranluxclstate, ranluxcltab);
}
 
#endif	// BayesNetUtilityKernels_CL

goplayer/betanode.txt · Last modified: 2023/02/24 23:05 (external edit)