Introduction to Bayesian inference and generative models

Introduction to Bayesian inference and generative models Dr. Richard E. Turner ([email protected]) Computational and Biological Learning Lab, Departmen...

Author: Hillary Johnson

0 downloads 0 Views 765KB Size

Report

Download PDF

Recommend Documents

Introduction to Bayesian inference. March 2, 2005

Bayesian Inference for Spatio-Temporal Models

Bayesian Sparse Factor Models and DAGs Inference and Comparison

Bayesian networks: Inference and learning

Generative Models

Bayesian signal inference

Inference in Bayesian networks

When Did Bayesian Inference Become Bayesian?

Bayesian inference for logistic models using Polya-Gamma latent variables

Bayesian Inference for Spiking Neuron Models with a Sparsity Prior

Learning Deep Generative Models

Introduction to Bayesian Analysis

Introduction to Bayesian Learning

Chapter 2. Information Theory and Bayesian Inference

Generative Models for Classification

REVIEW Bayesian inference in ecology

Bayesian inference in Inverse problems

Bayesian inference for sample surveys. Roderick Little Module 1: Introduction

An Introduction to Bayesian Networks

Bayesian Inference. Bayesian inference is a collection of statistical methods which are based on Bayes formula

Bayesian models of cognition

Bayesian Inference for Logistic Regression Parameters

Bayesian Regression Tree Models!!!

Introduction to Bayesian inference and generative models Dr. Richard E. Turner ([email protected])

Computational and Biological Learning Lab, Department of Engineering University of Cambridge Laboratory for Computational Vision, Center for Neural Science New York University

Question

• Collected inter-spike interval measurements, x, from a neuron • x follows an exponential distribution with a characteristic time-scale λ, shifted to take account of the absolute refractory period of the neuron, 5ms long. • ISIs over 50ms were not recorded (short trials used for data-collection) • N ISIs are observed at {x1, . . . xN }. What is λ?

0

10

20

30

x /ms

40

50

Question

• Collected inter-spike interval measurements, x, from a neuron • x follows an exponential distribution with a characteristic time-scale λ, shifted to take account of the absolute refractory period of the neuron, 5ms long. • ISIs over 50ms were not recorded (short trials used for data-collection) • N ISIs are observed at {x1, . . . xN }. What is λ? 4

count

3 2 1 0 0

10

20

30

x /ms

40

50

Ideas Idea 1 • bin up into a histogram – where do we place the bins • fit to density – what error measure do we minimise? Idea 2 • construct an estimator e.g. the sample mean µ =

1 N

PN

n=1 xn

– which estimator should we choose? mean, variance, higher moments? • relate to parameters via expectation of estimator e.g. µ ≈ hxi = f (λ) – small sample effects can be problematic e.g. if µ > 12 (50 + 5)ms

A less ad hoc method...probabilities as degrees of belief Cox’s axioms • degrees of belief can be represented by real numbers • take into account all evidence • consistency: if things can be reasoned in more than one way, each must lead to the same answer • equivalent states of knowledge → equivalent plausibility assignments Conclusion: degrees of belief must follow the rules of probability. Product rule: p(λ,P x) = p(λ|x)p(x) = p(x|λ)p(λ) (Bayes’ Rule) Sum rule: p(λ) = x p(λ, x) (marginalisation)

A less ad hoc method...probabilities as degrees of belief Cox’s axioms • degrees of belief can be represented by real numbers • take into account all evidence • consistency: if things can be reasoned in more than one way, each must lead to the same answer • equivalent states of knowledge → equivalent plausibility assignments Conclusion: degrees of belief must follow the rules of probability. Product rule: p(λ,P x) = p(λ|x)p(x) = p(x|λ)p(λ) (Bayes’ Rule) Sum rule: p(λ) = x p(λ, x) (marginalisation)

A less ad hoc method...probabilities as degrees of belief Cox’s axioms • degrees of belief can be represented by real numbers • take into account all evidence • consistency: if things can be reasoned in more than one way, each must lead to the same answer • equivalent states of knowledge → equivalent plausibility assignments Conclusion: degrees of belief must follow the rules of probability. Product rule: p(λ,P x) = p(λ|x)p(x) = p(x|λ)p(λ) Sum rule: p(λ) = x p(λ, x)

A less ad hoc method...probabilities as degrees of belief Cox’s axioms • degrees of belief can be represented by real numbers • take into account all evidence • consistency: if things can be reasoned in more than one way, each must lead to the same answer • equivalent states of knowledge → equivalent plausibility assignments Conclusion: degrees of belief must follow the rules of probability. Product rule: p(λ,P x) = p(λ|x)p(x) = p(x|λ)p(λ) ←Bayes’ Rule Sum rule: p(λ) = x p(λ, x) ←marginalisation

Mathematical solution

5ms

50ms

area must sum to one

what the data tell us (likelihood of parameters)

what we knew before hand (prior)

Density

0.1

p(x|λ=10) p(x|λ=20) p(x|λ=50)

0.09 0.08 0.07

p(x|λ)

0.06 0.05 0.04 0.03 0.02 0.01 0 5

10

15

20

25

30

x /ms

35

40

45

50

Likelihood of the parameters

0.04 p(x=15|λ) 0.035 0.03

p(x|λ)

0.025 0.02 0.015 0.01 0.005 0 0

20

40

60

λ /ms

80

100

Likelihood of the parameters

0.04 p(x=15|λ) p(x=20|λ)

0.035 0.03

p(x|λ)

0.025 0.02 0.015 0.01 0.005 0 0

20

40

60

λ /ms

80

100

Likelihood of the parameters

0.04 p(x=15|λ) p(x=20|λ) p(x=30|λ)

0.035 0.03

p(x|λ)

0.025 0.02 0.015 0.01 0.005 0 0

20

40

60

λ /ms

80

100

Posterior distribution: p(λ|x1) −3

x 10 2.5

p(x|λ)

2

1.5

1

0.5

0 0

20

40

60

λ /ms

80

100

Posterior distribution: p(λ|x1, x2) −3

x 10 2.5

p(x|λ)

2

1.5

1

0.5

0 0

20

40

60

λ /ms

80

100

Posterior distribution: p(λ|x1, x2, x3) −3

x 10 2.5

p(x|λ)

2

1.5

1

0.5

0 0

20

40

60

λ /ms

80

100

Posterior distribution: p(λ|x1, x2, x3, x4) −3

x 10 2.5

p(x|λ)

2

1.5

1

0.5

0 0

20

40

60

λ /ms

80

100

Posterior distribution: p(λ|x1, x2, x3, x4, x5) −3

x 10 2.5

p(x|λ)

2

1.5

1

0.5

0 0

20

40

60

λ /ms

80

100

Summarising the posterior distribution −3

2.5

x 10

maximum a posteriori (MAP)

2

p(x|λ)

error-bars 1.5

1

0.5

0 0

20

40

λ /ms

60

80

100

Summarising the posterior distribution −3

2.5

x 10

Gaussian approximation

p(x|λ)

2

1.5

1

0.5

0 0

20

40

λ /ms

60

80

100

Summarising the posterior distribution −3

2.5

x 10

p(x|λ)

2

1.5

1

0.5

0 0

samples from posterior 20

40

λ /ms

60

80

100

Question

• Record inter-spike interval measurements, x • As before: absolute refractory of 5ms & ISIs above 50ms not recorded • We know if the neuron is... – quiescent: x follows an exponential distribution with time-scale λ0 = 25ms – bursting: x follows an exponential distribution with time-scale λ1 = 5ms • You observe a single ISI, x = 15ms. Is the neuron in a bursting state?

Question

• Record inter-spike interval measurements, x • As before: absolute refractory of 5ms & ISIs above 50ms not recorded • We know if the neuron is... – quiescent: x follows an exponential distribution with time-scale λ0 = 25ms – bursting: x follows an exponential distribution with time-scale λ1 = 5ms • You observe a single ISI, x = 15ms. Is the neuron in a bursting state? Intuition: should be close to 50:50

Mathematical solution

Introduce latent variable: Generative model

Recognition model: inference

not bursting bursting

Graphical solution

p(x|b,λb)

0.2 b=0 b=1 0.1

0 5

10

15

20

25

30

x /ms

35

40

45

50

Graphical solution

p(x|b,λb)

0.2

b=0 b=1

crossing point: 13.9 0.1

0 5

10

15

20

25

30

x /ms

35

40

45

50

Graphical solution

p(x|b,λb)

0.2 b=0 b=1 0.1

0 5

10

15

20

25

30

35

40

45

50

35

40

45

50

x /ms

p(b=0|x)

1

0.5

0 5

10

15

20

25

30

x /ms

Graphical solution

p(x|b,λb)

0.2

b=0 b=1

0.1

0 5

10

15

20

25

30

35

40

45

50

15

20

25

30

35

40

45

50

x /ms

p(b=0|x)

1

0.54 0.5

0 5

10

x /ms

Generative models in neuroscience

• data analysis (spike sorting, fMRI, etc.) • ideal observer models in psychophysics • neural encoding models • neural decoding models • Bayesian Brain - the brain is making inferences about the world using probabilistic calculus