Homework 3

• 32 min read • 6268 words
Tags: Ma-Le Probability
Categories: Machine Learning

Homework 3

2. Gaussian Classification

Let fXY=Ci(x)N(μi,σ2)f_{X\mid Y=C_i}(x) \sim \mathcal{N}(\mu_i,\sigma^2) for a two-class, one-dimensional (d=1d = 1) classification problem with classes C1C_1 and C2C_2, P(Y=C1)=P(Y=C2)=1/2P(Y = C_1) = P(Y = C_2) = 1/2, and µ2>µ1µ_2 > µ_1.

Q1Q1

Find the Bayes optimal decision boundary and the corresponding Bayes decision rule by finding the point(s) at which the posterior probabilities are equal. Use the 0-1 loss function.

Answer:Answer: 对于0-1损失函数,贝叶斯最优决策规则是:对于一个给定的观测值 xx,我们应该选择后验概率 P(Y=CiX=x)P(Y=C_i|X=x) 最大的那个类别CiC_i。而决策边界即为两者的后验概率相等:

P(Y=C1X=x)=P(Y=C2X=x)P(Y=C_1|X=x) = P(Y=C_2|X=x)

也即:

p(xY=C1)P(Y=C1)=p(xY=C2)P(Y=C2)p(x|Y=C_1) P(Y=C_1) = p(x|Y=C_2) P(Y=C_2)

由于两个类别的先验概率相同,有:

p(xY=C1)=p(xY=C2)p(x|Y=C_1) = p(x|Y=C_2)

带入正态分布PDF中:

exp((xμ1)22σ2)=exp((xμ2)22σ2)ln(exp((xμ1)22σ2))=ln(exp((xμ2)22σ2))(xμ1)22σ2=(xμ2)22σ2(xμ1)2=(xμ2)2x22xμ1+μ12=x22xμ2+μ222xμ1+μ12=2xμ2+μ222xμ22xμ1=μ22μ122x(μ2μ1)=(μ2μ1)(μ2+μ1)2x=μ1+μ2(since μ2>μ1)x=μ1+μ22\begin{aligned} &\exp\left(-\frac{(x-\mu_1)^2}{2\sigma^2}\right)=\exp\left(-\frac{(x-\mu_2)^2}{2\sigma^2}\right) \\ &\Rightarrow \ln\left(\exp\left(-\frac{(x-\mu_1)^2}{2\sigma^2}\right)\right) = \ln\left(\exp\left(-\frac{(x-\mu_2)^2}{2\sigma^2}\right)\right) \\ &\Rightarrow -\frac{(x-\mu_1)^2}{2\sigma^2} = -\frac{(x-\mu_2)^2}{2\sigma^2} \\ &\Rightarrow (x-\mu_1)^2 = (x-\mu_2)^2 \\ &\Rightarrow x^2 - 2x\mu_1 + \mu_1^2 = x^2 - 2x\mu_2 + \mu_2^2 \\ &\Rightarrow -2x\mu_1 + \mu_1^2 = -2x\mu_2 + \mu_2^2 \\ &\Rightarrow 2x\mu_2 - 2x\mu_1 = \mu_2^2 - \mu_1^2 \\ &\Rightarrow 2x(\mu_2 - \mu_1) = (\mu_2 - \mu_1)(\mu_2 + \mu_1) \\ &\Rightarrow 2x = \mu_1 + \mu_2 \quad(\text{since }\mu_2>\mu_1) \\ &\Rightarrow x = \frac{\mu_1 + \mu_2}{2} \end{aligned}

决策边界为两个均值的中点。

Q2Q2

Suppose the decision boundary for your classifier is x=bx = b. The Bayes error is the probability of misclassification, namely

Pe=P((C1 misclassified as C2)(C2 misclassified as C1))P_e = P\big((C_1\ \text{misclassified as}\ C_2)\cup (C_2\ \text{misclassified as}\ C_1)\big)

Show that the Bayes error associated with this decision rule, in terms of bb, is

e(b)=122πσ(bexp((xμ2)22σ2)dx+bexp((xμ1)22σ2)dx)e(b)=\frac{1}{2\sqrt{2\pi}\,\sigma}\left( \int_{-\infty}^{b}\exp\left(-\frac{(x-\mu_2)^2}{2\sigma^2}\right)\,dx +\int_{b}^{\infty}\exp\left(-\frac{(x-\mu_1)^2}{2\sigma^2}\right)\,dx \right)

proof:proof:

Pe(b)=P(decide C2actual C1)P(actual C1)+P(decide C1actual C2)P(actual C2)P_e(b)=P\big(\text{decide }C_2\mid\text{actual }C_1\big)\,P(\text{actual }C_1) +P\big(\text{decide }C_1\mid\text{actual }C_2\big)\,P(\text{actual }C_2)

将先验概率带入这个公式:

Pe(b)=12P(X>bY=C1)+12P(X<bY=C2)(1)P_e(b) = \frac{1}{2} P(X > b | Y=C_1) + \frac{1}{2} P(X < b | Y=C_2) \tag{1}

决策边界是 x=bx=b 并且 μ2>μ1\mu_2 > \mu_1,自然的分类规则是:

decide  {C2,x>bC1,x<b\text{decide}\; \begin{cases} C_2, & x > b\\[4pt] C_1, & x < b \end{cases}

P(decide C2actual C1)P(actual C1)P\big(\text{decide }C_2\mid\text{actual }C_1\big)\,P(\text{actual }C_1) 的实际概率为P(X>bY=C1)P(X > b | Y=C_1),这可以通过下面的积分得到:

bp(xY=C1)dx=b12πσ2exp((xμ1)22σ2)dx\int_b^\infty p(x|Y=C_1) dx = \int_b^\infty \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu_1)^2}{2\sigma^2}\right) dx

P(decide C1actual C2)P(actual C2)P\big(\text{decide }C_1\mid\text{actual }C_2\big)\,P(\text{actual }C_2) 的实际概率为P(X<bY=C2)P(X < b | Y=C_2),这可以通过下面的积分得到:

bp(xY=C2)dx=b12πσ2exp((xμ2)22σ2)dx\int_{-\infty}^b p(x|Y=C_2) dx = \int_{-\infty}^b \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x-\mu_2)^2}{2\sigma^2}\right) dx

带入 (1)(1),整理即得:

Pe(b)=122πσ(bexp((xμ1)22σ2)dx+bexp((xμ2)22σ2)dx)P_e(b) = \frac{1}{2\sqrt{2\pi}\sigma} \left( \int_b^\infty \exp\left(-\frac{(x-\mu_1)^2}{2\sigma^2}\right) dx + \int_{-\infty}^b \exp\left(-\frac{(x-\mu_2)^2}{2\sigma^2}\right) dx \right)

Comments

Total words: 6268