How to calculate the expectation and variance of a complex probability distribution

Assuming that the continuous random variables X1 and X2 are independent of each other, and the variances exist, the probability densities of X1 and X2 are $f_{1}(x)$
and $f_{2}(x)$, the probability density of random variable $Y_{1}$ is $f_{Y_{1}}(y)=frac{1}{2}left(f_{1}(y)+f_{2 }(y)right)$, random variablemathrm${Y}_{2}=
frac{1}{2}left(X_{1}+X_{2}right)$
. Which of the following statements is correct (the answer is D)?

$$begin{array}{c}
&(A)& E Y_{1}>E Y_{2}, D Y_{1}>D Y_{2}
&(B)& E Y_{1}=E Y_{2}, D Y_{1}=D Y_{2} \
&(C)& E Y_{1}=E Y_{2}, D Y_{1}<D Y_{2}
&(D)& E Y_{1}=E Y_{2}, D Y_{1}>D Y_{2}
end{array}$$

When I use the normal distribution to verify the D option, the following code keeps running:

Y1 = ProbabilityDistribution((1/
     2) (PDF(NormalDistribution(μ1, σ1), x) + 
     PDF(NormalDistribution(μ2, σ2), x)), {x, -Infinity, 
   Infinity})


Expectation(Y1, Y1 (Distributed) Y1)
Variance(Y1)
Y2 = TransformedDistribution(
  1/2 (x1 + x2), {x1 (Distributed) 
    NormalDistribution(μ1, σ1), 
   x2 (Distributed) NormalDistribution(μ2, σ2)})
Expectation(Y2, Y2 (Distributed) Y2)
Variance(Y2)

How can I improve the code to get the desired result?

How did the variance get calculated?

The Elm Tree golf course in Cortland, NY is a par 70 layout with 3 par
fives, 5 par threes, and 10 par fours. Find the mean and variance of par on this
course.

Mean was calculated as follows: Mean = 70/18 = 3.8888
Variance was found to be: second moment = (75 + 160 + 45)/18 = 280/18 = 15.555
variance = (5040 − 4900)/324 = 140/324 = 0.432

I am not sure what happened from the second moment part up to the calculation of variance. I never calculated variance like this. Can someone elaborate how second moment was found and how it led to the calculation of variance?

sql server – compare the variance of sales from one day to another on the same table

I got this table on a SQL Server named “sales” that shows the sales per day for each vendor, this is the current table:

+-------+-----------+-----------+--------------+
|   id  |   vendor  |   sales   |    date      |
+-------+-----------+-----------+--------------+
|   1   |   John    |     10    |   07-20      |
|   2   |   John    |     5     |   07-20      |
|   3   |   Jeff    |     15    |   07-21      |
|   4   |   Jeff    |     20    |   07-21      |
|   5   |   John    |     5     |   07-21      |
|   5   |   Jeff    |     30    |   07-20      | 

and I would like to transform it into this table below I need to group by vendor and compare the columns of the sales for each day

   +-----------+--------------+-------------------+-----------
   | vendor    |sales  07/20  |  sales  07/21     | Variance |
   +-----------+--------------+-------------------+-----------
   |   John    |       15     |       5           |   -10    |
   |   Jeff    |        30    |       35          |    5     |

statistics – Bayes Estimator for Bernoulli Variance

I have the following question, which I have also posted here, however nobody has answered, so I thought I would post it here as well.

Let $X_1,dots,X_n$ be independent, identically distributed random variables with
$$
P(X_i=1)=theta = 1-P(X_i=0)
$$

where $theta$ is an unknown parameter, $0<theta<1$, and $ngeq 2$. It is desired to estimate the quantity $phi = theta(1-theta) = nVar((X_1+dots+X_N)/n)$.

Suppose that a Bayesian approach is adopted and that the prior distribution for $theta$, $pi(theta)$, is taken to be the uniform distribution on $(0,1)$. Compute the Bayes point estimate of $phi$ when the loss function is $L(phi,a)=(phi-a)^2$.

Now, my solution so far:

It can easily be proven that $a$ needs to be the mean of the posterior. Also, when $theta$ spans $(0,1)$, $phi$ spans $(0,frac{1}{4})$. Hence, we have that
$$
a = int_0^{frac{1}{4}}phicdot f(phi|x_1,dots,x_n)dphi.
$$

Now, we have that
$$
f(phi|x_1,dots,x_n)propto f(x_1,dots,x_n|phi)cdot pi(phi).
$$

Given that $theta$ follows $U(0,1)$, we get that $phi$ follows:

$$
P(Phileq t) = frac{1-sqrt{1-4t}}{2}
$$

Hence we can derive $pi(phi)$. However, I am not sure how to derive $f(x_i|phi)$.

Help proceeding forward and letting me know if I have made any mistakes so far would be very appreciated.

pr.probability – Maximizing variance of bounded random variable through convex optimization

I am interested in maximizing the variance of a random variable $X$ supported on $(0,1)$. Formally,
begin{align}
max_{P_X: X in (0,1)} {rm Var}(X)
end{align}

where $P_X$ is a distribution of $X$.
This question is well-studied on this webpage. For example, see this question. The solution to this question is given by $P_X(0)=P_X(1)=1/2$.

My question is about a specific approach to solving this. Concretely, I am interested in applying the convex optimization method to this problem.

Here is the outline of the proof/method:

  1. Note that space of distribution of $(0,1)$ is compact and convex.
  2. $P_X to {rm Var}(X)$ is concave.
  3. Combining 1) and 2) we conclude that this is a convex optimization problem.
  4. Find the KKT conditions by using the directional derivative. These are given by the following: $P_X$ is an optimizer if and only if
    begin{align}
    &(x-E_{P_X}(X))^2 le {rm Var}_{P_X}(X), xin (0,1)\
    &(x-E_{P_X}(X))^2 = {rm Var}_{P_X}(X), x in text{ support of $P_X$}\
    end{align}

My question: How to solve the above equations and produce the optimal $P_X$?

Of course, at this point, we can just plug-in our distribution and check it. However, this would be cheating. I would like to solve for the optimal distribution starting with the above equations with no extra help. For example, at this point, we don’t even know that the distribution is discrete.

coding theory – Variance Gamma parameter estimation in R studio

I am using the vgFit function in R Studio to estimate parameters for stock prices. I converted the stock prices to a vector and I get this error. My code was:

param<-c(0,1,0,1) 
fit<-vgFit(c(x), param=param)

The error is:

Error in optim(paramStart, llsklp, NULL, method = startMethodSL, hessian = FALSE,  : 
  function cannot be evaluated at initial parameters

statistics – Determine all $overrightarrow{a}$ for which the estimator is an unbiased estimator for the variance

consider a random variable $X$ and stochastically independent repetitions $X_1,…,X_n$ of $X$.
For each vector $overrightarrow{a}=(a_1,…,a_n) in mathbb{R}^{n} text{ with } a_i > 0 $ we denote with $T_overrightarrow{a}$ the estimator
$$T_overrightarrow{a} = a_1 . X_1 + a_2 . X_2 + … + a_n . X_n$$
for the expected value.

Solution: (which is provided in the textbook)

$$E(T_overrightarrow{a})
= E(a_1 .X_1 + a_2 .X_2 + … + a_n .X_n)
= a_1 . E(X_1) + a_2 . E(X_2) + … + a_n . E(X_n)
= a_1 . E(X) + a_2 . E(X) + … + a_n . E(X)
=(a_1 + a_2 + … + a_n) . E(X) \
Rightarrow \
E(T_overrightarrow{a}) =E(X) Leftrightarrow a_1 + a_2 + … + a_n=1 \
Rightarrow
$$

The class of the unbiased estimators is therefore the set of all $T_overrightarrow{a}$ with $a_1 + a_2 + … + a_n=1$.


I know, how to determine all $overrightarrow{a}$ for which $T_overrightarrow{a}$ is an unbiased estimator for the expected value $E(X)$ of $X$, but how to do the same for variance $Var(X)$ of $X$? How can the following task be solved like the one provided in the textbook?


The task for which I need help:

consider a random variable $X$, whose mean value is known 𝜇 and stochastically independent repetitions $X_1,…,X_n$ of $X$.
For each vector $overrightarrow{a}=(a_1,…,a_n) in mathbb{R}^{n} text{ with } a_i > 0 $ we denote with $T_overrightarrow{a}$ the estimator
$$T_overrightarrow{a} = a_1 . (X_1-mu)^2 + a_2 . (X_2-mu)^2 + … + a_n . (X_n-mu)^2$$ for the variance of X.

a) Determine all $overrightarrow{a}$ for which $T_overrightarrow{a}$ is an unbiased estimator for the variance $Var(X)$ of $X$.

b) Determine the most effective among the unbiased estimators $T_overrightarrow{a}$.

My thoughts

a)
First try:
$$E(T_overrightarrow{a})
= a_1 . E(X_1-mu)^2 + a_2 . E(X_2-mu)^2 + … + a_n . E(X_n-mu)^2 \
= a_1 . E(X_1-E(X_1))^2 + a_2 . E(X_2-E(X_1))^2 + … + a_n . E(X_n-E(X_n))^2 \
= a_1 . E(X-E(X))^2 + a_2 . E(X-E(X))^2 + … + a_n . E(X-E(X))^2 \
=(a_1 + a_2 + … + a_n) . E(X-E(X))^2 \
=(a_1 + a_2 + … + a_n) . Var(X) \
Rightarrow
Var(X)= frac{E(T_overrightarrow{a})}{(a_1 + a_2 + … + a_n)}
$$

Second try:
$$E(T_overrightarrow{a})
= E(a_1 . (X_1-mu)^2 + a_2 . (X_2-mu)^2 + … + a_n . (X_n-mu)^2) \
= a_1 . E(X_1-mu)^2 + a_2 . E(X_2-mu)^2 + … + a_n . E(X_n-mu)^2 \
= a_1 . (E(X_1^2)-mu^2) + a_2 . (E(X_2^2)-mu^2) + … + a_n . (E(X_n^2)-mu^2)\
= a_1 . (E(X^2)-mu^2) + a_2 . (E(X^2)-mu^2) + … + a_n . (E(X^2)-mu^2)\
= (a_1 + a_2 + … + a_n) . (E(X^2)-mu^2)\
\
=(a_1 + a_2 + … + a_n) . Var(X) \
Rightarrow
Var(X)= frac{E(T_overrightarrow{a})}{(a_1 + a_2 + … + a_n)}
$$

Third try:
$$Var(T_overrightarrow{a}
) = a_1^2 . Var(X_1-mu)^2 + a_2^2 . Var(X_2-mu)^2 + … + a_n^2 . Var(X_n-mu)^2 \
= a_1^2 . Var(X-mu)^2 + a_2^2 . Var(X-mu)^2 + … + a_n^2 . Var(X-mu)^2 \
= (a_1^2 + a_2^2 + … + a_n^2) . Var(X-mu)^2 \
=sum_{i=1}^n a_i^2Var(X-mu)^2=csum_{i=1}^n a_i^2 \
$$

for all $i$ and for some constant $c(ne 0)$.

I was suggested that I should minimize $Var(T)$ to $E(T)=sigma^2$, but how to do that?

Thanks 🙂