combinatorics – Probability of consecutive coin flips

There are 15 coins in a bag. 5 of the 15 are fair coins and the rest are biased (80% H, 20% T). When a coin is chosen randomly from the bag and flipped twice, what is the probability that both of them are heads?

I tried to solve this using two different ways and I get two different answers.

Method 1:
$$P_{fair}(HH) = frac {1}{2^2}$$
Considering 80% is $frac 45$ and 20% is $frac 15$
$$P_{biased}(HH) = frac {4^2}{5^2}$$
$$P(HH) = frac {1}{3}P_{fair}(HH) + frac {2}{3}P_{biased}(HH)$$
$$P(HH) = frac {51}{100}$$

Method 2:
Constructing from all possible outcomes, consider 1 fair coin for every 2 biased coins.
$$P(HH) = frac {text{number of outcomes with two H}}{text{total outcomes}}$$
$$text{total outcomes} = (text{outcomes for fair coin}) + 2(text{outcomes for biased coin})$$
$$text{outcomes for fair coin} = text{{H, T} tossed twice}$$
$$text{outcomes for biased coin} = text{{H, H, H, H, T} tossed twice}$$
$$text{total outcomes}=2^2+2(5^2)$$
$$=54$$
$$text{number of outcomes with two H} = text{HH for fair coin} + 2(text{HH for biased coin})$$
$$=1+2(4^2)$$
$$=33$$
$$P(HH) = frac{33}{54}$$

Have I done a mistake in either of the methods or maybe both? It’s not like I didn’t understand conditional probability (or did I?). For instance, I can find the probability of drawing two red cards from a deck of playing cards using both those methods.
$$P(RR) = P(Red_1) P(Red_2|Red_1)$$
$$=frac{26}{52} frac{25}{51}$$

Also using just combinatorics,
$$P(RR) = frac{^{26}P_2}{^{52}P_2}$$
$$=frac{26 times 25}{52times51}$$

So definitely my methods aren’t incorrect. Coming back to my original question, where did I go wrong?

probability – Expectation of the minimum of two continuous random variables – using the joint pdf

Define $Z = min(X, Y)$ and the joint pdf of $X$ and $Y$ as $f_{XY}(x,y)$.

I saw an approach that said

$$
E(Z) = int int min(x,y) f_{XY}(x,y) dydx
$$

Is this readily obvious, or do you need to convert the following:
$$
E(Z) = int min(x,y)f_Z(z) dz
$$

to the above?

probability – With what frequency should points in a 2D grid be chosen in order to have roughly $n$ points left after a time $t$

Say I have a 2D array of $x$ by $y$ points with some default value, for generalisation we’ll just say “0”.

I randomly select a pair of coordinates with a frequency of $f$ per second and, if the point selected is 0: flip it to 1. However, if the point is already $1$ then do not change it.

How then, given a known $x$ and $y$ (thus known total points as $xy$) can I calculate a frequency $f$ that will leave me with approximately (as this is random) $n$ 0 points remaining, after a set time $t$? Where $t$ is also seconds.

For some context I am attempting some simplistic approximation of nuclear half-life but I’m not sure how to make use of the half-life equation for this, nor do I think that it is strictly correct to apply it given my implementation isn’t exactly true to life, picking single points at a time.

probability theory – CDF of $S_{N_{t}}$ where $S_{N_{t}}$ is the time of the last arrival in $[0, t]$

enter image description here

I am confused on this problem. My professor gave this as the solution:

$S_{N_{T}}$ is the time of the last arrival in $(0, t)$. For $0 < x leq t, P(S_{N_{T}} leq x) sum_{k=0}^{infty} P(S_{N_{T}} leq x | N_{T}=k)P(N_{T}=k) = sum_{k=0}^{infty} P(S_{N_{T}} leq x | N_{T}=k) * frac{e^{- lambda t}*(lambda t)^k}{k!}$.

Let $M=max(S_1, S_2, …, S_k)$ where $S_i$ is i.i.d. for $i = 1,2,.., k$ and $S_i $~ Uniform$(0,t)$.

So, $P(S_{N_{T}} leq x = sum_{k=0}^{infty} P(M leq x)frac{e^{- lambda t}*(lambda t)^k}{k!} = sum_{k=0}^{infty} (frac{x}{t})^k frac{e^{- lambda t}*(lambda t)^k}{k!} = e^{- lambda t} sum_{k=0}^{infty} frac{(lambda t)^k}{k!} = e^{- lambda t}e^{- lambda x} = e^{lambda(x-t)}$

If $N_t = 0$, then $S_{N_{T}} = S_0 =0$. This occurs with probability $P(N_t = 0) = e^{- lambda t}$.

Therefore, the cdf of $S_{N_{T}}$ is:
$P(S_{N_{T}} leq x) =
begin{array}{cc}
{ &
begin{array}{cc}
0 & x < 0 \
e^{- lambda (x-t)} & 0leq xleq t \
1 & x geq t
end{array}
end{array}$

probability – “First principles” proof of the limit of the expected excess of the uniform renewal function

The closed form of the expected number of samples for $sum_r X_r geqslant t, X_r sim text{U(0,1)}$ is given by:

$$m(t) = sum_{k=0}^{lfloor t rfloor} frac{(k-t)^k}{k!}e^{t-k}$$

From this we can deduce the expected amount by which this sum exceeds $t$, namely:

$$varepsilon(t) = frac{m(t)}{2} – t$$

From knowing that $m(t) to 2t+dfrac{2}{3}$, we can easily see that $varepsilon(t) to dfrac{1}{3}$.

Is there a simple (“low tech”) way of proving that $varepsilon(t) to dfrac{1}{3}$ without first passing through proving $m(t) to 2t+dfrac{2}{3}$ ?

Probability of dice roll between values

Context: In calculating the optimal policy for MDP’s an algorithm called Value Iteration is used. I am using this algorithm to calculate the optimal policy for a small game and test my knowledge in the field.

In the game, $d$ normal dices (1-6) are rolled simultaneously, and you can either pick all dices with the largest value, or all dices with the smallest value. To not have to compute all possible $6^d$ dice rolls, I limit it to $x$ dices getting the smallest values, and $y$ dices getting the highest values, where $x leq d$ and $y leq d – x$.

Now my question is: With $d$ dices, what is the probability that $x$ dices fall on a minimum value $v_x$, $y$ dices fall on a maximum value $v_y$, and $z = d – (x+y)$ dices are between (not including) $(v_x, v_y)$?

I have the feeling that the $z$ in-between dices can be modeled with a binomial distribution with $binom(z, d, frac{v_y – v_x – 1}{6})$, but I am not sure how to reconcile this with the probabilities of $x$ and $y$.

probability theory – Why is the a sum in the definition of a simple function?

Simple functions assume finitely many values in their image, and can be written as

$$f(omega)= sum_{i=1}^n a_i mathbb I_{A_i}(omega), quad forall omega in Omega$$

where $a_i geq 0, forall i in {1,2,3, dots, n},$ and $A_i in mathcal F, forall i.$

So this is how I process it in “human”: For each outcome in the sample space (i.e. $omega$), one must check whether it belongs or not to a measurable set $A_i$ in the sigma algebra $mathcal F.$ If the Boolean operation (characteristic function $mathbb I_{A_i}$) is $1,$ the result of the function will be some value $a_i$ which will be exactly the same for all outcomes in $A_i.$ This could be symbolically plotted as step function, which each step corresponding to one of the $A_i$‘s.

So far, clear as day.

Now, when you introduce the $sum$ at the beginning of the definition, it looks like you are integrating: in other words, the function $f(omega)$ with the sum in front doesn’t seem to “spit out” the corresponding step of that particular $omega,$ but rather all the steps for all omegas – all at once. And that “all at once” seems like a contradiction: after all in a truly simple function, such as $f(x)=2x+2,$ you don’t get a line because you sum the results of the function across the real line, but because you collect as a set the results for every and each value of the real line entered into the function as an independent variable.

bessel functions – Compute the conditional probability distribution of a noncentral $chi$ variable given the range of Erlang distributed non-centrality parameter

I need to compute a conditional probability distribution as described below for my research.

In $(mathbb R^2,||cdot||_2)$, I have a random vector $underline{z}$ with uniformly distributed angle and $Z=||underline{z}||$ following Erlang distribution with $k=2$ and scale parameter $mu$, i.e. with the density function $f_Z(z)=frac{z}{mu^2}e^{-frac{z}{mu}}$. I have another normal random vector $underline{y}$ independent of $underline{z}$. I’m interested in the resultant vector $underline{x}=underline{y}+underline{z}$ and want to compute the conditional distribution of $X=||underline{x}||$ given $aleq||underline{z}||leq b,0leq a<bleqinfty$, to be specific, the complementary cumulative distribution function $overline{F}_{X|Z}(x|(a,b))=P(X>x|aleq Zleq b)$. Solutions for special cases where $Zleq c$ or $Zgeq c$ for any $c>0$ would be sufficient for my research if they are easier to solve.

Following is my attempt. Given a fixed $Z=z$, since $underline{y}$ is normal, $X$ follows the noncentral $chi$ distribution with $k=2$ and non-centrality parameter $lambda=z$, i.e. $f_{X|Z}(x|z)=xe^{-frac{x^2+z^2}{2}}I_0(xz)$, where $I_0(x)=frac{1}{pi}int_0^pi e^{xcosalpha}dalpha$ is a modified Bessel function of the first kind. Then the density function of the conditional distribution is
$$f_{X|Z}(x|(a,b))=frac{int_a^b f_Z(z)f_{X|Z}(x|z)dz}{int_a^b f_Z(z)dz}$$

The denominator $int_a^b f_Z(z)dz=gamma(2,frac{b}{mu})-gamma(2,frac{a}{mu})$ where $gamma$ is the lower incomplete gamma function.

Change the order of integration, the numerator is
$$begin{align}
int_a^b f_Z(z)f_{X|Z}(x|z)dz & = frac{1}{pi}int_a^bfrac{z}{mu^2}e^{-frac{z}{mu}}xe^{-frac{x^2+z^2}{2}}int_0^pi e^{xcosalpha}dalpha \
& = frac{x}{pimu^2}e^{-frac{x^2}{2}}int_0^pi e^{frac{1}{2}(frac{1}{mu}-xcosalpha)^2}int_a^b ze^{-frac{1}{2}(z+frac{1}{mu}-xcosalpha)^2}dzdalpha \
& = frac{x}{pimu^2}e^{-frac{x^2}{2}}int_0^pi e^{frac{beta^2}{2}}left(e^{-bar{a}^2}-e^{-bar{b}^2}+sqrt{frac{pi}{2}}left(operatorname{erf}bar{a}-operatorname{erf}bar{b}right)right)dalpha
end{align}$$

where $beta=frac{1}{mu}-xcosalpha$, $bar{a}=frac{a+beta}{sqrt{2}},bar{b}=frac{b+beta}{sqrt{2}}$, $operatorname{erf}$ is the error function.

Then I got stuck at the second integral. I am looking for an analytical expression of $f_{X|Z}(x|(a,b))$. I tried numerical integration and compared it to a simulation using matlab. The results are as expected.

Finally, what I want is an analytical expression of $overline{F}_{X|Z}(x|(0,c))=P(X>t|Zleq c)=int_t^infty f_{X|Z}(x|(0,c))dx$ and $overline{F}_{X|Z}(x|(c,infty))=P(X>t|Zgeq c)=int_t^infty f_{X|Z}(x|(c,infty))dx$.

Is it possible?

probability or statistics – Linear regression: confidence interval of the average of 10 slopes, error of propagation and error of the mean

As each slope has a different standard error, you should calculate a weighted mean with the weights taken as the inverse of the variance (= square of standard deviation).

Lets assume we have n slope values sl((i)) with var((i)) ( = (standard error i)^2), then the mean and variance of slopes are:

mean= Sum(sl((i))/ var((i)),{i,n}) / Sum(1/ var((i)), {i,n})

variance= 1/Sum(1/var((i)), {i,n})

std error= Sqrt(variance)