pr.probability – Is there a name for a random variable that is the absolute value of the difference between two iid discrete uniform variables?

I’m working on a project and I needed to calculate the distribution of the difference between two iid discrete uniform variables (sorry for the long title).
That is, let $I, J$ be two iid discrete uniform variables with support ${1, ldots, k}$. I calculated the distribution of $Z = |I – J|$:
$P(Z = 0) = frac{1}{k}$, $P(Z = y) = frac{2(k-y)}{k^2} forall y in {1, ldots, k-1}$.
I was wondering if such a distribution has a name, so that I can get more info and maybe some results about it.

Thanks a lot.

pr.probability – How to model system where interactions are probabilistic

I am trying to model following system.

There are $n$ actors. Actors send and receive messages. Each actor has some probability that when it receives a message, it sends another message to all the other $n – 1$ actors and itself. Each agent stars with a message. The system evolves.

My observations follow. The system either converges on some number of messages and no new messages are sent, or the number of new messages grows quicker than the number of messages which die out, and the system diverges. There isn’t an equilibrium. In a system with homogenous actors the system won’t diverge if $p < frac{1}{n}$. If $p = 1$ then the number of messages at time step $t_i = a^i$. Epidemiological models are related.

I would like to be able to tell whether a system diverges or converges, or how likely both outcomes are, given list of probabilities. I didn’t get anywhere my amateur analysis. Are there any resources I could use to derive properties of the system?

pr.probability – Conditional Probability of A given B, C, D

Suppose you’re given events A, B, C and D. B, C and D are independent events that have A relies on (like how the outcome of a sports game relies on several independent factors).

Is there a way to determine P(A| B, C, D) if you are given P(A), P(A|B), P(A|C) and P(A|D)?

Sorry if this is a dumb question but I have almost no background in stats and I need to know the answer.

pr.probability – Expectation of the determinant of the inverse of non-central Wishart matrix

The joint distribution of the eigenvalues $lambda_i$, $i=1,2,ldots n$ of $A$ is known,
$$P(lambda_1,lambda_2,ldotslambda_n)=c_{k,n}prod_{i<j}|lambda_i-lambda_j|prod_m e^{-lambda_m/2}lambda_m^{(k-n-1)/2},$$
with $c_{k,n}$ a normalization constant.

The desired expectation value is given by
$$U_{n,k}=mathbb E(det(I+(I+A)^{-1}))=int_0^infty dlambda_1int_0^infty dlambda_2cdots int_0^infty dlambda_n,prod_{i<j}|lambda_i-lambda_j|prod_m e^{-lambda_m/2}lambda_m^{(k-n-1)/2}left(1+(1+lambda_m)^{-1}right).$$
For small $n$ the integrals can be done by quadrature, but the integrals quickly become unwieldy.

For example, for $n=1$, $k>0$ I find
$$U_{1,k}=2^{-frac{k}{2}} left(2^{k/2}+sqrt{e} ,Gamma left(1-tfrac{1}{2}k,tfrac{1}{2}right)right),$$
with $Gamma$ the incomplete Gamma function.

If you are satisfied with the expectation value of the logarithm of the determinant, then you can use the Marchenko-Pastur distribution to obtain an accurate result for large $n$.

pr.probability – Central limit theorem for chi-squared random field on $mathbb R^p$

Let $X:x mapsto X(x)$ be a centered stationary Gaussian process on the $Omega:=mathbb R^p$, such that $X(x) overset{d}{=}X(x’)$ for all $x,x’ in Omega$. Set $sigma^2 := mbox{Var}(X(0)) = mathbb E(X(0)^2)$. Let the random fields $X_1,ldots,X_N$ be iid copies of $X$, and define a random process $Z_N$ on $Omega$ by
$$
Z_N(x) := frac{1}{sqrt{2Nsigma^2}}(sum_{i=1}^N X_i(x)^2-Nsigma^2),;forall x in Omega.
$$

Question. Is there a central limit theorem (perhaps under further conditions on the base field $X$) for the limiting distribution of the random field $Z_N$ when $N to infty$ ?

pr.probability – Characterizing ‘very homogeneous’ finitely valued stochastic processes

Fix a positive integer $n$. Let $X = {X_i}_{i in mathbb{N}}$ be a discrete time stochastic process such that each $X_i$ is a ${0,dots,n-1}$-valued random variable. Suppose that the joint probability distributions of any finite sequence of $X_i$‘s only depends on the order of their indices, or to be more precise suppose that $X$ satisfies the following:

  • For any $kin mathbb{N}$, any two increasing sequences of indices $i_0<i_1 < cdots i_{k-1}$ and $j_0 < j_1<cdots<j_{k-1}$, and any function $f : {0,dots,k-1} to {0,dots,n-1}$, $$mathbb{P}(X_{i_0} = f(0) wedge X_{i_1} = f(1) wedge cdots wedge X_{i_{k-1}} = f(k-1)) = mathbb{P}(X_{j_0} = f(0) wedge X_{j_1} = f(1) wedge cdots wedge X_{j_{k-1}} = f(k-1)).$$

Call such a stochastic process ‘strongly homogeneous.’ I’m trying to understand what the set of strongly homogeneous stochastic processes looks like. This is my approach so far:

The set of ${0,dots,n-1}$-valued discrete time stochastic processes can be understood as the set of Borel probability measures on the (compact) space $A = {0,dots,n-1}^{mathbb{N}}$. This is a subset of Banach space of (regular Borel) signed measures on $A$. Let $S$ be the set of such measures corresponding to a strongly homogeneous stochastic process. It’s not hard to check that $S$ is a convex, weak* closed set, and therefore that the Krein-Milman theorem applies to it. This gives us that every element of $S$ is in the weak* closure of the convex hull of the set of extreme points of $S$ (where a point is extreme if it is not the convex combination of any other elements of $S$). This leads to my precise question.

Question: What are the extreme points of $S$?

Note that the extreme points of the set of all probability measures on $A$ is precisely the set of Dirac measures on $A$, but a similar statement here is not sufficient. For instance, if $n=2$, then the only strongly homogeneous Dirac measures are those concentrated on the constant $0$ sequence or the constant $1$ sequence, but convex combinations of these do not have the measure corresponding to a sequence of i.i.d. fair coin flips in their weak* closure.

My suspicion is that the measures corresponding to i.i.d. sequences are the extreme points, but I haven’t proved either that they are all extreme points or that all extreme points are of that form. (Proving that they are all extreme points should be easy, however.)

pr.probability – Sock Draw Probability Competitive Programming Question

First let me paste the question.

This problem is based on an (almost) true story. A child, who shall remain nameless, has a large pile of clean socks. This pile contains m pairs of socks with pictures and patterns and n
pure white socks. Each pair of socks consists of two identical socks and every pair is unique — no two pairs look the same. All pure white socks are identical. Each day, the child randomly selects two socks from the pile, puts them on, and heads for school.
But today is a picture day and the child needs to wear two identical socks. So the child randomly selects two socks and if both socks are identical, the child puts them on and heads out the door. If the two socks are not identical, the child throws the socks into the laundry basket (they are now dirty — don’t ask why) and continues the same process of randomly selecting two socks from the pile of remaining clean socks. As this process continues, the parents are starting to get worried: Will this child ever make it to school today? Please help them to compute the probability that the child will not find a pair of identical socks using this process.


First of all, the limitations are n<=500 pairs of socks, and m<=200 white socks. I’ve really tried to solve question every way i can. I used permutation of 7! and then count the paired socks. At test problem, there are 3 white and 2 paired socks. I could only solve this by counting all paired socks in 7! permutations. Can you help me how i can solve this question? I used python by the way, and this is a competitive programming question.

At initial test answer, there are 3 white socks and 2 paired socks (total of 7 socks). And the answer is:0.457142857142857

pr.probability – The reason why a test is undersized?

Now I have a statistic $T_n$ for testing $H_0 leftrightarrow H_1$, and I have proved that:
$$n T_n rightarrow_d chi_K^2$$
under $H_0$. Then an asymptotic $chi^2$ test can be used, an asymptotic level $alpha$ test for the null hypothesis is obtained by rejecting $H_0$ whenever
begin{equation*}
T_n > n^{-1} chi_{K, 1 – alpha}^2.
end{equation*}

When I do simulation, however, I counter a big problem. Even $n$ is large enough (15,000 for instance), the empirical rejecting rate under $H_0$ is much smaller than $alpha$! Does this performance means the theoretical properties of $T_n$ that I have proved may be wrong? Or there is a ubiquitous phenomenon in chisuqare testing, and it is related to some more involved questions?

pr.probability – How to Calculate 5 to 11-Card Poker Straights When Dealt 11 Cards from 8-Deck Shoe

I’m looking for how to calculate the odds of a 5-card straight poker hand formed from 11 cards dealt from an 8-deck shoe, and then the odds for a 6-card straight out of those 11 cards, and then the same for 7, 8, 9, 10, and 11-card straights, all formed from 11-cards randomly dealt from a shuffled 8-deck shoe of 52-card poker decks (no jokers).

I understand how to calculate the odds of a straight formed from 5 cards dealt from a single 52-card deck. There are plenty of resources for calculating poker hand probabilities dealt from a single-deck shoe (i.e., a single deck of 52 cards), but I cannot for the life of me find any resources online regarding poker probability math from a multi-deck shoe, whether that’s 2 decks, 4 decks, 8 decks as in my case, etc.

This problem also has the compounding factor that it’s not just five card stud, but 11 cards dealt.

This seems a relatively trivial problem, but I’m afraid I’m going to mess up my calculations without realizing my error. I’d love any assistance, thanks!

pr.probability – When are all average trajectories of $w_{k+1}=Aw_k+b$ bounded?

Below is an open-problem in my field, and I’m wondering if someone has insights I’m missing. (cross-posted on math.se)

Suppose observation $x$ is drawn from some distribution $mathcal{D}$, $w_0in mathbb{R}^d$, and my update has the following form

$$w_{k+1}=(I-xx’)w_k+b$$

When are average trajectories $u_k=E(w_k)$ bounded? Expectation is taken over all sequences of IID observations $x_1,ldots,x_k$. Motivation for this recurrence is here.

I have worked out the case of $b=0$ and I’m wondering if global asymptotic stability for $b=0$ also implies boundedness for some other value of $b$. An interesting special case is when $x$ comes from Gaussian centered at 0.

Cases of increasing difficulty are

  1. $b=0$
  2. $b=c$ is some fixed vector $c$ from $mathbb{R}^d$
  3. $b=Bx$ for some matrix $B$
  4. $b=d$ where $d$ is drawn from some distribution $mathcal{D}_2$ independent of $x$
  5. $b=c+Bx+d$
  6. $b,x$ are drawn jointly from some distribution $mathcal{D}_3$

For the case of $b=0$, condition below seems to be a necessary and sufficient condition for all trajectories $u_k$ to converge to 0. The following must hold for all symmetric matrices $A$

$$E((x’Ax)^2)<2 E(x’A^2 x)$$

For the case of $x$ coming from zero-centered Gaussian with covariance $Sigma$, this becomes (using identities 20.18 and 20.25c of Seber’s Matrix Handbook book)

$$(text{Tr}(ASigma))^2+2text{Tr}(ASigma)^2<2text{Tr}ASigma$$