commutative algebra – Is $widehat{IM}=widehat I widehat M? $ for any finitely generated module $M$ , where $widehat {(-)}$ denotes completion w.r.t. maximal ideal?

Let $(R,mathfrak m)$ be a Noetherian local ring, and let $(widehat R, widehat{mathfrak m})$ be its $mathfrak m$-afic completion. Let $M$ be a finitely generated $R$-module.

Then, is it true that $widehat{IM}=widehat I widehat M? $

I know this is true if $I=mathfrak m$ as explained here If $M$ be a finitely generated module over a Noetherian ring $A,$ then $widehat{aM}=hat{a} hat{M}.$ , but I’m not sure what happens for general $I$. Please help.

matrices – Matrix derivative w.r.t. a general inverse form: $(A^TA)^{-1/2}D(A^TA)^{-1/2}$

I want to find derivative of matrix $(A^TA)^{-1/2}D(A^TA)^{-1/2}$ w.r.t. $A_{ij}$ where D is a diagonal matrix. Alternatively, it is okay too to have

$$frac{partial}{partial A_{ij}} a^T(A^TA)^{-1/2}D(A^TA)^{-1/2}b$$

Is there any reference for such problem? I have the matrix cookbook which gives results when $D=I$. But how is this general form evaluating to?

To give more information, empirical distribution of diagonal of diagonal matrix D converges to some known distribution.

matrices – Derivative of a Matrix w.r.t. its Matrix Square, $frac{partial text{vec}X}{partialtext{vec}(XX’)}$

Let $X$ be a nonsingular square matrix.

What is
frac{partial text{vec}X}{partialtext{vec}(XX’)},

where the vec operator stacks all columns of a matrix in a single column vector?

It is easy to derive that
frac{partialtext{vec}(XX’)}{partial text{vec}X} = (I + K)(X otimes I),

where $K$ is the commutation matrix that is defined by
text{vec}(X) = Ktext{vec}(X’).

Now $(I + K)(X otimes I)$ is a singular matrix, so that the intuitive solution
frac{partial text{vec}X}{partialtext{vec}(XX’)} = left( frac{partialtext{vec}(XX’)}{partial text{vec}X} right)^{-1}

does not work.

Is the solution simply the Moore-Penrose inverse of $(I + K)(X otimes I)$, or is it more complicated?

dg.differential geometry – Comparison inequality between Sobolev-seminorm w.r.t spherical uniform distribution and gaussian distribution

Let $d$ be a large positive integer. Let $f:mathbb R^d to mathbb R$ be a continuously differentiable function. Let $X$ be uniformly distributed on the unit-sphere $S_{d-1} := {x in mathbb R^d |x| = 1}$ and let $Z$ be a random vector in $mathbb R^d$ with iid coordinates from $N(0,1/d)$. Let $nabla_{S_{d-1}}f:S_{d-1} to mathbb R^d$ be the spherical gradient of $f$, defined by $nabla_{S_{d-1}} f(x) :=(I_d-xx^top)nabla f(x)$, where $nabla f(x)$ is the usual / euclidean gradient of $f$ at $x$.

Question. Is there any comparison inequality between $mathbb E|nabla_{S_{d-1}} f(X)|^2$ and $mathbb E|nabla f(Z)|^2$, meaning the existence of abolute constants $c,C>0$ independent of $f$ and $d$, such that $c mathbb E|nabla_{S_{d-1}} f(X)|^2 le mathbb E|nabla f(Z)|^2 le C mathbb E|nabla_{S_{d-1}} f(X)|^2$?

calculus and analysis – Computing Partial Derivatives of a Function wrt Another Function

I have two functions $P_1,Q_1$ which are both depending on the variables $r_{T_m}, r_{i_f}$, which are theirselves functions of $T_m, mi_f$. I am interested in computing the Jacobian

$$frac{partial (P_1,Q_1)}{partial(r_{T_m}^2,r_{i_f}^2)}Bigg|_{r_{Tm}=f(T_m) \ r_{i_f}=g(mi_f)}.$$

To do that, I first define the function

powerD(f_, x_^(k_.)) := powerD(f, {x^k, 1});
powerD(f_, {x_^(k_.), 0}) := f;
powerD(f_, vars__) := Fold(powerD, f, {vars});
powerD(f_, {x_^(k_.), n_Integer?Positive}) := Det(Append(Table((j!/i!) Binomial(k i, j) x^(k i - j), {i, n - 1}, {j, n}), Table(D(f, {x, j}), {j, n})))/(k x^(k - 1))^Binomial(n + 1, 2);

(because I want to compute the derivative wrt the power of $r_{T_m}, r_{mi_f}$), then I want to substitute $r_{T_m}=f(T_m)$ and $r_{i_f}=g(mi_f)$. However, something in the code fails at this point. I think it is because instead of first computing the partial derivatives $frac{partial P_1}{partial r^2_{T_m}}$ etc. and then substituting $r_{T_m}=f(T_m)$, it tries to compute $frac{partial P_1}{partial f^2(T_m)}$.

I attach here the code ($P_1,Q_1$ are actually functions of other parameters too, but I substitute these other parameters with the needed numerical values before computing the Jacobian).

The function $P_1(r_{T_m},mi_f)$ is given by

P1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif)=-1.12469*10^-8 (1.71055*10^13 - 112.657 rif^2 + 112.657 rTm^2 - 9.60432*10^-10 (-5.66361*10^20 - 3.52759*10^9 rif^2 + 3.52759*10^9 rTm^2 + Sqrt((5.66361*10^20 + 3.52759*10^9 rif^2 - 3.52759*10^9 rTm^2)^2 - 3.26026*10^16 (3.2715*10^26 - 3.83021*10^15 rif^2 + 12691.7 rif^4 - 4.07533*10^15 rTm^2 - 25383.3 rif^2 rTm^2 + 12691.7 rTm^4))))

and the function $Q_1(r_{T_m},mi_f)$ is given by

Q1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif)=6.13447*10^-17 (-5.66361*10^20 - 3.52759*10^9 rif^2 + 3.52759*10^9 rTm^2 + Sqrt((5.66361*10^20 + 3.52759*10^9 rif^2 - 3.52759*10^9 rTm^2)^2 - 3.26026*10^16 (3.2715*10^26 - 3.83021*10^15 rif^2 + 12691.7 rif^4 - 4.07533*10^15 rTm^2 - 25383.3 rif^2 rTm^2 + 12691.7 rTm^4)))

I compute the elements of the (2×2) Jacobian matrix as

J11(rTm_, rif_) := FullSimplify(powerD(P1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif), rTm^2))
J12(rTm_, rif_) := FullSimplify(powerD(P1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif), rif^2))
J21(rTm_, rif_) := FullSimplify(powerD(Q1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif), rTm^2))
J22(rTm_, rif_) := FullSimplify(powerD(Q1(230 1.73, 0.2, 0.0072, 100 3.14, rTm, rif), rif^2))

and then I build the full Jacobian matrix as

J(rTm_, rif_) := {{J11(rTm, rif), J12(rTm, rif)}, {J21(rTm, rif),J22(rTm, rif)}}

Up to here everything is okay, but then when I substitute
$$r_{T_m}=f(T_m)=sqrtfrac{(230cdot1.73)^4 + 4cdot(230cdot1.73)^2cdot0.2cdot100cdot3.14cdot Tm}{4cdot(0.2)^2}$$


$$r_{i_f}=g(mi_f)=sqrtfrac{230cdot1.73cdot mifcdot100cdot3.14 }{(0.2)^2 + (100cdot3.14cdot 0.0072)^2}$$

with the command

J(Sqrt(((230 1.73)^4 + 4 (230 1.73)^2 0.2 100 3.14 Tm)/(4 (0.2)^2)), Sqrt(((230 1.73) mif 100 3.14)/((0.2)^2 + (100 3.14 0.0072)^2)))

I get the errors
“6.25 (2.5066610^10+3.9771110^7 Tm) is not a valid variable”,
“24254.6mif is not a valid variable.”

I have just started today using the software, so I am sure it is a pretty easy mistake to be solved, maybe I should assign the values differently to the various Jacobian elements $J_{i,j}$.

complexity theory – Consequence of NP-complete, and DP-complete w.r.t. randomized reductions

If a problem is NP-complete with respect to randomized (polynomial time) reductions, but not with respect to deterministic reductions, then we have P $neq$ BPP (See Question 2 here and its answer).

Suppose a decision problem is proved to be NP-complete; and it is also proved to be DP-complete with respect to randomized reductions. Does this have any real consequence?

Given a graph $G$, it is coNP-complete to test whether $G$ has exactly one 3-colouring upto swapping of colours (because the “another solution” problem associated with 3-colouring is NP-complete (1)). The same problem is DP-complete with respect to randomized reductions (2).

(1) Dailey, David P., Uniqueness of colorability and colorability of planar 4-regular graphs are NP-complete, Discrete Math. 30, 289-293 (1980). ZBL0448.05030.

(2) Barbanchon, Régis, On unique graph 3-colorability and parsimonious reductions in the plane, Theor. Comput. Sci. 319, No. 1-3, 455-482 (2004). ZBL1043.05043.

c++ – Gradient of regularizer wrt to weights in L2 regularization

In the empirical risk minimization formula that I am using for my neural net, the formula for updating the weights looks as follows:

Delta = (-Gradient of weight) - (lambda * gradient of regularizer)
Weight = Weight + alpha * Delta

In the source that I’m referencing, the gradient of the l2 regularizer is simply:

2 * weight

So I plug it into my code:

double regularizegrad(double weight)
  return 2 * weight;

for (size_t I = 0; I < Inlayer.size(); I++)
        double PD{};
        for (size_t iw = 0; iw < Inlayer(I).weights.size(); iw++)
            PD = (Inlayer(I).weightderivs(iw) * -1) - (Lambda * regularizegrad(Inlayer(I).weights(iw),"L1"));
            Inlayer(I).weights(iw) = Inlayer(I).weights(iw) + (Alpha * PD); 

once I run this in my net, I get insta NAN(ind) no matter what I put as my lambda and alpha values.

But if I use the gradient of L1 regularization ( which is sign(Weight), according to my source):

double regularizegrad(double weight)
    if (weight > 0) return 1;
    else if (weight < 0) return -1;
    else if (weight == 0) return 0;
    else return 2; //error

I have no problem whatsoever.
So my question is, is that the right way to calculate the gradient of regularizer for L2? I am now, as a result, doubting if I even did the L1 right, but the net definitely works if I use L1.
But yeah, simply unsure if the code for L2 gradient is correct.

reference request – Equidistributed sequence wrt exponential/Gaussian measure

For an arbitrary measure space $(X,mu)$, a sequence $(x_n)$ in $X$ is said to be equidistributed with respect to $mu$ if the measures $frac 1 n sum_{1le kle n} delta_{x_k}$ converges weakly to $mu$ as $n to infty$.

The special case of a compact interval equipped with the Lebesgue measure is very well studied. One way of building equidistributed sequences is to construct an ergodic transformation preserving the measure. This post sketches such a construction for the Lebesgue measure on the real line.

Is there any known explicit construction of an equidistributed sequence for the measure associated to an exponential or Gaussian random variable, preferably that doesn’t involve applying the cumulative distribution function/its inverse?

router – DD WRT drops internet access

I have a TP-Link wireless router (model WR841N v3) with DD WRT on it (don’t ask why it was here like this).

There are 3-5 devices connected to it (wireless) and sometimes one or more of them drop the connection. When this happens I can access the DD WRT GUI (currently on IP but can’t connect to the internet. If I change the Local IP in the ‘Basic settings’ of DD WRT and apply the new settings then I’ll be able to connect to the internet again but the DD WRT GUI will be inaccesible (timeout). In the former case (no internet, accessible DD WRT GUI) ipconfig /all says that the Wireless LAN adapter’s default gateway is, same as the Local IP set up in DD WRT. In the latter case it shows something different then the Local IP. I have very limited (literally no) network knowledge.

There were no changes in DD WRT’s default settings (as far as I know) only the WiFI was configured (default settings, security setup, etc). Do you guys have any guess why is this happening?

bipartite matching – Tight sets w.r.t. Hall’s condition

Consider a bipartite graph G=(U+V,E) and suppose that G has a perfect matching. Therefore by P. Hall’s condition, for every subsets A of U, the neighborhood N(A) of A has size at least |A|.
I am interested in subsets A for which N(A) has the same cardinality as A. Do they have a name?

It seems that any matching must contain a sub-matching between A and N(A).
Is there an algorithm to identify the largest such set?