Edit distance proof when last chars of 2 strings are the same

from Wagner–Fischer_algorithm:

I understand substitution cost needs to be adjusted, either 0 or 1:

 for j from 1 to n:
      for i from 1 to m:
          if s(i) = t(j):
            substitutionCost := 0
            substitutionCost := 1

          d(i, j) := minimum(d(i-1, j) + 1,                   // deletion
                             d(i, j-1) + 1,                   // insertion
                             d(i-1, j-1) + substitutionCost)  // substitution

But in later proof part, it claims when s(i) == t(j), d(i, j) = d(i-1, j-1):

If s(i) = t(j), and we can transform s(1..i-1) to t(1..j-1) in k operations, then we can do the same to s(1..i) and just leave the last character alone, giving k operations.

I don’t understand, shouldn’t it be d(i, j) = min(d(i-1, j-1), d(i-1, j) + 1, d(i, j-1)+1)?

It’s also mentioned in edit distance wiki

enter image description here

How do you prove if the last chars are matched, d(i, j) is d(i-1, j-1)?

following the lemma here,

I can think one way to prove this is to assert minimal is not from d(i-1, j-1) when s(i) = t(j). Then draw contradiction.

if minimal is not from d(i-1, j-1), the minimum must come from

  1. deleting last char from s, s(i).
  2. or inserting last char from t, t(j).

If we go with 1, we need to match s(0..i-1) with t(0..j).

A: if s(i-1) != t(j),

  1. insert t(j) to end of s, we are rewinding to the original state.
  2. modify s(i-1) to t(j), it’s equvilent to deleting s(i-1) (+1) when matching s(i-1) with t(j-1), whose minimum is d(i-1, j-1).
  3. delete s(i-1) -> only viable way

B: if s(i-1) = t(j),

  1. no insert t(j), otherwise the previous deletion is redundant.
  2. no modification: equivalent to 2nd point in discussion A.
  3. deletion -> viable

Combine A and B, only deletion is acceptable and it’s impossible to match if only deletion is performed on s.

Do you guys think this proof is correct?