algorithm – How can I morph between images in real time?


it has traditionally been a brutal algorithm to force your computer to perform.

This may have been true when the Elastic Reality software you link to was created in the mid-90s. Back then CPUs were measured in MHz. Today, even your phone is likely 10-20x faster than this, with double to sextuple the cores, not even counting the graphics acceleration hardware. Computers today are more than up to the challenge.

This also isn’t something you need a specialized engine feature to support. Modern graphics pipelines are built to be extremely flexible, so you can create this kind of effect for yourself with just a little shader work.

Here’s a simple example I cooked up, just to test how far we could get with a fairly naive approach:

Photo of Douglas Gregory morphing into his roommates' cat

(Using a couple of images I had handy – photo of me taken at GDC by James Everett, photo of my roommates’ cat by me 😉)

As usual, we’ll place pairs of point markers to designate corresponding feature points in each image. More full-featured solutions will work with line segments or Bézier curves, but we can get a pretty decent look with just points for starters.

From these points, I generated a displacement map that encodes the travel each pixel needs to undergo: the red and green channels store the horizontal & vertical displacement for the pixel that starts at this location in the source image, respectively, and the blue and alpha channels store the displacement for the pixel that arrives at this location in the destination image.

Displacement map

Now the work the shader has to do at display time is dead simple. All it needs is the source and destination images, the morph displacement map above, and a progress parameter to set how far through the morph it should display for the current frame. Then each frame, for each rendered pixel, it will:

  1. Sample this pixel’s location in the displacement map, and decode it from the 0-1 colour space to the -1 to 1 displacement range.

  2. Backtrack along the displacement vectors to find which pixel in the source image should have arrived to here. Sample the source image at that backtracked point.

  3. Scan ahead along the displacement vectors to find which pixel in the destination image should come from here. Sample the destination image at that look-ahead point.

  4. Cross-fade between these two displaced texture lookups.

A sharp reader will note this is a pretty crude approximation. The displacement vectors we look up are only strictly correct at the very beginning and very end of the morph. In-between, we’re just counting on the morph map being smooth enough that the vectors we read halfway between our source and destination are “pretty close” to the direction this pixel needs to be moving at this moment in the blend.

If you tried to do a very aggressive shape change this way, the artifacts would likely become very noticeable. But for blending a face to another face at a similar angle, it does pretty OK.

The exact shader code (or graph) will depend on the engine or tool stack you’re using, but here’s an example of this strategy in Unity ShaderLab:

Shader "Unlit/Morpher"
{
    Properties
    {
        _MainTex ("Source", 2D) = "white" {}
        _MainTex2("Destination", 2D) = "white" {}
        _Morph("Morph Map", 2D) = "white" {}
        _MaxTravel("Max Travel", Vector) = (1, 1, 1, 1)
        _Progress ("Morph Progress", Range(0, 1)) = 0
    }
    SubShader
    {
        Tags { "RenderType"="Opaque" }
        LOD 100

        Pass
        {
            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag

            #include "UnityCG.cginc"

            struct appdata
            {
                float4 vertex : POSITION;
                float2 uv : TEXCOORD0;
            };

            struct v2f
            {
                float2 uv : TEXCOORD0;
                float4 vertex : SV_POSITION;
            };

            sampler2D _MainTex;
            sampler2D _MainTex2;
            sampler2D _Morph;

            float4 _MainTex_ST;
            float4 _MaxTravel;

            float _Progress;

            v2f vert (appdata v)
            {
                v2f o;
                o.vertex = UnityObjectToClipPos(v.vertex);
                o.uv = TRANSFORM_TEX(v.uv, _MainTex);
                return o;
            }

            // Simple sigmoid / smoothstep function.
            float sigmoid(float t) {
                return 3 * t*t - 2 * t*t*t;
            }

            fixed4 frag (v2f i) : SV_Target
            {
                // Enable to animate automatically over time, rather than a manual parameter input.
                //_Progress = 0.5f * (1.0f + sin(_Time.x * 10.0f));                     

                // Decode the displacement vector into the correct range.
                float4 displacement = (tex2D(_Morph, i.uv) - 128.5f/256.0f) * 2.0f * _MaxTravel.xyxy;

                // Biased progress curve that starts and ends slower,
                // and transitions faster in the middle.
                float sig = sigmoid(_Progress);

                // At the start of our morph, the source vectors are more accurate.
                // At the end, the destination vectors are more accurate.
                // So we'll transition from using one set to the other.
                float2 travel = lerp(displacement.xy, displacement.zw, sig);

                // Estimate which source pixel should have arrived to here.
                float2 sourcePos = i.uv - travel * _Progress;
                
                // Then repeat the estimation in reverse to find our matching destination pixel.
                float2 destPos = i.uv + travel * (1.0f - _Progress);
                
                // Sample the source and destination textures.
                fixed4 col = tex2D(_MainTex, sourcePos);
                fixed4 col2 = tex2D(_MainTex2, destPos);

                // Cross-fade between the two samples.
                return lerp(col, col2, sig);
            }
            ENDCG
        }
    }
}

You can see there’s not a lot to this shader. We could probably have hundreds of these rendering simultaneously, even on a modest GPU – the limiting factor is likely to be our texture memory bandwidth for fetching the source/destination photo pixels, not the morphing computations.

That’s because we did all the hard work in advance when we built the displacement map.

The strategy for that is:

  1. Loop over all the source and destination points, and build a list of displacement vectors that take each source to its destination.

    In my version, I also keep track of the maximum displacement in this pass, so I can scale everything relative to this and make maximum use of the limited precision in my colour displacement map.

  2. Loop over each pixel in your displacement map.

    • For each pixel, loop over all your feature point pairs.

    • Compute the distance of this pixel from the source point of the pair.

    • Turn this distance into a weight that’s extremely high at the point itself, and low elsewhere. (I use the inverse squared distance, plus a little epsilon to avoid division by zero)

    • Scale the pair’s displacement vector by this weight, and add it to an accumulated sum. Also accumulate the weight itself.

    • After you’ve processed all the point pairs, divide the accumulated weighted displacement vectors by the accumulated weight. Now you have a weighted average.

    • Convert the vector into the range 0-1 (or 0-255), and store its x & y values in the red and green channels of your texture.

    • Repeat for the blue and alpha channels, except now it’s the distance from the destination point in each pair that’s responsible for computing the weight.

  3. Save/upload your modified texture.

You can see we’ve got some 3-deep nested loops here, so there’s a not insignificant amount of work going on. But even with this naive algorithm running single-threaded, I’m able to process the 48 point pairs to a 256×256 displacement map in less than a second – good enough for design-time texture creation. The effect still holds up even with a tiny 16×16 map, which takes just 3 ms to prepare.

If you needed to, you could divide up parts of the displacement map to different background threads to compute in parallel, or even push the whole thing to the GPU.

More sophisticated approaches to this problem might try subdividing the mesh we’re using to display the photos, and displacing the vertices around similar to a 3D morph target. Or using more sophisticated shaders to loop over a list of pairwise displacement primitives and compute a more accurate displacement vector at render time. But this should at least show the spirit of how you can start to produce this kind of effect in the game tech of your choice.