unity – How do I activate a game action when a UI button is held?

First, change your event trigger. Right now you have both “Set Down State” and “Set Up State” called from the “Pointer Down” event.

You want to remove “Set Up State” from this event. Add a second event that listens for “Pointer Up” and attach your “Set Up State” to that one instead.

Then you want to change your code to check GetButton() (which returns true if the button is currently held), not GetButtonDown() (which returns true only the first frame that a button is pressed, and false for subsequent hold frames)

Something like this:

void Update()
{        
    if (CrossPlatformInputManager.GetButton("E"))
    {
        if (box == null) {
            TryGrab();
        }
    }
    else if (box != null) {
        Drop();
    }
}

void TryGrab() {
    Physics2D.queriesStartInColliders = false;
    RaycastHit2D hit = Physics2D.Raycast(transform.position, Vector2.right * transform.localScale.x, distance, boxMask);

    // Use CompareTag, not .tag ==
    if (hit.collider != null && hit.collider.gameObject.CompareTag("Box"))
    {
        box = hit.collider.gameObject;
        var body = hit.rigidbody;
        var joint = box.GetComponent<FixedJoint2D>();

        joint.connectedBody = GetComponent<Rigidbody2D>();
        joint.enabled = true;
        body.gravityScale = 1;
        body.mass = 1;
    }
}

void Drop() {
    box.GetComponent<FixedJoint2D>().enabled = false;

    var body = box.GetComponent<Rigidbody2D>();
    body.gravityScale = 6;
    body.mass = 6;

    box = null;
}

As an aside, I’d also recommend naming your virtual button “Grab” not “E” – you might one day want to map it to a different control than “E”, or give the player options to remap the control, so hard-coding a specific letter can cause trouble in the future.