8 – PHP Script taking up way more memory than expected

We have a script that is meant to clean up some user data. Essentially, we 1) load all user ids and loop through them. For each user we 1) load the user object then 2) alter some data and finally 3) we save the user.

We originally did all of this in one script file but we were running into memory issues about 15% of the way through processing all the users and weren’t certain why. So we’ve tried the following:

  1. Unset any variable instanced in the for-loop at the end of the for-loop using unset
  2. Any variable instanced in the for-loop is set to NULL at the end of the for-loop
  3. Loading, Altering, and Saving the user object functionality was extracted to a service method that the script will call

The hope for attempted fixes 1 and 2 would be that the memory would be deallocated and php’s garbage collector would clear up memory, but it did not

The hope for attempted fix 3 was that the memory from the user object and all that would be cleaned up after the service function finished since, I believe, it should run on a different process.

I’m very confused as I don’t know what variable(s) are taking up so much memory, if anything, the list of user ids should be the largest variable taking up memory but it doesn’t even take up that much… In our D7 project, I had many scripts that ran through each user and did some work and saved the user and it had no issues, but in this D8 version I’m having memory issues.

Here is the code we have now, and I simply cannot find anything here that would suggest something continually to take up memory…

Script

<?php
use DrupaluserEntityUser;

$user_service = Drupal::service('my_user.my_user_service');

$uids = Drupal::entityQuery('user')
                ->condition('status', '1')
                ->execute();
$total_users = count($uids);
$user_index = 0;

foreach($uids as $uid) {
  $user_index += 1; 
  
  $ytp_user_service->cleanUp($uid);

  $percent_complete = ($user_index / $total_users) * 100;
  $percent_complete_formatted = number_format($percent_complete, 2, '.', ' ');
  $memory =  memory_get_usage()/1000000;
  print_r("r33(KUSER: {$uid} | COMPLETION: {$percent_complete_formatted}% | MEMORY:  {$memory} MB");
  $start_memory = memory_get_usage();
}

Service Method

  function cleanUp($uid) {
    $today = new DateTime();
    $entityManager = Drupal::service('entity_field.manager');
    $fields = $entityManager->getFieldStorageDefinitions('user', 'user');
    $countries = options_allowed_values($fields('field_country'));
    $lower_case_countries = array_map('strtolower', $countries);
    $formatted_date = date('Y-m-dTH:i:s', time());

    $user = User::load($uid);
    $username = $user->getUsername();
    $roles = $user->getRoles();
    if (in_array('administrator', $roles) || in_array('advisor', $roles)) {
      return;
    }

    // Convert Birthdate to age and update ages (Can be done during migrations)
    // Subtract year of birth from Registration date
    // Ex: Registration Date: June 1st, 2020
    // Year of birth: 2005
    // Age: 15 (June 1st, 2020 – June 1st 2005)
    $age = $user->get('field_age')->value ?? '';
    $reg_timestamp = $user->getCreatedTime();
    $birthdate = strtotime($user->get('field_birthdate')->value) ?? '';
    if(!$age && $birthdate) {
      $reg_month = date('m', $reg_timestamp);
      $reg_day = date('d', $reg_timestamp);
      $birth_year = date('Y', $birthdate);
      $new_birthday = $birth_year.'-'.$reg_month.'-'.$reg_day;
      $birth_date_time = new DateTime($new_birthday);
      $diff = $today->diff($birth_date_time);
      $user->set('field_age', $diff->y);
    } else if ($age) {
      $reg_date = date('Y-m-d', $reg_timestamp);
      $diff = $today->diff(new DateTime($reg_date));
      $new_age = $age + $diff->y;
      $user->set('field_age', $new_age);
    }

    // Add Country of Residence to people who don't have them
    // Only if their Nationality is Emirati or they have an emirate of Residence and they don't have a country
    // get country options
    // compare country to options
    // if it is not in options make it other
    $country = $user->get('field_new_residence')->value;
    $nationality = strtolower(trim($user->get('field_nationality')->value));
    $emirate = $user->get('field_country_residence')->value ?? '';
    if (($nationality === 'uae/emirati citizen' || $emirate) && !$country) {
      $user->set('field_country', 'United Arab Emirates');
    } else if (!in_array(trim(strtolower($country)), $lower_case_countries)) {
      $user->set('field_country', 'Other');
    } else {
      $user->set('field_country', trim(ucwords($country)));
    }

    // Remove hashtags from existing registration codes entered by users
    $reg_code = $user->get('field_workshop_code')->value;
    if($reg_code) {
      $user->set('field_workshop_code', strtolower(str_replace ('#', '', $reg_code)));
    }

    // Levels tasks
    // task 1_1
    // complete me3 and match 3 careers
    $task_1_1 = $user->get('field_task_1_1')->value;
    $matched_degs = $user->get('field_matched_degrees')->getValue();
    if(!$task_1_1 && count($matched_degs) >= 3) {
      $user->set('field_task_1_1', 1);
      $user->set('field_task_1_1_date', $formatted_date);
    }

    // task 1_2
    // enroll in one course
    $task_1_2 = $user->get('field_task_1_2')->value;
    $enrolled_courses = $user->get('field_current_courses')->getValue();
    if(!$task_1_2 && count($enrolled_courses)) {
      $user->set('field_task_1_2', 1);
      $user->set('field_task_1_2_date', $formatted_date);
    }

    // task 2_1
    // complete 1 one course
    $task_2_1 = $user->get('field_task_2_1')->value;
    $completed_courses = $user->get('field_completed_courses')->getValue();
    if(!$task_1_2 && count($completed_courses)) {
      $user->set('field_task_2_1', 1);
      $user->set('field_task_2_1_date', $formatted_date);
  
    }

    // task 2_2
    // request certificate
    $task_2_2 = $user->get('field_task_2_2')->value;
    $certificates = $user->get('field_course_certificates')->getValue();
    foreach($certificates as $cert_string) {
      $decoded_data = json_decode($cert_string('value'));
      if(is_null($cert_data)) {
        continue;
      } else {
        $cert_data = reset($decoded_data);
      }
      if ($cert_data->grading->certificate->download_url) {
        $user->set('field_task_2_2', 1);
        $user->set('field_task_2_2_date', $formatted_date);
        break;
      }
    }

    // task 3_2
    // complete 3 online courses
    $task_3_2 = $user->get('field_task_3_2')->value;
    if(!$task_3_2 && count($completed_courses) >= 3) {
      $user->set('field_task_3_2', 1);
      $user->set('field_task_3_2_date', $formatted_date);
    }

    // Save User
    $user->save();
  }

What am I missing here that is causing the script build up so much memory as it processes items?