Skip to main content

Blog Post

Memory Management with Migrations in Drupal 8

by Chris Runo
July 20, 2017

When dealing with a site migration that has hundreds of thousands of nodes with larger than usual field values, you might notice some performance issues.

In one instance recently I had to write a migration for nodes that had multiple fields of huge JSON strings and parse them.  The migration itself was solid, but I kept running into memory usage warnings that would stop the migration its tracks.

Sometime during the migration, I would see these messages:

  • Memory usage is 2.57 GB (85% of limit 3.02 GB), reclaiming memory.
    [warning]
  • Memory usage is now 2.57 GB (85% of limit 3.02 GB), not enough reclaimed, starting new batch
    [warning]
  • Processed 1007 items (1007 created, 0 updated, 0 failed, 0 ignored) - done with 'nodes_articles'

The migration would then cease to continue importing items as if it had finished, while there were still several hundred thousand nodes left to import.  Running the import again would produce the same result.

I found a few issues on drupal.org that show others have been having similar issues:
https://www.drupal.org/node/2701335
https://www.drupal.org/node/2701121

The Drupal site was up to date and the patches provided in those issues weren't working.  The ideal solution would be to solve the problem so that the migrations would start back up after memory was freed, but because there wasn't enough time to dig into the cause of the issue, I opted for another solution.

Often times it can be useful to create a bash script to run your migrations for you.  That way you don't have to chain drush migrate-import commands together.  So writing a bash script like this:

#!/usr/bin/env bashecho "Importing users";
drush mi users;
echo "Importing terms";
drush mi terms;
echo "Importing articles";
drush mi nodes_articles;
echo "Importing others";
drush mi other_nodes;

...can help save keystrokes.

When I ran into the memory issues with these larger migration items I thought it might be easier to apply a solution to the bash script since there was nothing inherently wrong with the migrations themselves.

I came up with this bash method:

migration_loop(){# Get the output of the drush status.drush_output=$(drush ms | grep $1);

	# Split output string into an array.output=( $drush_output );

	# Output the status items.for index in "${!output[@]}"do    	if [ $index == "0" ]then        	echo "Migration: ${output[index]}";
    	fi    	if [ $index == "1" ]then        	echo "Status: ${output[index]}";
    	fi    	if [ $index == "2" ]then        	echo "Total: ${output[index]}";
    	fi    	if [ $index == "3" ]then        	echo "Imported: ${output[index]}";
    	fi    	if [ $index == "4" ]then        	echo "Remaining: ${output[index]}";
    	fi	done# Check if all items were imported.if [ "${output[4]}" == "0" ]then    	echo "No items left to import.";
	else    	echo "There are ${output[4]} remaining ${output[0]} items to be imported.";
    	echo "Running command: drush mi $1";
    	echo "...";
    	# Run the migration until it stops.
    	drush mi $1;
    	# Run the check on this migration again.
    	migration_loop $1;
	fi}

The loop is pretty simple.  It simply reads the output of drush migrate-status for a given migration using grep as a filter.  It then prints out some information about the migration and determines.  

Based on the drush output of how many items remain to be imported, it will either run the migration again...

Migration: nodes_articles
Status: Idle
Total: 62294
Imported: 50672
Remaining: 11622

There are 11622 remaining nodes_articles items to be imported.
Running command: drush mi thr_node_venue

or end the loop...

Migration: terms
Status: Idle
Total: 8536
Imported: 8536
Remaining: 0

No items left to import.

Here is a full example of the script:

#!/usr/bin/env bash

migration_loop(){# Better readability with separation.echo "========================";
	# Get the output of the drush status.drush_output=$(drush ms | grep $1);

	# Split output string into an array.output=( $drush_output );

	# Output the status items.for index in "${!output[@]}"do    	if [ $index == "0" ]then        	echo "Migration: ${output[index]}";
    	fi    	if [ $index == "1" ]then        	echo "Status: ${output[index]}";
    	fi    	if [ $index == "2" ]then        	echo "Total: ${output[index]}";
    	fi    	if [ $index == "3" ]then        	echo "Imported: ${output[index]}";
    	fi    	if [ $index == "4" ]then        	echo "Remaining: ${output[index]}";
    	fi	done# Check if all items were imported.if [ "${output[4]}" == "0" ]then    	echo "No items left to import.";
	else    	echo "There are ${output[4]} remaining ${output[0]} items to be imported.";
    	echo "Running command: drush mi $1";
    	echo "...";
    	# Run the migration until it stops.
    	drush mi $1;
    	# Run the check on this migration again.
    	migration_loop $1;
	fi}

migration_loop users;
migration_loop terms
migration_loop article_nodes;
migration_loop other_nodes;

With this, you can circumvent any memory issues you may encounter with large migrations if time is limited.

Additional Resources:
Migration with Custom Values in Drupal 8 | Blog
Drupal 8: How to Reference a Views' Block Display From a Field | Blog
Rethinking Theme Structure in Drupal 8 | Blog

Chris Runo

Meet team member, Chris Runo

Chris brings four years of Drupal experience to his role as a Senior Drupal Architect at Mediacurrent. Throughout his web development career, Chris has gained a unique perspective on what it takes to deliver a successful project through his experiences in freelancing, agency, and also strictly product development/support work.

During his college years at the New Jersey Institute of Technology, Chris created his first production Drupal website for the school’s mechanical engineering department and worked on a Drupal site for a major nurses' association in Pennsylvania. He then began his career at Northpoint Digital and Ernst & Young where he gained a wide range of experience with Drupal including e-commerce, migrations, front-end, back-end, headless Drupal, templating, site architecture, mentoring, support, client interaction, devops, and repository and deployment management. Prior to Mediacurrent, Chris worked as a software developer on feature request and support for a custom PHP-based site, Vultr (VPS hosting company e.g. Digital Ocean, Amazon AWS, etc.), devops, and system administration.

Outside of work, Chris can be found obsessing over cars and keyboards in New Jersey, the state he calls home. He is an audiophile who keeps a server rack running in his living room.

Learn more about Chris >

Related Insights