Upgrading websites off of legacy platforms and onto Drupal is a pretty common request. One of the most helpful contributor developed modules, Migrate, is now in Drupal 8 core, which really solidifies Drupal as a great choice when having to make the decision of which CMS to go with when upgrading.
A major problem that can come up early in planning migrations for older sites is the source data format. There are some fairly well-known contrib modules that help with this already; Migrate JSON , XML, and CSV. XML and CSV source plugins are particularly useful with older sites. For instance, legacy Oracle content management systems used XML for menu and page structure.
CSV source data is a common occurrence for older sites. It can be useful to have a CSV export when it's not easy to gain quick access to the source data, or in some cases, when the source data comes from data types that aren't able to be read by Drupal's migrate module. One of those unreadable data sources, which comes up more often than expected, is Microsoft Access. Access provides two tools: databases and reporting.
Until recently, the easiest solution to migrate Access data into Drupal using the migrate module was to export data and reports into CSV files and use migrate_source_csv to read and import the data into Drupal. While this is possible to get working it often requires more time to finish than it probably should.
A number of issues can come up when dealing with data in CSV files. If the CSV file is created on a Windows-based machine, line endings and file encoding affect how each row and value is being read by the importer. If values are not encapsulated correctly, it can cause the importer to read single values as separate columns and cause all kinds of issues for the imported data. Some of which can not always be caught right away. These problems are particularly difficult to figure out when you're dealing with large sets of data across multiple CSV files.
One of the nice features of Microsoft Access is that it provides functionality to export databases and reports to Excel files. Excel (xls, xlsx, ods, etc.) files are usually larger than CSVs and inherently more complicated in structure, but what they provide is data that can be read the same way across multiple spreadsheet applications in different operating systems. If only there was a way we could read data from spreadsheet files rather than having to use multiple CSVs.
In comes the Migrate Spreadsheet module. As of writing this post it hasn't even been on drupal.org for a full month, but it has already proved its worth. I had the unfortunate problem of being stuck with an xlsx file as source data for a migration. I tried converting it to CSV but encountered the issues that can potentially come with csv data listed above.
This module leverages the PhpSpreadsheet plugin by PHPOffice, which makes it possible to read data from ods, xlsx, xls, xml, html, sylk, and csv files. We are only concerned with xlsx in this case. Both migrate_spreadsheet and PhpSpreadsheet can be installed via Composer by following the installation instructions, so this is ready to go for modern Drupal 8 projects.
Setting up the migrate_plus configuration file is pretty straightforward and similar to how CSV migrations are set up. The major differences are that you need to specify the worksheet you're migrating from in the xlsx file. This is actually very useful so that you can use one source data file for multiple migrations. You can then set the origin, which is the starting point for data in the worksheet, by cell ID (ex: A2). Following that, you can set the row number which contains your header information (if you have one).
The list of columns is slightly different than in a csv migration since you only need to list the columns as they appear in the excel sheet. Unlike the csv migration configuration, you don't need to set key or index for each column name. The only other difference being that when setting the key identifier for the source data, you need to tell the migration what type of data it is (ex: integer). Below is an example of an excel sheet source migration configuration file.
The migration class itself doesn't differ much from other migration classes. Simply include a use statement for Drupal\migrate_spreadsheet\Plugin\migrate\source\Spreadsheet; make sure your class extends the Spreadsheet class, and that's it!