Update on my website and blog migration to Drupal with Druxtjs
donderdag 2 maart 2023 - 860 woorden, 5 min read
In June previous year I've started working on migration my website to Drupal with Druxtjs. In this blog I will share a (technical) update on this work.
In my blog Merging my site and blog into Drupal and Vue.js with Druxtjs I pointed out my plan for migration my website and blog into my favorite CMS (Drupal) and frontend framework (Vue.js). I ended this blog with the following bullet points:
Upcoming challenges:
- Migrate current blogposts (markdown files) into Drupal
- Migrate related media (images and audio files) into Drupal
- Deployment setup with Gitlab CD/CI
I will share more updates soon about this project how I tackled these challenges!
I’ve tackled the first two points:
- Migrated all blogposts into nodes in Drupal
- Migrated all related media into file entities in Drupal
This was quite some work and I wrote a Drupal migration to accomplish this. Let me share the work how I’ve done this so this blog turns almost in a tutorial.
Write a custom Drupal migration
For building the migration, I’ve used these contrib modules:
- Migrate source directory
For migration markdown files to content entities. - Migrate Plus
For some basic examples and additional educational documention. -
Migrate tools
For using drush migrate commands:migrate: migrate:fields-source (mfs) List the fields available for mapping in a source. migrate:import (mim) Perform one or more migration processes. migrate:messages (mmsg) View any messages associated with a migration. migrate:reset-status (mrs) Reset an active migration's status to idle. migrate:rollback (mr) Rollback one or more migrations. migrate:status (ms) List all migrations with current status. migrate:stop (mst) Stop an active migration operation.
More info about writing your own custom migration, use these docs.
The file structure for the migration on path web/modules/custom/migrate_blog/
:
- migrations
-- inline_media_to_files.yml
-- markdown_to_blognodes.yml
migrate_blog.info.yml
migrate_blog.module
All my Markdown files (blog content) were located outside the project root. I moved them into my project root in the directory blog
. The path you are using in the migration script must be absolute!
This directory has many directories with a single index.md
file. Every directory represents a blog item. Additional files such as images used in the blog item are also located in this directory.
Check if your migration show up in the list with migrations with the command drush ms
:
As you can see, the script has detected 207 directories to import.
Let’s try to import just one to start with and run drush mim markdown_to_blognodes --limit=1
Let’s see what happens. If you need more info about executing this command, use drush mim —-help
As expected, this is not working out of the box.
When the status of your migration is set to busy, use drush mrs markdown_to_blognodes
to set the status back to idle. Only when your migration has this status, you can execute the migration.
To rollback a migration use drush mr markdown_to_blognodes
to undo all imported items.
Modify and set data values before it’s being imported
You must use the function hook_migrate_prepare_row
in your .module file to modify data before it’s processed. Otherwise the process fields in your migration will return NULL values.
Let me show this with an example with this snippet:
// Get all content from the markdown file
$content = file_get_contents($row->getSourceProperty('source_file_realpath'));
// Split content into array
$content = explode('---', $content);
// Array item with key 0 is empty
// ---
/*
* Format array of with key 1:
* layout: blog\n
* title: Search met Whoogle\n
* date: 2022-05-03T19:46:22.810Z\n
* description: intro text
* categories: whoogle google selfhosted degoogle bigtech\n
* comments: false\n
*/
// ---
// Array with key 2 contains the full text (markdown) of the blog item
// get title, title is on the second line in $content[1]
preg_match('/^title: [^\r\n]*/m', $content[1], $title);
// remove 'title: ' from the string
$title = str_replace('title: ', '', $title);
// remove quotes around title
$title = trim($title[0], '"');
// set title property so can be used in the migration
$row->setSourceProperty('title', $title);
Migrate embedded media item to file entities
After I managed to migrate by markdown files into blog nodes, I was missing my embedded media items in the content. Media items such as images and audio files. As mentioned all these files are in the same directory as each markdown file of one blog item.
I created a second migration script for migrating all media files (.jpg, .jpeg, .mp3, .mp4, .gif, .svg, .png) into file entities.
After this step, I editted my content to replace the file paths. I’ve done this with this regex in a preg_replace function:
// replace file paths
$sub_dir = preg_replace('/\/var\/www\/blog\//i', '', $source_file_path);
$prefix_file_path = '/sites/default/files/blog/' . $sub_dir;
// ![title-of-image](image.png)
// is replaced with
// ![title-of-image](/sites/default/files/blog/image.png)
$body = preg_replace('/!\[(.*)\]\((.*)\)/im','![$1]('.$prefix_file_path.'/${2})' , $body);
After a couple of hours I ended up with quite a large script for migrating all the files to Drupal and replacing all the paths in the markdown files. This is because of content once was published on three different blogs (two Wordpress sites and one Drupal site).
All code of the hook_migrate_prepare_row
you can find in this snippet.
Almost finished
I still need to finetune some things in the migration, but I’m also building the frontend for my blog. Here is a little preview: