Regex, Netlify, Gatsby and swapping links


caveat: this is an “I don’t really know what I’m doing but it’s working, so….” post.

This blog is on Netlify and uses Gatsby to build it from the a WordPress install (I’m typing this on the WP install, it’ll get built by Netlify and published there.) Because I’m a lazy sod, I’m just leaving the images on the WordPress install for now. And sometimes I want them to be linky so it’s possible to see the full size images.

I had a js bit that took the content and switched any links to the WP blog to the netlify one so that internal links on posts went to the correct place. Like this link to the post on improving fucking Gravatar. I’ve made a link as I’m typing to the post on the WP install and the js looks for it and makes it a link to the post on Netlify when it’s built there.

That was working well but I realised that I needed to keep the links to the images because those are hosted here / there (depending on whether you’re me typing this or you reading it). So I’ve added a bit to look and see if it’s an image link or not.

I also was having some insecure content warnings on the images and added something to look for “http:” on images and replace it with “https:”.

This isn’t abstracted for your situation or particularly robust but it works well enough for me right now. I’m not great (!!!) at regex and found https://regex101.com/ super helpful. If you are decent at regex, you will probably be “wow, what exactly is going on there” but I’m learning bit by bit. 😊

This isn’t a great explanation but here’s the function:

JAVASCRIPT
// Update links from local WP install.
let createLocalLinks = (html, wordPressUrl, prefix = '') => {
  // The regex for switching the links but not the image links.
  const regex = /href\s*=\s*(['"])(https?:\/\/.+?)(img)?(src=['"]https?:\/\/.+?)?(\/a>)/gi
  // Is this an image with an insecure link to the src? 
  const isImgHttps = /src=(['"])(http(s?):)([\/|.|\w|\s|-])*\.(?:jpg|gif|png)/gi
  let link
  while ((link = regex.exec(html)) !== null) {
    // If the first regex matches something, check that it's
    // a link to the WP install and then see if it *doesn't* link
    // an image.
    if (link[2].includes(wordPressUrl) && link[4] === undefined) {
      html = html.replace(wordPressUrl, `/${prefix}`)
    }
  }
  let src
  while ((src = isImgHttps.exec(html)) !== null) {
    if (src[2].includes('http:')) {
      let quoted = `src=${src[1]}http:`
      let quotedRE = new RegExp(quoted, 'g')
      html = html.replace(quotedRE, 'src=' + src[1] + 'https:')
    }
  }
  return html
}
, ,

Leave a Reply

Your email address will not be published. Required fields are marked *

By submitting this comment, you are agreeing to the use of Akismet which helps reduce spam. You can view Akismet’s privacy policy here. Your email, website and name are also stored on this site.