Conversion of German umlauts in tag_path

I want to create a Blog in German with Middleman and I’ve run into a problem with how the tag paths are created from the tag list, if the tags contain the very common German umlauts (Ä,ä,Ö,ö,Ü,ü). Let me explain the issue with the following test case:

Define the tag path in config.rb as follows:

:taglink = "/tags/{tag}.html"

Create two articles with the following YAML frontmatter:

---
tags: Löcher
title: 'Article tagged "Löcher"'
---

---
tags: Straße
title: 'Article tagged "Straße"'
---

Put this into layout.erb:

<h3>Tags:</h3>
<% blog.tags.each do |tag, articles| %>
  <p>- <%= tag %> (<%= articles.size %>), Tag Page: <%= link_to "#{tag_path(tag)}", tag_path(tag) %></p>
<% end %>

This produces the following output:

Tags:
- Straße (1), Tag Page: /tags/strasse.html
- Löcher (1), Tag Page: /tags/locher.html

You see that the German character ‘ß’ in ‘Straße’ has been correctly converted into ‘ss’ in the tag path (this is according to German convention). However the umlaut ‘ö’ in ‘Löcher’ has been converted to ‘o’. This is against German convention, which requires the following conversions:

'Ä','ä' => 'Ae','ae'
'Ö','ö' => 'Oe','oe'
'Ü','ü' => 'Ue','ue'

The correct tag paths should therefore be:

Tags:
Straße (1), Tag Page: /tags/strasse.html
Löcher (1), Tag Page: /tags/loecher.html

This is important, because ‘Locher’ and ‘Löcher’ are two different words with completely different meanings, that shouldn’t be merged into one tag page. And it also looks very odd to Germans if ‘oe’ is converted to ‘o’.

Now let’s look at what is being displayed on the tag page. Put this into tag.html.erb:

<h3>Articles that have the tag '<%= tagname %>':</h3>
<% page_articles.each do |article| %>
  <p>- <%= link_to article.title, article %></p>
<% end %>

Then the article is correctly listed on the tag page /tags/locher.html:

Articles that have the tag 'Löcher':
- Article tagged "Löcher"

Now add a third article with the tag ‘Locher’ in the YAML frontmatter:

---
tags: Locher
title: 'Article tagged "Locher"'
---

The code in layout.erb now produces the following list:

Tags:
- Straße (1), Tag Page: /tags/strasse.html
- Löcher (1), Tag Page: /tags/locher.html
- Locher (1), Tag Page: /tags/locher.html

Notice that both ‘Löcher’ and ‘Locher’ are mapped to the same tag page /tags/locher.html.
But article list on /tags/locher.html now looks like this:

Articles that have the tag 'Locher':
- Article tagged "Locher"

The article tagged ‘Löcher’ has disappeared. Apparently there is a collision somewhere, where the tag name without the umlaut overrides the tag name with umlaut. I assume the same problem probably will appear with categories, but I haven’t tested that yet.

Unfortunately, I’m a complete ruby newbie, I just installed it yesterday for the first time. So I would really appreciate your help:

  • What would be the smartest way to force a mapping of the tag ‘Löcher’ to the tag path /tags/loecher.html, so that this page actually lists all the articles tagged ‘Löcher’, and not those tagged ‘Locher’?
  • Could this be achieved by adding some code in config.rb? How would that look like?
  • Or is this something I should report as a bug to the middleman developers?

Thank you very much for your help!