Validation of yaml files

I’m interested using something like kwalify to specify schemas for yaml pages located in /data as part of a middleman project. This is useful as it allows you to check the yaml files are valid and match a specified schema. Would this be possible using a middleman custom extension? Is this the best solution? What is the most appropriate hook to use?

Many thanks…

Why not create a presenter instead? You could handle any errors with syntax if you needed to, but you could be more flexible with your data.

Let’s say your data was products in a store.

class Product
  attr_accessor :name, :description
   
   # Defaults to zero if no price is given
   def price
      @price || 0
   end
  
   # Only accepts numbers
   def price=(n)
      raise "#{n} is not a number" unless n.is_a?(Fixnum)
      @price = n
   end

   # Returns a list of all products (you may want to memoize this)
   def self.all
      data.products.map do |data|
         self.from_data(data)
      end
   end
  
   # Construct a new product object from YAML data
   def self.from_data(data)
      self.new.tap do |product|
        product.name = data[:name]
        product.description = data[:description]
        product.price = data[:price]
      end
   end
end

There are a number of advantages; easier to test, easier to set defaults, choice over how you want to parse each attribute.

<h1>Products</h1>

<% Product.all.each do |product| %>
   <h2><%= product.name %></h2>
   <h3>$<%= product.price %></h3>
   <p><%= product.description %></p>
<% end %>

If you really wanted to just check the YAML, there are two approaches you could take. You could have a Middleman extension that checks when the data file changes, or you could have a separate tool that can check on demand, like a lint-checker. The latter is probably easier. Have it run just before building.

1 Like

Thanks - this is incredibly useful.

Having a presentation class seems very useful. Ideally, I would like to parse the yaml files once, at the start and make the parsed data globally available (in the way data is). Do you know if it would be possible to have something like:

products = Product.all

within config.rb, and then access the “products” data structure from any embedded ruby later on?

There are a few ways you could achieve that effect. I just memoize the method:

def self.all
      @@all ||= data.products.map do |data|
         self.from_data(data)
      end
   end

This way, the YAML is only evaluated once and then the objects are cached in memory. Subsequent calls to Product.all return the objects straight out of memory.

One downside to this is that you will have to reload Middleman when your YAML data changes. You can get around this by farming out your caching to a separate class, and then writing an Middleman extension that will break the cache any time the files change.

OK great thanks. Memoizing seems like the way to go.