ICE09 . playing with java, scala, groovy and spring .

Playing with Spring

Archive for October, 2011

Groovy vs. Ruby – the CSV shootout (7L7W day 3)

Posted by ice09 on October 18, 2011

Nope, there is no rant here. Even more, there is no CSV shootout. However, a little CSV is here, it’s even Meta-CSV somehow.

Ok, let’s clarify this: I just started reading the book “Seven Languages in Seven Weeks” and I already like it. It’s exactly written for someone on my language-knowledge level, ie. someone who tried and used several languages and is pretty good at getting the differences between OO and functional languages, but has difficulties to explain what differentiates Clojure from Scala from Haskell from Io besides the different kinds of typing (static vs. dynamic). Someone who has heard of but never knew what prototypes are about and what specialities each language defines (the movie comparison also helps a lot. But be aware that Prolog developers have to relax a little and accept to be compared to Rain Man).

So, the first day is about Ruby. I know Groovy pretty well, so I thought that would be easy. It was. But, I realize that the subtile differences matter a lot. In short, I feel that Ruby is much more consistent (you will see by the sample). However, I like Groovy more – after all, for a Java developer it’s the much better fit. But also, I like the even more magic – which I’d hate in big production systems but really love for scripting.

So, the sample is a CSV reader – the Ruby part is much longer than it has to be, but it should show the Metaprogramming mechanisms. And I like them a lot, even more than the Groovy ones (though Categories are really cool for scoping the Metaprogramming).

class CsvRow
  attr_accessor :values, :keys
  
  def initialize( keys, values )
    @keys = keys
    @values = values 
  end
  
  def method_missing(id, *args)
    if (id==:to_ary) 
      then return @values 
      else return @values[@keys.index(id.to_s)]
    end
  end
end

module ActsAsCsv
  def self.included(base)
    base.extend ClassMethods
  end
  
  module ClassMethods
    def acts_as_csv
      include InstanceMethods
      include Enumerable
    end
  end
  
  module InstanceMethods   
    attr_accessor :headers, :csv_contents
    
    def each &block
       @csv_contents.each{|csvRow| block.call(csvRow)}
    end
    
    def read
      @csv_contents = []
      filename = self.class.to_s.downcase + '.txt'
      file = File.new(filename)
      @headers = file.gets.chomp.split(';').collect{|s| s.delete("\"")}
      
      file.each do |row|
        values = row.chomp.split(';').collect{|s| s.delete("\"")}
        @csv_contents << CsvRow.new(@headers, values)
      end
    end
    
    def initialize
      read 
    end
  end
end

class RubyCsv  # no inheritance! You can mix it in
  include ActsAsCsv
  acts_as_csv
end

m = RubyCsv.new
m.each { |it| puts it.Kontonummer }

I will not explain how this works here, there are bazillions of resources of people really knowing Ruby. The most important fact is how the Metaprogramming works – and it does it exactly as expected. Even more, it does it right. The best fact to me is that upon including the module (which unfolds itself into the base class very nicely), the include method is called automatically. There is no dependency of the RubyCsv on it’s Mixin (and there shouldn’t be!).

So, this is pretty cool. How do I achieve this in Groovy? This is diffficult, there are no modules in Groovy. Of course, Metaprogramming is easy in Groovy, but I want it to mimick the Ruby script.

The most I can get is by using the Groovy 1.6 @Mixin annotation like this:

class CsvRow {
    def keys = []
    def values = []
    
    def propertyMissing(String name) { 
        values[keys.indexOf(name)]
    }
}

class ActAsCsv {
    public headers = []
    public csvRows = []
    
    def read(def instance) {
        new File("rubycsv.txt").eachLine {
            if (instance.headers.isEmpty()) {
                instance.headers = it.trim().split(';').collect{it.replaceAll("\"", "")}
            } else {
                def values = it.trim().split(';').collect{it.replaceAll("\"", "")}
                instance.csvRows << new CsvRow(keys:instance.headers, values:values)
            }
        }
    }

    def iterator() {
        csvRows.iterator()
    }
}

@Mixin(ActAsCsv)
class GroovyCsv {
    GroovyCsv() {
        read(this)
    }
    //@Delegate ActAsCsv acter = new ActAsCsv()
}


def act = new GroovyCsv()
println act.collect { it.Buchungstext.reverse() }
act.each { println it.Kontonummer }

So, this is nice as well, but compared to Ruby it’s more clumsy. The worst thing is the trick with calling the Mixin in the constructor with the this reference. It does not work otherwise, since the fields cannot be set on the Mixin-GroovyCsv-instance itself. This is weird and costs time to debug. It is not a real dependency upon the Mixin itself, but it’s just superfluous.
Also, the nice template method style of including modules in Ruby is not the same. It fells better in Ruby (look up the Enumerable inclusion, it’s really nice). On the other hand, just having to declare iterator() the right way to get rid of all Metaprogamming to implement the Collection methods correctly is cool as well. Also, I found it cleaner to use the @Delegate possibility, which works exactly as expected (no this weirdness here).

It’s as always, there is no best way, I like both versions a lot. I still feel that Ruby feels cleaner, but I can achieve everything in Groovy with the same “expressiveness”. But I can use all Java and all it’s environment and tools as well – so jep, to me it’s the better choice. But I see where a lot of the good stuff in Groovy comes from.

Advertisements

Posted in Groovy, Ruby | Leave a Comment »

DSLs made easy – with Groovy’s chain method calls (GEP 3)

Posted by ice09 on October 7, 2011

As you probably know, the creation of DSLs is quite easy these days. Take Groovy or Scala and the creation of internal DSLs is a day’s work. Take Xtext 2.0 and the creation of an external DSL is a two days’s work.

However, let’s assume you do not have a day, but two hours left – still possible? Yes, indeed. With Groovy 1.8 and its new chained method calls (contributed by GEP 3).

So, chained method calls have always be possible in Groovy, but the new “conventions” of GEP 3 make the creation of a DSL really straightforward, since with a little effort you can design a DSL which has no brackets and dots at all!

.use case.

A sample use case for a simple DSL is the evaluation of account data. I wanted to have

  • the possibility to parse downloaded CSV files
  • enrich the items with certain categories
  • store them in a local (file based) database
  • be able to evaluate the items for certain time spans and categories

Having thought about the GUI which I would have to create for this use case, I felt overwhelmed immediately, given that my design capabilities are … debatable. But also, I would like to be able to use the data in several ways, transform it, check and validate it and so on. For this I do not want to create my own command set, but I want to use Groovy commands for this. Namely, I want to create an internal DSL.

.sample.

The following commands I can think of. However, this is not complete

load savingsfile     scans a predefined directory for the newest CSV file with a certain pattern and reads this file into memory 
                     (as a list of string arrays)
filter database      filter the data in memory against a predefined database file and remove all items from memory which are in the 
                     database already (depends on load step)
categorize data      use a certain rule set (external configuration) to automatically categorize the items according to different properties
                     (like sender/receiver of money)
show data            show the actual data in memory
save database        store the actual data in database file (filter step must be executed first)

Moreover, the DSL should allow for evaluating the data in memory, with the following syntax

evaluation of date september, 2011 category "Clubbing" with display           evaluate the database and display evaluation for 
                                                                              september, 2011 in category "Clubbing"
evaluation of date always category "Clubbing" without display                 same as above, but for no specified time span
evaluation of date always category all with display                           same as above, but for no specified time span and no specified category
evaluation of date always category all without display                        same as above, but without display of data (yes, it might not make sense -
                                                                              but it's possible :))

Now, the thing is that I would like to be able to execute all these commands several times, for different data and – since the data is available as a list of string arrays – I know I can do whatever I want with it within Groovy.
So, if I do not have a GUI, what about a shell? Of course, Groovy has one. What about TAB completion on my shiny new DSL – sure, why not? (ok, honestly, due to type inferencing difficulties, this does not work completely, but it’s a help definitely – just try it!)

So, let’s go. Just start the Groovy shell (GROOVY_HOME/bin/groovysh) and copy & paste this code. Yes, that’s just the skeleton to show the important parts – but there is a working version here, which works if you are customer of the German Sparkasse (shouldn’t they pay me for this?). If you really want to use this “real”, working version, you will need a rule file, which must be named rules.properties and has key/value pairs of regular expressions for the sender field like PAYPAL=Paypal or ^GA=Geldautomat.

class Savings {
    def read = false
    def filtered = false
    def datenbank
    Savings(datenbank) {
        this.datenbank = datenbank
    }
    def load(sfile) {
        read = true
        println "file loaded."
        this
    }
    def display(dummy) {
        if (!read) println "no data available."
        else println "display called."
    }
    def filter(sfile) {
        filtered = true
        println "data is filtered."
        this
    }
    def categorize(cat) {
        println "categorize called with category ${cat}."
    }
    def save(sfile) {
        if (!filtered) println "data is not filtered, aborting storing."
        else println "data is stored in file $datenbank"; read = false
    }
}

class Evaluation {
    def month
    def year
    def display = false
    def category
    def date(Integer month, Integer year) {
        this.month = month
        this.year = year
        this
    }
    def date(all) {
        this.month = null
        this.year = null
        this
    }
    def without(display) {
        this.display = false
        evaluate()
    }
    def with(ausgabe) {
        this.display = true
        evaluate()
    }
    def category(category) {
        if (category.toString() != "all")
            this.category = category
        this
    }
    def evaluate() {
        println "evaluate ${(month != null) ? "month ${month+1} year $year" : "all"} in category $category ${display ? "with":"without"} display"
    }
}

enum words { savingsfile, category, of, data, database, all, always, with, without, display }
(january, february, march, april, may, june, july, august, september, october, november, december) = (0..11)
datenbank = "db.csv"
import static words.*

savings = new Savings(datenbank)

read = { file -> savings.load(file) }
showit = { speicher -> savings.display()}
filter = { file -> savings.filter(file)}
categorize = { speicher -> savings.categorize(speicher)}
store = { file -> savings.save(file)}
evaluation = { of -> new Evaluation() }

//sample commands

read savingsfile
filter database
categorize data
showit data
store database

evaluation of date september, 2011 category "Clubbing" with display
evaluation of date always category "Clubbing" without display
evaluation of date always category all with display
evaluation of date always category all without display

The last lines (from //sample commands) are actual commands of the DSL. Try these with TAB completion in the shell and see what’s possible and what’s not. I will not explain the functionality in detail, because there are several resources which can be used for clarification.

.conclusion.

I wanted to show that two hours can be enough to create your own DSL, together with the Groovy shell this is a really powerful way to let people with limited technical abilities use the power of the complete Java/Groovy platform in interactive mode! Imagine the possibilities – your business analysts will love you.

Obviously, I had to fit the DSL to the conventions of GEP 3 (and moreover, I had to rename “load”, “show” and “save”, since these are reserved keywords in the Groovy shell…), but still I think it’s worth it. But if you take care of DSL design and do not want the language to be restricted by the command chain conventions, look for a cleaner approach. For quick & dirty API usage easing, this works as a charm.

If this blog post was useful to you, you can tip me. Maybe you do not want to tip, but please inform yourself about Bitcoin and applications of Bitcoin like Youttipit.

Posted in DSL, Groovy | Leave a Comment »