Amsterdamned

November 6 2009

i’m back.

for the last two weeks, i’ve been staying in the lovely city of Amsterdam, working for a customer of my dutch colleagues. challenging, amusing, funny and resource-consuming, here’s a brief recap of my last 15 days.

first of all, thanks from the deep of my heart to Maurizio “daje forte” Mao Pillitu, for hosting me in his nice and comfortable home, just outside the city town. he’s been very kind and friendly, i hope i had in some way paid back with my italian-style cousine.

so, i’ve been working for Hippo, a young and energetic open-source company born around their CMS product: it’s a nice building down-town, just 15 walking minutes far from Dam square (yep, i loved walking through the city lanes after a full day of working). guys at Hippo are friendly and passionate, devoted to open-source; they also organize forge-fridays, sort of coding dojos with the focus on releasing working plugins (for Hippo CMS, of course) at the end of the afternoon.

Hippo CMS is having a lot of popularity among public institutions in the Netherlands, something my dutch colleagues have been working on hard also. but even if Hippo 7 is getting popular, there are still a lot of projects done with the older product version, Hippo 6. And that’s were my story begins.

i’ve been working for the municipality of Schijndel, a little dutch town, helping its IT management improve and automate meeting’s agenda and reports publishing. yeah, you heard it right: they record and publish (with a little delay, of course) audio and text content for every council’s meeting. being an italian citizen, all that transparency and devotion sounds strange, but is really laudable.

the first challenge i faced was, of course, translating all documentations from dutch to english, from analysis PDF to past emails with customer. i didn’t had everything clear at first, but thanks to double-checking with dutch colleagues i finally got it. (anyway, it’s funny almost every translation from dutch gets verb in the very last part of sentences. it really reminded my latin classes, while at college).

then i finally entered the dark tunnel: technology viscosity and indecent web of dependencies, also known as Maven 1. gosh, i really had to work hard to have a successful build on top of Java 1.4, Axis2 and Cocoon 2.1, which turned out to be classpath monkey-patching, using ant tasks, jelly scripts and maven postGoals. damn!

add lack of support from webservice’s developers and consultants, and the soup is ready to be served! in fact, i just had a working test environment (i mean, representative of customer’s one, with valid data) almost 3 days before the project scheduled end. that’s awesome, isn’t it? how did the hell i managed to get the work done?

applying what i later called the “abstract and adapt” strategy: understand the domain, abstract from implementation details, then adapt code when things get clearer. well, that’s the hexagonal architecture (but, you know, we like coining sexy names). so, i spent the whole first week coding the application logic decoupled from real system behaviour, which in fact was unknown. Agenda and its Repository, Content and Storage, Indexer and Importer, these are all roles i’ve been writing, test-driven, from day one. that’s not easy, and of course it’s risky; but it was the best i could do.

reading webservice specifications and WSDL, i could also guess how that slimmy layer should behave, but i really got it wrong at first! then, i had an ah-ah moment during the first weekend, and changed the webservice adapter in order reflect my new thoughts, without the need to modify domain logic so much (in fact, i also improved my domain knowledge). i changed unit tests, and added sort of spikes: tests with no assertions, just logging actual parsed responses, so that i could “see” with my eyes current webservice behaviour, at each test run.

and i was right! i clearly remember how shocking was reading in the console log some parsed data, when they finally were set up on test environment! you know, i was going for lunch, i ran all tests one more time, before locking down workstation, and i saw that: “parsed 6 agenda”, following by a so-nice full toString(). that was awesome, really: my tests told me setup was done before receiving a confirmation email by consultants, 30 minutes later!

than, i had my journey to Schijndel, to discuss deployment and testing on customer’s network. trip took 2 hours, i also had a 30 minutes stop in ’s-Hertogenbosch which i spent walking down-town, among nice gothic buildings and golden dragons.

it’s shocking how efficient dutch national transports website is, with its door-to-door journey planner, really. well, it’s a shame it’s not updated with temporarily moved bus stops, which could have saved me one hour in the late evening!

anyway, that’s it, a recap of techy stuff mixed with journey reports. thanks to the whole dutch office for the opportunity and drinks, looking forward to next works together!

It seams open

May 4 2009

i can clearly remember when i first discussed DIP and OCP with others: it was two years ago, during my apprenticeship as an XPer. to me, it was nothing new, i already had studied all the principles that now come under the SOLID ancronym. but, probably, i hadn’t digested them enough: something that just came later with experience.

Dependency Inversion was for sure my favourite since then, for its multiple implications depending on what “high level” and “low level” mean; in my view, there are at least two meanings: abstraction (high level policies vs. low level details) and layering (close-to-user layer such as GUI vs. infrastructure layer). much more, i loved its love-hate with Dependency Injection (maybe more on this in a separate post).

Open-Closed is harder to understand, at first. how could you “change without modifying”? abstraction is the key! let modules depend on abstraction, then provide new implementations when behaviour has to change, without the need to modify existing code. in other words, always depend on abstraction: which, in the end, is DIP itself. OCP and DIP are such “yin and yang” in software design: they help achieve each other.

then, when i first discussed OCP with team-mates, i pushed for an analogy with Feather’s Seam Model. it’s discussed in “Working Effectively with Legacy Code” book: use seams to let legacy (which means untested) code be tested. to be fair, my analogy was not welcomed too much! i had to force my thesis a bit, and in the end not everybody was convinced.

two years later, it happened again! indeed, a few months ago we had a study group on OCP. i was in charge of preparing material to study on, and i chose a few corollary articles: first chapter from GoF Design Patterns book, which focuses on “design targeting interfaces”, and WELC chapter 4 “the Seam Model”. this time i was more convincing, and the analogy between OCP and the Seam Model became clear during our study session! and now, i want to tell you too.

after reading the bunch of articles i prepared, i asked a colleague to state in a few words what OCP was about. he said “change behaviour without modifying code”. great! then, i asked him again to state what the Seam Model was about, and he said “let code behave in a different way without modifying it”. well.. nothing left to say!

abstraction is the key, that’s true. but what about code which doesn’t follow OCP/DIP? it doesn’t depend on abstraction. we can modify it, refactoring, but we need an automatic test suite, in order to guarantee no behavioural change. and that’s exactly what Feather’s model is about: change code a little bit putting seams, to test it in isolation.

on the other side: what seams can you use to test OCP-compliant code? of course, already existing abstractions: you just have to change enabling points (in a test, usually in fixture setup).

in the report for our study session, on our internal wiki, we wrote:

We then discussed what OCP and Feather’s Seam Model have in common:

  • they seem the same idea, applied to reach different goals
    • OCP: put abstractions to isolate from future source code changes
    • Seam: put abstractions to test applications without changing its source code
  • to recap
    • closure/abstraction = seam
    • “main() routine”/factories = enabling point

to be precise, Feather’s model is about three different techniques, useful for testing legacy code written in any language, not just object-oriented ones. he talks about preprocessing seam, linking seam and object seam. so, the analogy with OCP is just between abstractions and object seams, even if sometime linking techniques are also used to achieve abstraction (such as reflection or some configuration-based IoC tool).

so, when i watched Misko’s clean-code-talks videos, i was surprised to hear him use the term “seam” while talking about DI and SOLID principles: he confirmed my analogy have sense!

this week i’ve performed my first public kata, at the local XPUG, here in Milan. meeting’s topic was “internal iterators and blocks”, very close to my heart. i just had the idea a few days ago, and the eXtreme guys approved my session as soon as i proposed. (this year seems promising, a lot of pratical sessions planned… stay tuned!)

the main idea for a (code) kata is to perform a short pratical exercise you feel confident with, almost with no braining; fingers should go on smoothly, like a dance. i chose to start from Matteo’s “birthday greetings” session on exagonal architecture, as coded by Milo and me during an XPUG meeting. i did the kata at home a few times and collected on paper some guidelines. then i printed out “brief instructions”, which i distributed to the audience before starting. the plan was a three short exercises kata.

first, i refactored existing code, splitting loops, extracting the InternalIterator / Predicate pair and then moving construction to facades like On an Being. second exercise was adding a new functionality, “send a kiss to female employees”; i drove implementation with unit tests, and focused on using just extracted iterator and predicate code. it finally looked like this:


public class BirthdayService { ...

  public void sendGreetings(OurDate ourDate) throws Exception {
    List employees = employeeFacade.loadEmployees();

    for (Employee employee : On.items(employees).collect(Being.bornOn(ourDate))) {
      sendGreetingTo(employee, happyBirthday(employee));
    }

    for (Employee employee : On.items(employees).collect(Being.female())) {
      sendGreetingTo(employee, "A kiss for you");
    }
  }
}

then, the mini collections library – On.items().collect(predicate) – was first modified to be generic (with Java’s Generics) and then improved with other functionalities such as:

  • findFirst(predicate)
  • reject(predicate)
  • contains(predicate)
  • apply(transformation)

final code consisted of 3 interfaces for blocks, one class for internal iterator, a Not predicate implementation and, finally, the On facade.

the leitmotif for the session was “you know, i don’t really like algorithms, so i don’t want to write them twice!”. that funny sentence was just trying to show how much more readable and understandable the code was without noisy algorithm details (like useless foreach, if/else, results.add(), if not null, etc). even if it was late evening, audience was really following me (and i have to admit, i was going a bit too fast!). they also asked me some good questions.

Gabriele pointed out that maybe findFirst() should provide a default value parameter, instead or returning a null reference. with Roberto we also discussed how difficult could be to implement returning a generic NullObject instance. Matteo went further on the idea of literal programming, suggesting something like:


employees.collect(bornToday).do(sendGreeting)

this would require iterator returning another iterator instead of a collection, or encapsulating every collection into a domain object.

something that really catched my attention was Gabriele suggesting moving predicates from Being facade into domain classes, with the pleasant side effect of removing useless getters on domain objects. that’s really a good idea! i’ll consider this for sure next time (kata was insipired by code from a project i no more work on).

then, Matteo finally asked me how many “collection traverse” algorithms i would have encapsulated into an iterator. i could not answer, i thougth it depends on business scenarios. but i just didn’t understand well the question: he was referring to the foreach loops, and that’s why he suggested me that probably i would not have written another foreach loop, they’re just MAP and REDUCE (transform and collect). nice hint, i should study some more functional programming; or even better, some more Smalltalk-Rubysh idioms! (yes, i’ve done a Google search for that. first rule to learn more: admit your ignorance)

here it comes the best part. before going home, i pushed for this methapor. exagonal architecture (and DIP) promotes to encapsulate low-level infrastructure behind an interface (a facade), to which high-level client code interacts with passing around domain objects; you just have to write and adapter once. for example, an EmployeeFacade interface and a DatabaseEmployeeAdapter – written just once – which takes Conditions to filter data searches; new business scenarios can be achieved providing different Conditions, without the need to modify infrastructure code.

in the same way, i consider collection manipulation an infrastructure issue: what really matters is behaviour to be run on collection items. that’s why i really like writing algorithms just once, and reuse them heavly providing specific behaviour for a given user story: concrete predicates, transformations and other blocks are the only domain code i have to write; iterators are hidden behind a (creation) facade, like On.

when i first thought about, this sounded like an epiphany, but i can see it’s not so easy to understand, or to agree with. let me just know what you think about.

ok, and finally, here is the code! tests firts:


@Test
public void sholdCollectItemsMathcingAPredicate() throws Exception {
  assertEquals(Arrays.asList(2, 6), On.items(Arrays.asList(2, 3, 6)).collect(new Even()));
}

@Test
public void shouldFindAnItemMathingAPredicate() throws Exception {
  assertEquals(2, On.items(Arrays.asList(1, 2, 4, 5)).findFirst(new Even()));
}

@Test
public void shouldReturnNullIfNotMatchingItemFound() throws Exception {
  assertNull(On.items(Arrays.asList(1, 3, 5)).findFirst(new Even()));
}

@Test
public void shouldRejectItemsNotMathingAPredicate() throws Exception {
  assertEquals(Arrays.asList(1, 5), On.items(Arrays.asList(1, 2, 5, 4)).reject(new Even()));
}

@Test
public void shouldDetectIfAMatchingItemIsFound() throws Exception {
  assertTrue(On.items(2, 5).contains(new Even()));
}

@Test
public void shouldApplyTransformationOnEacheItem() throws Exception {
  assertEquals(Arrays.asList(2, 4, 6), On.items(1, 2, 3).apply(new Doubler()));
}

public class Even implements Predicate<Integer>{
  public boolean evaluate(Integer item) {
    return item % 2 == 0;
  }
}

public class Doubler implements SimpleTransformation<Integer> {
  public Integer applyOn(Integer item) {
    return item * 2;
  }
}

then application code:

public interface Predicate<TYPE> {
  boolean evaluate(TYPE item);
}

public interface Transformation<FROM, TO> {
  public TO applyOn(FROM item);
}

public interface SimpleTransformation<TYPE> extends Transformation<TYPE, TYPE>{
}

public class InternalIterator<TYPE> {
  private final List<TYPE> items;

  public InternalIterator(List<TYPE> items) {
    this.items = items;
  }

  public List<TYPE> collect(Predicate<TYPE> predicate) {
    List<TYPE> result = new ArrayList<TYPE>();
    for (TYPE eachItem : items) {
      if (predicate.evaluate(eachItem)) {
        result.add(eachItem);
      }
    }
    return result;
  }

  public TYPE findFirst(Predicate<TYPE> predicate) {
    List<TYPE> result = collect(predicate);
    return result.isEmpty() ? null : result.get(0);
  }

  public List<TYPE> reject(Predicate<TYPE> predicate) {
    return collect(new Not<TYPE>(predicate));
  }

  public boolean contains(Predicate<TYPE> predicate) {
    return ! collect(predicate).isEmpty();
  }

  public <TO> List<TO> apply(Transformation<TYPE, TO> tranfromation) {
    List<TO> result = new ArrayList<TO>();
    for (TYPE eachItem : items) {
      result.add(tranfromation.applyOn(eachItem));
    }
    return result;
  }
}

public class Not<TYPE> implements Predicate<TYPE> {
  private final Predicate<TYPE> predicate;

  public Not(Predicate<TYPE> predicate) {
    this.predicate = predicate;
  }

  public boolean evaluate(TYPE item) {
    return ! predicate.evaluate(item);
  }
}

public class On {

  public static <TYPE> InternalIterator<TYPE> items(List<TYPE> items) {
    return new InternalIterator<TYPE>(items);
  }

  public static <TYPE> InternalIterator<TYPE> items(TYPE... items) {
    return On.items(Arrays.asList(items));
  }
}

mom said “test everything that could possibly break”. now, how to define “everything”? well, the story goes like this…

last week, me and Antonio were pairing trying to find out what was causing a very strange behaviour in our codebase. a batch process, loading data from a file and populating an “ATM transactions” table, should then put duplicate records on a “duplicate ATM transactions” table. so far, so good, we worked on stories like this zillion times.

when our customer’s proxy finally run the batch on a PREPRODUCTION environment, something went wrong. she then collected a log file, from wich we started investingating where the problem could be. what a big surprise reading “unique constraint XYZ violated” for “duplicates” table! could be? we put no contraints at all on that table! even more, XYZ contrainst was declared on “transactions” table, not duplicates!

at first sight, there was nothing wrong in the codebase; we then tried to reproduce the bug on our DEVELOPMENT environment too, with no luck. so, we decided to log more debug infos, such as “executing INSERT INTO transactions” and “executing INSERT INTO duplicates”. test all, commit, deploy and go!

after another run, this time on TEST environment, bug was there again, but at least we got more log to read. can you see? yeah, debug info was “executing INSERT INTO duplicates”, followed by the silly “contraint violated” exception! gosh! how could that be?

support from customer’s internal technical staff didn’t help much, we just find out that there were “something” different from DEVELOPMENT and TEST/PRE environments. very strange. anyway, we compared genearted DDL for tables on both environments, but, again, nothing helpful.

then light came, thanks to our new team-mate Roberto, our little Oracle guru!

on our daily standup, we let the team know we were with no ideas, so Roberto proposed himself for a pairing session with me (the team is working on few projects, on which we turn pairs on a weekly basis). after reviewing together (useless) DDL scripts, he then started showing me some magic with Oracle “reflection” stuff: queries on metadata, such as tables, contraints, and so on.

i was talking with a team mate, who just asked me something, when i turned back on the desk, and found Roberto with a smily face saying “Jacopo, i know what’s wrong, but you’ve got to wait for our 5 minutes break!”. arghh! couldn’t really wait!

you know, the problem was in the only part we didn’t test: customer’s internal stored procedure invocations, for creating database object alias!

we use incremental SQL scripts to recreate from scratch the database structure: we just skip stored procedures for grants and alias, because they cannot be run in DEVELOPMENT environment. we delay that feedback to a manual run (a.k.a. a demo!) in TEST and PREPRODUCTION environments. and that’s exaclty what we had: feedback on our broken SQL scripts. one of the alias was wrong, duplicates table was pointing to transactions table! it was probably caused by a copy and paste from another script (and that’s the saddest part).

so, how to define “everything”? simple, everything!

Architecture toolbox

December 30 2008

yesterday i was reading a nice paper by Nat Pryce on testing asynchronous systems, presented at last XP Day in london. the topic is really interesting to me, but what caught my attention was the diagrams describing system X, Y and Z: what a nice and compact way of showing complex architectures and testing strategies! system Z resembles our last customer’s project, a mix of http, jms and soap. moreover, a service bus was choosen to interconnect applications, such as in sistem Y.

i’m really interested in software architectures and distributed systems. having a background in system administration and integration, and an engineering curricolum too, i always start considering existing standards and solutions when facing problems. but that’s half the story. being an XPer, i value simplicity. so, avoiding to reinvent the wheel, while looking for simple strategies, which components are in our architecture toolbox?

first, when we need to store, retrieve and share data in a distributed system, we can choose:

  1. a RDBMS, using a relational model
  2. a filesystem, for flat (files) or hierarchical (directories) data
  3. a queue or topic, using a flat (and asynchronous) model
  4. a directory service, for hierarchical data
  5. e-mails, using a flat (and asynchronous) model

sharing data at resource level means using existing standard or vendor specific protocol and transport. here they are a few:

  1. SQL and ODBC, or some platform-specific technology such as JDBC
  2. network filesystems such as NFS or SMB accessed through a socket. otherwise, WebDAV resources on top of HTTP, or the aged FTP. for local filesystems, you can rely on the OS and URL abstraction (using file:// schema)
  3. a messaging system, such as JMS for Java or Microsft MSMQ, which provide both a protocol for data access and a transport technology
  4. LDAP, using open APIs (such as OpenLDAP) or vendor’s transport technologies (e.g. Microsoft Active-Directory or ADAM)
  5. IMAP/POP3/SMTP standard protocols, accessed through a socket or through platform-specific APIs

when directly sharing resources is not feasible, we can put a lightweight proxy API in front of the resource we want to isolate. think of security issues (DMZ or firewalling rules), clustering (non functional requirements and system *-ility), and low-resource devices compatibility (e.g. for mobile devices). proxy API can be developed writing a little grammar for a protocol and then using a code-generation tool for parsing network traffic (on top of HTTP or custom sockets). otherwise, choose a standard protocol and transport:

  • SOAP and WS-* standards on HTTP, or XML/RPC on TCP, both textual protocols
  • CORBA or platform-specific standard like RMI fo Java, both binary protocols
  • REST services on HTTP, providing CRUD operations in term of HTTP methods

for more complex scenarios, we go beyond a distributed system sharing data: we need to interconnect independent applications. Enterprise Integration Patterns book shows four integration strategies: file transfer, shared database, remote procedure invocation and messaging. while for file, database and remote APIs we can reuse standards for sharing data, for messaging we need an ESB solution, built on top of existing technologies such as XML and JMS.

finally, there are a few additional services we usually need. for authentication and authorization, instead of a custom or integrated vendor sulution (login forms or SSO technologies) what about using HTTP or HTTPS? otherwise we can wrap existing transport inside SSL. also LDAP can help. then, we probably need to monitor systems and applications. again, before implementing a custom solution, consider using existing standard as (for Java) JMX or (the old one) SNMP.

and you, what’s on your toolbox?