CQ, Sling, Felix and my headaches
December 27 2009
these last six months have been incredibly full for me, i’ve learnt so many technologies and technical stuff: RubyOnRails web application development (and a bit of S3 cloud deploying), Hippo CMS 6 and Cocoon pipelines, and now Day CQ stack, which means JCR and Jackrabbit, Sling RESTful web framework, and OSGI bundles with Felix. oh my!
yep, i’m currently working for a big TLC italian company, developing their internal portal based on CQ5. i was completely new to content-repositories and web content management, but i got it quickly: it’s a different paradigm, data are modeled around resources, not around relations (as with relational databases).
btw, what i want to show is my journey with CQ stuff, and how our development approach has grown during the last weeks (and where it’s going). beware: there’s a lot of technical stuff (maven, Day CRX, Apache Sling, Apache Felix); i won’t explain everything in detail, so i’m referring to documentation and other blog posts.
so, first of all, start reading CQ tutorial on “How to Set Up the Development Environment with Eclipse”: please, spend almost one hour following all steps, even boring ones, like grabbing jars from CRX repository and putting them manually into local maven repository. in the end, you’ll have two projects (ui and core), one page with template (manually created and edited), executing a component as JSP script (imported through VLT), which uses “domain” logic provided by a plain old Java class (from core project). that’s a lot of stuff!
then, let’s enter the magical world of CQDE, a customized (old version of) Eclipse, which provide access to remote content (via webdav) from within an IDE, so that you can edit, compile and debug code as it was stored locally (but it isn’t). at first, it seems a lot better than VLT-ing from commandline; but soon you’ll miss it: versioning, and sharing code with others. even if it’s not clear in the tutorial, ignoring VLT specific files let Subversion version also content stored in src/main/content/jcr_root. that’s not always funny, like manually merging conflicts on XML files, but it’s really a lot better than blindly edit code with CQDE, with no way back! also, sometimes i’ve found much more easier editing pages as XML files than using WCM editor (CQ authoring tool).
ok, relax, take a deep breath, and think about what you’ve done so far. do you like it? are you comfortable with this? well, i wasn’t; i missed my IDE-based development, checking-in and out code, running automatic tests all the time. the good news is we can do better than this, the bad news is we’ll still miss something (so far, red/green bars for UI). to recap, we can choose from:
- remote coding and debugging, with CQDE: no “native” versioning, VLT can be use as a “bridge” to Subversion
- local coding, with any IDE (eg Eclipse): still can’t compile JSP files, VLT used to deploy UI code
next step is (well, i’m a bit afraid, but time has come)… deploy an OSGI bundle with maven, with both UI code and initial content to put on repository.
step one: compiling JSP files locally. ingredients: JARs as local maven dependencies and sling maven jspc plugin.
i could not find any public Day maven repository (and it makes sense, from a business point of view), but as the tutorial shows, everything we need is already available from CRX. so, it takes long, but referring to the /libs/xyz/install convention and doing searches via CRX explorer you can come up with something like this:
#!/bin/bash function grabDependency(){ JAR_URL=$1 REPOSITORY_DIR=~/.m2/repository/$2 JAR_FILE=$3 wget --user=admin --password=admin $JAR_URL mkdir -p $REPOSITORY_DIR mv $JAR_FILE $REPOSITORY_DIR } cd /tmp; rm -rf deps; mkdir deps; cd deps grabDependency \ http://localhost:4502/crx/repository/crx.default/libs/commons/install/day-commons-jstl-1.1.2.jar \ com/day/commons/day-commons-jstl/1.1.2 \ day-commons-jstl-1.1.2.jar # ... grab other jar files
then, let’s add JSPC plugin to the maven build chain, and CQ and Sling dependencies (see attached file with sample code). this is a simple example; you’ll probably need to override plugin’s sling jar dependencies with versions used by application code!
<plugin> <groupId>org.apache.sling</groupId> <artifactId>maven-jspc-plugin</artifactId> <configuration> <compilerSourceVM>1.5</compilerSourceVM> <compilerTargetVM>1.5</compilerTargetVM> </configuration> <executions> <execution> <id>compile-jsp</id> <goals> <goal>jspc</goal> </goals> </execution> </executions> </plugin>
moving JSP code into src/main/scripts (under apps/myApp subfolder) should be enough to have maven build (mvn clean compile). just remember to grab global.jsp from CRX and put it under src/main/scripts/libs/wcm folder. Eclipse also will compile (regenerate project files with mvn eclipse:eclipse), but it needs another copy of global.jsp into /libs/wcm (i know, it’s silly; i’ll check this next time).
step two: packaging an OSGI bundle with UI code and content nodes. ingredients: Felix maven bundle plugin.
the key concept for me was understanding what to put into the bundle. i was used to have JSP files on CRX under /apps node, editing nodes properties such as jcr:primaryType (cq:Component, cq:Template and the like) and jcr:content. deploying application as OSGI bundle it’s slightly different: code is available as bundle resources (from the bundle itself), while only property nodes are copied from bundle to CRX repository, as initial content. this separation was not clear to me in the beginning, but it now makes sense (even if less duplication would be nice, for example in content structure).
so, we should create a bundle with:
- included resources: all required resources (maven resources and
src/main/scriptsfolder) to be later referred - bundle resources: .class and JSP files
- initial content: node properties, as JSON files (i decided to put them into
src/main/resources, underCQ-INF/initial-contentsubfolder)
more details are available on the Sling website and on this blog post.
so, let’s add Felix bundle plugin to maven (remember to declare project bundle packaging with <packaging>bundle</packaging>):
<plugin> <groupId>org.apache.felix</groupId> <artifactId>maven-bundle-plugin</artifactId> <version>1.4.3</version> <extensions>true</extensions> <configuration> <instructions> <Export-Package> com.day.cq5.myapp.*;version=${pom.version}, org.apache.jsp.apps.*;version=${pom.version} </Export-Package> <Import-Package>*</Import-Package> <Private-Package></Private-Package> <!-- included resources folders (to be later referred): maven resources and JSP files --> <Include-Resource> {maven-resources}, src/main/scripts </Include-Resource> <!-- resources available from within bundle (not available as CRX nodes): compiled .class files and JSP files. --> <Sling-Bundle-Resources> /apps/myApp, /var/classes!/org/apache/jsp/apps/myApp </Sling-Bundle-Resources> <!-- content initially copied into CRX nodes: properties as JSON descriptors --> <Sling-Initial-Content> CQ-INF/initial-content/apps/myApp/; overwrite:=true; path:=/apps/myApp, CQ-INF/initial-content/content/sample/; overwrite:=true; path:=/content/sample </Sling-Initial-Content> </instructions> </configuration> </plugin>
this should be enough to create a package with mvn clean pakage. we’re almost done..
step three: installing the bundle. ingredients: maven sling plugin.
with CQ there are two ways to install a bundle: put it under /apps/myApp/install folder or using the Felix console. i choose the latter, which turns out to be a plain POST request to the console URL. anyway, we can hook the maven build chain with the Sling plugin, this way:
<plugin> <groupId>org.apache.sling</groupId> <artifactId>maven-sling-plugin</artifactId> <version>2.0.4-incubator</version> <executions> <execution> <id>install-bundle</id> <goals> <goal>install</goal> </goals> <configuration> <slingUrl>http://localhost:4502/system/console/install</slingUrl> <user>admin</user> <password>admin</password> </configuration> </execution> </executions> </plugin>
just type mvn install and we’re done.
that’s it. a lot of setups, expecially if, like me, you’re new to maven and OSGI. anyway, i’ve written this mainly for later reference and to share thoughts with colleagues. i’ve shown three approaches to develop with CQ, tested in my daily work on the last month. in my view, deploying OSGI bundles is the best one, so far; it’s a trade-off between ease of use while debugging (yep, no UI automatic tests yet) and development lifecycle (versioning, building, packaging). i hope to gather much more info next year, and probably something will be easier! next step will be setting up automatic tests for JSP files, using Koskela’s JspTest tool.
sample code is here: please, follow README and have fun.
well, happy new year to everyone!
Amsterdamned
November 6 2009
i’m back.
for the last two weeks, i’ve been staying in the lovely city of Amsterdam, working for a customer of my dutch colleagues. challenging, amusing, funny and resource-consuming, here’s a brief recap of my last 15 days.
first of all, thanks from the deep of my heart to Maurizio “daje forte” Mao Pillitu, for hosting me in his nice and comfortable home, just outside the city town. he’s been very kind and friendly, i hope i had in some way paid back with my italian-style cousine.
so, i’ve been working for Hippo, a young and energetic open-source company born around their CMS product: it’s a nice building down-town, just 15 walking minutes far from Dam square (yep, i loved walking through the city lanes after a full day of working). guys at Hippo are friendly and passionate, devoted to open-source; they also organize forge-fridays, sort of coding dojos with the focus on releasing working plugins (for Hippo CMS, of course) at the end of the afternoon.
Hippo CMS is having a lot of popularity among public institutions in the Netherlands, something my dutch colleagues have been working on hard also. but even if Hippo 7 is getting popular, there are still a lot of projects done with the older product version, Hippo 6. And that’s were my story begins.
i’ve been working for the municipality of Schijndel, a little dutch town, helping its IT management improve and automate meeting’s agenda and reports publishing. yeah, you heard it right: they record and publish (with a little delay, of course) audio and text content for every council’s meeting. being an italian citizen, all that transparency and devotion sounds strange, but is really laudable.
the first challenge i faced was, of course, translating all documentations from dutch to english, from analysis PDF to past emails with customer. i didn’t had everything clear at first, but thanks to double-checking with dutch colleagues i finally got it. (anyway, it’s funny almost every translation from dutch gets verb in the very last part of sentences. it really reminded my latin classes, while at college).
then i finally entered the dark tunnel: technology viscosity and indecent web of dependencies, also known as Maven 1. gosh, i really had to work hard to have a successful build on top of Java 1.4, Axis2 and Cocoon 2.1, which turned out to be classpath monkey-patching, using ant tasks, jelly scripts and maven postGoals. damn!
add lack of support from webservice’s developers and consultants, and the soup is ready to be served! in fact, i just had a working test environment (i mean, representative of customer’s one, with valid data) almost 3 days before the project scheduled end. that’s awesome, isn’t it? how did the hell i managed to get the work done?
applying what i later called the “abstract and adapt” strategy: understand the domain, abstract from implementation details, then adapt code when things get clearer. well, that’s the hexagonal architecture (but, you know, we like coining sexy names). so, i spent the whole first week coding the application logic decoupled from real system behaviour, which in fact was unknown. Agenda and its Repository, Content and Storage, Indexer and Importer, these are all roles i’ve been writing, test-driven, from day one. that’s not easy, and of course it’s risky; but it was the best i could do.
reading webservice specifications and WSDL, i could also guess how that slimmy layer should behave, but i really got it wrong at first! then, i had an ah-ah moment during the first weekend, and changed the webservice adapter in order reflect my new thoughts, without the need to modify domain logic so much (in fact, i also improved my domain knowledge). i changed unit tests, and added sort of spikes: tests with no assertions, just logging actual parsed responses, so that i could “see” with my eyes current webservice behaviour, at each test run.
and i was right! i clearly remember how shocking was reading in the console log some parsed data, when they finally were set up on test environment! you know, i was going for lunch, i ran all tests one more time, before locking down workstation, and i saw that: “parsed 6 agenda”, following by a so-nice full toString(). that was awesome, really: my tests told me setup was done before receiving a confirmation email by consultants, 30 minutes later!
than, i had my journey to Schijndel, to discuss deployment and testing on customer’s network. trip took 2 hours, i also had a 30 minutes stop in ’s-Hertogenbosch which i spent walking down-town, among nice gothic buildings and golden dragons.
it’s shocking how efficient dutch national transports website is, with its door-to-door journey planner, really. well, it’s a shame it’s not updated with temporarily moved bus stops, which could have saved me one hour in the late evening!
anyway, that’s it, a recap of techy stuff mixed with journey reports. thanks to the whole dutch office for the opportunity and drinks, looking forward to next works together!
It seams open
May 4 2009
i can clearly remember when i first discussed DIP and OCP with others: it was two years ago, during my apprenticeship as an XPer. to me, it was nothing new, i already had studied all the principles that now come under the SOLID ancronym. but, probably, i hadn’t digested them enough: something that just came later with experience.
Dependency Inversion was for sure my favourite since then, for its multiple implications depending on what “high level” and “low level” mean; in my view, there are at least two meanings: abstraction (high level policies vs. low level details) and layering (close-to-user layer such as GUI vs. infrastructure layer). much more, i loved its love-hate with Dependency Injection (maybe more on this in a separate post).
Open-Closed is harder to understand, at first. how could you “change without modifying”? abstraction is the key! let modules depend on abstraction, then provide new implementations when behaviour has to change, without the need to modify existing code. in other words, always depend on abstraction: which, in the end, is DIP itself. OCP and DIP are such “yin and yang” in software design: they help achieve each other.
then, when i first discussed OCP with team-mates, i pushed for an analogy with Feather’s Seam Model. it’s discussed in “Working Effectively with Legacy Code” book: use seams to let legacy (which means untested) code be tested. to be fair, my analogy was not welcomed too much! i had to force my thesis a bit, and in the end not everybody was convinced.
two years later, it happened again! indeed, a few months ago we had a study group on OCP. i was in charge of preparing material to study on, and i chose a few corollary articles: first chapter from GoF Design Patterns book, which focuses on “design targeting interfaces”, and WELC chapter 4 “the Seam Model”. this time i was more convincing, and the analogy between OCP and the Seam Model became clear during our study session! and now, i want to tell you too.
after reading the bunch of articles i prepared, i asked a colleague to state in a few words what OCP was about. he said “change behaviour without modifying code”. great! then, i asked him again to state what the Seam Model was about, and he said “let code behave in a different way without modifying it”. well.. nothing left to say!
abstraction is the key, that’s true. but what about code which doesn’t follow OCP/DIP? it doesn’t depend on abstraction. we can modify it, refactoring, but we need an automatic test suite, in order to guarantee no behavioural change. and that’s exactly what Feather’s model is about: change code a little bit putting seams, to test it in isolation.
on the other side: what seams can you use to test OCP-compliant code? of course, already existing abstractions: you just have to change enabling points (in a test, usually in fixture setup).
in the report for our study session, on our internal wiki, we wrote:
We then discussed what OCP and Feather’s Seam Model have in common:
- they seem the same idea, applied to reach different goals
- OCP: put abstractions to isolate from future source code changes
- Seam: put abstractions to test applications without changing its source code
- to recap
- closure/abstraction = seam
- “main() routine”/factories = enabling point
to be precise, Feather’s model is about three different techniques, useful for testing legacy code written in any language, not just object-oriented ones. he talks about preprocessing seam, linking seam and object seam. so, the analogy with OCP is just between abstractions and object seams, even if sometime linking techniques are also used to achieve abstraction (such as reflection or some configuration-based IoC tool).
so, when i watched Misko’s clean-code-talks videos, i was surprised to hear him use the term “seam” while talking about DI and SOLID principles: he confirmed my analogy have sense!
Collections algorithms as infrastructure
March 14 2009
this week i’ve performed my first public kata, at the local XPUG, here in Milan. meeting’s topic was “internal iterators and blocks”, very close to my heart. i just had the idea a few days ago, and the eXtreme guys approved my session as soon as i proposed. (this year seems promising, a lot of pratical sessions planned… stay tuned!)
the main idea for a (code) kata is to perform a short pratical exercise you feel confident with, almost with no braining; fingers should go on smoothly, like a dance. i chose to start from Matteo’s “birthday greetings” session on exagonal architecture, as coded by Milo and me during an XPUG meeting. i did the kata at home a few times and collected on paper some guidelines. then i printed out “brief instructions”, which i distributed to the audience before starting. the plan was a three short exercises kata.
first, i refactored existing code, splitting loops, extracting the InternalIterator / Predicate pair and then moving construction to facades like On an Being. second exercise was adding a new functionality, “send a kiss to female employees”; i drove implementation with unit tests, and focused on using just extracted iterator and predicate code. it finally looked like this:
public class BirthdayService { ...
public void sendGreetings(OurDate ourDate) throws Exception {
List employees = employeeFacade.loadEmployees();
for (Employee employee : On.items(employees).collect(Being.bornOn(ourDate))) {
sendGreetingTo(employee, happyBirthday(employee));
}
for (Employee employee : On.items(employees).collect(Being.female())) {
sendGreetingTo(employee, "A kiss for you");
}
}
}
then, the mini collections library – On.items().collect(predicate) – was first modified to be generic (with Java’s Generics) and then improved with other functionalities such as:
- findFirst(predicate)
- reject(predicate)
- contains(predicate)
- apply(transformation)
final code consisted of 3 interfaces for blocks, one class for internal iterator, a Not predicate implementation and, finally, the On facade.
the leitmotif for the session was “you know, i don’t really like algorithms, so i don’t want to write them twice!”. that funny sentence was just trying to show how much more readable and understandable the code was without noisy algorithm details (like useless foreach, if/else, results.add(), if not null, etc). even if it was late evening, audience was really following me (and i have to admit, i was going a bit too fast!). they also asked me some good questions.
Gabriele pointed out that maybe findFirst() should provide a default value parameter, instead or returning a null reference. with Roberto we also discussed how difficult could be to implement returning a generic NullObject instance. Matteo went further on the idea of literal programming, suggesting something like:
employees.collect(bornToday).do(sendGreeting)
this would require iterator returning another iterator instead of a collection, or encapsulating every collection into a domain object.
something that really catched my attention was Gabriele suggesting moving predicates from Being facade into domain classes, with the pleasant side effect of removing useless getters on domain objects. that’s really a good idea! i’ll consider this for sure next time (kata was insipired by code from a project i no more work on).
then, Matteo finally asked me how many “collection traverse” algorithms i would have encapsulated into an iterator. i could not answer, i thougth it depends on business scenarios. but i just didn’t understand well the question: he was referring to the foreach loops, and that’s why he suggested me that probably i would not have written another foreach loop, they’re just MAP and REDUCE (transform and collect). nice hint, i should study some more functional programming; or even better, some more Smalltalk-Rubysh idioms! (yes, i’ve done a Google search for that. first rule to learn more: admit your ignorance)
here it comes the best part. before going home, i pushed for this methapor. exagonal architecture (and DIP) promotes to encapsulate low-level infrastructure behind an interface (a facade), to which high-level client code interacts with passing around domain objects; you just have to write and adapter once. for example, an EmployeeFacade interface and a DatabaseEmployeeAdapter – written just once – which takes Conditions to filter data searches; new business scenarios can be achieved providing different Conditions, without the need to modify infrastructure code.
in the same way, i consider collection manipulation an infrastructure issue: what really matters is behaviour to be run on collection items. that’s why i really like writing algorithms just once, and reuse them heavly providing specific behaviour for a given user story: concrete predicates, transformations and other blocks are the only domain code i have to write; iterators are hidden behind a (creation) facade, like On.
when i first thought about, this sounded like an epiphany, but i can see it’s not so easy to understand, or to agree with. let me just know what you think about.
ok, and finally, here is the code! tests firts:
@Test
public void sholdCollectItemsMathcingAPredicate() throws Exception {
assertEquals(Arrays.asList(2, 6), On.items(Arrays.asList(2, 3, 6)).collect(new Even()));
}
@Test
public void shouldFindAnItemMathingAPredicate() throws Exception {
assertEquals(2, On.items(Arrays.asList(1, 2, 4, 5)).findFirst(new Even()));
}
@Test
public void shouldReturnNullIfNotMatchingItemFound() throws Exception {
assertNull(On.items(Arrays.asList(1, 3, 5)).findFirst(new Even()));
}
@Test
public void shouldRejectItemsNotMathingAPredicate() throws Exception {
assertEquals(Arrays.asList(1, 5), On.items(Arrays.asList(1, 2, 5, 4)).reject(new Even()));
}
@Test
public void shouldDetectIfAMatchingItemIsFound() throws Exception {
assertTrue(On.items(2, 5).contains(new Even()));
}
@Test
public void shouldApplyTransformationOnEacheItem() throws Exception {
assertEquals(Arrays.asList(2, 4, 6), On.items(1, 2, 3).apply(new Doubler()));
}
public class Even implements Predicate<Integer>{
public boolean evaluate(Integer item) {
return item % 2 == 0;
}
}
public class Doubler implements SimpleTransformation<Integer> {
public Integer applyOn(Integer item) {
return item * 2;
}
}
then application code:
public interface Predicate<TYPE> {
boolean evaluate(TYPE item);
}
public interface Transformation<FROM, TO> {
public TO applyOn(FROM item);
}
public interface SimpleTransformation<TYPE> extends Transformation<TYPE, TYPE>{
}
public class InternalIterator<TYPE> {
private final List<TYPE> items;
public InternalIterator(List<TYPE> items) {
this.items = items;
}
public List<TYPE> collect(Predicate<TYPE> predicate) {
List<TYPE> result = new ArrayList<TYPE>();
for (TYPE eachItem : items) {
if (predicate.evaluate(eachItem)) {
result.add(eachItem);
}
}
return result;
}
public TYPE findFirst(Predicate<TYPE> predicate) {
List<TYPE> result = collect(predicate);
return result.isEmpty() ? null : result.get(0);
}
public List<TYPE> reject(Predicate<TYPE> predicate) {
return collect(new Not<TYPE>(predicate));
}
public boolean contains(Predicate<TYPE> predicate) {
return ! collect(predicate).isEmpty();
}
public <TO> List<TO> apply(Transformation<TYPE, TO> tranfromation) {
List<TO> result = new ArrayList<TO>();
for (TYPE eachItem : items) {
result.add(tranfromation.applyOn(eachItem));
}
return result;
}
}
public class Not<TYPE> implements Predicate<TYPE> {
private final Predicate<TYPE> predicate;
public Not(Predicate<TYPE> predicate) {
this.predicate = predicate;
}
public boolean evaluate(TYPE item) {
return ! predicate.evaluate(item);
}
}
public class On {
public static <TYPE> InternalIterator<TYPE> items(List<TYPE> items) {
return new InternalIterator<TYPE>(items);
}
public static <TYPE> InternalIterator<TYPE> items(TYPE... items) {
return On.items(Arrays.asList(items));
}
}
Chronicle of a (damn stupid!) bug
February 23 2009
mom said “test everything that could possibly break”. now, how to define “everything”? well, the story goes like this…
last week, me and Antonio were pairing trying to find out what was causing a very strange behaviour in our codebase. a batch process, loading data from a file and populating an “ATM transactions” table, should then put duplicate records on a “duplicate ATM transactions” table. so far, so good, we worked on stories like this zillion times.
when our customer’s proxy finally run the batch on a PREPRODUCTION environment, something went wrong. she then collected a log file, from wich we started investingating where the problem could be. what a big surprise reading “unique constraint XYZ violated” for “duplicates” table! could be? we put no contraints at all on that table! even more, XYZ contrainst was declared on “transactions” table, not duplicates!
at first sight, there was nothing wrong in the codebase; we then tried to reproduce the bug on our DEVELOPMENT environment too, with no luck. so, we decided to log more debug infos, such as “executing INSERT INTO transactions” and “executing INSERT INTO duplicates”. test all, commit, deploy and go!
after another run, this time on TEST environment, bug was there again, but at least we got more log to read. can you see? yeah, debug info was “executing INSERT INTO duplicates”, followed by the silly “contraint violated” exception! gosh! how could that be?
support from customer’s internal technical staff didn’t help much, we just find out that there were “something” different from DEVELOPMENT and TEST/PRE environments. very strange. anyway, we compared genearted DDL for tables on both environments, but, again, nothing helpful.
then light came, thanks to our new team-mate Roberto, our little Oracle guru!
on our daily standup, we let the team know we were with no ideas, so Roberto proposed himself for a pairing session with me (the team is working on few projects, on which we turn pairs on a weekly basis). after reviewing together (useless) DDL scripts, he then started showing me some magic with Oracle “reflection” stuff: queries on metadata, such as tables, contraints, and so on.
i was talking with a team mate, who just asked me something, when i turned back on the desk, and found Roberto with a smily face saying “Jacopo, i know what’s wrong, but you’ve got to wait for our 5 minutes break!”. arghh! couldn’t really wait!
you know, the problem was in the only part we didn’t test: customer’s internal stored procedure invocations, for creating database object alias!
we use incremental SQL scripts to recreate from scratch the database structure: we just skip stored procedures for grants and alias, because they cannot be run in DEVELOPMENT environment. we delay that feedback to a manual run (a.k.a. a demo!) in TEST and PREPRODUCTION environments. and that’s exaclty what we had: feedback on our broken SQL scripts. one of the alias was wrong, duplicates table was pointing to transactions table! it was probably caused by a copy and paste from another script (and that’s the saddest part).
so, how to define “everything”? simple, everything!
Architecture toolbox
December 30 2008
yesterday i was reading a nice paper by Nat Pryce on testing asynchronous systems, presented at last XP Day in london. the topic is really interesting to me, but what caught my attention was the diagrams describing system X, Y and Z: what a nice and compact way of showing complex architectures and testing strategies! system Z resembles our last customer’s project, a mix of http, jms and soap. moreover, a service bus was choosen to interconnect applications, such as in sistem Y.
i’m really interested in software architectures and distributed systems. having a background in system administration and integration, and an engineering curricolum too, i always start considering existing standards and solutions when facing problems. but that’s half the story. being an XPer, i value simplicity. so, avoiding to reinvent the wheel, while looking for simple strategies, which components are in our architecture toolbox?
first, when we need to store, retrieve and share data in a distributed system, we can choose:
- a RDBMS, using a relational model
- a filesystem, for flat (files) or hierarchical (directories) data
- a queue or topic, using a flat (and asynchronous) model
- a directory service, for hierarchical data
- e-mails, using a flat (and asynchronous) model
sharing data at resource level means using existing standard or vendor specific protocol and transport. here they are a few:
- SQL and ODBC, or some platform-specific technology such as JDBC
- network filesystems such as NFS or SMB accessed through a socket. otherwise, WebDAV resources on top of HTTP, or the aged FTP. for local filesystems, you can rely on the OS and URL abstraction (using file:// schema)
- a messaging system, such as JMS for Java or Microsft MSMQ, which provide both a protocol for data access and a transport technology
- LDAP, using open APIs (such as OpenLDAP) or vendor’s transport technologies (e.g. Microsoft Active-Directory or ADAM)
- IMAP/POP3/SMTP standard protocols, accessed through a socket or through platform-specific APIs
when directly sharing resources is not feasible, we can put a lightweight proxy API in front of the resource we want to isolate. think of security issues (DMZ or firewalling rules), clustering (non functional requirements and system *-ility), and low-resource devices compatibility (e.g. for mobile devices). proxy API can be developed writing a little grammar for a protocol and then using a code-generation tool for parsing network traffic (on top of HTTP or custom sockets). otherwise, choose a standard protocol and transport:
- SOAP and WS-* standards on HTTP, or XML/RPC on TCP, both textual protocols
- CORBA or platform-specific standard like RMI fo Java, both binary protocols
- REST services on HTTP, providing CRUD operations in term of HTTP methods
for more complex scenarios, we go beyond a distributed system sharing data: we need to interconnect independent applications. Enterprise Integration Patterns book shows four integration strategies: file transfer, shared database, remote procedure invocation and messaging. while for file, database and remote APIs we can reuse standards for sharing data, for messaging we need an ESB solution, built on top of existing technologies such as XML and JMS.
finally, there are a few additional services we usually need. for authentication and authorization, instead of a custom or integrated vendor sulution (login forms or SSO technologies) what about using HTTP or HTTPS? otherwise we can wrap existing transport inside SSL. also LDAP can help. then, we probably need to monitor systems and applications. again, before implementing a custom solution, consider using existing standard as (for Java) JMX or (the old one) SNMP.
and you, what’s on your toolbox?
