Friday, December 21, 2007

The Fluxology Office and GigaSpaces

Yesterday Paul successfully applied for the GigaSpaces startup program.
We will work with the exciting GigaSpaces technology and we want to explore both technical and business opportunities. We think GigaSpaces is an ace especially in the finance and telecommunication markets, were the capacity to deal with a huge number of transactions in high availability and the capacity of easily scaling horizontally for performances are crucial.

"GigaSpaces provides infrastructure software solutions that deliver unparalleled dynamic scalability for high-volume transactional applications, without the overhead and complexity inherent in traditional multi-tier development & deployment environments."

"GigaSpaces’ innovative “space-based architecture” combines advanced services such as distributed data caching, distributed messaging bus, parallel processing, and grid-enablement with open standards and cross-language support to enable a “write once, scale anywhere” approach to the development and deployment of distributed applications."


Next step on our side will be to create a set of joint solutions with Java CAPS and GigaSpaces, the possibilities and applicative fields are infinite. I will blog here for news about this experiment.

New partnership with The Fluxology Office

This month I have started a partnership with a small but very smart consulting firm called The Fluxology Office. It was founded by my friend Paul Peters, a dutch experienced IT consultant and entrepreneur who worked with me at SeeBeyond for a while.
We are working around Europe to help customers to get the best out of Sun Java CAPS and other new Sun technologies, we offer both consulting and training. We are also expanding our technology offer by selecting different new products we want to use in our projects.

Friday, November 16, 2007

Java CAPS review from The SOA Lab

Note: the previous review was written in 2007, so it was no more up to date and I decided to remove the link because it could be misleading. The Java CAPS product line has evolved a lot since version 5, the new version 6 introduces a whole set of new possibilities as it merges previous JCAPS 5.1 Repository components with the new, open-source JBI architecture and components based on the OpenESB, Netbeans and Glassfish Communities.

A recent review of the latest JBI-based products (Glassfish ESB) can be read here.

If you want to discuss more about the new Sun's SOA / EAI product line you can send me an email.

Monday, November 5, 2007

SOA without Technology

In this beautiful article "A Low-Tech Approach to Understanding SOA" Dan North describes a Service-Oriented Architecture in terms of a 1950s corporation, where no computers were available and people worked with just pens, paper and folders.

The idea is to focus attention to the business description level, without mentioning any technology. I think this is a very effective metaphor, which helps shaping the What, Who and Why of a business services architecture without bothering with implementation details.

Forgetting for a while about the technological infrastructure can really improve the quality of description and lead to a cleaner picture of what it is really necessary. SOA is first about People and Process, then Platform (Read this "SOA Methodology" presentation for an example).

There is nothing new under the sun, except that some weird information about SOA has delivered wrong concepts to final customers and it's time for some clarity.

HermesJMS with JCAPS

Here you can find how to setup HermesJMS with JCAPS.

Saturday, August 25, 2007

java.lang.OutOfMemoryError: unable to create new native thread

Symptoms


Some days ago a flow running into JCAPS 5.1 produced this exception:
java.lang.OutOfMemoryError: unable to create new native thread

The first attempt, especially if you are used to ICAN 5.0, would be to add memory to the JVM's heap with the -Xmx flag. But forget for a while about the misleading "OutOfMemoryError" and focus to the rest of the message: it is clearly telling that the JVM was asking the O.S. to create a native thread, but that was not possible. It does not mean you don't have enough heap. In fact, the mentioned flow was already running into a Logicalhost with 1024 Mb of memory and there was no sign that it was not enough.

Diagnosis


Depending on your operating system and JVM version, you can have a pretty different per-thread stack size which affects both the maximum number of native threads you can start and the overall consumed memory. See the Java HotSpot VM Options:

Thread Stack Size (in Kbytes). (0 means use default stack size)
Sparc: 512;
Solaris x86: 320 (was 256 prior in 5.0 and earlier);
Sparc 64 bit: 1024;
Linux amd64: 1024 (was 0 in 5.0 and earlier);
all others 0.

When your application is trying to start too many threads you might need to decrease the default stack assigned to each thread using the -Xss parameter, so that each single thread has less stack but you can create more of them. For some operating systems this is not enough, you should decrease the O.S stack size using the "ulimit -s" command.


Currently, some stack sizes are:

ThreadSS VMThreadSS CompilerThreadSS default_stack_size

SPARC 32 512K 0 C2:2048K C1:0 not used
SPARC 64 1024K 0 C2:2048K C1:0 not used
Solaris i486 256K 0 C2:2048K C1:0 not used

Linux i486 0 0 0 512K
Linux ia64 0 0 0 1024K

Win32 i486 0 0 0 ASSERTs:1024K 0
Win32 ia64 0 0 0 ASSERTs:1024K 0

Notes:
1) 0 for VMThreadSS and CompilerThreadSS implies
use of ThreadSS value

2) 0 for ThreadSS implies use of default_stack_size

Generally speaking you should usually start testing your flows with a little JVM heap, the default JCAPS 512 Mb is normally enough. Then you should increase step by step this value only and only if your processes are allocating big data structures requiring more heap. A good step size could be 256 Mb. It is never a brilliant idea to set your domain's heap to, say, 1.5 Gb by default only because this seems to let you sleep well. Additionally to the mentioned thread problem a bigger heap will lead to more complex and longer garbage collection cycles, penalizing performances in the medium term. It is then a good idea to set the same value for -Xms and -Xmx to help the GC.

Tuesday, August 14, 2007

Java CAPS: Processing Large XML Payloads Using a SAX Parser

Introduction


In Java CAPS the standard way to deal with XML files is to parse them through Object Type Definitions (OTD). An OTD represents a XML file as a Java object, it provides marshal and unmarshal methods, plus setters and getters for each XML document's element.
The OTD is a smart way to create a DOM tree in memory, starting from the XML source document. However, if the XML file is large loading it entirely in memory through a DOM representation is generally not a great idea. In this case is common to use a SAX parser, which allows to process the XML file as a stream instead of loading the entire object in memory. SAX parsing is of course easily implementable in Java CAPS, as this article will briefly show.

Implementation


The implementation is straightforward, it is just plain Java code. In this example an the eGate flow is triggered by an event in the form of a JMS message containing the filename. As the XML file we'd like to process is assumably large (otherwise why bother us with SAX...) it probably resides in some filesystem, so in this case a BatchLocalFile (part of the optional Batch eWay) can be used to read it. You are not doing such a stupid thing like sending multiple megabytes payloads through your JMS server, aren't you? As a general rule of thumb, it is a wise idea to keep your JMS payloads below 1 Mb, to avoid overloading your JMS server. As already explained in other posts, I think moving bigger payloads through JMS is a clear indicator of some flaws in your process' design and, sooner or later, it will drive to troubles.

Connectivity Map


Below the simple CM for this example:

The queIn channel receives triggering events for the svcSaxParser service, which makes use of a BatchLocalFile external application to read the file from disk. The JCD, as described below, is really trivial and logs some elements using the standard logger.

Java Collaboration Definition


the SAX parsing service is implemented through a JCD called jcdSaxParser. It receives the input JMS message containing the filename, opens the InputStream from disk and assign it to the SAX parser. A SAX's DefaultHandler inner class, called (with some lack of fantasy...) MyHandler, is defined and used to intercept SAX events:

package SamplesprjSAXJCD;

import java.io.InputStream;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;

public class jcdSaxParser
{
public com.stc.codegen.logger.Logger logger;
public com.stc.codegen.alerter.Alerter alerter;
public com.stc.codegen.util.CollaborationContext collabContext;
public com.stc.codegen.util.TypeConverter typeConverter;

public void receive( com.stc.connectors.jms.Message input, com.stc.eways.batchext.BatchLocal BatchLocalFile_1 )
throws Throwable
{
try {
BatchLocalFile_1.getConfiguration().setTargetDirectoryName( "D:\\Projects" );
BatchLocalFile_1.getConfiguration().setTargetFileName( input.getTextMessage() );
InputStream istream = BatchLocalFile_1.getClient().getInputStreamAdapter().requestInputStream();
// Create a handler to handle SAX events
DefaultHandler handler = new MyHandler( logger );
// Parse the stream
parseXmlStream( istream, handler, false );
BatchLocalFile_1.getClient().getInputStreamAdapter().releaseInputStream( true );
} catch ( Exception ex ) {
logger.error( "@@@ ", ex );
}
}

// Parses an XML stream using a SAX parser.
public static void parseXmlStream( InputStream istream, DefaultHandler handler, boolean validating )
throws Exception
{
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setValidating( validating );
factory.newSAXParser().parse( istream, handler );
}

// DefaultHandler contain no-op implementations for all SAX events.
// This class should override methods to capture the events of interest.
static class MyHandler extends DefaultHandler
{
private final com.stc.codegen.logger.Logger _logger;
private final StringBuffer _buff = new StringBuffer( 1024 );

public MyHandler( com.stc.codegen.logger.Logger logger )
{
_logger = logger;
}

public void startElement( String uri, String localName, String qName, Attributes attributes )
throws SAXException
{
_buff.append( "startElement: uri=" ).append( uri ).append( ", localName=" ).append( localName ).append( ", qName=" ).append( qName ).append( "\n" );
}

public void characters( char[] cbuf, int start, int len )
throws SAXException
{
_buff.append( "Characters: " ).append( new String( cbuf, start, len ) );
}

public void endElement( String uri, String localName, String qName )
throws SAXException
{
if (_buff.length() > 0) {
_logger.info( "@@@ " + _buff.toString() );
_buff.delete( 0, _buff.length() );
}
}
}
}

After creating a proper Deployment Profile you can run this flow by sending a JMS message containing the filename into queue queIn (you can use the eManager for that). Then you just need to add to the MyHandler class some more useful functionality.

The source stream was obtained from the InputStramAdapter of the BatchLocalFile:
InputStream istream = BatchLocalFile_1.getClient().getInputStreamAdapter().requestInputStream();
Then the parsing is done by passing both the InputStream and the handler to the SAXPArser's parse method:
factory.newSAXParser().parse( istream, handler );

Conclusions


If you were struggling with 100 Mb big XML files and using OTD you've got plenty of OutOfMemory errors, you could try to implement a SAX parsing process as described in this article. Before implementing this technique ask yourself why the hell you are producing so big XML files and then try to fix your data model or your process, because to me you are using XML the wrong way.
A typical case where dealing with large XML files could be unavoidable is for HL7 v.3.0 XML messages: specs define huge XML Schemas for that standard, it could be even impossible to generate an OTD with the eDesigner.

Friday, August 10, 2007

The CAP Theorem

In this InfoQ video presentation Amazon's CTO Dr Werner Vogels discuss about availability and consistency for distributed systems. The central item is the "CAP theorem", Dr Vogels describes it starting by this question:

What goals might you want from a shared-data system?

- Strong Consistency: all clients see the same view, even in presence of updates
- High Availability: all clients can find some replica of the data, even in the presence of failures
- Partition-tolerance: the system properties hold even when the system is partitioned

The theorem states that you can always have only two of the three CAP properties at the same time. The first property, Consistency, has to do with ACID systems, usually implemented through the two-phase commit protocol (XA transactions).

In his presentation Dr Vogels explain why big shops like Amazon and Google, as they handle an incredibly huge number of transactions and data, always need some kind of system partitioning. Amazon then must provide high availability, for example a customer must always has access to the shopping cart, because it obviously means that the customer is committing to buy something. As for Amazon the third and second CAP properties (Availability and Partitioning) are fixed, they need to sacrifice Consistency. It means they prefer to compensate or reconcile inconsistencies instead of sacrificing high availability, because their primary need is to scale well to allow for a smooth user experience.

This IMHO leads to some easy conclusions: most legacy application servers and relational database systems are built with consistency as their primary target, while big shops really need high availability. That's why firms like Google or Amazon have developed their own applicative infrastructure. That's why, as Dr Vogels presentations explain well, a two-phase commit protocol is never an appropriate choice in case of big scalability needs. On this subject you can also read this article from Gregor Hohpe: Your Coffee Shop Does Not Use Two-Phase Commit

To scale-up what you really need are asynchronous, stateless services, together with a good reconciliation and compensation mechanism in case of errors. Second, your data model has a dramatic impact on performances, that's why Amazon has implemented a simple put/get API instead of running complex database queries, and why Google performances are due to the MapReduce algorithm: simplicity rules.

Friday, July 27, 2007

EAI AntiPatterns: Using a JMS Server Like a Database

Do you know a very simple rule of thumb to verify the health of your integration flows? Well, everything could be consider reasonably fine if queues in your ESB are on average empty or quite close to be empty. I have heard many times in different projects complaint like "this JMS server doesn't work well, I have about 10,000 messages in a queue and everything looks so slow...". 10,000 messages parked in a queue? You have a problem here and it is not the fact that the JMS server is not very good at dealing with that, simply your flows are unbalanced and your overall design is broken! Basically your consumers are much slower than your producers so messages quickly accumulates in the system. But a JMS server is not a database, is definitely not for long term storage. You can't easily query messages in your queues as you could do in a database, you can't easily delete or edit them. Additionally, a good JMS server like the SeeBeyond IQ Manager, which is the default JMS implementation in JCAPS, by default activates message server throttling. It means that if persistent messages in the server go beyond a certain threshold the single producer or even the entire JMS server are stopped until a proper consumer lag is reached. For the SeeBeyond JMS these values are by default 1000 messages per single queue, after that messages producers are freezed, and 100,000 messages for the whole server, after that all the producers connected to that particular JMS server are stopped until a certain amount of messages are properly consumed. This is a safe net to avoid producers flooding the JMS server.

Probably the guy above complaining about the 10,000 messages knew about this throttling feature and he decided to simply increase the default threshold. This is definitely a bad idea, he needs to fix the balance of his flows and really understand what is happening in his system instead of looking for easy workarounds. The default limit of 1000 messages per queue is there for a good reason, and it is that a JMS server is not a storage device! It is a quick asynchronous delivery mechanism instead, where messages must stay in a queue for the minimum possible time. The message persistence is there for a complete different reason: it is a way to avoid message losses in case of a temporary hardware or software failure of the messaging system, that's it (oh well, you need an highly available filesystem for that, otherwise that becomes your new single point of failure...). If too many messages are usually staying in the system for a long time you'll notice a proliferation of .dbs files under your stcms folder. Briefly these files are where the IQ manager stores persistent messages, when they are too many all the system becomes inefficient because of files segmentation and reallocation (yes, there is a kind of garbage collection in action, and you want to avoid too much of that).

Then you can say: "but in my flows consumers are slower than producers by nature". This can be easily true because, for example, consumers are performing slower I/O operations with external systems (quite the norm in EAI). So, even if with JCAPS you can easily deploy a set of distributed consumers, this would solve the problem only if the processing is CPU-bound, but not if it is I/O-bound, as scaling horizontally does not help very much if external systems are inherently slow. So what? Well, your flows looks too me not to be nearly real-time, requesting an asynchronous delivery, but instead nearly batch. You need to take control of this and design a proper solution instead of complaining against the technology you are using (you might say that some other JMS servers can be configured to persist messages into a regular relational database. The bad news are that this does not solve your design issue at all: you are using a messaging solution the wrong way, regardless of the underlying persistence mechanism of your particular JMS vendor).

The solution? You should apply the "store and forward". Store your incoming messages into a regular database table then forward them into the destination queues using a more controlled process, moving batches of messages using a scheduled procedure, keeping your consumers busy but without flooding queues. You then might need to implement a reconciliation service. This depends on the context, but probably you would like to know how many messages entered and exited your system, and you probably like to have the possibility to re-submit or delete messages. A database table is a good fit for it, you can decide to store some additional information into table's fields and the run queries on it. Your reconciliation service will expose counters so that you know how many messages traveled through your EAI flow, for each single processor (a JCD in the JCAPS jargon). You than can decide to delete messages from the database when they are picked up by the first consumer or you might prefer to do so only at the very end, when all your process is successfully completed. This second solution could allow to remove the persistent flag from all the intermediate queues in the processing pipeline, to further speed up the JMS server, but this is a design consideration quite dependant on the applicative context. For example, in some scenarios could be simpler and cheaper to repeat the whole process from the beginning in case of failures, instead of maintaining an intermediate state, but in some other scenarios it could be too expensive to repeat and so it is mandatory to store intermediate processing results within the message itself, so that it must be stored in a persistent queue.

Saturday, January 6, 2007

Books: Enterprise SOA Adoption Strategies

Steve Jones, currently a CTO at Capgemini for SOA strategies, wrote this very interesting little book that fills a gap in the IT literature. The term SOA, which stands for Service Oriented Architecture, is a bit of an overused buzzword these days. SOA has to do much more with the business side of an enterprise than with the technological one, while most books focus their attention too much on technical aspects. The target of a SOA strategy, as Steve's book explain very well, is to "deliver IT to the business". This should sound very obvious, shouldn't it? In my experience many technology departments were nowadays being transformed into playgrounds for software programmers and mediocre IT managers instead. A SOA strategy must reinstate IT into its necessary form within the enterprise, which is to be business-driven.

SOA must be business-driven and not technology-driven

Steve's book explain very well an analysis approach to model a SOA starting from main business functions, down to a decomposition to technical services. The key to a Service Oriented Architecture? As he wrote: "It's the Service, Stupid!". So the second important concept you will understand reading his book is another very often missed point:

SOA must be service-driven and not process-driven



Enterprise SOA Adoption Strategies
by Steve Jones
Read it on InfoQ