Monday, January 10, 2005

What does it take to implement services? (as in SOA)

I been thinking on what exactly has changed in the way we develop applications in 2000 and today especially in relation all this noise about a new Web services standards. WS-Crap and WS-GoodForNone are being released almost every day.

Well, in certain ways this necessary for our own understanding (as in IT providers) to evolve and mature. Finally when the dust settles we will have something which is the basic minimal set of standards which are necessary to standardize and understand.

Specifically, I have suddenly heard java component developers using the word services more frequently and less appropriately for their component contracts. Well a client who uses your component can definitely treat the component as a service provider but that is definitely not what SOA is about.

In fact, SOA is about services that provide some common properties which we will see shortly and at the same time do not specify the language or component model actually providing that service. In fact you can even use simple and effective scripting tools like PHP and Python to implement SOA? I need to emphasize that you do not need Object Oriented languages even.

What then, qualifies to be called a service? In my understanding three constituents are vital.

1. A Service Contract that clearly specifies what data is expected to invoke the service and what you can expect in the response. This is the XML Document Schema in Web services.

2. A Network Endpoint or Port and binding in WS terminology that specifies the protocol and network address to send an XML document (instance of the contract) to. Remember it does not have to HTTP - in fact you can still use SNA LU 6.2 :-)

3. A Service Processor or software that actually handles the service request and provides the response.

It is important to recognize the intent and role of these three rather than their names as they always appear in internet literature well disguised.

Now for the major thought within the implementation space (Java or .Net) - what really changes between one service and another within an SOA application is mainly the Service Processor which in the java world is mostly either an EJB or an MDB. Yet, during application development are we spending too much time and effort having to write code for parsing the XML document and transforming it into the object/component interfaces within the processor. My premise is that if we can altogether avoid having to write code repeatedly for each service then we can build services faster and with better reliability.
Why reliability - simple - less code more reliable.

Before I post my ideas how to achieve this - I'd like to be sure that I'm seeing the right problem.
Hope to hear from you.

Bye


Monday, November 22, 2004

The engineering of software

I am a 'software engineer' not in my designation but as a term that describes my work. At least that is what I strive to be.

Those two words 'software engineer' actually imply a person who can instill an engineering discipline into creating software. Developing software is itself a demanding task and most often it is common to just leave the engineering part of it as a name tag.

Why is this so and is it important that we at least see the issues clearly. G.K.Chesterton once remarked
'It is not that they cannot see a solution - It is that they cannot see the problem.'

One of the problems we are still trying to understand and solve is creating a mathematical representation of good accuracy that models the complex structure and behavior of software instructions. These include the type systems, concurrency models, knowledge models and reasoning.

Can you write software without all this - well you can. The difference is you can write software but you can never accurately derive and prove program characteristics until the program is written.

It does not matter how long I have been writing software without engineering. It is important at some point to realize that estimation, understanding and verification of software are impossible without further investigations into analytics or simulation. Empirical observations and knowledge of historical projects is only a small first step. Further, safety and better use of software will result when we are able to model program characteristics in a formal representation and then even verify whether actual programs conform to those characteristics.

The areas we need to investigate are a language for program representation that has high fidelity to programming languages but is more formal and concise. It should not be bound to normal structure found within implementations of programming languages like type rules.

This representation can then rendered in either human readable algorithmic forms or as visual diagrams (UML). The final breakthrough is creating a computational task that can 'understand ' a given concrete program and create a representation. This representation will be complete not only in static structure and interaction (UML achieves this) but also in semantics and flow.

This idea is quite old but is still valid and worth pursuing.
Can you see? - that is the question.







Sunday, October 24, 2004

Gap analysis for a Software Produc t Team

‘Deceived’ - is the state of sincerely believing something which is false and baseless.

The more I engage in teams the more I have to fight against deception. A deception that we are good and high tech when most often managers in software rarely know and understand what and how to measure in order to gauge effectiveness of the team.

Software is not only meant to be executed, it is meant to be read, to be understood by others, to be maintained, to be deployed - Think about that.

I will attempt to list some basics without which some teams alarmingly keep on churning software - software that runs (and nothing else) but is worth garbage measured against what it should be.

1. A Clear Java coding guideline.
How long will a software team survive without a standard set of comment tags for classes and methods?
At the minimum these should have
@requires
@modifies
@effects
for all methods, which forces developers to think in terms of pre and post conditions for method behavior and the effects produced.

It drastically improves morale of new comers (with or without experience) when they look at the code base since they can now understand code.

The practice of throwing an already ugly code base at new developers without mentoring is ineffective. Moreover, they end up as faithful followers of the seniors in writing code without comments and tests. It only produces more 'blind men'.

2. A Simple to use but enforced Automated Testing practice at developer level

I mean 'automated tests' the way automated testing should be.

A group of developers have to be handed over (obviously by management) this task to create a customized test suite based on JUNIT that can send and receive Application requests and responses.

Feature Tests based on these then have to be checked-in into source repository for every CR/checkin. Let Managers override and accept responsibility for this if they feel that the checkin cannot be independently tested.

3. A dedicated build machine separately in all locations and an automated build cycle.

Again, please resist the 'this is that' syndrome of trying to equalize anything that you are doing now to what is actually being proposed. Builds on developer machines are fine only for that developer and with limited scope.

Build machines look at the entire product. They should download the entire source from source repository do a full compile, build and package. Then a complete fresh deployment is done including all EJBS and then a basic set of automated Test are run to determine state of the code. Error are notifies by email or whatever is required. Xdoclet is one effective solution to build package and deploy EJB on any target App Server including weblogic,JBOSS and websphere.

Any build solution should maintain build config information in one and only place (DRY = Don't Repeat Yourself).

4. Finally, get all employees on a company wide Instant Messenger that helps locating people by knowledge and expertise. So anyone knows whom to reach to ask questions related to 'specific features'.

So there are solutions in all these areas but none will succeed without management drive and planning.

This analysis is free but might just be the same you get from a paid consultant. The content speaks for itself.

Wish you all the best.


Monday, October 11, 2004

Compilers with hooks

I have been thinking on type safety and the excellent job that compilers do. Without letting you assume things let me state it clearly - one job compilers do well is helping us write 'correct' software by verifying type usage. This means that, if you define a type and later use that type then the compilers verifies whether all usage is conformant to the definition.

What is so great about this - everyone knows this? I think that tells me something about my job as a mature programmer. I should write code that checks code - help other programmers - by simply shouting every time a rule or practice has been violated.

I am thinking in the lines of being able to write java code that can be then plugged into the compilation process in order to verify applications within context. So each product team or company can then apply their rules and practices that simply shout on violation. The programmers are then supposed to just go ahead and correct code to get quietness.

I am interested in dynamic typing and one of the issues in dynamic typing is that the functional type of the instance is not its static type but is separate and distinct. So each object has functional type information. All this is pretty well known. I would like to envision a mechanism of verifying usage of objects even when their type is attached at instance level (runtime).

To simplify the problem in static typing all usage of objects of class C have to be verified against definition in C. In dynamic typing objects now have a static type of D (the dynamic type def) and then their functional type has to attach to each of these objects in order to verify their usage. If this can be done we then catch a lot of problems associated with using dynamic types and dynamic typing is useful in certain application problems. Application Configuration and XML parsing are two examples.




Sunday, August 08, 2004

Basics of Risk Management for Java Developers

In order to deliver value under the name Risk Management we should be able to analyze measure and report the factors that affect business rewards (or profit) and the extent of losses expected from those factors.

I' m no expert in this area but I have to understand it in order to deliver value to the customers of our Software Products. So here I go tying to make sense.

Let me give you a simple example of risk in profits. You own a bank. (Yes, the current trend of paying geeks more for doing less just continues) The banks management projected nice numbers in its profits because they have just given out 30% of the banks capital in loans at 12% interest to be recovered in 5 years. The earnings on paper look very impressive. What the business managers or even the CEO might just not tell you are
1. What are the factors that can cause those paper profits to completely disappear?
2. How will those factors affect the expected profit and capital?

In the current example the factors could be a downturn in oil industry and most of the loan customers have their business in that segment, or a change in foreign currency rates - the local currency suddenly fears very bad against the Dollar or even some unforeseen event like a terrorist attack that almost wipes out the profits of your customers. (Remember the airlines industry after Sept 11/01).

So that is the first important step in risk management is identifying the risk factors that affect your expected rewards.

The next step arriving at a set of values for various combinations of those risk factors against time-steps. This leads to a related set of factors with numbers for their effects that are linked to the occurrence of some other factors. For example a hike in Oil Prices (one factor) causes the occurrence of another factor a change in Fx Rates against USD.
This related set of risk factors with values over a time line is what I call a scenario. How do you arrive at those values - this where you need accurate historical data and some analytic code.
The analytic code analyses events of the past and makes reasonable predictions of events in the future - each event is the effect of change in one (or more) of the risk factors identified earlier.
Even without any analytic code there are clever business minds who can look at historical data for a time period of the past and give a good business scenario projection for a certain time periods in the future.

So we have risk factors, scenarios and finally one more. Instruments, Trades and portfolios.
The difference and relation between instruments, trades and a portfolio is simple to capture for developers.
Consider an instrument to be a class definition. Trades are instances of Instruments. So instruments are the type for a Trade. A collection of trades is a portfolio whatever is the grouping condition for putting those trades together. A Type object pattern in design pattern language is very useful to capture this relationship.

Now let us dig deeper - if you look at how your company assess and reports risk it is mostly on the basis of the numbers in the portfolio and every time a trade changes your risk positions needs to be adjusted. Now, this is most often a time intensive and/or a computationally intensive operation. Measure of Risk is always dependent on Portfolio. We do not touch that.
But the extent of dependency can be reduces by shifting the intensive analytics to be performed against a bag of instruments - the type definition of a portfolio.

This is my logical conclusion after one hour of risk study. I need to pursue it further because as I stated earlier I see value for our customers.

If a Trade is an instance of an instrument then a portfolio is an instance of an instrument group. Now I assume that there is some commonality in risk profiles across two portfolios with the same instrument group. That commonality needs to detailed and pinpointed to some risk multipliers. If you now have this risk profile for a collection of instruments then you can apply it to any number of portfolios in a much faster way than before.

What remains is to abstract Instruments and SetOfInstruments well. Risk factors and profiling should then be applied to SetOfInstruments and the abstraction tested.

I need to get back to work now.

Bye







Thursday, August 05, 2004

Static and Dynamic Data Typing in Java

The concept of types is crucial as you progress as a developer. I have actually written java code about 7 years back without fully grasping this concept.

It involves understanding that a class is a type by itself (the implementation type) and can also comply with other types. You then begin to not just appreciate but ensure a clear separation between specification and implementation between interface and class. Even in simple method signatures you begin to be interested and careful about method specification, pre-conditions, effects and dependencies.

I want to focus on something else today that @requires an understanding of types in OO.
The concept and use of static and dynamic data types in java.

Application programs have to remember data. The simple way is to store them in variables.
int i =2; or Customer cust = new Customer("262621");

Every piece of data (including objects) has a type. And the type defines what you can do with that element. So type defines behavior. Types have hierarchies (remember need not be a single hierarchy) but a forest of hierarchies. From this we understand the concept of a generic (higher in the hierarchy) and specific (lower in the hierarchy) types.

A static type is a simple java class that defines fields and methods. I am really interested on how java types encapsulate data. In static data typing each field is statically declared with a type and a name. The variable (the container for data) has a specific type attached to it. Then you expose this field on the public interface of the class by suitable accessory.

Dynamic typing does not define static members. In dynamic typing the container for data has a generic type that can accommodate any kind of data. The problem this creates is that whenever you store (try to remember) data in such a container you loose information as to what the actual type was. The type of data has to be remembered along with the data element as the type of the container is not the type of the data element (as in static typing). More aptly the container type has no relation to the type hierarchy of the date elements that are stored in it. At compile time you do not know what these names or types are. It immediately implies that in a dynamic data type you do not have type safety. Now let me explain what is type safety and more what is the benefit of having type safety.

Consider a static type as follows

class Customer
{
public String getCustomerId()

public Date getBirthDate()

public String getLocationId()

.. fields and method body left out
}

Now, the application dictates that a Customer is allowed a 5% discount but Agents do not qualify for this discount. If Agent is another static type in the Application then the signature of the method to calculateDiscount( Customer argcust) is defined with an argument of type Customer. So only Customer objects can be passed as parameters.

Now suppose the application is dealing primarily with dynamic types and assume both Customer and Agent classes do not exist instead we use a class called DataObject to encapsulate data. Now that same method will be defined as calculateDiscount( DataObject argcust) .
Both Customer objects and Agent Objects appear to the application developer as DataObjects(the generic container for data). Now it is upto the developer to ensure that he does not inject a dataObject with the state of an Agent and pass it into the calculateDiscount() method. Even if he does, the compiler will not complain and will result in an runtime application bug.

The important question that an arise in your mind at this point is why go through all the pain of using dynamic types when you acheive it with static typing. True - Dynamic typing introduces a problem of lack of type safety. But it does provide certain benefits and moreover there are situations where you do not need static types at all. Those situations are characterized by the use static java beans with only state and very little behavior. Classes in the domain model immediately do not fall into this category since domain models primarily are designed around behavior. But there are many layers within an application that are essentially best suited to dynamic types. The use of static type in these situations results is unwanted coupling and also more development effort in creation and maintenance.

Some of the layers are
a) edge data parsing - either from the network or from files. XML & fixed width parsing included does not require static types.

b) Application Config Information - is fundamentally dynamic in definition.

c) Relational Data Managers - That reads data from RDBMS.

Further, it is far more powerful to use codegen or bytecode manipulation to generate static types from Dynamic type objects based on some type definition (XML Config). This layer then becomes a data mediator service between the world of static and dynamic types.

Wednesday, August 04, 2004

Handling time variant data

Consider a rate or price feed - all bankers and energy traders have to deal with this regularly.

Basically there is a set of values (the price or the rate) for a specific commodity / index / currency from a specific data source.
Now you can have values for a year, which may be further qualified by values for every month, that is again qualified by values for each day in the month - which is again qualified by values for every hour for that day and so on....(it depends on the business domain or the commodity or even on the customer) . We (developers) should not assume the precision of the time spread.

The problem is

1. How do I abstract this price information or rate information in such a way the clients can access aggregate (or summarized) values without really being concerned whether detail values (more precise) exist or not.
For example a price feed provided you hourly prices for this month, but only monthly prices for months beyond this month, only quarterly prices for the months beyond 6 months and yearly prices for anything further.
Assume today is Apr 01 1004 . How do I access the price from the abstraction for
a) April 01 2004 14:00:35
b) April 02 2004 13:00:00
c) May 05 2004
d) October 10 2004
e) April 01 2005

It means that is a specific (detail) price exist use that otherwise use the next summarized price.

I am thinking about a tree where each Node has aggregate properties based on the properties of the child nodes. Each nod then becomes a collection whose individual properties are aggregate values form the child nodes.

A DAY Node has a price property - which is the average (on type of aggregation) of the price of all its HOUR Child Nodes.
Another DAY Node does not have children at all (any specifics) - in that case the price is a fixed (allotted) price and not an aggregate price.