Blog Archives

Will the Cloud drive SOA adoption?

I participated in a podcast today on EBizQ with Peter Schooff, managing editor.  You can listen to the podcast or simply read the transcript here.

To summarize, I believe that companies with a sound architecture that is loosely couple from both their data layer and their infrastructure are in a great position to take advantage of cloud computing.  Companies that are tightly coupled or “hard-wired” to their databases and servers will struggle. In fact, I would argue that tightly coupled architectures are not good candidates for cloud computing.  Instead, companies that are “hard-wired” are better off only building brand new applications in the cloud and should not even attempt to move legacy applications off premise.

I also predict that 10 years from now the cloud will be common place for companies of all sizes.  The next 2-3 years we will finally start seeing real enterprise success stories, especially in the government sector where the government is driving numerous large scale cloud initiatives.  Check out the podcast as I answer five important questions about SOA and the Cloud.

Potential Impacts of Amazon’s Virtual Private Cloud

I usually don’t get involved in vendor discussions. I like to think of myself as vendor agnostic (or some say anti-vendor). But I think today’s announcement of Amazon offering a virtual private cloud (VPC) solution is rather significant for a number of reasons.

1. Clarity of the Private Cloud definition

Chris Hoff’s post sums it up the best:

In one fell swoop, AWS has:

  • Legitimized Private Cloud as a reasonable, needed, and prudent step toward Cloud adoption for enterprises,
  • Substantiated the value proposition of Private Cloud as a way of removing a barrier to Cloud entry for enterprises, and
  • Validated the ultimate vision toward hybrid Clouds and Inter-Cloud

Of course this won’t end the mass confusion over cloud definitions and the arguing over semantics, but it clearly differentiates between on-premise and off-premise.  In my opinion, part of the definition of cloud computing is the outsourcing of infrastructure, so I don’t consider on-premise attempts at virtualization to be private clouds.  Amazon’s Werner does a good job of outlining the benefits of cloud vs on-premise:

I often get asked to define “The Cloud,” especially because of the many permutations that different vendors use in trying to make their existing businesses look like a cloud offering. I define the cloud by it benefits, as those are very clear. What are called private clouds have little of these benefits and as such, I don’t think of them as true clouds.

The cloud:

  • Eliminates Cost. The cloud changes capital expense to variable expense and lowers operating costs. The utility-based pricing model of the cloud combined with its on-demand access to resources eliminates the needs for capital investments in IT Infrastructure. And because resources can be released when no longer needed, effective utilization rises dramatically and our customers see a significant reduction in operational costs.
  • Is Elastic. The ready access to vast cloud resources eliminates the need for complex procurement cycles, improving the time-to-market for its users. Many organizations have deployment cycles that are counted in weeks or months, while cloud resources such as Amazon EC2 only take minutes to deploy. The scalability of the cloud no longer forces designers and architects to think in resource-constrained ways and they can now pursue opportunities without having to worry how to grow their infrastructure if their product becomes successful.
  • Removes Undifferentiated “Heavy Lifting.”The cloud let its users focus on delivering differentiating business value instead of wasting valuable resources on the undifferentiated heavy lifting that makes up most of IT infrastructure. Over time Amazon has invested over $2B in developing technologies that could deliver security, reliability and performance at tremendous scale and at low cost. Our teams have created a culture of operational excellence that power some of the world’s largest distributed systems. All of this expertise is instantly available to customers through the AWS services.

Source: Amazon Web Services Blog

2. Simplifies hybrid cloud architectures
This is where I get excited.  I have talked at great length about a hybrid cloud solution that my company is building.  Since Amazon only had a public cloud offering prior to today, I had the added complexity of integrating two cloud vendors to make up our hybrid solution.  This meant I needed to manage two separate vendors with two distinct SLAs and introduced some potential latency issues communicating between the two vendors’ platforms.  Assuming that our testing of the beta goes well, Amazon’s VPC simplifies our architecture and makes it more manageable.  I also feel that the risk of my private cloud vendor going out of business or getting bought by a competitor has been greatly reduced.  Another advantage is that my engineers already have experience with Amazon EC2 which greatly reduces the learning curve on the private cloud.  I am a believer in the KISS (keep it simple stupid) theory.

3. Threatens the livelihood of private cloud vendors

Dave Rosenberg of CNet tweeted “AMZN’s new private cloud service just put a half-dozen startups out of business.”  He may be right.  Before this announcement, private cloud providers were not directly competing with Amazon.  Now they are.  There is one caveat though.  For Amazon’s VPC to replace private cloud  vendors, they must allow for access for auditing purposes.  In my case, the whole purpose of the private cloud is to put critical data on computing resources that are not shared so that audits can be performed.  So Amazon’s VPC solution is not complete if they still refuse audits to be performed in this environment.

When I was evaluating private cloud vendors, I was alarmed by how expensive they were.  On the surface, Amazon’s VPC looks significantly cheaper than many of the vendors I researched, although I confess that I have not done a deep analysis on this yet.

Summary

It will be interesting to see how this announcement impacts the competition amongst the cloud vendors.  I assure you that many vendors will try to change the public perception of Werner’s definition of private cloud to one that closely resembles their approach.  It will also be interesting to see the results of the first few beta tests of the VPC solution.  My team will be kicking the tires on it.  If VPC turns out to be everything Amazon promises it to be and they do allow for auditing of VPC images, this will be an absolute game changer for the industry and cloud computing.  My 2 cents!

How to be PCI Compliant in the Cloud

There has been a lot of talk lately about PCI Compliance in the cloud. Amazon even admitted that PCI Level 1 could not be achieved on the AWS platform. Of course the pundits took that comment and immediately wrote off the cloud as a possible solution for systems that process payments. There is a big difference between the saying you can’t run your entire application in the public cloud and be PCI compliant versus you can’t use cloud computing and be PCI compliant.

I have spent an extensive amount of time researching this topic because I am trying to architect a secure and compliant platform that is 100% cloud based.  Just because many people sitting on the sidelines say it can’t be done does not make it true.  In fact, I already have the  blessing from two external companies, a security firm and a PCI auditing firm, that have reviewed our architecture.  I agree that achieving PCI level 1 in a public cloud is unattainable, but I will show how you can easily be compliant by simply processing payments external to the public cloud.

The following diagram is a slightly modified version from my Secure Hybrid Cloud Architectures post.

From Cloud Computing

In this diagram you can see the use of both public and virtual private clouds.  The virtual private cloud is used to store critical data on virtual machines that only my company has access to.  This allows us to grant access to auditors when necessary, something that is impossible on shared virtual images in the public cloud.

The next step is to make sure that no credit card data is ever seen in the clear on the public cloud.  I show three different ways to approach this strategy (there are probably many other ways to do this as well).

Option 1 – SaaS Solution

The first method is to leverage a PCI Level 1 Compliant software as a service (SaaS) solution like Shift4.  The cloud solution has a web service for handling payment transactions.  Payment messages go to Shift4 who creates proprietary key values that are passed to the cloud web service.  The cloud application never sees the credit card in the clear or even the hash value.  Shift4 basically creates a dumb key that the cloud solution uses internally.  The public cloud is no longer held to the regulations of Level 1 because the card data never touches the cloud and nobody needs to know the seed (value for unencrypting).

Option 2 – Amazon Payments web services

The second method is similar but uses Amazon’s payments service.  Once again, the payment processing is offloaded to an external service that is PCI compliant and frees the public cloud solution from any Level 1 regulations.

Option 3 – On-Premise

The third method uses a company’s existing physical data center to process payments and/or mask the credit card ID.

All three options allow a company to leverage the cloud for all of the other processing which in many cases can be extremely cost effective due to low cost computing resources, cost effective scaling, reduced carbon foot print, and more.

Summary

The big take away here is that by offloading the payment processing to a provider or location that can pass a Level 1 audit, a company can still take advantage of the public cloud.  It is not an all or nothing proposition.  At the end of the day it comes down to knowing your requirements, understanding the pros and cons of the cloud, and applying sound architectural methods.

Blaming Cloud Computing for our own shortcomings

Every day I see the articles telling us of all the perils of cloud computing.  I try to refrain from responding but every eventually I get fed up and respond.  That usually cures me for 7 to 10 days until I have to write another post.  Today’s rant comes after reading this post called The perils of becoming a cloud software developer by Neil McCallister.  I like this article and think there is an important message in it about the perils of relying on someone else’s platform.  However, associating Twitter failures to cloud computing is where I get wound up.  So before we go blaming the cloud for all of our problems, let’s focus on our own shortcomings.  Here are two shortcomings related to the article mentioned above.

1. We don’t properly architect for the cloud.

Twitter was never built for cloud computing.  The original design was for an on-premise architecture.  When Twitter was having massive outages (as opposed to today’s regular intermittent outages), their solution was to fire their top architect and bring in industry experts to port the application to the cloud.  This took some time but eventually reduced the frequency of Fail Whales by leveraging Amazon’s cloud offering and designing for dynamic scaling capabilities.  So is Twitter a cloud application?  Yes.  Was Twitter architected for the cloud? No!  Twitter was on life support and rushed into the cloud.  The architecture was speedily “enhanced” to fit in the cloud.  Bottom line, don’t associate Twitter failures with cloud computing.  If anything, the cloud has made Twitter better!

2. We don’t properly assess the risks and operational aspects of our products and services

Many folks in IT are brilliant when it comes to writing code but often fail to fully understand the business aspects of supporting a product and dealing with customers.  Neil’s article points out the issues of running a business on top of Twitter or Facebook’s platform.  A smart company should go through a vetting process with its vendors and fully understand the risks, issues, and limitations.  What are the SLAs?  What is the provider responsible for if the SLAs are not met?  What are the vendor’s consumer privacy policies?  How do they protect from breeches? The list goes on.  The company’s business model must account for the potential risks, issues, and limitations.  It may be addressed through contractual agreements with vendors (unlikely in the case of Twitter & Facebook since they are free) or through risk mitigation strategies, or through terms and conditions that consumers must agree to upfront.  No company should be surprised if they build on top of Twitter and find out that the service they provide is unreliable to their consumers.  No company should be surprised that their customers data on Facebook is at risk.  Again, these are not cloud computing issues, these are issues with the vendors.

Summary

Building applications or leveraging assets in the cloud has its pros and cons.  It is critical that companies identify what the pros and cons are for their unique business opportunities and address them with sound archtiectural and business solutions.  There is nothing new about this.  Whether we are deploying on mainframes, client server, on-premise, off-premise, or whatever, the same holds true.  So let’s stop blaming the cloud for our short comings.  Instead, let’s understand the cloud better.  Use cloud computing when and where it makes sense, not because its cool and new.  Make sure the architecture addresses the risks and issues that the cloud presents to your business case.  After all, if your business fails in the cloud, who gets fired?  You or the cloud?

Cloud Computing: Top 10 Worst Practices

It is no surprise that many IT organizations are struggling to implement enterprise solutions in the cloud. After all, we have seen IT struggle for decades with many “new” initiatives like SOA, ERP implementations, Enterprise Architecture, etc. The following list captures what I call cloud computing worst practices.  Many of these items on the list are the same mistakes that IT has been making for decades while others are specific to cloud computing.

fail owned pwned pictures
see more Fail Blog

Top 10 Worst Practices in the Cloud

1. No business justification – This is the age old problem where IT has a hammer (cloud computing) and sees every business problem as a nail.  The reasons for using cloud computing should be directly related to business goals.  After all, the reason IT exists is to support the business.  I have seen many IT shops take on cloud computing for technology’s sake.  Start with the problem, not the solution.

2. Unrealistic expectations – Like SOA and other hyped technologies, cloud computing is not a silver bullet.  In fact, if you don’t architect it correctly you will likely expose your company to more risks, outages, and costs than your currently functioning on-premise solutions.  Don’t underestimate the impact of organization change.  Many people, especially system administrators, security personnel, and people who like the status quo may fight it tooth and nail.  This is just another page out of the SOA playbookPeople can kill any good technical solution.

3. Jumping in too soon – It is easy to get started in the cloud.  Simply sign up with a credit card and you can start living large in the cloud.  That’s great for R&D, off loading adhoc processing, or just experimenting, but if you are planning on building enterprise ready production solutions in the cloud you had better do your homework first.  I have seen IT shops put corporate data in the cloud without understanding the ramifications.  My recommendation is to spend some time at the Cloud Security Alliance website and download the CSA Guide.  Read it front to back and understand what needs to be addressed.  Then decide if cloud computing is a fit for your organization’s culture.

4. No focus on architecture – Classic IT here.  When will we learn?  To deploy enterprise solutions in the cloud, off-premise solutions must be architected differently than on-premise solutions.  Let’s learn our lesson from all of the SOA failures and focus on architecture.  You don’t buy security, compliance, failover, performance, resilency….you build it!

5. Moving legacy to the cloud – This might be the biggest mistake of them all.  Most legacy applications were never intended to be exposed outside the corporate firewall.  I have seen companies spend stupid money trying to retrofit existing systems that work fine and make them “cloud enabled”, all in the name of saving money.  The end result is often a more expensive system with more security issues, less stability, and more complexity.  Unless your legacy system is based on a service oriented architecture, the cloud is better served for new applications, not legacy.

6. Depending on the cloud vendor for security – Some companies see the cloud as an opportunity to outsource security.  Nothing could be further from the truth.  Running systems outside your firewall requires more security from your application development teams than ever before.  No longer can companies deliver applications with little to no application security and hide behind a corporate firewall.  Now companies must actually deliver secure software and many don’t have the talent and know how to meet the challenge.

7. Not addressing the risks – If you downloaded and read the CSA Guide, you will notice that there are a lot of risks when deploying in the cloud.  That doesn’t mean that the cloud is bad, but if don’t address the risks, your solution will be beyond bad.  You must address issues like data ownership, consumer privacy, PCI compliance, country specific regulations, and much more.  Ignore these risks and you could expose your company to security breaches, outages, and failed audits.  As the old saying goes, “You can pay me now or pay me later”.

8. Underestimating the effort – If you listen to the vendor case studies you will hear about the weekend success stories like the Washington Post where a ton of work was accomplished over night for less than $200.  Yes these are real case studies but these are not real enterprise solutions.  They are one-offs where a ton of work is offloaded to the cloud for pay as you go processing.  That is much different than architecting a fully operation production system in the cloud.  You will need more than a credit card for one of those!

9. Selecting the wrong vendor – If you pick your cloud vendor because of an existing relationship with your favorite vendor you should be shot on site.  First of all, the mega vendors like IBM, Microsoft, and HP are late to the game and are not thought leaders in the cloud.  In fact, you would be hard pressed to find a significant real life enterprise solution deployed on any of the mega vendors’ platforms since they are all relatively new.  Instead, understand your requirements.  Before selecting a vendor, figure out what type of cloud makes sense for the business problem and your architectural standards.  Are you looking for Infrastructure as a Service (IaaS) and want to be free to code in any language?  Are you willing to be locked in to a Platform as a Service (PaaS) vendor and write in the code mandated by the platform?  Or do you simply need to assess Software as a Service (SaaS) solutions?  Or maybe you need a combination of these.  Do you need virtual datacenters outside of the US?  Do you need a public cloud, private cloud, or a hybrid cloud solution?  There are a lot of questions to answer before you settle on a vendor.  Picking Azure because your developers know C# is like playing Russian Roulette.  Close your eyes before pulling the trigger!

10. Lack of talent – And finally, like every other failed attempt at implementing new technology, many shops just don’t have the talent needed, don’t train their staff sufficiently, or hire consultants who apply the previous nine worst practices.  Cloud Computing can allow companies to compete like never before.  Unfortunately, there are a lot of people who can screw it up.

Real Time Transactions coming to a Cloud near you

I am one of those people who love to do things that “can’t be done”. At the end of 2008, I was researching real time transaction processing and cloud computing. I could not find many examples where people had pulled this off. To make matters more challenging, I was looking for case studies where these transactions were also being done with consumer and financial data. When I polled my Twitter network for case studies of this I mostly heard laughs and jokes for replies. One response was “Let me know when you find one”. Sounded like a challenge to me!

As I have mentioned in the past, we leveraged the Extended Enterprise Architecture Framework (E2AF) cheat sheet I put together to identify our business requirements, followed by our data requirements, followed by our systems requirements, and then our infrastructure requirements (in that order). This process provided us with the necessary requirements to guide us through the vendor selection process for our hybrid cloud architecture which addresses both our performance and security/compliance requirements.

Some of our key drivers for leveraging the cloud are:

So up until this point, I felt very good about our decision to leverage the cloud because it addressed all of our business and technical requirements…On Paper! Now it was up to my talented group of engineers to validate what we believed could be accomplished in the cloud. We needed to prove that our software in the cloud could process large numbers of concurrent real time transactions and return results in sub second response times. First of all, I can’t think of a feasible way to even try prototyping this on premise unless you just happen to have a ton of very expensive hardware of various configurations sitting around at your disposal. In the cloud we were able to fire up virtual machines ranging from 1.7GB, 1 node, 32-bit instances all the way up to 7GB, 20 node, 64-bit instances (see EC2 instance types here). My guys went through numerous iterations of designs with some trial and error and have come up with a solution that can process an incredible amount of concurrent transactions on a single 20 node instance with minimal load on our database that sits on its own virtual server. We can continue firing up more 20 node (or smaller) instances as the volumes increase and still have plenty of resources left on the database server to handle the load.

We are well on our way to deploying a real time transaction processing system in the cloud without buying a single piece of hardware, without having a physical data center, and with the ability to scale on demand and pay as we go. Our infrastructure costs to date are less than our travel expenses. It is hard to fathom how cost effective cloud computing can be when the right architecture is matched with the right business problem. We believe we have just scratched the surface on the throughput that we can get in the cloud. We already have enough throughput to meet our immediate needs but believe we can blow these numbers away with additional tuning.

For those of you who have been reading this blog over the last few years, you know that I like to write about the trials and tribulations of my enterprise initiatives. I have documented my experiences with BPM and SOA over the last two years and have been focusing heavily on the cloud this year. Up until this point I have written mostly about design theories and strategies. Now we are putting these theories and strategies in place. The blog posts you will see in the upcoming months will be the proof in the pudding. Expect posts on passing PCI audits in the cloud, logging strategies in the cloud, disaster recovery in the cloud, and plenty of myth busting. As always, feedback is welcomed!

Death by Mega-Vendors

If you have been reading any of my Tweets or blog posts lately you will know that I am getting entirely turned off by the conversations occurring around cloud computing and SOA. So much of the “discussions” are focusing on the theoretical aspects of Cloud Computing, the semantics of SOA, and the PR of vendors. It is getting to the point where I rarely read blogs and tweets anymore and have almost completely stopped blogging because the conversations are becoming irrelevant. Both SOA and Cloud Computing can create huge opportunities for businesses, but at the end of the day it boils down to architecture and people. I have beaten that dead horse into submission.

But it seems that the Mega-Vendors have really ramped up the PR lately. So much attention has been given to HP’s “cloud” solution, Oracle’s SOA solution, and various other large vendors and their 2009-2010 mission statements. First let me pick on the cloud computing “solutions” for the mega-vendors (again). As I have said in the past, the innovation in this space is coming from the pure players (3Tera, GoGrid, Mosso, etc.) and the companies that have been in the space for years (Amazon, Google, Salesforce, etc.). The mega-vendors are recycling existing products and calling them cloud computing offerings just like they did when SOA was in its prime hype stage.

So let’s pick on Oracle. There is so much hoopla about Oracle’s SOA Suite 11g. Underneath the covers, this is a conglomeration of many purchases. In fact, go here and see that Oracle has recently purchased over 50 companies in the last few years. In a past life, I was a BEA customer. BEA had also purchased many companies to put together its SOA stack. We drank some of the coolaid about the smooth integration between the layers. In reality, the BPM tool (formerly Fuego), the Portal (formerly Plumtree), and the Data Services (formerly Flashline) where all purchased from pure players and rushed to a new product line called AquaLogic. There were tons of issues trying to get these products to integrate. With each release, BEA improved their stack but they were still a few releases away from creating a fluent and robustly integrated stack. Then Oracle bought them and the cycle continued. Now you have a hodge podge of BEA acquisition tools being merged with Oracle acquisition tools. The 11G release is their “crown jewel”. I put money on it that they are still several releases away from getting all these various products with various underlying architectures to play nice together (if history repeats itself). So if you expect 11g to solve all your problems, think again. You will need a ton of professional services at $250/hr and above just to get the stack running correctly, before you even think about solving business problems with the stack. My past experience with mega-vendor solutions is that a ton of money is spent on non-value added labor just to get these mammoth tools running in your environment. Most of the solutions that the vendors recommend to fix the integration issues is to upgrade to the latest version of a given product within the stack. That tends to fix some problems but create others, which in turn are fixed by upgrading to another version of another product within the stack. This is a never ending, expensive cycle which makes you look foolish when you have to explain it in front of your customers. It is things like this that give SOA a bad name and gives the project the appearance as another solution for the sake of technology.

So don’t feed on the hype. If your company needs all of the features and all of the layers that are in these stacks, you are probably doomed regardless of what tools you buy. I don’t mean to single out Oracle but it is one that I have hands on experience with. I am sure some of you have similar horror stories with other mega-vendors like Microsoft and IBM. The bottom line is that innovation is happening with the pure players. The mega-vendors can’t match their agility and creativity so they buy them and add them to their product line where innovation stalls and shifts to internal integration efforts.

So when you see all the vendor PR and hype out there, take the rose colored glasses off. Figure out what features are really important for delivering your business requirements. You may not even need most of these tools (see Ross Mason’s To ESB or Not to ESB). Focus on business drivers and sound architecture principles. The last thing on your list needs to be vendor evaluations. If you are looking for vendor solutions to save the day, you are bound to fail. Your business drivers and architecture should lead you to the right vendor solution, not the other way around. Don’t forget about open source solutions either. You may only need a fraction of the features that these mega vendors offer.

So there is my rant for today. Now back to the vendor hyped Twitter stream.

Twitter: Is the Cloud Computing conversation nothing more than vendor PR?

I am a big fan of social media, especially Twitter. I have used Twitter to engage in conversations with fellow enterprise architects, CTOs, CIOs, vendors, and industry analysts to name a few. Most of what I troll Twitter for are conversations about EA, SOA, and Cloud Computing. Lately, the Twitter stream has been adding little to no value on these topics. Most of the chatter that comes across my Twitter client is a rehashing of what is being said at conferences. It seems that there are at least one, if not more, cloud computing conferences each week. Even the conferences that are not about cloud computing still manage to talk about the cloud. After all, if you are a vendor and you are not talking cloud, you are outdated (or so it seems).


Source: http://www.frontpagepr.com/public_relations/

So my Twitter stream is full of “new” information about the cloud. Vendors sponsor conferences, supply most of the keynotes and other sessions, and all of the attendees blog and/or tweet about it. Many of the people up on the grand stage can’t even define the terms correctly yet all of the trade magazines reiterate what they say to all of us who are not there. This leads to mass confusion because unqualified speakers are talking in riddles and peddling their snake oil to the hungry masses looking for information.

If you have been following the chatter the last few weeks, you might think that the top cloud computing vendors are HP, IBM, and Oracle. The reality is that these mega vendors have been on a huge PR mission and have dominated the conferences, thus dominating the conversation on Twitter and on blogs. For those of us who are actually working hands-on with cloud computing each day and solving real business problems, we know that these three vendors are not even close to the top of the cloud computing list in the areas of innovation, real customer installations, and maturity. Companies like Amazon, 3Tera, Google, Salesforce, GoGrid, and many others are light years ahead. Companies like iBM, Oracle, EMC, HP are retrofitting existing products and selling them as cloud solutions.

But try to have a real conversation on Twitter about cloud computing and you will be sorely disappointed. There are some great folks out there in the area of security and compliance. I actually see some value in those conversations once you sift through the hype and myths. But when it comes to architecture and deployments in the cloud, there is complete silence out there. I feel like I am stranded on an island looking for the ship to arrive. Where are the practitioners? Are there any? The cloud conversation is just pure conference regurgitation. There seems to be more cloud computing conferences then there are cloud computing case studies. When I do see an EA type talking about the cloud the message is usually, “Cloud computing has been around for years. This is nothing new”. When I hear that I respond with, “Phones have been around for years, but you can’t tell me that the iPhone and the Blackberry have not radically changed the way we do business”. This is true for the cloud. Yes, we have been outsourcing data centers for years, but the technological advancements over the years have driven costs way down and has made managing and deploying cloud resources much simpler than in the past. Today’s cloud is as much as a game changer as the Internet was when it bursted onto the scene in corporate offices in the early 90’s.

So as a CTO/architect who still rolls up his sleeves and designs things, I am seeing a huge void in finding valuable information about cloud computing. I don’t see any enterprise level case studies. I hear vendors talking the talk, but they have no real results to support all of their hot air. I see the analysts and bloggers repeating what vendors say which is mostly product specific. There is a lot of talk about what to do, but very few are talking about how to do it and sharing the hard fought lessons learned along the way. Maybe I am expecting too much from Twitter. Maybe the people with all the answers I seek are too busy building things to be tweeting all day. Maybe Twitter and Blogs are the wrong place to look for answers. Maybe I just need to figure everything out myself (which has been the case so far). Maybe Twitter is just the perfect PR machine for vendors and a bad place for practitioners to collaborate.

Maybe I just need a vacation!

Architecting the Cloud: Testing performance in your cloud application

When I talk about architecting in the cloud, I am referring to building composite applications or services from scratch with the cloud as the target deployment platform. So as you read this post think about an enterprise application or collection of services build for the cloud. In previous posts I have referenced a hybrid cloud model like the one below:

From Cloud Computing

In this model, there are many requirements in the architecture that are specific to security, compliance, reliability, and scalability and are independent of the business functionality that will be deployed in the cloud. If you look at the image above, you will see many different endpoints where data moves from one cloud to the next, to SaaS solutions, and between virtual data centers. I call this the Cloud Infrastructure.

These requirements are also critical to the flow of data through out the cloud. Encryption, transformation, replication, backup/recovery, and many other tasks are key deliverables within any good cloud architecture. Then comes the services that transport business logic in and out of the cloud while inheriting the cloud infrastructure and data services that should be built for reuse. The following image shows a simple view of the separate performance layers of the cloud architecture and the order in which they should be tested.

From Cloud Computing

The first thing to do is test the performance of your cloud vendor(s). Looking at the hybrid cloud image above, I would test the flow of data from the different endpoints. At this point there is no need to worry about encryption, transformation, business logic, etc. When testing the Cloud Infrastructure you should be testing the performance of the platform that the cloud vendor is providing. Do not add variables that are specific to the business problems you are trying to solve. For a hybrid solution, test the private and public clouds separately. Your tests should run for several hours with varying size loads. You need to ensure that the cloud can sustain heavy loads, handle concurrency, and consistently deliver solid performance for all transactions. Testing at this level will also help identify configuration and optimization opportunities for each cloud vendor. Once both the public and private cloud infrastructure is tested, then test the intercloud connectivity between them. Make sure this connection is not a bottleneck.

Once you are satisfied with the performance of your hybrid cloud, it is time to analyze the impact of encryption, transformation, data replication, and the various ways that data is being manipulated to address security, compliance, reliability, and scalability requirements. It is critical to understand the impact of these requirements on the overall performance of the system. If you skip this step, finding performance issues later can be like finding a needle in a haystack. People can waste a lot of time searching for performance issues in the business logic when the problem may be in the data layer. Manipulating data can be resource intensive and a potential bottle neck of the overall architecture. Spend some time testing this layer before overlaying it with business logic.

So now you feel good about the Cloud Infrastructure and you have iterated through the design of all of the data manipulation requirements. You now have a solid foundation for your business services. You can focus all of your energy on the the performance of your services. Test each service individually first. Then test the flow of data through the various combinations of service calls that the system is expected to perform. Put these services through rigorous testing and measure their performance for load, sustainability, concurrency, etc.

After the business logic has met the performance requirements, it is time to test the systems as a whole. Up to this point you have tested the system at different layers within the architecture and at different components within the layers. Now it is time to test the system holistically and in the eyes of the end user. It would be very expensive to find performance issues from the lower levels of the architecture at this point. That is why I recommend the layered approach to performance testing. What I also like about this approach is that you can start testing very early in the life cycle. For example, you can test the performance of the Cloud Infrastructure way before the development team delivers the business logic. This approach is iterative and agile and aims at removing performance risks earlier in the lifecycle thus reducing the risk of project delays.

Struggling for an ROI? Why not practice EA?

I have read many articles giving advice on how to sell a technology to the business.  It seems that the ROI is a hard thing to derive and explain in business terms these days.  You see it with SOA, Cloud Computing,  Social Computing, and even with security (yes, having to prove the value of security)!  Being a practitioner and addict of Enterprise Architecture,  I find this method of thinking to be amusing and even backwards.  It sounds to me like people have a technical solution and are now looking for a problem to solve with it.  It needs to work the other way around!

I have also read many articles about business and IT alignment, or lack of.  Well, coming to the business with technical solutions asking for help to justify them with business drivers is not alignment.  Alignment is being a participant along side the business solving business problems.  This is what Enterprise Architecture (EA) is all about.  EA is all about understanding the business and then aligning the proper technologies to help the business achieve its goals.  EA should not be a bunch of non business speaking geeks setting standards and creating pretty pictures on the plotter (aka Ivory Tower). The following picture describes how EA sees alignment. Notice that the technical strategy is a business enabler.


Source: Extended Enterprise Architecture Validation Full version.pdf

EA then defines a clear process for helping the enterprise take these strategies and turn them into something actionable and beneficial to the business.


Source: Extended Enterprise Architecture Validation Full version.pdf

The images above comes from the work of Jaap Schekkerman who created the E2AF (Extended Enterprise Architecture Framework) which I am a big fan and user of. As you can see from this process flow, it all starts with the business’s mission, goals, and objectives. From these “business drivers” the enterprise architects already have a view of the things that are important to the business. Then the architects start their analysis by asking the Why, Who, What, How, With What, and When questions for the perspectives of the Business, Information, Systems, and Technology Infrastructure (see image below).


Source: Extended Enterprise Architecture Validation Full version.pdf

A while back, I took the E2AF framework and made a simple downoadable cheat sheet that puts all of the questions in the above matrix into a document. Feel free to use and distribute it.

Whether an enterprise adopts an EA framework or not, there is no excuse for not asking a series of questions like the ones in the cheat sheet for each enterprise initiative.  If enterprises would do this, they would find that the combination of questions across the different perspectives would lead them to technology solutions (i.e. SOA, Cloud Computing, etc.) that should support the business’s mission, values, and goals.  At this point the ROI should be much easier, because the solutions where driven by the problem statement(s), not the other way around.

In Summary

When you see an IT team constantly struggling to “sell the business” or compute ROI’s, you can bet that they are not aligned with the business.  To become truly aligned with the business, a representative of IT or EA must sit at the table with the business execs and be a participant in the strategic planning of the BUSINESS.  This person should not be seen as the IT guy/gal, they should be viewed as a business executive.  When CIOs don’t value EA or can’t sell the value of EA, they are the IT guy sitting at the table.  When they do believe in EA and have built an EA that the business sees value in, then you have alignment.  Without this alignment, IT will constantly struggle to sell technical solutions to the business and come up with appealing ROIs.