# New Windows Azure Feature Announced at PDC 2010

This morning, I received a newsletter from the Windows Azure Platform Team. Besides the fact that it’s formatting is horrible as usual since all German umlauts are displayed as some weird unreadable symbol, it contained a lot of new and exciting features for Windows Azure. Also be sure to check out this video which provides a fast overview of new features for Windows Azure.

## Extra Small Instances

This is one of the features I was really waiting for. Before this announcement, the smallest instance size was Small, having a price tag of $0.12 per hour. Now there is an Extra Small instance size with$0.05 / hour.

I went ahead and replicated the table from this Microsoft web site to have the most current overview on my blog. This should come in handy, at least for me because I will keep forgetting this stuff.

 Compute Instance Size CPU Memory Instance Storage I/O Performance Cost per hour Extra Small 1.0 GHz 768 MB 20 GB Low $0.05 Small 1.6 GHz 1.75 GB 225 GB Moderate$0.12 Medium 2 x 1.6 GHz 3.5 GB 490 GB High $0.24 Large 4 x 1.6 GHz 7 GB 1,000 GB High$0.24 Extra large 8 x 1.6 GHz 14 GB 2,040 GB High $0.96 As explained in this video, Extra Small Instances share resources (CPU, memory) with other VMs on the same node. Furthermore, the network bandwidth is capped at around 5 Mbps. This is not the case with larger instance sizes, where CPU and memory are not shared and your service can leverage unused bandwidth. I believe this is a very good idea. Amazon’s smallest Windows instance (Micro Instance) is available at a cut-price$0.03 per hour, but we can’t compare these offerings 1:1 because Azure offers more functionality. We don’t have to pay extra for load balancing services and we don’t have to worry about OS updates etc. Besides, Azure’s Extra Small instance has 155MB more RAM, which could make a lot of difference.

This is why I like this new instance size so much: It means that it just got easier to get a small service running without downtime. By that I mean that I can now pay 2 extra small instances for a small service which gives me the ability to enable rolling upgrades without downtime. This includes automatic OS updates as well as my own updates to my service when delivering new features. This is still cheaper than having one small instance without all these benefits. I would expect this to activate the SLAs also, but I’m not sure about that. This won’t be the case while this feature is still in beta.

# Things I Believe Windows Azure is Missing

Lately I was in the comfortable situation of having some time to dig deeper into some features of Windows Azure. While I am really excited about what Azure means to me as a .NET developer who wants to leverage PaaS and not IaaS, I also feel that some very important features are still missing.

Some missing features require workarounds, others are simply showstoppers. For example, this week I hit a wall with Table Storage. Ouch. I was really happy about the approach I used with it and had even already applied some workarounds to my solution when, suddenly and sadly, I had to realize that Table Storage just doesn’t support case insensitive search.

Without going into much detail, let me put all this into a list. While it may seem that this post is essentially a rant about Windows Azure, that is not true. The rant is really only about documentation, see below. Everything else are just features I want to have and I believe the Azure platform needs to have ASAP if MS doesn’t want to get left behind. Some other big players in Cloud Computing are doing a splendid job of throwing new features at developers almost monthly and letting them know as soon as possible.

So all this boils down to: Deliver crucial new features. Do it fast. Let everyone know about it.

## Table Storage

There is no support for secondary indexes. Amazon SimpleDB has it. Missing this kind of feature means we will have to store information redundantly, just to be able to query efficiently for them. This results in additional development effort as well as increased storage costs. Since storage is cheap, this is ok. Since when done properly, Table Storage can do an enormous scale out, I am even willing to invest some more time to get this right. So this is not necessarily a showstopper.

There is no support for case insensitive search. This means that I would have to store a lowercase version of whatever I need to search for, together with the version containing the correct spelling.

There is also no way for doing full text searches on the storage. There is a workaround for StartsWith() but none for EndsWith() or Contains(). And I’m not talking about not being able to do that efficiently. I’m talking about not being able to do that. Because the LINQ provider just doesn’t support these kinds of queries. And no, doing that on the client side is not always an option.

So MS, please, don’t tell me that Table Storage is missing all those features by design and that it was always intended to be only a very simple way of storing data. Just don’t. Give it a few more features and make it a powerful No-SQL storage. I’m not asking for making it support all the relational stuff like cascading deletes and the like. I understand that it is non-relational. But don’t stop on half the way. Continue, quickly.

## Hosting

Yes I know this one is probably just for devs trying to grok the way Azure works or maybe small startups, but still … hosting a small service on Azure is just too expensive. As already mentioned, this is not a problem when you put your virtual machine’s power to good use, but the pay-as-you-grow model has a flaw here – the inital price might be too high if your needs are too low. Ah yes, Amazon has that, too.

## Sending Emails

Did I mention that Amazon has a feature called SNS? This is not sufficient for sending emails to your customers, but it is perfectly acceptable for anything else, including email messaging in case of failure etc. There are solutions for that, but these are merely workarounds.

## Caching

While I am happy that there are projects like CloudCache that are based on Memcached, I really hope that support for Velocity a.k.a. AppFabric Cache is coming soon. And by soon I do not mean PDC 2011. If there is no announcement (with specific date) at PDC 2010, I will go with Memcached.

## Fixed IPs

IPs seem to remain fixed over the lifetime of a deployment. While this may be true, there is no official statement from Microsoft saying that this behavior is supported. To some, that might seem fair enough since you don’t have to bother about that anyway, right? Well, say you’d have to communicate with some other service that requires your IP to be whitelisted. Not your domain but your IP. Then assume that it will take some time to get another whitelist entry in case something happens to your first IP. You could try to get a ‘backup IP’ whitelisted parallel to your real IP, but this would require another deployment to run on a different production slot, so this is also a clumsy workaround.

And Amazon has the concept of – ah, never mind.

## Documentation

Honestly. This is just too hard. Its not surprising the Azure community keeps on providing link lists and compilations of Azure resources. Information is scattered, insufficient and sometimes out-dated.

And I’m getting tired of saying this, but compared to Microsoft’s approach, Amazon’s is really centralized. You find a long list of documentation for every feature AWS offers on the starting page for that feature and from there on, you can follow links to complementary information.

There is a monthly AWS newsletter saying “we are happy to announce”. Well, I am happy if there is ANY way for me to figure out what Microsoft is up to in the next few months with regard to Azure. Why is communication so bad on this topic?

Last time I checked, the official voting forum for Windows Azure Feature Requests lists 8 completed requests and 1 accepted request (secondary indexes for Table Storage). Even then, there has been no further information on secondary indexes since like 11 months. No, let me check that again, this can’t be true … aargg, it is true. There are a lot of other very important requests but none of them made it to ‘accepted’. Does this mean the Azure team is not currently working on any of these feature requests? I hope not! Really, I hope not.

I would like to see that Microsoft uses this platform (or any other means, just DO it) to communicate upcoming changes to developers and decision makers. That way, everybody would be able to plan ahead and be less frustrated. Developers like me would have a central way of being kept in the loop and appreciate new features instead of being angry for not being told.

Only they don’t. So they won’t. So they are not. And we haven’t. So I am not (at least to a certain degree).

And I definitely believe it is not a good idea for MS to keep roadmaps and timelines as secret as possible, wait a whole year until next PDC and only then make information about all that publicly available. That being said, I really hope there will be some interesting announcements at PDC 2010. I need certain new features to make it easier for me to get customers adopt Azure. Otherwise they will rather go with Amazon.

# Azure Launch Day in Stuttgart Wrap Up

After having listened to Tim Fischer et al, the summary of this conference manifested in my mind like this:

Microsoft still hasn’t decided how to pronounce Azure. Which makes kind of sense since GB and US pronunciations seem to differ quite a bit.

Some other topics were of interest, too 😉 I’d like to give a brief summary here because the conference served as a trigger for me to revisit the latest evolution in Microsoft’s cloud computing.

### Different VM Sizes

Azure now lets you choose how much power your hosting instances are sporting. There are four different sizes available:

 Name Price / hour CPU RAM Instance Storage Small $0.12 1 x 1.6 GHz 1.75 GB 250 GB Medium$0.24 2 x 1.6 GHz 3.5 GB 500 GB Large $0.48 4 x 1.6 GHz 7 GB 1,000 GB X Large$0.96 8 x 1.6 GHz 14 GB 2,000 GB

Pricing and features is almost equal to Amazon Web Service Standard Instances. For details, see their different types and pricing.

### Upcoming Features

There will be blobs that can be mounted as NTFS drives called XDrives. As Ray Ozzie said:

Perhaps most significantly is in a new storage type that we call XDrive. Azure XDrives are Azure storage blobs that are mountable as regular NTFS volumes, a drive mapped, cached, durable volume, accessible by mapping an Azure page blob into an NTFS VHD.

He also announced that the Azure portfolio will have a feature that is really more an IaaS feature than a PaaS feature:

As we move forward and think about ways we can simplify being able to take the investments that you’ve made in the Windows Server environment and move them into Windows Azure, one of the things that we’re doing is allowing you to create your own image. We will do this next year. This is another feature that’ll come in 2010. We’ll allow you to create your own image, which has all of the software configured exactly the way you want it.

As far as I know, this feature is called “Virtual Machine Role” but no one knows. Maybe even Microsoft doesn’t know. And if the do know, they won’t pronounce it. Hell no.

I also heard that in 2010 Worker Roles can be addressed directly from the web without having to route traffic through Web Roles. Didn’t quite understand why there are 2 different roles, then.

### Blobs

Blobs are really getting useful. I already knew they had the ability to be public or private, but these two new features were news to me:

• By specifying HTTP Cache-Control policy for each blob, Azure Blob Storage can be used as Content Delivery Network
• Snapshots of blobs can be taken to create read-only backups

### Pricing Options

As we were told, there will be different pricing options. One of these options is useful for systems that already have a certain level of consumption and want a better pricing strategy for that compared with the very flexible but relatively costly “Pay as you grow” strategy. The first is more like Amazon AWS reserved instances.

And BizSpark Members will get Azure Hosting and SQL Azure for free for 8 months (don’t know details yet).

### Summary

Compared to Amazon, I like the idea of PaaS (Azure being the first choice for a .NET developer like me). When I want give one of my ideas a try and build a web application, I surely don’t want to care about all this tedious infrastructure stuff like firewall, backups, load balancing, security updates etc.

It’s interesting to see that Microsoft is announcing a move more towards IaaS that early. This seems to be driven by early customer feedback. There must be a need for more flexible environments and they don’t want to lose those people to Amazon.

I really dig some of the new features. Good job so far, keep it coming. Looking forward to the EU datacenters.

Windows Azure Storage at PDC 2009

PDC 2009 (German Blog)

# Microsoft Announces Pricing for Windows Azure

When Microsoft revealed their pricing structure for Windows Azure, I was wondering what took them so long to figure it out. But see for yourself.

Below you can find a table comparing pricing for Amazon AWS and Windows Azure Services for the US Region.

 Feature Azure AWS Compute Hour [$/hour]$0.12 $0.125 Storage [$/GB/Month] $0.15$0.15 Storage Transactions [$/10k Transactions]$0.01 $0.01 Bandwidth [$/GB Data Transfer] $0.10 in,$0.15 out $0.10 in,$0.17 out

Looking at the table above, Microsoft’s pricing structure is hardly surprising. They are measuring usage in exactly the same way Amazon does it and their offerings are available at almost the same prices, too. I believe they had no real choice here. By no means had they been able to introduce a pricing model that is more complex than that of Amazon. And clearly, they should not make it more expensive.

But even though the numbers are quite similar, the offerings may be hard to compare. For example, what I have listed as Amazon’s price for a compute hour is the smallest available Windows EC2 instance. I did not find any information describing what power Azure’s compute hour offers. And then there is Amazon’s concept of Reserved Instances which makes it cheaper for users to get long-term capacities.

# Amazon AWS announces load balancing

On May 17 2009, Amazon announced a new set of services. Included are two features I always wanted to see in their portfolio, namely auto scaling of an entire web application as well as load balancing. Unsurprisingly, Amazon calls these features Auto Scaling and Elastic Load Balancing.

Load balancing was something that, from a developer’s point of view, I didn’t want to be bothered with. Sure, I wanted it to work, but I didn’t want to get it to work. In the past, developing an application on top of an Infrastructure as a Service (IaaS) provider like Amazon often involved setting up your own load balancer, for example by creating a new instance from an Amazon Machine Image (AMI) with some suitable load balancing software. This was when the offerings of a Platform as a Service (PaaS) provider like Microsoft’s Azure platform became quite compelling (For differences between IaaS and PaaS, have a look here). Hosting an application on Azure already includes / will include load balancing, auto scaling and auto curing. That’s what PaaS is about: It is a more specialized stack than IaaS taking away some of your freedom of choice (Azure currently supports .NET and PHP apps while IaaS lets you use virtually everything) but in turn offers some features that IaaS normally doesn’t provide out of the box.

This is the reason why this offering affects Microsoft: it increases pressure on the development of the Azure platform. Amazon just took away one of the reasons to choose a tightly integrated PaaS provider over an IaaS provider. Of course, that feature has to come with a price, otherwise Amazon would lose money for all the running instances that currently do the work of a load balancer. I think that price is reasonable, which currently is at “$0.025 per hour for each Elastic Load Balancer, plus$0.008 per GB of data transferred through an Elastic Load Balancer”.

Let’s compare both approaches: Without Elastic Load Balancing, one had to pay for (at least) one instance running the load balancer (assumed this was an extra instance) but no traffic costs since traffic between EC2 instances within the same availability region is free of charge. We could have been using a reserved instance (linux), so we would have to pay the 1-year fee plus the hourly charges for one year, leaving us with

$325 + (365 * 24 * 0.03) = {\bf 587.8}$$With Elastic Load Balancing, we have to pay $365 * 24 * 0.025 = {\bf 219}$$ plus traffic charges, which makes it hard to put any number here. Still, we can tell how many GB we could serve through Elastic Load Balancing before it gets more expensive than our first solution:

$(587.8 - 219) / 0.008 = {\bf 46,112}$ GB. That’s right, that’s about 45 Terabyte. I say, that’s fair enough to get anyone started 🙂

We have to keep three things in mind here:

1. Normally, we wouldn’t use only one load balancer for the first solution because this would be a single point of failure. But if we trust Amazon’s Elastic Load Balancer to be inherently fault tolerant, we still could use one Elastic Load Balancer instead of two, making this solution even more worth the money.
2. You can keep the amount of GB flowing through the Load Balancer small. You do not have to serve all your pictures, videos etc through your web server, instead you should use another server or S3 or CloudFront or whatever. This separation for media files is no special do-fancy-stuff-to-reduce-my-amazon-fee workaround; it’s a common approach.
3. In a small company / startup, I would definitely go for Elastic Load Balancing because it has this nice aaS aspect: It hides complexity from me, thus saving me time and nerves. It also integrates nicely with other services from Amazon AWS such as auto-scaling and monitoring.

Combined with Auto Scaling, Amazon provides the possibility to let an application react upon increasing and decreasing traffic, distributing load across several servers and even cure itself by replacing unresponding instances with freshly booted ones which register themselves to the load balancer.

This will probably also give a headache to the guys over at Rightscale and Scalr which filled the gap by providing monitoring, auto-scaling, loadbalancing and self-curing on top of Amazon AWS. Of course, their services are more comprehensive than Amazon’s beta features but they are also more expensive. Over at RightScale it says “The RightScale Platform is available in editions starting at $500 a month with a one-time integration and access fee of$2,500.” For small companies, Amazon just saved the day.

Wrap-up: I believe the new offerings from Amazon (Auto Scaling, Monitoring and Elastic Load Balancing) are very attractive to developers with limited knowledge in infrastructure and network topics as well as to start-ups or other small businesses which do not want to pay for full-blown equivalents like RightScale. Amazon expands a core concept of cloud computing (grow dynamically, pay dynamically) to new features.

# Azure Storage Manager

Recently I’ve been playing around with Windows Azure and wanted to get the log files for my hosted app.

I tried to get the logs using PowerShell and that worked in one case, on another box I got errors with PowerShell and couldn’t quite tell why. Anyway, I found it tedious to set up. What I wanted was a point + click solution to have all my logs on the hard drive. Another time I realized I created a lot of tables with Azure Table Storage and wanted to clean them up. I was missing a simple tool that would help me with these tasks. So I sat down and fired up Visual Studio.

I gave this app the humble name “Azure Storage Manager” as it can deal with tables as well, at a later point in time possibly even with queues.
Currently it can do this to tables:

• List
• Delete

ahem, that’s about it. Now for blobs:

• List + show properties (well, some)
• Delete
• Copy to hard drive
• All of the above also applies to whole blob containers. This is important because it allows you to get a container full of blobs with one click. Multi selection of blobs and containers is also supported.

It can store and use different account settings, which might come in handy if you happen to have different storage projects on Windows Azure.

This app is completely standalone in that it does NOT require PowerShell or Azure Development SDK installed. It worked for me under XP, Vista and Windows 7, which is hardly surprising as this is what the .NET runtime is for. Just wanted to make the point 🙂

The following walkthrough shows how the application works.

When you start the application for the first time, it will complain that there are no settings stored. You will be presented with this screen where you can enter the information that has been given to you when you created your Azure Storage Project. You need to fill in your account name and shared key. The actual endpoints are being inferred from that information. After you press Save, your account settings will be persisted as XML files. To do so, you should specify where to put the files using Set Folder.

In case you already have a folder with settings files in it, simply set the folder and it will show all the settings in the list on the left. Double-clicking on an item in the list will apply these settings to the application, just like pressing Use Setting does.

After that initial setup,  you can switch to the Blob tab and you will find a listing of your blob containers and blobs. You will have to set a path in order to be able to copy blobs or containers. In this screenshot a container is selected, so any delete or copy operation will apply to the whole container includin all contained blobs. Copying on container level adds the container name to the path you specified; in this example it would create a folder C:\AzureBlobs\production to put all the listed blobs in.

Here, we selected multiple blobs. As you can see, copy and delete now works on blob level. In this case, container name is excluded from save path.

I might post the code for this app in a more technical follow-up, however I have to warn you that according to Phil Haack’s Sample Code Taxonomy, this is still prototype code. It works for me and does its job if you treat it well, but the amount of error handling that is not included in this code is tremendous 😉

What do you think about it?

# Cloud Computing demystified

When putting my head into The Cloud for the first time, I found it very confusing to understand what Cloud Computing means as a term and what it means to me as a developer, specifically

• What are the core concepts of Cloud Computing?
• What are the different types of Cloud Computing?

A short while ago, I happened to dig into this topic because I was giving a talk on Cloud Computing. I’ll post a quick roundup in case anybody is interested.

I like to compare the term Cloud Computing to AJAX. Just like with AJAX, the technology behind Cloud Computing is not completely new or even revolutionary. And just like with AJAX, it actually makes a fair bit of sense to give a bunch of known technologies a new name once you herd them together for a specific purpose, because now whenever someone uses this new term everybody else will know what the heck he is talking about, including

• What technology is being used
• What kind of problems are likely being addressed
• That he is indeed very cool

Or so it should be. I have to say, since Cloud Computing became such a hype and everybody is doing Cloud right now, it doesn’t help in understanding what actually is going on.

Enough talk, you might say, and you’d be right. Let’s get started.

# Core Concepts

The most important idea behind Cloud Computing is <Anything> as a Service. Again, the idea of aaS is not brand new. As Rob Conery once pointed out (can’t quite remember where it was), by taking a taxi you’re already leveraging the idea of aaS. Rob called this Car as a Service, and we will quickly look at what aaS means and how we can apply it to cars.

• No aquisition costs, no overhead costs:  You did not have to buy a car to get you to the train station and you do not have to pay a fixed amount of money per month to get the right to call a taxi whenever you need it.
• Variable costs: You only pay for what you consume. In case of a taxi, this is being calculated by distance and time.
• Hidden complexity and maintenance: Some of the major headaches in using a system are being abstracted so you don’t have to care about them. In this case, you don’t care how difficult it is to start that old engine or how often, why and when they have to take it to a workshop to keep it from falling apart as long as you still catch your train.
• Economies of scale: Okay, maybe our example doesn’t work too well for that one, but image that they’re buying lots of taxis and therefore get an amazingly low price and then you as a customer will benefit from that. Right?
• Scalability: In some scenarios, most of the time you need very little resources and suddenly there is a peak and you need lots of it. For example, one day you need to get your whole family (really the whole family: children, parents, relatives) to the train station. You certainly don’t want to buy 4 cars and have them standing around the whole time just to be prepared for this rare occasion? Just get 4 cabs instead of one and you’ll be fine. That’s what I call scalability.

That CaaS comparison worked pretty well for me to show what aaS is about. Then I started to wonder what it means when applied to computers instead of cars.

# Different Types

As mentioned above, we can apply this Service principle to virtually anything by just sticking one letter in front of aaS and you’re done. Let anyone else figure out what you meant by it. There are a lot of services in the area of Cloud Computing, that’s one reason why it is such a big mess when you look at it first. I found it helpful to distinguish between 3 different types of Cloud Computing, although there are more and one size doesn’t fit all.

## Infrastructure as a Service (IaaS)

In this simplified 3-tier view of Cloud Computing, IaaS is at the bottom of our stack and the most basic layer. It basically provides abstraction over hardware (in fact, IaaS was formerly called Hardware as a Service) by applying the principles of aaS to bare metal. I like to see Amazon Web Services (AWS) as a good example for IaaS. Their services include:

• Elastic Compute Cloud (EC2):  provides you with virtual machines (they are using XEN virtualisation) that are running in Amazon’s datacenter once you fired them up (the VMs, not Amazon). Their EC2 pricing model says you pay them on an hourly basis for that. That’s it! Here is our aaS principle again: No registration fees, no monthly fees, no overhead costs.
• Simple Storage Service (S3): This is a persistent storage service (see, we could already call that Storage as a Service). It goes like this: Once you’ve uploaded your file, they will replicate it so you don’t have to think about backup. You will pay them based on your usage which is, in this case, measured by 1) storage size, 2) time and 3) data transfer (in and out)

They have a lot of other services and are constantly addind new ones, however these two are what I believe to be the most commonly used ones. All of their services are directly accessible as web services (useful for programmatic control) as well as by their management console.

In IaaS, you are admin for every virtual machine you create. You can install every software you need (web server, database and so on), you can also use the programming language of your choice to implement your application you want to host. This gives you great flexibility, but also great responsibility because, at this point, it’s only the hardware you do not have to care about. You still have to patch your OS and all the software you installed, for example. In theory, this gives you infinite scalability in matters of minutes since all resources are virtualized and at every given point in time you could step up and say “give me two more of these”.

Now let’s move on in the stack.

## Platform as a Service (PaaS)

PaaS provides more abstraction than IaaS because in addition to hardware, it also provides an abstraction layer for the OS as well as commonly used components like web server, database, load balancer and the likes. One example for PaaS is Microsoft’s Azure platform. Although we’ve been told that Azure is an OS for the cloud, we never actually see this OS because we’re using PaaS.

We’re not admins, we don’t get access to the OS which means we can’t choose what to install and we don’t have to. The responsibility for patching and maintenance for all necessary components is now up to the PaaS provider. For every programming language (Azure currently supports .NET and PHP) a PaaS provider will need another set of tools to run your app. You as developer take your application, load it up to the PaaS provider and say ‘run it’.

See the difference to IaaS? PaaS is a more specialised stack which takes away work as well as decisions. If this doesn’t limit you too much (do they support your programming language?), PaaS will get you started more quickly.

Time for the next step.

## Software as a Service (SaaS)

With SaaS, we’re leaving the Cloud Computing domain for developers and enter that for end users. What can Cloud Computing do for end users? All the principles of taxis should apply here too.

With SaaS, you don’t buy the software. You only pay for it when you use it. You don’t have to install it on your machine, it’s running somewhere else and someone else takes care of it. You just use it. Sometimes it’s even free until you need more advanced functionality. Picnik, for example, lets you edit your pictures online, right in your browser. It’s free and gives you enough options for the occasional picture manipulation session, removing the need to install a full-blown image manipulation solution on your machine.

## To summarize …

There are 3 manifestations of Cloud Computing:

• Infrastructure as a Service (IaaS)
• Platform as a Service (PaaS)
• Software as a Service (SaaS)

With CC, you get different levels of SEP (Somebody Else’s Problem). Depending on which layer of Cloud Computing you’re using, someone else will take care of necessary hardware, operating system or even a complete application.

The goal is to reduce upfront investments, complexity and time to market and to allow for great scalability and economies of scale using virtualized resources.

Usage in Cloud Computing typically is measured as

• Time (CPU time or real time)
• Data transfer (inbound and outbound)
• Storage (measured in MB or GB)
• Number of transactions (like GET and PUT)

I hope that helped anybody to see Cloud Computing in a less foggy way.