AES-GCM
Implementation & Analysis Papers
For some very good and very boring reasons I have been digging into AES-GCM (Galois/Counter Mode) implementation. I found a number of interesting papers analyzing GCM and describing a variety of interesting aspects of secure, fast implementation.
Analysis
Implementation
- Faster and Timing-Attack Resistant AES-GCM, Käsper & Schwabe, 2009
- Software Optimizations for Cryptographic Primitives on General Purpose x86_64 platforms, Gueron, 2012
- Optimized Galois-Counter-Mode Implementation on Intel® Architecture Processors, Gopal, et al, 2010
- Fast Cryptographic Computation on IA Processors Via Function Stitching, Gopal, et al, 2010
—May 07, 2012
Origins of the Internet
Fundamental Publications
I’ve had a long interest in the origins of the Internet, both the people and the seminal publications. A few years ago I put together a short presentation on some of that history and how it might relate to the evolution of the cloud:
I thought folks might be interested in some of the original publications directly or indirectly referenced in that presentation, so I’ve collected them here. Please contact me if you have copies of PDFs you think are missing from my list (especially Leonard Kleinrock’s work).
- A Mathematical Theory of Communication, Shannon, 1948
- Reliable Digital Communications Systems Using Unreliable Network Repeater Nodes, Baran, 1960
- On Distributed Communications Networks, Baran, 1962
- On Distributed Communications, Baran, 1964
- The Aloha System – Another Alternative for Computer Communication, Abramson, 1970
- The Ethernet Memo, Metcalfe, 1973
—Jan 25, 2012
Some Rules for Engineering and Operations
The best solution to a problem is not to have it.
An insufficiently ugly temporary hack is permanent.
There is no such thing as standby infrastructure: there is stuff you always use and stuff that won’t work when you need it.
The first fallacy of automation is making machines perform each step of a manual human process.
These are not features: Security, Availability, Performance.
—Jan 24, 2012
Service Level Disagreements, Part 2
Yesterday, I explained the dangers of the common misunderstanding of service level agreements as insurance policies. While I mentioned a strategy of using multiple vendors rather than relying on the SLA offered by a single vendor, some more specific details will be useful in understanding and internalizing this approach.
Over the past ten years I have participated in or lead negotiations for internet and CDN bandwidth at Internap, Amazon, and Microsoft. at first I invested significant time and effort in defining SLAs, methodology, metrics, and penalties, as is common practice. What eventually became apparent were two things:
- Defining meaningful SLAs for public internet services, as opposed to private telco links, is not generally possible.
- SLA failure penalties are insufficient compensation for business impact.
From this experience and these realizations I changed my approach significantly. The two facets of the new strategy were, and are:
- Only enter into contracts with as small a traffic commitment as feasible and with no penalties for termination, regardless of cause.
- Engage multiple vendors for all bandwidth services.
Availability, which is always the responsibility of the customer, is now actually under the customer’s control, rather than being delegated to a vendor via an SLA. Should a vendor fail to deliver the desired service level, even for a short period of time, traffic can be shifted to other vendors until quality improves. Should a vendor prove too unreliable to use at all, their services can be terminated and other vendors brought in to replace them.
To make best use of this strategy it is important to have proper software support in place. For example, a single CDN vendor should be used for content on each page served, and the vendor used varied dynamically across requests; mixing multiple CDN vendors on a single page can actually reduce availability. Similar traffic engineering can be done for requests to your own web servers using DNS-based global load balancing, though with coarser granularity. Similar principles will apply to “the cloud” as the interfaces and functionality in the space are commoditized.
As Heinlein said, TANSTAAFL, and high-availability distributed systems are not exceptions. You are responsible for your availability. Understand clearly the business value to you of a vendor SLA and be prepared to change your strategy, and put in the technical and contract work required, if it will not meet your business needs.
—Jul 16, 2009
Service Level Disagreements
Vijay posted a (better late than never) rebuttal to a post from November last year by Joe Weinman of AT&T. I agree with all the points Vijay makes, and want to focus in on a particular area of Joe’s article:
(4) SLAs with financial penalties — Not only won’t enterprises accept “Well, after all, it’s still in beta” as an excuse for service outages, they demand meaningful SLAs (service level agreements) with clear metrics for evaluating achievement of those SLAs, backed up by monitoring and management systems, and financial penalties such as credits or refunds if service levels aren’t met. A “free” or low-cost service with questionable delivery quality is about as attractive to a CIO as an offer of free neurosurgery from someone who just skimmed a blog on how to do it in three easy steps.
Ah, the mighty service level agreement! The tooth and claw by which the wily customer brings the vendor to heel. Get the SLA right and you, the customer, can sit back and relax, safe in the knowledge that should there be an outage, you are covered. Your business is protected from harm by the warm, experienced embrace of a big, stable telco. Pinch me, I must be dreaming.
Vijay refers to SLAs as “an actuarial game”. The situation is rather worse than that. The trouble is that many intelligent people mistake an SLA for an insurance policy. It most definitely is not.
An insurance policy is purchased for a price, often based on actuarial tables, that reflects the risk of the policy being paid out and the size of the pay out. The value of the policy is that it is a hedge: in the event of a claim, the holder is compensated for (approximately) the full value lost. The insurance industry is predicated on most policy holders paying far more over the life of their policies than they are paid out, and on there not being catastrophic events that cause simultaneous claims by a large number of policy holders.
A service level agreement does not work this way. An SLA is not a hedge against the business impact of an outage: it is a refund policy. The maximum value of an SLA ‘claim’ is your monthly bill. The cost to your business of an SLA failure is likely to be far higher, but you will not be compensated for that loss. A six hour service outage might cost your small business 10,000 dollars. receiving a 500 dollar service credit is cold comfort.
SLA failures become more common as you move up the stack from the rigid, extremely well-characterized, layer 1 telco sweet spot. Outages that impact large sections of your customer base simultaneously are inevitable in large-scale, shared software infrastructure. If SLAs were insurance policies, vendors would quickly be out of business.
Given this, the question remains: how do you achieve confidence in the availability of the services on which your business relies? The answer is to use multiple vendors for the same services. This is already common practice in other areas: internet connection multihoming, multiple CDN vendors, multiple ad networks, etc. The cloud does not change this. If you want high availability, you’re going to have to work for it.
—Jul 15, 2009