Why it’s so hard to externalize stuff

7 minute read

I’ve been chatting with somebody about helping them externalize software they built for an internal application, and it reminded me of a gnarly problem I faced nearly 20 years ago when I was leading the effort that is now called Fulfillment By Amazon (FBA).

View on Twitter

The problem had to do with semantic coupling, which is a particularly insidious form of coupling. It’s insidious because it was probably good when it was put in, isn’t explicitly in any API, and is very hard to take out.

The goal of FBA was to allow 3ʳᵈ-party merchants to use the Amazon fulfillment network. At the time (2005), you could buy products on the Amazon website where the seller was not Amazon, but if you did, the seller had to ship the product to you from their facilities.

We wanted to allow 3ʳᵈ parties to buy inventory and ship it to Amazon fulfillment centers. Then when a customer ordered that product, Amazon would box the item up (maybe with other Amazon-fulfilled orders) and get it to the customer.

As we started looking at what we’d need to do to offer this service, we quickly found a big problem: the ASIN. The ASIN (Amazon Standard Identification Number) is the unique identifier for every product on the Amazon website. So what was the problem?

The ASIN uniquely identifies what a shopper thinks of as a particular “thing.” For example, What if? in paperback has one ASIN (1848549563), while the hardcover version has a different one (0544272994).

At the time we were starting FBA in 2005, those ASINs were also used for fulfillment. So if I ordered three different items, a list of those three ASINs would be sent to the fulfillment network, where pieces of inventory matching each ASIN would be boxed up and shipped.

This worked great when all the inventory in our fulfillment centers was owned by Amazon, but if we have a paperback What If? owned by Amazon in our FC and another owned by Bob’s Books, when the customer buys from Bob, the ASIN isn’t enough information to know which one to ship.

Now maybe you can make an argument that the two physical books are “the same” and it doesn’t matter which one you ship (though google “FBA comingled inventory” for more on that), a primary goal of FBA was to attract used book sellers.

Obviously, in some sense, all used copies of What If are the “same” - one of the strengths of Amazon is presenting a single web page for What If that shows all the offers. But after I buy one with a particular wear description, Amazon needs to ship that one.

We’ve gotten into the heart of the problem now. When Amazon launched in 1995, the ASIN¹ represented a thing for sale on the web site, implicitly new and sold by Amazon². The FC software took that identifier and used it with all the implications left unstated.

Once 2005 rolled around, the implications of seller and condition had been realized and unrolled on the website, but the FC software hadn’t needed to. The website software and the FC software were coupled by the semantics of the ASIN, but those semantics had started to drift.

In Code Complete the author writes about the dangers of semantic coupling, and describes it as one module using knowledge of another’s inner workings. That’s true, but in my experience, easy to avoid. Our example is one that is much harder to avoid.

Our example is one where we are modeling in code something in the real world. The ASIN represented a “book for sale” on the website, and a “book to be fulfilled” in the FC software. These two concepts seemed to be the same when Amazon launched, and practically they were.

The challenge with modeling the real world in code is figuring out which parts of the world to model and which parts to elide. In general, you want the model to be as simple as possible but no simpler. The problem is that the definition of “no simpler” changes over time.

Over time, as the system evolved, it became clear that these were two separate concepts and needed to be modeled as such. The FC software needed an ID for a “fungible physical thing for order fulfillment” as opposed to a “thing that customers think of as a unique product.”

These two IDs don’t really need to have anything in common with each other. If you were building a company that only did fulfillment, you’d have an ID for that concept, and it would have attributes such as weight, dimensions, shipping restrictions, etc.

But if you had those two concepts as separate, you’d need some way to tie them together. An inventory system would have to keep track of the correspondence between, for example, the ID for What If that the website uses and the one the FC uses for that physical book.

Also, much of the information that the FC systems need, the website needs as well. So either you’d have to duplicate that information and keep it updated, or the inventory system would have to keep track of it all.

Obviously Amazon has that system now, because the two IDs have been broken apart. But in 1995, it was much faster to tie all that information together with this one ID. So ASINs modeled both the thing for sale and the physical item.

It would be easy in 2005 to look back and say, “They should have thought harder about this coupling and modeled it better.” I’ve heard some version of that refrain dozens if not hundreds of times in my career.

But I dunno. Maybe. It’s easy in retrospect to see the axes of flexibility that are important and the places where you need independence. But at the time, you might be able to come up with a hundred places to add flexibility, of which only twenty would add value.

While I do think it’s very important to think clearly about the concepts you are modeling (and writing things in prose helps with this a lot), I also think it’s important to move fast and ship software. Hopefully the value that senior engineers brings is discerning that balance.

One way I’ve gotten engineers to think about where they have coupling between internal systems is to think about what would have to change if you replaced one part of that system with something sold by a 3ʳᵈ party.

In a sense, that was what we were doing with FBA, but in reverse. We needed to convert the Amazon fulfillment network from internal software (where semantic coupling is possible and sometimes good) to an externalized service (where it’s not possible).

Whenever I’ve been involved in an effort to externalize software that was written for internal use, uncovering and fixing these places with semantic coupling have been the hardest parts of the job.

It’s why in AWS we have only rarely tried to directly externalize internal software. DynamoDB was based on what we learned building the system described in the Dynamo paper, but it was written from scratch to be an external service.

CloudWatch was based on what we learned building our internal metrics system, PMET, but it was written from scratch to be an external service.

We couldn’t do that with FBA, because the scope of all the fulfillment software was just too big. So we powered through, but it was a lot of work figuring out how to separate the concepts and retrofitting the software in place.

FBA has obviously been a huge success (though Jeff said it took too long to launch), so I wouldn’t say never try to externalize an internal service. But be very aware that the hard parts are likely things that you aren’t thinking of.

As I wrote at the top of the tweet, this thread was inspired by some consulting work I’ve been doing. If you are interested in learning more, see my services page.

View on Twitter

Well, it’s more complicated than that. Amazon launched using ISBNs, which are standard IDs assigned by the book publishing industry. Unfortunately, ISBNs aren’t unique - they get ,reused meaning two different books can have the same ISBN. ↩
In the beginning, Amazon did sell used out-of-print books, so really an ISBN could either represent a new in-print book or a used out-of-print book. ↩

Andrew Certain