wrong tool

You are finite. Zathras is finite. This is wrong tool.

  • Email
  • LinkedIn
  • RSS
  • Twitter

Powered by Genesis

25 architecturalist papers: playing chess while everyone else plays checkers

February 25, 2020 by kostadis roussos Leave a Comment

One of the tough challenges of acting as a strategic software architect is that it’s not precisely an understood job. Many people ask me what I do, and after several fumbling minutes, I point them to this blog.

The most recent analogy that became somewhat useful is that strategic software architecture is about playing chess while the rest of the world plays checkers. And what I mean is that the world is looking at short time horizons, planning the next step, while the role requires planning two or more steps—and simultaneously making moves during the current phase.

Strategic software architecture is not about the stuff that is shipping now. It’s about the thing that will ship in the future. And the job is not to deliver the current thing. The job is to make sure that the team can deliver the current stuff without you.

As that strategic software architect, when you are working on an immediate deliverable, what it means is you failed your team a long, long time ago. If they need your help, it means that you either didn’t provide them the resources, the people, or strategy that would ensure their success. Fail enough times, and there is a new strategic software architect.

In most software companies, the planning cycle doesn’t extend out longer than 18 months. The world changes too fast for anything more than that. And so nobody is thinking past 18 months.

There is one group that is thinking past 18 months, these strategic thinkers. They are a critical, sufficiently unrecognized group across a large number of business functions, but this is about engineering, so I am focusing on that. As this role doesn’t exist and isn’t recognized, and there are no rewards for long term success and, it begs the question of ‘does it exist’?

There are many people like that at a company. They are the ones who seem to generate magic just when the company needs it. That continues to deliver value, and nobody knows why or how. As engineers, they do it with well-timed code-reviews, speaking whispers to the right people, working on the right project, checking in something that nobody expected. They call meetings to discuss things in private, and thereby create a social network that is impenetrable and built around the respect they have earned and the reputation they have acquired.

And over time, the company strategy is the strategic thinkers’ strategy, even if the company thinks otherwise. For many reasons, beginning with hiring that is shaped through their biases. What is easy to build and hard to make is what they and their social network think is easy and hard. What can be created is controlled by their tastes. What is easy and hard to do is shaped through the myriad of small technical decisions that make change very hard. And their software architecture ossifies their decisions through org charts that can endure long past the code choices that formed them are relevant.

Currently, this entire area is left to chance. We are lucky to hire people who can do the job. And I have seen it in my career. Where there are groups that seemingly out of nowhere, keep doing the right thing. And things keep getting better, but I can’t figure out why. And finally, somewhere someone turns up that has a plan . That plan exists in their heads, or on a piece of paper or a confluence page that nobody reads but that everybody is working on.

I believe that a company that figures out how to do strategic software architecture as a discipline and incorporates it into the 18-month planning has a decisive advantage. I also believe that if they can couple this approach with a rabid focus on immediate delivery, they can’t lose.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

24 architecturalist papers: how to not engage the A-team.

February 18, 2020 by kostadis roussos Leave a Comment

There are many reasons to change jobs. Some are better than others. But the best is when your peers or your boss don’t engage you for your best work.

Let me start with a story. In 2009, I was a technical director at NetApp. At many other companies, this role is analogous to a senior technical leader with a pay grade that is equivalent to that of a director-level manager.

At the time, NetApp was engaged in a multiyear effort to converge the operating system of the company they had acquired Spinnaker and their platform OnTAP. After the first unsuccessful attempt with the product known as OnTAP GX, the technical and business leadership of the company rallied around a strategy that eventually became what is now known as OnTAP 8.0.

This effort required vast amounts of synchronization. At one point, I was the architect for the data protection portion of the business. The overall architect for the effort sent me an email with a detailed task breakdown of what the data protection team needed to accomplish over the next two years. As I looked at the list I had some concerns with some of the details and the overall general direction, and so I flippantly responded with a “this is so detailed that I’m not sure what value I’m going to add. Why don’t you just send it directly to the management team in the product managers.” And so the author of the email took my response and forwarded the email with his comment, “you’re right.”

After that, it took very little for me to want to leave NetApp.

Over the years I have wondered why this particular exchange was so critical in my leaving NetApp. NetApp was at the time treating me well. I would’ve made more money at NetApp than I did at Zynga. I had a five-week vacation at NetApp. I was well respected by the then CTO and chief science officer. My boss at the time and I had some differences of opinion, but those differences of opinion were resolvable. And the problem space remained interesting.

So why did I leave?

I left because at the end of the day that other architect did not want to engage with my best self. He was not interested in working with me to come up with the right answer. He just wanted me to do exactly as I was told and to take accountability for his decisions.

In short, what I heard him say is, “I don’t need you to think. I need you to do exactly as you’re told and to make sure that the things I need done are done.”

Over the years, I have seen this pattern play out again and again. Sometimes with me on receiving that message or the author of such a message. What I have concluded is that if you’re engaging someone to solve a critical business problem and you talk to them in a way that demonstrates you do not see value in their abilities, then you’re asking them to leave.

But there is another insidious problem that also can occur. If you’re trying to engage with another team where you happen to know a lot about how that team’s systems are built, it is tempting to bypass the current architect and just tell them to do this and this. The problem is that what you’re doing is not engaging the team to be able to solve your problems. And what this kind of communication does is discourage precisely the people you need. The people who want to have a scope to think and to imagine possible solutions. They decide to not want to work on your problems.

If the problem is small in scope, then this is not necessarily a bad thing. If the problem is significant in scope, this is a calamitous decision. You’ve traded off the delivery of a small set of critical capabilities at the expense of a deeper understanding of a hard, complex problem. You have reduced a team that could potentially add value by understanding the problem and thinking about it hard and long for a team that will do exactly as they’re told and no more.

“But that wasn’t the goal,” I have said on more than one occasion.

No, it wasn’t. But that is precisely what I have achieved in the past. Because the people who could think realized that there was room for them to think, only to do. So instead of looking at a problem and trying to solve it, they saw a list of activities that could be done by someone who couldn’t think as deeply as they could. At the end of the day knowledge workers have a tremendous amount of freedom to choose what problems they work on. They also have a tremendous amount of freedom to decide how long and how hard they want to work on an issue. The single most calamitous decision you can make as an architect is to engage with another architect by treating them as less than an equal.

Over the years, I have made this mistake. To those that I treated poorly, my loss is massive but nothing as compared to how poorly I treated you. All I can say is that life is about getting better and learning more new things. And maybe this goes a little distance as an apology.

 

 

 

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

23 architecturalist papers: latency kills

January 17, 2020 by kostadis roussos 2 Comments

While at NetApp, I saw the incredible effort that became known as ONTAP 8.0 and was part of the spinnaker acquisition.

From that experience, I learned a few seminal things that continue to resonate. The short version is that latency kills.

Let me start by saying, that the hard problem in storage is how to deliver low-latency and durability. Enterprise storage vendors earn their 70% gross margin because of the complexity in solving two issues that appear to conflict. The conflict is that durability requires a copy, and making a copy slows things down.

The solution was, and is, to use algorithms, in-memory data structures,  and CPU cycles to deliver latency and durability.

When Spinnaker was acquired, there was a belief within the storage industry that single-socket performance had reached a tipping point, and that performance could only be improved if we threw more sockets at the problem.

And, in retrospect, they were right. Except, we collectively missed another trend. Although the single-thread performance was no longer going to double at the same rate, the performance of media was going to go through a discontinuity and radically improve its performance.

But at the time, this wasn’t obvious.

And so many folks concluded that you could only improve performance through scale-out architectures.

The problem with scale-out architectures is that although single node-latency can be as good as local latency, remote latency is worse than local latency.

And application developers prefer, for simplicity, to write code that assumes uniform latency of the infrastructure.

And so applications tend to be engineered for the worst-case latency.

And single-node systems were able to compete with clustered systems. As media got faster, and as single-node performance improved, application performance on non-scale-out architectures was always better.

In short, the scale-out architectures delivered higher throughput, but worse latency.

And it turns out that throughput workloads are not, generally, valuable.

And so scale-out for performance has it’s a niche, but it was not able to disrupt non-scale out architectures.

Over time, clustered storage systems added different value than performance, but the whole experience taught me that customers will always pay for better latency. And that if there is enough money to be made in the problem space, it will be solved in such a way to avoid applications from changing.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers, Storage

22 architecturalist papers: multi-tenancy and quotas

January 2, 2020 by kostadis roussos Leave a Comment

Over the last years, I have gotten into a series of protracted debates about multi-tenancy.

What I have begun to understand is that it is essential to define the objectives of multi-tenancy before one starts to talk about it.

And even before we get to that need to define what is multitenant.

Consider a piece of hardware, say a server with four sockets. An individual owns the server. Another individual owns the building in which the server resides.

In effect, when there are two actors Mary and Tom, that have access to a system, that system is said to be multitenant if Mary and Tom do not trust each other.

But how much do they trust each other? The trust goes to how much the system must protect Mary from Tom and vice versa. For example, suppose Mary trusts Tom. Then Mary doesn’t care that Tom has physical access to the hardware. And Mary takes no actions to protect her data or her applications running on that server. In effect, Mary and Tom are the same people; they have different roles.

But suppose Mary trusts Tom, but Tom doesn’t want to damage Mary’s system accidentally. Identities and roles play a factor. What Mary would like to do is have a role that Tom can use that allows him to do the things he needs to do to Mary’s server and no more.

And so this is where things get complicated. There are two basic approaches; the first is to bake into the system the set of controls that Tom has access to and to use some role-based access system integrated with some identity system that determines what Tom can do. The problem with such an approach is that if Tom needs to do something that is not in the system, he has no way to do it and has to ask Mary. Now, if Mary is okay with that, all good, however, Mary may not want to do the task and may wish to allow Tom to do the job. But if the system has no way for her to do that, then she is forced to give him access to more controls than he is capable of using.

The second approach is to use layering. You create a net new interface that interacts with Mary system through some APIs, and that net new interface is what Tom uses. Thus when Mary wants to enable Tom to do something new, she will, Tom, can extend his tool to do that. The problem with this approach is that Tom now has access to a whole bunch of operations he shouldn’t have. The only thing preventing Tom from using those operations is his adherence to procedure and the fact that at the end of the day, Tom isn’t malicious. He’s a good guy.

My observation is that approach one doesn’t work. The reason it doesn’t work is the set of operations that Tom needs to perform is ever-evolving. Worse, the collection of activities that Mary wishes Tom to do is ever-expanding. And as a result, they end up using the second approach.

Okay, so what?

The problem is that too many people attempt to build the first model. For example, suppose I have an interface for interacting with the system. That interface allows me to create objects delete objects, or modify objects. Then what happens is that somebody decides that the hierarchy of those objects should reflect some authorization scheme. Then what happens is that Tom and Mary can’t do their jobs because the hierarchy or the complexity of configuring and setting up the hierarchy and setting up authorization is not expressible by the system. In effect, the hierarchy and system that allows you to create edit and manipulate objects for one task is not the same hierarchy you would use for another.

And so, ultimately, what you do is you create a tool that has a specific set of operations that Tom needs. Mary and Tom configure the tool so that it only does what it needs to do.

But, the advocates of the first system point out that the second approach is less secure. And they are right. Or I’ll take them at their word.

They ask, what if Coke and Pepsi want to run their software on the same physical servers. I always found that to be an absurd question. Even if we could assume that the system was entirely secure, there is human error. I thought that Coke and Pepsi would always buy their servers. What is interesting is that the market seems to be doing that even in the public cloud. The Nitro hardware that Amazon has produced mainly provides physical instances on a shared server. And this was before we discovered that there are architectural holes in our systems that allow data to leak between programs running on the same physical server that belong to different tenants.

And so, my assumption has always been that if you care about security, air gaps are about the only thing you should trust.

What does this mean?

Consider the server. With no software, it can do anything you could imagine. The minute you start running software, the set of things you can do becomes increasingly more limited. It turns out that there is a whole slew of user interfaces that are a lot more useful than just starting with the hardware. Over time, a set of interfaces for using a system and controlling access to that system have emerged. And we have figured out over many years how to make them address both Tom and Mary’s needs. A great example is the use of root and less privileged users on most operating systems.

A user interface for the system is handy to both Mary and Tom is incredibly powerful. And therefore, whenever a new way of interacting with servers emerges, there is a temptation to try to figure out what the boundary between the two tenants should be. The reality is that such interfaces developed after years of hard work and experience in operational practicality. Therefore in a new system, you are most likely to draw the boundary in the wrong place. And thus, in my mind, how you access a system should be independent of how you control access.

Okay, I’m dangerously close to talking about security, and when it comes to security, I know that I know nothing.

The problem is that even if you don’t care about security, another critical use case of multi-tenancy is to reduce the cost for the infrastructure provider, Indigo. What Indigo wants to be able to do is assign quotas to Mary and Jane and Tom. And what they want is to ensure that Mary, Jane, and Tom don’t ever exceed their quotas.

Amazon’s solution was to create an infinite supply of servers and bill Mary Jane and Tom for their usage.

The limitation of such a system is that you can only buy the set of servers that Amazon has decided to make available. The other limitation is that it assumes an infinite infrastructure.

If, however, Indigo does not have access to an infinite infrastructure or it’s inappropriate for their use case, what to do?

In my opinion, they should choose approach number two. What does this mean? There are a set of objects that Mary Jane and Tom use to do their jobs. Indigo has a set of quotas that they assigned to Mary, Jane, and Tom. Mary, Tom, and Jane’s need to refer to the quotas and use them transparently. And so there is a temptation to encode them in the objects. But instead of having quota enforcement done at every access to the objects, it should be done lazily unless you exceed some threshold.

And if you look at what Amazon does, they do the same thing.

If you want to use one more server, they will give it to you. If you’re going to employ 10,000 more servers, that involves a phone call. They have their quotas that are lazily enforced and, at some point in time, block access.

In effect, Amazon has decoupled secure isolation from quota enforcement.

And so, when we talk about multi-tenancy, what we need to do is ask, are we trying to solve for secure isolation, or are we trying to solve for quota enforcement? The requirements for security depend on the customer, the trust, the legal requirements, etc. How you do quotas is independent of all of those security restrictions and should be treated as such.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

21 architecturalist papers: always be right, right now.

January 1, 2020 by kostadis roussos 1 Comment

One of the particular challenges of being a software architect is that the average manager and vice-president of engineering assume you are incapable of adding value, right now.

One of the particular challenges of being a software architect is that the average manager and vice-president of engineering assume you are incapable of adding value, right now.

Engineering managers have releases they need to deliver to now. They have engineers who have problems now. The managers have budget squabbles now. The teams have debates with product management now.

An architect has visions of the future, and those visions are often years away from delivery, and worse, even more years away from solving any of the immediate problems engineering management has.

And so the VP of engineering begins to see the architect as a distraction. The architect distracts in two ways. The first is that they are always complaining that the team is not investing in the future. The second is that the things they want are unimplementable. And worse, the architect is typically unwilling or unable to go figure anything out. Powerpoint gets produced, confluence pages get edited, wiki pages get updated, and in some cases, small prototypes get written, but no useful code gets changed.

And thus, the VP tries to manage that distraction. There are two fundamental approaches. The first is to keep them away from anything that matters and fire them at some opportune moment. The second is to focus them on a small problem. A small problem keeps the architect from distracting other people and makes it easy to measure their delivery of value. In other words, force the powerpoint jockey to write code, and either they figure out how to write code, or they get fired.

And you know what, the VP of engineering is right.

The thing about being a software architect is that you always have to be right. What I mean is that you are guiding a team, and the direction has to be correct. If it’s not correct, then the team is heading towards a catastrophe. You can change course over time, but it always has to be the right course.

In addition to always being right, you also have to be sufficiently high-level, so everyone is doing the right thing. Pulling this off is another tricky thing to master. If the correct answer is not inclusive of the entire organization, then somebody is doing something wrong. Furthermore, if it’s too low-level, then there are no white-spaces for engineers to innovate.

Another customer for the long term architecture is the CEO/GM and Product Management team. They have to see enough value that they can talk about it to their customers.

In short, you need to be right, and high-level enough that you can’t be wrong, but if that’s where you end, you fail.

The software architect must also add value right now. What does that mean? It means if the head of product management chooses to fund X features, all X are stepping stones to your architectural vision. If the engineers have to design something, it should be evident from the high-level architecture what they should do. If the managers have to figure out how to trade off long term vs. short-term execution, they should make that decision in the context of the global vision.

But it’s more than that. Managers operate in terms of things that need to get done with some set of resources. And so it’s vital that the architecture, be broken down into discrete tasks.

And so my mantra,

  • I must always be right.
  • I must be sufficiently high-level to be always right.
  • I must be right, right now.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

19 architecturalist papers: why doing the right thing matters, a tale of Facebook and charities.

October 1, 2019 by kostadis roussos Leave a Comment

When I was at Zynga, Mark Pincus and the executive team had this brilliant idea on how to raise money for charity, selling virtual goods.

The idea was pretty simple, they had a virtual good, that virtual good was relevant to the game, and if you used real money, we gave a portion of the money to some charity.

This technique generated a lot of money for charities. And, to be fair, it was great for Zynga as well. Even if we did not keep the money, getting people to spend on a free game was hard, but once you got them to pay, it was straightforward to get them to pay more.

But we had to stop.

Why?

Facebook Credits.

See Facebook and Zynga signed a deal to have Zynga use Facebook Credits instead of real dollars. Feels a lot like Libra, but I am bitter. And because we used Facebook Credits, we needed to get them to do some back-office paperwork.

So I got the foundation to agree to do anything and everything that Facebook needed.

And they said, no.

I said that I would write a blog raking them over the coals for not prioritizing incremental revenue over doing good.

And they said, “Do it, we do not care.”

So I worked with our MarComm team to put something together. And we had layoffs, and our business was imploding and they asked me to not post it. They had so many other fires to put out, that this felt over the top.

And I agreed.

And I was wrong to agree.

Because, since then, no one has done this. Not one single freemium game has done this. Nada. Not one.

At Zynga, we pioneered a lot of the pay-to-play game mechanics. But Facebook’s payment team of the time pioneered the idea that charity was not a business priority.

It’s my fault for not having a spine six years ago. I wonder if I wrote that blog, things would be different. How many people would be alive if I had just done what was right?

When Facebook started it’s “charitable” giving on their timeline, I puked. I got so angry that I donated 1000$ to Mother Jones because they were the only publication that was willing to call out Facebook. Heck, I offered to give another 500$ as a matching donation. No one from Mother Jones asked, I just did it. I went on twitter and said if people sent me a note with a proof of a donation, I would donate 500$ to Mother Jones; I was that angry. And while we are here, give to Mother Jones, they are an excellent liberal paper that fights the power.

I screwed up.

So why am I writing now? Because a friend of mine saw a freemium game that did something for charity, and it made me happy. It meant that some games were trying to do the right thing again.

The Elder Scrolls Online 

@TESOnline

Thousands of Dragons have been slain since Elsweyr released – but now you can continue defeating them for a good cause! Raise money for real-world charities that support pets in need with each Dragon kill in #ESO. beth.games/2oVobFW #SlayDragonsSaveCats

And I also wanted to remind everyone that there are consequences to not doing the right thing. I get angry when I see folks ask how do we incentivize tech companies to do the right thing. We should be asking them what kind of moral bankruptcy exists that says the right thing to do isn’t something you do? But I didn’t do the right thing. And the industry is different as a result. And worse, a lot of people are not better off because I didn’t bother to write that blog.

As software architects, we make choices, and we are accountable for those choices.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers, Facebook, Zynga

18 architecturalist papers: As was foretold.

June 13, 2019 by kostadis roussos Leave a Comment

Recently, a coworker of mine approached me and said, “All you have to do is figure out what the final right answer is.”

And I stared at him, and I was surprised at how ridiculous the comment sounded.

I turned to him and I said, “anyone can do that!”

In fact, it was at that point in time that I realized how little I understood what the nature of system architecture is. Figuring out how to build the correct answer is the least interesting part. Figuring out how to build the next part, while giving you optionality to build the correct solution later is the real job.

The job is to see what is being done today, understand where you want to go, and course correct efforts that are going in the wrong direction.

System architecture lives between the now, and perfect future, an area of complete grey. And the challenge is in that gray area; there are no correct answers.

The Minbari Grey Council is a perfect metaphor. Their job was to stand before the now, and the future that they knew was coming and making the hard choices. That space between the now and the future was unclear and uncertain. And they chose, when confronted with the future, to not make a choice.

The job of the system architect, is to know when the right thing to do is to break the Grey Council and when it is not.

The challenge for the system architect is that when you see a project that is going off the rails, you need to understand how much you need to get involved. Is this a project that if it succeeds will take the company in the new wrong direction? Or is this an effort that will open new opportunities that currently don’t exist? The height of hubris is to assume you know the answer. But to do nothing, is to say yes to everything.

Ultimately system architecture is a reflection of the taste of the architect. And like fake turning machines, not all taste is correct all problems.

The challenge, is to understand when you were taste is getting in the way of new opportunity and when your taste is telling you that something is going wrong.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

17 architecturalist papers: go fast and build things

May 17, 2019 by kostadis roussos Leave a Comment

Over the last ten years, I have struggled with a Facebook motto: Go Fast and Break Things.

It reminds me too much of the Great Gatsby quote:

They were careless people, Tom and Daisy—they smashed up things and . . . then retreated back into their money . . . and let other people clean up the mess they had made.

The process “go fast and break things” does not describe a method for creating value; it represents a process for an adrenaline high, an excuse to do whatever you want.

Let me ground this into a real-world example. Suppose I have a system, and I want to radically re-imagine what the system can do.

There are two paths.

The first is to create a brand new system, that is entirely incompatible and breaks everything. What do I mean by everything?  Any system is part of an ecosystem of tools and operations, and people that interact with the system. When you break the system, you are breaking that web of relationships and interactions. The net outcome is a radical change of that web.

So why do it? Well, because the cost of that change is borne by the people who use the system, not by the people who built the system. The more powerful the market position, the easier it is for the entrenched system to break things.

For example, Facebook used to break APIs all of the time. And that was okay for Facebook, because their captive audience, had no choice but to change to use the new APIs. The consumer-owned the full cost of the disruption.

The second approach is to evolve the system in a way that doesn’t break anything. In this model, instead of forcing the world to adapt to your system, you figure out how to integrate into their world.

Intel is an excellent poster-child for both of those. The first time was when they delivered the Pentium. At the time of the 486, there was a bunch of RISC processors like MIPS, Alpha, SPARC, that those of us in the field thought had a legitimate chance of dethroning Intel because the CISC core was intrinsically slower than RISC systems. But getting off of Intel meant breaking things. Instead, Intel did something that was a surprise to the casual followers of the industry, they embedded a RISC-like core into their processor, preserving the CISC instructions. By choosing not to break things, they won the CPU core wars.

Ironically, a little later, Intel pursued a foolish – in retrospect – strategy of the Itanium. The thing about the Itanium was that at the time there was no 64 bit Intel processor. Intel planned to move away from the x86 instruction set to go to a new kind of instruction set. The switch was highly disruptive. And AMD delivered what the market wanted, a 64-bit x86 processor and achieved a huge market opportunity.

In both cases, the winner deliberately chose to break as little as possible and add value in a way that did not disrupt the consumers of the technology.

To me, that is the best kind of engineering.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

16 architecturalist papers: you work for the future GM

March 23, 2019 by kostadis roussos Leave a Comment

One of the most challenging parts of the job of strategic software architect is that your job is to think about the future, and the GM’s job is to think about the present. And what’s worse, your planning horizon is typically beyond the planning horizon of the current GM.

Why is that a problem? Because we hate our future selves.

There is a lot of behavioral research that suggests we hate our future selves. We will do things that optimize for current happiness vs future happiness. Explains so many things about our choices. 

This, for example, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5611653/ or a more accessible and possibly more useful version here: https://www.anderson.ucla.edu/faculty-and-research/anderson-review/future-self-health

And this leads to my favorite story about the conflict between GM’s and their Strategic Software Architects.

In 2006 I sat in a room with Guy Churchward while at NetApp.
He was the new GM, and it was our first 1:1. And I told him: Look, Guy, you’re going to be gone in 18 months. And I’m going to be here for 3-5. My job is to make sure you don’t screw this technology up. If you do something I think is stupid, I will assure you it will not happen. Because I will make sure the dumbest, worst engineers are working on it. If you want to do something smart, I will make sure it succeeds. So you need to get me on board. And even if you get the smart guys to work on it, I will undermine their success, because it’s my job to make sure there is a technology that the next guy sitting across me has to go to market with.
It was a breathtakingly arrogant comment. But it was a true comment. Guy looked at me wondering if Technical Directors at NetApp could be fired.
Unfortunately for NetApp, he left way too soon from his job. And I left shortly thereafter.

He left not because he failed, but the nature of the GM job tenure is less than the tenure of the strategic software architect, and that is by design. We want the GM to be more short term focused and we want the strategic software architect to take the longer view.

The challenge is that we are working for the next GM. And the current GM is not interested in helping the future GM who is probably going to be somebody different. 
So the architect – in some sense – is that person who gets in the way of the current GM plan’s to help the future GM, someone the current GM hates (even if it’s him in 2 years). 
So then what? 
As architects, we are constantly fighting our current boss.
How does this manifest itself concretely:
If I am a GM and I have a product, to hit my numbers, I only need junior engineers. But if none of those turn into senior engineers in 1-2 years then product will have problems in 4. And if none of them turn into architects in 3, the product is dead in 7.
We can argue about the dates and ranges but the story holds true. If you don’t have senior technologists thinking about the future, then you miss the future.
So now what?
As strategic software architects, our job is to make the current GM think that the future GM is working for him.
Here’s how I always try to do that.
1. Make sure that the Strategic Software Architecture is something that the current GM will profit from. A GM has to make sales to companies who want to believe the company has a future, he has to attract technologists to build the current stuff, having a compelling technology strategy is very useful for both.
2. Make sure that the Strategic Software Architecture adds value all of the time to the current GM. This means that future pieces deliver value now.
3. Hire for the people you need to build the future, and have them build the present. This is this weird thing. You bring in someone to build your future, and have them work on the immediate problem. While you are doing that you wonder if you are making a mistake. I think of it as a twofer. The new gal learns something new AND gains credibility AND will build the future thing better.

4. Be flexible in planning. Every new GM will have new priorities, so be willing to change what you recommend to be built.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

15 architecturalist papers: The Turing Machine Fallacy

February 17, 2019 by kostadis roussos Leave a Comment

One of the most enduring mistakes software architects make is what I call the Turing Machine Fallacy. The argument goes like this.

1. My system solves this fantastic class of problems

2. This system will address all problems.

3. Therefore, there is no room for any other system

The most recent example of this fallacy was the cloud. Four years ago, when I joined VMware, everyone assumed that the public cloud was the future. The assumption was that a small set of public cloud providers would provide the infrastructure that everyone would consume. That computing infrastructure was fundamentally undifferentiated and, therefore, not something that would be worth investing in.

I didn’t, don’t, and never will agree.

I believed when VMware’s stock price was in the mid-’50s that this was a ridiculous proposition.

In short, my point is the following. If you believe that a single computing infrastructure will meet all computing needs, you also believe that all software will run on a single computing platform and that that computing platform is X.

We have a name for X, it’s called a Turing Machine, and last I checked, the guys selling paper were not making a lot of money.

I believe that the minute the industry has coalesced around yet another fake Turing Machine, I need to start looking for the thing that will replace it.

But why do I believe this?

Because fundamentally, the software is an approximation of the real world. The software is a model of the real world. It is not the real world. And the real-world changes. And when the world changes, the software approximations become increasingly ill-fitting until they no longer fit. And changing software is very, very hard.

Changing software is so hard that new software fits into the gaps between the current software and the real world.

The strategic system software architect’s role is to recognize that your job is to see where the approximations are ill-fitting and cause investment to happen in solutions that will fit into those areas.

And those investments are software that is – by definition – different than the current winning system architecture. And those investments will drive hardware investments to support that software. And the hardware architecture, which is, in turn, an approximation of reality, will change. And as the hardware architecture changes to adapt to the change in software architecture, the winning system stops being the final answer to software system architecture.

We realize that the system architecture is not a universal Turing Machine but a computer.

Hardware evolves to support software, which is continuously evolving to support a changing world. Performance, form factor, power consumption, and legal requirements are in constant flux, and therefore the needs change as well.

And so to end on a proof point, in 2015, everyone assumed that Amazon would own The Cloud. And yet, here we are, and what is clear is that there will be a plethora of clouds – private, public, authoritarian, IoT, etc. Each with their optimizations that deal with their specific requirements.

The universal Turing Machine is an excellent mathematical abstraction, but there is no such thing in the real world where I live and work.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

  • « Previous Page
  • 1
  • …
  • 3
  • 4
  • 5
  • 6
  • 7
  • Next Page »
 

Loading Comments...
 

    %d