wrong tool

You are finite. Zathras is finite. This is wrong tool.

  • Email
  • LinkedIn
  • RSS
  • Twitter

Powered by Genesis

45 architecturalist papers: a great software architect is a shepherd

May 24, 2021 by kostadis roussos Leave a Comment

man standing in front of group of lamb

Photo by Biegun Wschodni on Unsplash

A friend of mine and a colleague sent me a note today that he feels that his job feels more like a shepherd’s than anything else. He now had to look at what people were doing and keep telling them “not good enough” with the hope that it gets good enough at some point.
This contrasted with the model in his head of the software architect as gatekeeper.

Having worked with him for many years and seeing his growth as a person and a software architect, he articulated what I have been noodling about for many years.

The traditional mental model of a software architect is what several open-source projects implement. Every change is ultimately approved by one or more contributors who judge every change for clarity and quality.

That model makes the architect the gatekeeper of every coding decision.

That model might work well, but it does not work for me.

My model is more of that of a shepherd. I define in my head what is unacceptable. Unacceptable is anything I don’t understand. And that is anything that will break the customer’s deployment of the current product.

A breaking change need not break the current deployment of the existing product. It is always possible to transition to a new version of a system. A truly breaking change is – in my mind – a different product.

What about things I don’t understand? Understanding doesn’t mean agreement. And most importantly, it doesn’t mean code. It means the person sharing the technical data has given me enough details that I am confident I can understand what is being proposed and what it might manifest as, and I am okay with that.

To borrow from Gandalf, until I have understanding and believe this won’t break customers, “You shall not pass.”

Then the next set of questions are about the confidence that what you are building is being built well. My basic strategy is to hire great engineers who build great software. I assume the great engineers are going to produce great code. The real tricky bit is to make sure that the engineers believe that what they are building is the right thing for the right reasons. Like all of us, it’s often easier to say “Yes” to the wrong answer than to communicate “No.” Not every battle is worth fighting over.

My view is that as a software architect, I am required to say, “No.” And that my job is to discover if the engineers think what they are building is correct. And if it is not, be the fulcrum and lever that they need to get the space and time to figure out the right answer.

The other point is that once we know the correct answer, we can then agree to do the wrong thing because of time-to-market reasons. And that is an explicit tradeoff and not a consequence of never even considering the correct answer. And the ability to choose to do the wrong thing allowed flexibility and enabled the right long-term thing to be done. Without that ability, I have seen standoffs between engineering management, product management, and architects. The architects don’t trust the managers and the PM to do the right thing, so they only propose the right long-term solution, pushing out timelines, etc.

Again it’s about shepherding people through a process, not telling them what to do, and not reviewing everything they produce.

Why am I okay with this?

Look, I have 10 fingers, and I do more check-ins than any person in the company. Because I have a team of hundreds that does hundreds of check-ins per day, and they can operate independently, looking at the correct data. By leveraging their collective intelligence, we can go faster and innovate faster than any other leadership model.

Is there a risk that things will go badly? Yes. But I prefer to deal with the failures and then understand how to mitigate them going forward. To innovate is to fail. To do new things is to fail. To encourage people is to fail. I would instead fall that way than any other way. Because when you innovate, do new things, and encourage individuals and teams, greatness happens.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

44 architecturalist papers: the value of a college degree

April 21, 2021 by kostadis roussos Leave a Comment

Over the last twenty years, I have been the least impressed with the value of formal education in the field of computer science for most practitioners who do most of the work.

As someone who struggled to do well in exams and avoided classes with exams, I never understood what they measured. I know that I never applied to Google because they had this ridiculous requirement to see my GPA. I graduated with honors and summa cum-laude. It wasn’t my grades; it was the principal.

Maybe it was being bullied. As a weird, obnoxious Greek, my experience at Brown was toxic. I didn’t fit in. And all I wanted was to put as much physical distance with that part of my life as I could. And maybe the experience taught me that maybe this degree wasn’t as valuable as I thought. And maybe I thought, if everyone at top-colleges were like some of the specific people I had to deal with, then I would prefer to never work with them again. And the easiest thing to do was never to go work where they worked.

Over the years, I had no issues with hiring or promoting people on my team with no CS degree. At Zynga, we did real paper facebook of every senior leader with no CS degree, which was illuminating. I do remember that the guy who was the architect for cafe-world had no degree in CS.

Schools teach you the wrong skills, abstract concepts that are of meandering value, and most importantly, the wrong interpersonal values. Production software is a team sport. School is an individual sport. Production software is about maintaining software over time, not hitting a deadline. Production software is about customers and business requirements, not some contrived technical problem that some poor Professor invented to grade your attention span in a class. Production software is about the wisdom of how broken something can be.

In fact, I will observe my high-school history class, and the value it placed in critically understanding the nature of information and sources has proven to be far more valuable than any CS class I ever took. I would trade any CS class for that one.

But I *never* dismissed a top college’s most important value to a student. In fact, while simultaneously shitting on the value of a CS degree, I would tell folks, “a CS degree at a top-school has extraordinary value for your first set of jobs that set you up for your next set.”

If you want to get hired, out-of-school, graduating from a top school is orders of magnitude more valuable than anything else. And then, because the next set of jobs is a function of your relationships that you build while you work, the next set of jobs. I met an Israeli sales guy, and he told me the same thing. He was infuriated that he never got a job at a top-tech company because he felt he didn’t have the right degrees from the right school

I graduated from Brown University with a degree in CS, where I barely understood algorithmic analysis (I still don’t get how to do induction). I didn’t know what a database was. I didn’t know what a compiler was.

But I got a job at SGI in the kernel group because Forest Baskett’s daughter was a student at Brown. Forest Baskett was the then CTO of SGI, and he interviewed students at Brown as part of a recruiting project.

That had more to do with anything.

At SGI and later NetApp, I got a part-time Master’s degree at great personal expense and filled in gaps in my understanding of the field. But I did that to get a military deferral. I was wealthy enough to do that.

Having said all of that, I think learning and a life-long love of learning are crucial to your personal happiness and success. And a college education, if you can afford it, can help in that way, maybe. And I do believe learning to analyze how people talk and think and learn critically is valuable. And if you need to learn some boutique knowledge and a college setting works for you, by all means, take a class.

But the real reason a top-college degree is valuable? Because recruiters go there. That’s all.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers, Zynga

43 architecturalist papers: p-zero or die.

April 15, 2021 by kostadis roussos Leave a Comment

[so I tweaked an earlier post to be more inclusive and more relevant]

VMware has – possibly – the coolest skunkworks system at any company. Skunkworks projects are shared at a three-day internal conference known as RADIO.

During the year, employees across the company work on projects and produce papers based on those projects whose only purpose is to share them at RADIO and possibly get funded later.

Andrew Lambeth is a Fellow and all-around amazing person who gave an excellent talk titled P-zero or die. The point of the talk was how to take any big idea that you had and get it funded.

What does P-zero mean? Well, the P means priority. Everything in a company has a priority to decide what the team works on first, and the number zero is the highest priority level.  Why zero? Because, well, it turns out, first you start with the most important things having priority 1. And then the teams realize that there are too many P1’s. So instead of moving everything down to P2, a new priority level is created, priority zero. And then, of course, there are too many priority zero’s, and someone says, “let’s create p -1”. Then everyone realizes how many scripts and tools will break and that this makes them look ridiculous. So the org finally does the hard work to figure out how to reprioritize. But why didn’t this happen when p-zero was created? Well, because computer scientists start counting at 0, not 1. So 0 is the lowest number, not 1.

The fundamental principle of the talk is that if you don’t make your big idea a p0, it will get deprioritized for other stuff. And the reason it got deprioritized is that it’s big, and its value proposition was unclear, and people didn’t understand what you were trying to accomplish.

And so here’s the checklist of things you need to do to make your big idea someone else’s p0.

    1. Describe it effectively in 5 minutes.
    2. Success is easy to measure.
    3. Listeners must understand, not agree.
    4. Have a document to share, not go over.
    5. Describe it on any media.
    6.  Pitch it at every opportunity, relentlessly.

And the most crucial thing is this:

7. If you get no traction, then move on to the next big idea.

Sometimes a big idea’s time has not come, and you just need to let it go.

I liked the talk so much that I decided to make a t-shirt.

P-Zero or Die

P-Zero or Die

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

42 architecturalist papers: filling a glass with water is easy, building a glass is hard.

April 10, 2021 by kostadis roussos Leave a Comment

So my dad is one of those legendary figures in modern science. His H-index is 70+. His research transformed what we think of as hospital medicine, from a series of narrowly scoped disciplines to a systems theory of the human body where the center was critical care of key systems. And along the way, he transformed medicine in at least two countries, Canada and Greece, with his tireless advocacy and work to improve the medical systems either directly or through the people and institutions he built.

Whenever I need humility, I talk to him.

I mean, I have done a lot of big things in my life, but no, I didn’t change the fundamental way we think about how we die (body first, brain last).

In oh-so-many meetings, I stand in front of a large number of engineers and product managers and engineering managers and say, “see this impossible hill; we will climb it.”

And there is a natural inclination, “how?” And “how fast?” And “we have so much insanely hard work to do!”

For most of my professional life, I view that as an affront. That the person asking the question was dismissing the solution because it was too hard. As if there was some easier path that I had deliberately chosen to ignore. I thought they were saying you came up with the wrong answer. And I would get angry, and pissed off, and frustrated.

Recently I took this EQi test. And a key element of that test was how much you used reality to make judgments. And, well, I scored poorly.

And that got me thinking, why? Because I tend to look at reality not as immutable but as mutable.

As someone who has seen how the future changed because of what individuals like my dad did, and more modestly I did, my opinion of the relentless forces of nature and history is that I will always be willing to use my living hand to challenge any invisible or dead hand.

So once again, I was in a meeting where someone said, “this is the moral equivalent of scaling the North Face in a snowstorm.”

And I was like – “Man, you and I are so different, in a world that is 70% water, I look at an empty glass and think how easy it is to fill it.”

And, of course, that went down poorly. Because that person felt I was dismissing their observation. And, to be honest, I was.

And so I told this to my dad. My dad goes, “blah. Filling a glass with water is easy. Go to a lake or the sea. Finding a glass is easy. Go to any store. But make the right kind of glass that fits in the right place. That’s what’s hard.”

And he then looked at me with that look, “Good you can do hard things, I am proud of you.” My dad was never one for recognition, etc. H was always about the next hill to climb. And then we spent the rest of the time talking about how filling that glass is so damned hard.

And it got me thinking a lot about how strategic software architecture is about finding that right glass. And that is hard. And when I find that glass, I am excited, and all I can see is how we could achieve miracles if we just filled it.

But the work has barely begun for everyone else.

And filling that glass with water is hard.

And being dismissive of the challenge and not recognizing that the work has barely begun is critical. And when people tell you, “dude, I am not climbing the North Face in a snowstorm unless you break this down a little bit more,” I shouldn’t get angry; I should be delighted. Any sane person would be like – “good luck.”

Next time, I’ll not stand waiting for the applause and get annoyed that everyone isn’t admiring my achievement of identifying the right glass.

So let me adjust my thinking.

When I find a glass, I know we can fill it with water. The effort to fill it with water in a timely fashion will make the finding of the glass feel like a trivial subtask.

And to ask people to be happy for me for finding the glass is kind of like the tenor asking for applause for clearing his voice. I’m asking them to trust me that I know where the safe path up the North Face is. Maybe, instead of asking them to applaud, I should start preparing for the climb and be grateful that they might follow me.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

41 architecturalist papers: an Easter lesson, sexism and academic research values

April 4, 2021 by kostadis roussos Leave a Comment

Personal observations about Easter.

In 1994, I was taking a class in Computer Science. And the final project extended over Orthodox Easter.

I was in a quandary. The combination of workload, poor project planning had lead to the difficult decision of what to do. I needed more hours in the day to get it done. If it were just my grade, I would have been happy to take a hit. But it wasn’t; I was part of a group project. And I felt I was going to let my team down.

I went to the professor, and his reaction was illuminating – “Tough.”

At first, I was infuriated because I thought it was a statement about my Easter. Later on, I learned it was a statement of prioritizing my personal life over my work.

It was an essential and evil lesson. My personal life had to take a secondary place to my work life. If I cared about my career, frivolities like friendships and family could not intervene.

And so I missed Easter celebrations.

And I did that for the next 30+ years.

But not only Easter, all sorts of celebrations, because work mattered over everything.

Over the years, I saw this fetishization of deadlines. That was somehow hitting the date at all costs was the right thing to do.

And I have learned something from that experience. Some jobs require that kind of sacrifice, and they have a high burnout ratio. I had one of those jobs. Within two years, I was a burned-out husk, getting drunk by 6 pm to get through the day. Until I had my nerves dulled from booze, I couldn’t relate to my family.

As time has moved forward, I have concluded that jobs that impose that sacrifice have to be a matter of life and death because your life is a casualty of those jobs.

Any other employer who makes that kind of request all of the time is evil.

And anytime an employer makes that kind of demand, it reflects poor project planning.

As a strategic software architect, when I see a team in that kind of state, I view that as my failure. It’s my job to ensure that I never have to make that ask.

But this persistent evil behavior endures. And it got me wondering where it comes from. I am not naive enough to think that Academic Research is the only place it comes from, but I have an ax, and I wish to grind it.

I realized where it comes from. It comes from the academic community. Academics have firm deadlines. The paper arrives by the date, or it might as well never arrive. The paper that gets published first gets all of the credit. And so high-end schools teach that the work deadline is all that matters.

And that academic community was primarily male that happily dumped every personal obligation to their family on their wives. My mother and I lived through that period with my dad, who decided to abandon his pregnant wife and son in a tiny apartment on nun’s island, so he could do research in Paris. Fortunately, his friend and colleague explained the foolishness of his ways, and he adjusted.

I saw some of that through the SIGGRAPH era.

After all, the 1990’s SIGGRAPH deadline meant if you were publishing there, your Christmas is about writing a paper. While your spouse takes care of the family and social obligations, you have time to work without interruption.

And I still remember my outrage during the Fukushima disaster. The SIGGRAPH committee sent out a note saying that the deadline for papers was extended for folks submitting from Japan IP addresses. It was an absurd comment and yet utterly consistent with that professor who said, “tough.” I could imagine the academics who wrote that note thinking – “notice how compassionate we are! We extended an immutable deadline!”

So here’s my Easter comment to everyone. Enjoy Easter. Enjoy your life. It’s a short one. Life is a marathon, and you can’t sprint all of the time. Pick the sprints, pick the times to relax, and find the teams that will support you. And leave the teams that won’t. And if you want to sacrifice your life for a greater cause, that’s your calling. Just make sure it’s really a matter of life and death.

And if you are blessed to have a family, find the time to be with them. They matter. I don’t regret many things in my life, but I regret every single Easter I didn’t spend with my mom.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

40 architecturalist papers: complex problems require complex solutions

March 28, 2021 by kostadis roussos Leave a Comment

One of the most problematic practices in engineering organizations that I have joined is the rejection of complex solutions to complex problems because they are difficult.

Any problem at a large scale of customers is complex. The software has to deal with a comprehensive and unknowable set of use-cases and deployments. The software has to cater to a large number of other systems and users that interact with it.

Any change to the system can have unknowable impacts.

Suppose we have a piece of software, let’s say Acme Management Center (AM Center), and we need to make a significant architectural shift.

For example, if you need to change a system from:

  1. being single-threaded to multi-threaded;
  2. being a single point of failure to tolerate multiple-failures;
  3. using a single database, and using multiple federated databases;
  4. being a closed a system being open;

the size of the effort depends on the complexity of the software being change.

If AMC has a small number of users or is small, the cost of the change is tiny.

If the system, let’s call it BigVirtuCo Center (BVCC), has many users and has been around for 20 years, the effort to make the identical architectural change is enormous.

We intuitively understand the complexity of the two efforts as being different.

But then let’s turn it around a little bit. Why was the change of BVC Center’s architecture so much harder than AM Center? It’s because BVC Center supports 20 years’ worth of unique features more than AM Center.

Now BVC Center is worth 10 billion dollars because of how many people use it. AM Center is worth 1 million dollars because of the number of people who use it.

AM Center is a delicate product, but it simply doesn’t have the feature set that BVC Center has and thus can’t solve the problem that BVC Center solves.

A typical fallacy that the BVC Center competitors engage in is to say, “well, we can make something that has 80% of the value of BVC Center and kill it in 1/10 the time by focusing on 20% of the features”.

They are saying that there is a more straightforward solution to the complex problem that BVC Center solved.

And the competitors of BVC Center typically fail.

Oh, but what about the innovator’s dilemma?

Simpler products can indeed displace BVC Center. But they don’t replace BVC Center when they are simple; what happens is that they find a big complex problem that BVC Center ignored and become entrenched in that space.

BVC Center doesn’t get displaced by a simpler product; it gets displaced by a complex product solving a complex problem of higher value than the one BVC Center solved. Most of the time, the thing replacing BVC Center is more complicated than BVC Center ever was.

The problem with complex problems is that they take a lot of time and resources to solve.

As a software architect, it’s essential to understand what the correct solution to the problem is. And to not shy away from the complexity.

At the same time, we need to ship software. And this brings us to the tricky problem:  not the discovery of the correct solution but the delineation of an approximation that adds value and moves you in the direction of solving the problem.

For example, making a program multi-threaded that had a single thread is a massive undertaking. Incrementally adding threads and parallelizing pieces of the system allows you to take advantage of more processing power.

Similarly, a customer may have a complex problem that will take years to solve. The customer is willing to live with a partial solution if that leads to a complete solution.

To revisit the beginning of this thread, frequently, folks say – “this is too complex.” And perhaps it is. Or perhaps what they are saying is that the cost of the solution is too high.

As an engineer, it’s essential to understand what the correct answer is, and at the same time, what is the slice of the right solution that we need to do today and never to confuse the cut as the correct answer.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

39 architecturalist papers: software doesn’t have empathy

March 24, 2021 by kostadis roussos Leave a Comment

 

In the ongoing debates over SaaS and SaaS transformation is the belief that software engineers must do dev-ops to acquire empathy.

It sounds so good in principle. The problem is that the engineers don’t understand the problem because they don’t care!  So let’s make them do the job so that they can understand the problem and care.

No one ever asks – what if the software engineer sees the whole experience as a tax on his life that adds very little value. That their brain doesn’t work in a way where doing is the best form of learning. And why wouldn’t a well-crafted document suffice? What if the engineer is the kind of person that doesn’t function well in a high-stakes operations crisis but can deliver complex systems?

And then there is the emotional blackmail of the use of the word empathy. It’s not that  I don’t understand the problem because you haven’t explained it to me. It’s because I am callous.

My other favorite is that you must feel the pain so that you do what is necessary.

In short, the argument sounds like this to me:

  • No pain, no gain.
  • To grow, you must suffer.

It’s like a certain superhero’s origin story.

You don’t use your superpowers (in this case, writing code) because you don’t understand the good you could do (by doing more operational software).

I preferred the original – “With great power comes great responsibility.”

Enough.

When software lacks features, it’s because the engineers weren’t given prioritized requirements; it’s not because they lacked empathy.

I feel like the last 20+ years of interaction design have passed the entire world by.

And having owned operations at scale and done product development at scale, I feel like I am uniquely qualified to say “No” when people tell me it’s about empathy.

And as someone who has struggled with empathy because of how my brain works, the statement – “you lack empathy, and that is why you can’t build the right software” – when I have – is plain infuriating.

About 20+ years ago, the science of human-computer interaction emerged. And the idea was that software could be better if, instead of forcing human beings to change, the software met us halfway.

A book called About Face articulated that well. I still remember the aha.

About Face argued that software engineers did a terrible job of building software for their customers because they didn’t understand them.

And that there was a systematic approach to understanding the customers that could allow software engineers to build the right software.

That methodology became the art of software design.

And it has produced a much better user experience for software than ever existed.

Operations of software in a SaaS environment has a new persona, the operator. And the operator is not the software engineer.

And having the software engineer build UXs for the operator without the same kind of design discipline that we have for other products produces user-experiences from the 1990s.

In fact, the whole empathy argument is in many ways a rejection of over 20+ years of insight into the process of software design.

What the process did was to translate the user goals and aspirations into something that software could solve. And when you combine brilliant designers and brilliant software engineers, great things happen.

So the next time you are told you need empathy, inform the person screaming about empathy that moral arguments didn’t produce great software design. Great designers produced great software designs. And have the design team look at the problem with you.

If doing the job is a great way for you to learn, then do the job. If reading a document is a great way for you to learn, do that. If listening to a talk is a great way for you to learn, do that.

But learning what to build doesn’t require empathy; it requires a definition of what needs to be built.

 

 

 

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

38 architecturalist papers: A minimalist SaaS definition for enabling Service ownership

March 20, 2021 by kostadis roussos Leave a Comment

 

 

 

 

 

 

In the tech industry, there are as many ways to buy products as there are companies.

Today there are two big variants. The first variant is what we call “Packaged Software,” and the other variant is what we call “SaaS.”

And the corporate transition from selling one variant to the other can feel like the ocean hitting land. And the transition is made worse because organizations insist that the problem is culture and mindset and not how the software is built. No amount of mindset can overcome software architecture. And because software architecture is hard, and because the two teams are screaming at each other, the real problems never get addressed.

But, as I stared at the problem recently, I realized that we architects must provide meaningful guidance. Saying “but the architecture” is as useful as “but culture.”

So let me try.

Packaged Software is when you go to a store, buy the bits, and install them.

The other variant is SaaS.

Now the word SaaS translates to “Software as a Service,” but like any FLA (Four letter acronym), it’s so devoid of specificity to mean nothing.

I claimed that you were a SaaS service many years ago, not if you run in the cloud or on-prem but if you satisfied one criterion.

1. The provider of software decided when software gets upgraded

This definition was interesting, but big enough customers can demand special upgrade cycles even for SaaS.

So I started adding all sorts of other criteria, “easy to use, consumable, freemium, cloud delivery, DevOps,” and every single time, it floundered because I could trivially point to how packaged software offered the same features.

But SaaS was different. And then it hit me; SaaS is different because of another important criterion.

2. If someone other than the provider wants to extend the software what’s important is how to connect, not where the Software is.

The significance of this – is that the software consumer assumes the software is always running and that the consumer and integrator doesn’t have to care about how it’s deployed, how many resources the SaaS system requires, etc.

Integration doesn’t involve step 0; install the software.

Elements 1 and 2 starts to separate SaaS and non-SaaS.

But there is in my mind this third criterion:

3. The producer of the SaaS software chooses where it can run, not the consumer.

Now, this is a weird one. Almost every piece of software written has some minimal hardware and/or software requirements. So, at some level, why is this different from non-SaaS? The answer is that in non-SaaS, the actual location of where the software runs is opaque to the producer.

So now we have – in my mind – a minimal test for a team that wants to be SaaS.

  1. Does the team owning the software decide when it gets upgraded and consequently what features the software has?
  2. Does the team/customers that use the software just care about how to connect to the software?
  3. Does the team that owns the software get to decide where it physically runs?

These are necessary conditions, although not sufficient.

But they are necessary for service ownership to become a thing.

Service Ownership?

In any SaaS company, service ownership means you have a pager (or someone/somewhere on your team) has a pager.

When a company first goes to SaaS, they typically install much software from a bunch of teams into some cloud and offer that as a cloud offering.

The team that installed the software then finds itself triaging bugs in all of the software installed.

The team that did the installation then screams at the other teams: “You need a culture change. You don’t care about customers!”

The teams that produced the software look at the cloud team and said: “How dare you! We respond to every bug!”

The cloud team says: “We are a service.”

The other team says: “we produce software.”

And the screaming continues.

The truth is the following. The cloud team built a service out of packaged software. They control the version; they control how you connect to the service; they control where the software runs.

From the other team’s perspective, the cloud team is just another unhappy customer, running the wrong version of the software, has done all sorts of weird things that don’t quite match other customer usage patterns, etc.

And like every other customer, they just want people. And the product team has mastered the fine art of saying no.

But why does it feel to the product team that they are just another customer? Because the product team doesn’t have any control over when a bug gets fixed and deployed, the cloud team does. Because the product team doesn’t control how you connect to the software, the cloud team does. And finally, because the product team doesn’t even have any say in where the software runs, the cloud team does.

And the SaaS standoff occurs.

The SaaS standoff ends when the cloud team demands a specific service, not support, not cultural shifts, but a service. And that service has to be well-architected and engineered to provide both teams with what they want.

And the minimal requirements of that service are the three things I enumerated.

  1. Does the team owning the software decide when it gets upgraded and consequently what features the software has?
  2. Does the team/customers that use the software just care about how to connect to the software?
  3. Does the team that owns the software get to decide where it physically runs?

 

 

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

37 architecturalist papers: paradigm shifts are transitory

March 15, 2021 by kostadis roussos Leave a Comment

About 15 years ago, I met an Inktomi architect who said that the future of all computing was the cell phone. It was before the iPhone.

And I stared at him and his dorky phone, and I said to myself, “nope.”

For 15 years, that was THE worst bet of all time.

But COVID changed all of that.

Mobility ground to a halt. And more importantly, the emergence of a whole slew of technologies called into question the value of physical mobility.

And then got me thinking about the future of mobile computing.

Mobile computing intends to allow the person to go anywhere with their computing device.

That was the paradigm.

But what if the paradigm shifts a little bit, that mobile computing is about letting the person be present anywhere without physically being there.

I like to follow mobile apps as proof points of a thesis.

For example, before Covid, Doordash’s desktop features lagged their mobile devices. It has since changed.

The simplistic answer is that people are at home more and in front of a computer more.

But that’s not satisfying.

What if, instead, more people had more computers? And had created places to work at home that allowed them to use their computers more?

In that case, more people would stop carrying their phones with them and use them to do work stuff. Instead, they would go to their workspace and do their work stuff there.

So?

In that case, technologies that improve our ability to be present are more important than technologies that allow us to carry our computers with us.

And that mobile computing and battery life, the dominant trend and a perceived inevitable future, was just technology that accommodated the previous paradigm, well.

My point in this discussion wasn’t to focus on COVID. My point was to focus on how paradigm shifts can occur.

And that assumptions about the inevitability of anything are usually wrong.

Tying this back to strategic architecture, this falls into my general thesis that the role of a technologist who owns a portfolio of technologies is to recognize that things fall out of fashion. That when they do, it is tempting to call them dead. It is also tempting to ignore things that are in fashion and call them a fad. Instead, we need to keep our eye out for trends that change and then think through all of the implications.

And that if you are leading a broad enough portfolio of products, the strategy of how to navigate that shift is very important.

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

36 architecturalist papers: multi-tenancy and isolation primitives

March 10, 2021 by kostadis roussos Leave a Comment

greyscale photo of buildingOver the years, I have struggled with multi-tenancy. The definition of it feels fluid and imprecise.

And so, in a desperate attempt, I wrote some incoherent thoughts down. I’ll take another stab at being precise.

Multi-tenancy has two elements; the first is that there are a set of users who have access to a set of resources. And the users and their rights to those resources are a technical problem. What is the hierarchy of users? What is the hierarchy of permissions? What rights do they have on the resources?

There is a separate problem about the unit of isolation of resources. If two tenants have access to resources, the immediate follow-up question is how much isolation does the resource provide between tenants?

For example, a physical server is a unit of isolation. A virtual machine is a unit of isolation. And so on.

Let’s dig into the notion of isolation a bit. Isolation is used to provide security, performance, and availability isolation. A great example is a black site. A black site is completely isolated for reasons of security from the rest of the world. A dedicated server running a single application is used to isolate that application from any other application that may affect it.

At one extreme level of isolation, each computer system is isolated from another.

There is no multi-tenancy at all.

This model is costly.

And so consumers ask – can I trade off some security, performance, and availability for cost optimization?

In short, can I group a set of applications on a set of resources where the isolation isn’t as strong to get some sharing?

Time-sharing systems were, in many ways, the first such system when they replaced batch systems.

And that’s where things got confusing.

Because the only real isolation is hardware isolation, the minute the hardware is shared, two applications share something.

And although hardware vendors and software vendors have gone to heroic lengths to ensure isolation, the reality is if you are sharing something, you are not isolated.

Now let’s introduce a second person, the hardware resource provider. The hardware resource provider wants to offer resources of varying degrees of isolation. And wants to bill accordingly. The resource provider’s challenge is that the consumer may not be willing to pay for resources if they are too expensive. In other words, the consumer of the resources is willing to trade off some isolation for cost savings.

And now we introduce a third player, the manager of the resources that the consumer of resources uses.

The resource provider gives up some resources, the manager divides them up amongst consumers, and the consumers consume them.

The resource provider provides some resources that are isolated. The manager associates those resources into something that consumers can use, and the consumers consume them.

This simple model assumes that a resource that is isolated can be controlled and offered to a consumer and is at the right granularity.

But what if it isn’t?

And thus, we have the fourth player in our dance, a person who takes isolated resources from resource providers that they then offer to a manager of resources that offers them to consumers. Let’s refer to these resources as “virtual physical resources” (VRP).

And that’s fine, except in one very annoying use-case.

What if two different sets of virtual resources are sharing the same underlying physical resources?

If the VRP’s assume that is not the case, then the entire system falls apart.

Why? Because the guarantees of isolation assume that the underlying hardware has some degree of isolation (remember, the VRP is built on isolated resources).

Huh?

Imagine VRP-1 assumes that a socket is isolated and it controls the resource. VRP-1 can make assumptions about security and performance. If VRP-2 is also using that resource, VRP-1’s is being lied to. And that’s fine until VRP-1 and VRP-2 have conflicting objectives.

At that point, the thing that actually controls the socket has to arbitrate between VRP-1 and VRP-2.

And we’re still good. What if VRP-1 assumed it had control of the socket and gave out virtual sockets, and VRP-2 assumed that it had control of the socket and was giving out a full-time slice of the socket.

The problem is that the isolation unit was a socket, and there was no way to arbitrate between the two VRP’s.

So?

The real challenge in multi-tenant systems is the proliferation of VRP’s. Some VRP’s can not be layered on top of each other because they rely on isolation primitives at a lower level that loses critical information about how the VRP wants the resource to be used.

An obvious solution is to create a new VRP, called VRP-12, the union of VRP-1 and VRP-2 and can effectively share a system between the two users. I worked on such a system at SGI. It was a batch scheduler called Miser.

The problem with that approach is that as the VRP’s proliferate, there is a cost in pushing them down a layer.

What’s the cost?

There are three costs. The first is the system resources for the control plane itself. The second is the complexity of using the control plane. And the third is that as the control plane becomes more complex and supports more resource isolation, the control plane needs isolation.

At some point, it becomes simpler to have different VRP’s operating independently rather than on a single shared system.

But wait, there’s more.

Because a VRP only requires a layer that isolates hardware, and every VRP isolates hardware, VRP’s proliferate and get layered on top of each other. There is no final VRP layer.

So?

What it boils down to, in my mind, that multi-tenancy really means three things:

  1. A set of physical resources that need to be shared
  2. Some VRP system being used to share the physical resources (1) that uses mechanisms to isolate the hardware and provides isolated resources coupled with a mechanism to assign those resources to an individual group that can use them.
  3. Some mechanism for grouping users into hierarchies’.

And what makes multi-tenancy even more complicated is that in any system, there are multiple VRPs.

And so the challenge of multi-tenancy is really about how you enable this ever-expanding hierarchy of VRPs such that they don’t require their own isolated hardware, and each consumer, resource provider, and resource manager can effectively do their job. And understand what guarantees they are getting and providing.

To make this a little bit more concrete, let’s consider k8s.

Some developer wants to deploy a k8s application and request a certain amount of CPU cycles.

To do that, there must be physical CPUs.

Those physical resources are partitioned by some hypervisor that hands out virtual CPUs to an OS.

The OS, in turn, hands out threads that run on the vCPUs.

The k8s system in turns groups threads into things called Pods.

The k8s system is deployed in such a way that each tenant has his own namespace.

Each layer, the hypervisor, the OS, and k8s is providing virtual resource pools. Each layer has a distinct manager.

A multi-tenant system is the combination of those virtual resource pools and the physical hardware resources.

 

Share this:

  • Email a link to a friend (Opens in new window) Email
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on WhatsApp (Opens in new window) WhatsApp

Like this:

Like Loading...

Filed Under: Architecturalist Papers

  • « Previous Page
  • 1
  • 2
  • 3
  • 4
  • 5
  • …
  • 7
  • Next Page »
 

Loading Comments...
 

    %d