Skip to content
Ep 03: Keys to Cloud Success: Center of Excellence Strategies

Ep 03: Keys to Cloud Success: Center of Excellence Strategies

cloud-currents-lukas-cloud-success

About This Episode

The cloud revolution is transforming IT and business, but many organizations struggle with the transition. In this episode, Greg Ahlheim and Lukas Karlsson take a deep dive into cloud adoption, from early challenges to recent innovations like AI, highlighting key lessons learned along the way.

Know the Guests

Lukas Karlsson

Cloud Architect and Software Engineer

Lukas Karlsson is a seasoned Cloud Architect and Software Engineer, recognized as a Google Developer Expert for his profound knowledge and contributions. As a passionate Developer Advocate, he frequently takes the stage as a public speaker, sharing his insights and expertise. Beyond his technical endeavors, Lukas stands out as the founder of a pioneering tech initiative and a dedicated community organizer.

Know Your Host

Greg Ahlheim

Sr. Vice President of Product Development

Greg Ahlheim is the Sr. Vice President of Product Development. With over 20 years of experience, Greg joined TierPoint in 2019 and today manages the team responsible for conceiving, designing, and building the industry-leading colocation, cloud, and managed service solutions that help the company’s thousands of clients on their IT transformation journeys. Before TierPoint, Greg held leadership positions at World Wide Technology and CenturyLink (now Lumen). He started his career as an engineer with Bridge Information Systems. Greg holds a Bachelor of Science in Information Technology from Lindenwood University.

Transcript

(0:00) Introduction to Lukas Karlsson

Greg Ahlheim: Hi, my name is Greg Ahlheim. I'm with Lucas Carlson. This is the Cloud Current Podcast. Lucas, if you would, tell us a little bit about your background. You've got sort of a long career in technology things and gravitated towards cloud, I guess, right?

Lukas Karlsson: Yeah, it has been a long journey. I got started early on working at an internet provider in the early 2000s.

The mid early to mid-nineties. And, um, I've spent some time working in a semiconductor manufacturing industry and financial services. Um, I spent a couple of years at Akamai Technologies, and then I landed eventually at the Broad Institute where I spent, um, just shy of 20 years. Um, and, and at the Broad.

You know, it was even just in that one job, I experienced this massive, um, sort of change, uh, from the, um, traditional proprietary Unix operating systems like Solaris and HP UX and true 64 and moving to Linux, which was kind of a big deal. And then, um, there was a bunch of hardware shifts, both in the sequencing technology, which is where my company was very focused, as well as in the server technology of moving not only towards Linux, but towards little 1U servers and bleeds and eventually to virtual machines.

And then we experienced the entire shift from pure virtual machines, starting to move into containers and then taking advantage eventually of all the serverless technologies and Kubernetes and, and all of the sort of new things that didn't exist at all when I started doing this stuff. So yeah, it's been.

Uh, a lot of variety over that time that I've had to deal with.

Greg Alhiem: Yeah. And so, uh, today we're going to talk a little bit or a lot of it about, uh, cloud governance and I'm, I'm imagining, um, in the categories of security, cost containment, cost control, cost management, and then just best practices for architectural design deployment.

You've seen sort of an evolution of that, right? Uh, probably the principles stay the same, but can you talk a little bit, uh, more about how you've kind of seen that evolution happen? What the problems of yesterday, you know, how they come, you know, to be the problems of today just your

experience with some of that.

Lukas Karlsson: That's a great question. I feel like we could probably do an entire podcast just on that. Um, I would say the first thing that comes into play when you start making the shift from the traditional infrastructure that I spent a long time working with, um, into the cloud is that you move from this very like sort of structured time-bound model of acquiring these assets that are depreciated over time.

And people spend months or even, you know, years planning what they're going to buy. And then they buy it and then they spend potentially a year deploying it. And then it runs, you know, for like three years. And, um, and then maybe you buy a support contract, but like. All of the mechanics of that process are very well understood and like most businesses, traditional businesses, are like entirely wired around that concept.

And so, you know, you have, um, an IT person that has a certain amount of budget and then they go buy the equipment using that budget. And, um, periodically they have years where they have to spend a lot more because a lot of things are coming off of. Support, but you know, basically it's all very controlled and after you've done all that purchasing and the old model, like that's what you have.

So. Your users, whoever they may be, and whatever workloads they're running, you know, they can abuse it as much as they want, because if they're not using it, then you're losing money on it effectively, like you paid for it and it's depreciating, so you need to use all of it as much as possible. And then... So then when you switch to the cloud model, literally everything turns on its head.

Um, you establish like a line of credit basically for anyone who's using the cloud. And then they just spend and spend and spend with seemingly no controls in place. Um, and, and there's a huge disconnect between the people that are. Making the actual changes, the real world changes that cause the spend to be impacted and the people who are like tracking the money and so there's just any number of challenges in that space.

Um, and that's without even getting into security. That's just considering like, how does the money get allocated? How do you budget for something if you don't know how much it's going to cost in advance? Like there's so many benefits of being able to use the cloud in this metered way where you only pay for what you use, but then it becomes a completely unpredictable thing and companies have a real hard time, you know, predicting exactly what it's going to cost.

And, and then when you start dealing with security aspects, like. Not only do you have the users who might spend more than they should just because they don't really know the constraints, um, but then you also have the abusers who someone will hack into your account because now it's like out on the internet instead of locked up inside your Private data center and, um, then they start Bitcoin mining or whatever, using your financial resources to do their own, um, compute.

And so it's, it's really like a harrowing thing for companies, um, to adjust to, which is why I love to deal with companies who are like net new Greenfields companies. And this is the only thing they know because. Convincing all the players within the organization to, to adapt to a new model of how things work is a challenging thing.

And so companies understandably struggle with that.

(6:30) Getting Buy-in for a Cloud Center of Excellence

Greg Alhiem: Yeah. I mean, you almost have to think about sort of having a top down strategy to be most successful, because if, if, if the senior leadership is saying, Hey, this is the direction we're going. And, um, you can get everybody on board that works a lot better than trying to push up from the bottom, because I think that the bottom up aspect of that is where you might run into the shadow.

It people turning up things and, you know, forgetting about them, you know, causing some security risk and cost that they didn't plan for manage. Well, you know, I agree with. I think it's interesting. You were talking about before you had to put a lot of focus on the physical assets and the maintenance and the cost and, you know, trading, you know, sort of traditional compute for cloud compute.

You know, there was a sense of, Oh, we don't have to worry about all those things. Everything's going to be easier in the cloud, but then you're just trading those for sort of a different set of things that you have to focus on, you know, and that's like you said, the cost, the security, you know, general best practices when you don't have a startup that has a clean slate to go through that, I would imagine that transition being very difficult because you're kind of somewhere between two worlds.

You're trying to manage. All of that physical stuff, and then you've got this new thing that you're doing that you don't have the physical thing, but you have this whole new set of, uh, things that you have to deal with. So we've, we've seen, I've seen a big, you and I maybe in passing, we've talked about sort of the center of excellence, you know, having some, uh, uh, some guiding organization that's going to make sure that the adoption of cloud is done in an organized, structured, you know, And, uh, you know, in a way that's, that's best for the business, how do you see companies adopting to that?

Tell me, tell me what you've seen or, you know, uh, what's working, what's not working. If that, if that matures.

Lukas Karlsson: I think you nailed it when you said that you need like buy in from the top level. These new sort of shifts and how things work, like happen all the time. Um, some more significant than others. I mean, we could do a whole thing where we just talk about generative AI, um, and the way that's, you know, impacted the industry.

Um, and so. Depending on the organization, you know, the initial impetus to start using cloud can probably come from anywhere. Um, because, you know, it's not necessarily that the CEO is going to be like, we need to become a cloud company. It's going to be that some research group inside has a thing they can only do using some of the newer tech and then they start adopting it in a, in a, maybe a shadow it way.

But to use it substantially throughout the organization in a way that changes how you work. You need everyone sort of bought into it. Um, it means you need the procurement process to change because I don't, I don't order a server anymore. I'm basically ordering permission to spend money, which is the same kind of, the way I think about it as a lot of companies, you can.

Um, apply to get like a company card and some companies you have to go through like the company training on what's how you use the card and what it's okay to do with and how you get reimbursed and you know, before you can do any of that, you don't have a card and so like. A similar process needs to exist for cloud.

You can't just give people this, um, you know, blank check to go spend money without having thought through like what that's going to look like. And what are you going to do if X happens? And if Y happens and, um, and what, what is the right thing that should happen when they spend beyond what the original plan was, you know?

And so none of that can really be achieved in a silo. You need. The people in procurement and the people in accounts payable and. Um, and the actual users of the system and the people who own the accounts and the budgets, like all of them need to be sort of playing together in some kind of way. And, um, you know, creating a cloud center of excellence is a great way to like, where is that place that they're all going to convene where they're going to share knowledge and, and have a clear understanding of all the roles and responsibilities and, um, where they're going Build a set of best practices together that are going to improve the process.

Like, where's that going to happen? Like most places don't already have an obvious, like, is, does it own that process does finance on that process? Is it like, is it going to be like a CEO, a special project and, and everyone gets assigned to it? Like, how does that even work? And so if you can establish this cloud center of excellence where all these parties are participating.

In this thing, this entity together, I think that shows a lot of promise and that's how a lot of companies have, um, been able to get over that hump without something like that, you have a lot of, um, whether it's shadow it or shadow, like finance, you know, like. Once you've given me my Amazon account or my Google cloud account, and if you haven't put a significant amount of controls in place in advance, you just given me like a blank check.

And so assuming you totally 100 percent trust me to always act appropriately with that, that's fine. But. Pretty soon that gets unwieldy and now everyone, you need to establish some kind of training that people get before they go through the process and some kind of way to monitor that the process is still working.

And so I, you just can't do that like in a silo and none of those individual groups. Have the sort of influence across the organization to get everyone else to line up. And that's, that's why the buy-in from the top level is important. Yeah.

Greg Alhiem: Yeah, you covered so much good, interesting ground there that like, you, like you said, we could spend some money.

I guess one thing that I would ask you is you, in the time that you've been doing all of this, maybe, maybe more so in the last five to 10 years, are you seeing. Um, sort of across the industry that maturation, is it, is it most of the way there is it early and it's evolution or do you think it's getting like people get it now?

Lukas Karlsson: Oh, um, it depends like what community you're referring to. Like, I think if you're just talking about in general across the board, I think it's probably fair to say that there's like entire industries that haven't like started actively moving to the cloud yet. And, um, Most companies that I've worked with that are established companies that didn't start in the last five years, they haven't fully moved to the cloud yet.

Like they still have that inevitable long tail of stuff that is running in the data center, or maybe even in the network closet, if they like mostly got rid of the data center, but they still have some servers, you know? Um, and so there's very few companies that I would say, especially like companies that have been around for 10, 20 more, more years.

That are like a hundred percent cloud native. And like I said, there's entire areas where they haven't started that journey. So I think there's still a ton of opportunity for, you know, providers, service providers, software providers to continue to build out tools in that space to accommodate the ever growing, um, community.

I think like. What's generally deceiving is if you look just at cloud spend and you say like, okay, there wasn't cloud, uh, at some point. And then we had cloud and there was like a minimum amount of spend. And now you've seen this enormous growth of people spending money on cloud. I don't think, I think what that mostly represents is like most of the new stuff that has come into existence recently.

And it doesn't really represent. Everyone having shifted off of the old stuff. I know to use an analogy from my own experience, like at the Broad, when we were dealing with the data explosion from genomic data research, um, you know, it was true that we could say that. We had data that, you know, we had been generating for more than 20 years because the Human Genome Project itself went on for a while.

And that was, depending on who you ask, completed, you know, sometime in the early 2000s. And, um, and then, like, we just continued to generate massive amounts of data. But, but when you're thinking about, oh, well, you're currently generating tons of data and, like, five years ago and ten years ago, you were generating tons of data.

You must have, like, you know, so much more than you can imagine. And it was like, well, there's a lot there, but in fact, like the data we generated in the last 12 months is like greater in capacity and size than all the data that we had generated and all the time before that, because the tools are, are, um, improving.

And so you can get richer output. And then because you have more of those tools, now you can expand the amount of whatever it is you're doing. So, um, You know, we were like, yeah, now you can look at our cloud footprint and say that, look how huge this cloud footprint is compared to your average organization.

But it wasn't like, We moved anything, you know, it was like on some point in the not so distant past, we were like, let's start putting the new stuff in the cloud. And that way we won't continue creating a problem, you know, we'll just like, I still have the problem that we had, but it won't grow. And, um, you do that for a few years and suddenly like.

Your cloud footprint is huge compared to your on prem footprint. It doesn't mean the on prem footprint is small and it doesn't mean you successfully got away from all that old legacy stuff. It just means now you have both. Um, and so I think that's where a lot of companies are at is they inevitably have both, they're trying to manage between the two.

We've seen a few, um, notable, uh, folks recently who've like. Reported and some, you know, viral blog posts that they moved completely out of the cloud and into their on prem data center. And they saved all this money. Um, and those are anecdotes, which are. Highly discussed because they're like, um, they're don't line up with most people's like current view on things, but they should be acknowledged.

Like there's many things, different scenarios out there. And so people are all over the map on this, um, in general. And, um, we've again, generative AI discussion. The whole multi cloud thing has been, um. A question and a thing that's ongoing for years now, where for various reasons, you're going to like have stuff in more than one cloud, either.

Um, and maybe it's a tactical decision. Like, um, you know, if one of these cloud providers completely blows up off the face of the earth, we don't want to lose everything. So we're going to have our stuff in multiple places, or it's a tool set issue. Like. Oh, one of them offers this fancy thing that the other one doesn't offer.

So we do that thing with that cloud, but we do the rest of it with this other cloud. And that could be even something as simple as like, you're all in an Oracle database. And so you run your databases in Oracle cloud, but then you do your compute over in some other cloud. Um, and then another common one in my personal experience is like, you collaborate with somebody else and they happen to use that cloud.

And like, so that forces you to have to adopt some footprint there as well. Um, and I would guess a final one that's been common in my experience is like, you provide a product to customers and your customers use different clouds. So therefore you have to build your product on all those clouds. Um, so things have not at all gotten simpler.

Um, Um, I would say the original sort of storyline around cloud was that it was not only going to be easier and more capable, but it was going to be cheaper, like. I think we've all sort of resigned to the fact that it's not necessarily cheaper. Um, and to your point earlier, or what I think you're alluding to, it's like, um, there are definitely things you get with it that you didn't have before, which are beneficial to many organizations, which is why they make the investment.

But if, if the whole reason you went to cloud was just like, you wanted the dollars to be smaller, that's not necessarily how it's going to happen. You can shift, like you can find people that say, Oh, we shifted some of these resources from... Instead of, you know, they're swapping hard drives, now they're doing this other stuff instead.

Um, and, you know, you can also gain a great amount of opportunity. Um, again, to reference my research, research experience, if you have a static footprint, like your on prem data center, and Some philanthropist comes in and wants to do all this research on this, you know, new thing that you weren't doing before.

Well, in the old model, if you were, if you were managing your infrastructure, right, you were at capacity all the time. Like the point is you want to use all that hardware that you bought. So you're trying to be at capacity. That's your goal. If you're doing things really well, you're, you're close to that.

And then this new huge thing that just showed up, well, you just, you either can't do it because you're at capacity or you have to then like take away capacity from all these people who've become reliant on it. And, and now you do this other thing. That's, you're making this decision, like. The new thing is more important than whatever we had been doing a second ago.

But the promise of the cloud, especially in the research situation where you have a, you don't have a consistent spend, it's based on something you discovered or some, some funding you got, um, the opportunity is always there. So like the, the new thing comes in that you want to do and you're like, great, let's just like pull in more compute from our cloud providers.

And so for many companies. It's worth, um, you know, potentially experiencing an operational, you know, cost increase to be able to get the opportunity, opportunity, um, you know, out of it. And, um, and so I think that's where a lot of companies are now, like, it's like, I would rather spend money on this kind of engineer than this other kind of engineer because they can do more than just the one thing.

And so if I can get out of the data center, then I don't need as many of those certain kind of people. And I can get more of this other kind of person, um, which again is very strategic, um, and potentially tactical for, for companies.

Greg Alhiem: Yeah. I don't, you know, I, I, I wonder, um, as. As a company or a business or a government entity adopts a public cloud, if they've had things in their data center, I love what you were saying about, um, their growth is exponential because it's not as if, um, they just grew there because they, they, um.

You know, they moved things there. They, they grew there oftentimes because a lot more opportunity for services that are built in, made it to a platform. Um, or, you know, some, some, some use case drove a lot of data, right. Into those environments.

Lukas Karlsson: Yeah. I think the, the thing. That happens so much, um, which people don't think of is you have this, whatever it is in your business that you're doing.

And you're like, wow, we're putting so many resources into accomplishing this task or this outcome. And, um, it seems like, you know, we could just automate it and then make it go away. And so you automate it, which is great. Like doing it becomes insanely easier, you know, like to the point where. It's almost not hard to do at all.

It's, it's seamless at that point. Um, and then suddenly you realize everyone wanted to be able to do that all along. And the reason they weren't doing it was cause it was tedious before and slow and unreliable. So you thought you were going to just like make this thing disappear from your view. And then suddenly it's become a big, a big service that you have to support.

And, um, that's how it works. Like. You know, when a service doesn't exist, obviously no one's using it because you're not even offering it. And when you're offering it poorly, like some people are like, yeah, I guess I want that, but I don't necessarily want to jump through all these hoops. And then when you offer it really well, everyone's like, yeah, I always want that now.

And so the same with cloud, like. You know, you had your data center, that was just how it was and people were like, I wish I could do all this amazing stuff, but I can't because we don't have the resources, so they don't and suddenly you take away that constraint and it's like, wow, all the stuff we could have been doing all along.

And so, yeah, your bill goes up because you just made it really easy to consume the things people would have been consuming if it had been easy. Um, and so that's just like. Some people don't predict that part of the, um, equation when they go automate everything. Um, and, you know, the easier you make something for someone to do, the more likely they're gonna do it.

(24:31) What Gets Overlooked in a Cloud Strategy

Greg Alhiem: Yeah. I love that. That's, I definitely, you know, I'd seen that happen. As, as that, we talked about the, the, the, the footprint of your, if your cloud environment might grow, if you didn't set up best practices early on, it can get away from you pretty quickly. We did, we, we touched on center of excellence and other methods that companies might use, but what would you say maybe the top two or three things that get overlooked? You gave a bunch of examples, but you had to narrow it down to the most important or prevalent one or two things that get overlooked?

Lukas Karlsson: What would you say was our journey? I would say, well, first it depends who my audience is. Um, if I'm talking to a practitioner, someone who's going to like go set up an Amazon account, um, and put a credit card in and start using the cloud, um, The types of things I would say to them are, before you do anything, you want to wire up like a bunch of observability.

Like, I started my small business, you know, a little over a year ago. And one of the early things I did was I set up, you know, accounts with all these different companies. So I would have a place to put stuff when I needed it and I didn't have any stuff I needed yet. But, you know, I set up a couple cloud accounts with the different providers.

And I put in like a budget. You know, I say like, Oh, well, I'm not using this thing right now. So my budget is a dollar, 10, some little tiny number because I'm literally not using it. So I don't expect anything to happen there. And, um, that means, you know, I spent a couple of dollars and I see it and I'll be like, what happened there?

I thought I wasn't doing anything. Oh, I remember. You know, registered that domain name, that's what that's about. But if you build that kind of visibility, um, and right from the beginning, before you even have started using it, it's that same set of behaviors that's going to protect you in the future because, Oh, well now you're spending two grand a month.

And so you've had to change that budget that you started out a dollar, like dozen times since that happened, as you've been tearing up, um, And so 2, 000 a month, whatever, that's like your new baseline that you're at now. Um, and because you have all that stuff in place, when you hit, you know, one month where it's 2, 500, you'll get notified early enough to be able to go investigate and start learning everything you need to learn to manage this.

Because it really depends on your situation, like... In my situation where I was dealing with what I would say on the order of hundreds of different cloud accounts and each one had a different funding source that had different constraints on it. You know, that became really challenging really quickly. And so I think where we were at, I worked very closely with, um, with this person, Stuart and in the procurement team, we were both like, this is going to be huge, like, you know, at some point this is going to be millions of dollars that this place is spending and there's going to be hundreds of customers and we're going to need to get ahead of it.

So we started actively building and tools and, and processes and forms and whatever we were going to need, even when we were, I would say, you know, around 10 customers, um, because it was, by then it was even already out of hand. And it only takes one of those customers to have a bad experience, um, where you like learn the hard way.

Uh, and so. You know, whether that means they, they took advantage of the situation and did more work than they had funding for and just were gonna, like, apologize later, or if it means they got hacked, or if it means the right hand wasn't talking to the left hand and they did twice as much stuff without realizing that they didn't need to, like, any of those things could happen.

And it, you know, then everyone in the chain of command associated with that takes, you know, takes that on as a new, like, thing to lose sleep on all the time. And so, if you're the team that manages the grant, you know, and the relationship with the grant funding agency, you can't just go ask them for more money.

You have to, like, find the money within the company to pay for that thing. And, um, if you're the engineer, like, hopefully it's not gonna come out of your salary, but, like, You know, I had an experience where someone made an oopsie on the kind of their first journey into the cloud on day one, and like, it was effectively the amount that it cost was like on the order of like a new car for their family.

And it was like, okay, that's not, you know, you're personally not going to end up paying for that. But like, that's the order of magnitude that we're talking about here from like an oopsie. Um, and so I think if you can get these practices in place, like, uh, I work with Google cloud a lot. And so one of the things I tell anyone who's using Google cloud that they should do is not just like put a budget in place.

But set all their billing data to be exported to Google's BigQuery database. Because, like, I don't have anything I need to find in there right now because I'm not doing anything, but... You know, a year from now where my spend doubled in the middle of the month, and I'm trying to figure out what happened. I really don't want to have that be the day that I turn on the data export, because that means that that's the day that I have turned it on for the next event.

But I don't have anything to investigate this event. So, um. At a minimum, I would say, you know, setting budgets, getting all your observability in place. So when the activity starts to happen, you can keep a close eye on it. And then, you know, as you step back to higher tiers within the organization, I would say you need everyone on the same page.

You need the person who set that budget to understand what it means. Like if they go over and. Um, the team who's doing it to know what the budget is and how close they are so they can adjust their behavior accordingly. Like, all these people need to be kind of signed in together looking at the same information.

And so if that's not a Cloud Center of Excellence, you need to find some other way to get all that coordination happening amongst those different players.

Greg Alhiem: Man, that's just, it's so much to, to think about. I did giggle to myself a little bit when you were talking about the budget because I, I can remember when, when I was learning, uh, AWS the, you know, on my own credit card, like the very first thing that, that I learned was put a budget in place because you don't want to make one mistake, and then, you know, you get a, you get a bill, right.

That totally changes your life trajectory.

Lukas Karlsson: Right. Oh my, I mean, I've thought, you know, you've seen these stories on the internet. If someone gets this huge bill and I'm like. Okay, that's a company, but. The risk is no lower for me. Like I could easily commit my service account key to a public repo by mistake, and then I could get hacked and I could have a 30,000 bill.

And so like, there's no way I could write a check for that, you know, that would be devastating. And so, yeah, I like, I'm maybe even more cautious about it with my own accounts, but. Even I even, uh, get the benefit because I'm a partner with multiple cloud providers. I get some credits that I can use to purchase cloud services, you know, automatically.

And like, I monitor those accounts just as closely because I had an experience once where I got lazy. I was like. Oh, I got a nice, uh, healthy credit in there that should last me a year. I don't even need to think about it. And, you know, for six months, I'm not, my card is not being charged at all because I'm benefiting from this sizable credit.

And then one day I have a charge of like hundreds of dollars that I don't even recognize. And I'm like, what happened here? And it was like, Oh, I changed something, didn't, wasn't paying attention, burned up the whole free credit that I had, and now it's actually charging my credit card. And so it's like scrambling to try and resolve this.

And again, this is just me as an individual person. Can you imagine the risk that's involved when you have potentially dozens or hundreds of practitioners who are each able to make a costly mistake very easily. It's just, you expose yourself to so much potential risk there. And that's why folks need to get ahead of it early.

(33:19) Managing Cloud Complexity

Greg Alhiem: So, you know, when I think about that risk and, and, you know, it's not just You know, cloud cost. It's it's security. It's, you know, all the things. Um, and then you do that across mobile. If you have a multiple multi cloud strategy, right? Um, I guess that's what gives rise to all the tools where the, you know, the tools are point that to You know, AWS, Azure, Google, and it's, you know, they're kind of aggregating all of that.

Uh, I guess sort of, uh, they're tying in API to the native capabilities. I'm not sure how all that works, but they're giving you a single pane of glass to say, um, you can look here, uh, for cost opportunities, cost reduction opportunities. You can look here for security, best practice violations, and even. Now starting to see, even for, you know, to monitor drift against your compliance state.

If you, if you have like a compliant, uh, you know, regulatory compliance, some other compliance that you adhere to. So now, if you have a multicloud strategy, you have to have, someone managing a tool that now aggregates all of that. So that's like another layer of complexity in there, right?

Lukas Karlsson: I would say there's two main reasons that that. Sort of has to happen. Um, one of them, as you said, is like people are multi cloud and, um, so, you know, using whatever out of the box tool set that comes with the cloud, the different clouds have a varying set of offerings on like how observable they are, how much you can control it, like whether your budget actually does anything, or if it just sends you a friendly little email, that kind of stuff, it varies across the clouds.

And so, um, whatever tools you get built into the cloud are not going to solve that problem for the other cloud. And if you're talking about just looking at the, the financials and you're using multiple clouds, like you don't want to have to log in to one console and log into the other console, then add those two numbers and then be like, that's my cloud spend.

You want to be able to log into one thing and see it. Um, and so that's why a lot of folks have to use like a, some other tool. Because otherwise you're using multiple instead of one. Um, and so it's already a problem. The other reason is even if the cloud provider provided a perfect way to manage all of it, like you literally don't want your one cloud provider to be able to see all of your spend from your other cloud provider, because it's like a point of negotiations and such.

And so, you know, when I'm talking to Amazon and saying, Oh, you know, we're negotiating our contract. Um, I don't want them to see every price that I have and every discount that I have with all the other providers, you know, so you sort of necessitates where you're going to store that aggregated data at, you know, it's like, that's why you end up engaging with a third party who you're willing to let them look at all three cloud spends, um, or at least have access to it, but at least they're not going to give it back to the cloud providers.

So that, that's why that whole space exists in general. And then, um, beyond, beyond just things like looking at your bill and that kind of, and the security, um, observability, there's also the, the tools like that the infrastructure people are using. And so similar to what I was just mentioning, like you don't want to have to learn cloud formation for the, you know, 40 percent of your stuff that is in Amazon and then learn like.

Deployment manager for your set of stuff that's over here. And then I don't even know what, um, Oracle and Azure have in that space, but. Again, it's like you're managing infrastructure. You have a team that does that. Unfortunately, they they're dealing with multiple clouds and now you don't want that team to have to learn, you know, three different tools.

So that's where something like Terraform has become enormously popular. And there's also Pulumi and others in that space. Um, because then it's like, you can become very well adept at how to build infrastructure with this tool. And then whether the backend happens to be Amazon or Google or Azure. You know, that is less of a concern.

The tool is where you're building the expertise and it works the same across. And so that I would say that if you talk to most people, you know, maybe five years ago, if they were working on Amazon, they were probably using cloud [00:38:00] formation, but then the little cluster they're running. In VMs on their computer and the little Kubernetes thing over here.

And the other thing, those are all being managed with different tools. Now, I think it's more likely that they're probably using something like Terraform because. It allows them to apply that same tool set to, to multiple situations. Um, so that's, that's something that's happening across the board. And then of course, like the whole, you know, Kubernetes itself, I think has helped, um, to make this easier because if your infrastructure that you're running in is Kubernetes, then like.

The fact that you can run Kubernetes in any of those clouds means you, you can kind of do whatever works for your situation. And then Google obviously trying to, um, recreate the same situation that they created with Kubernetes and then build this Anthos technology, which is. That like additional layer that would go across all of them that gives you once again, you know, even the, now you have multiple clusters and they're in multiple different clouds that gives you a set of tools to try and manage all that as a single entity.

Um, and so, yeah, you're getting bills from three cloud providers, but maybe your infrastructure team just seems like. One seamless, you know, um, plane of, of infrastructure that happens to spread across those clouds. And so there's a lot of promise there, but I would say our jobs have not gotten less complicated in these multiple sort of shifts of technology.

It's just like maybe. Um, you, you've sort of moved a little bit up the chain into doing higher level work. Um, and so even just looking at the time I had at the Broad, the, probably the very first thing that I did when I started was there were some boxes that had servers in them that had been purchased and had been shipped and delivered.

But now they need it to be useful. And so my job was to like figure out how to take out the boxes and put those things in the rack and then make it so they could be used for something tangible, um, down the road. And certainly like I spend 0 percent of my time doing that task now. Um, but to accomplish the same outcome, like I write some Terraform configs and I commit something to GitHub and then some automation that I previously spent time building.

You know, goes in automatically, magically makes that server appear in the rack and plugs in all the cables and sets up all the, you know, automation and everything.

Greg Alhiem: Yeah. That infrastructure is code is, is like game changing, really. I mean, you know, because you've built, you've built a repository of things and it's there, you know, you can, I guess you could just, the way you're describing, you just go back and use that code next time you need to do a thing and it's sort of autonomous across, you know, multiple.

Lukas Karlsson: Platforms and it's why would you do it any other way throughout all these huge shifts? Like the thing that's been most exciting to me, um, in general has been with each new thing, like whether it's, you know, um, rack mounted blade servers where you like buy a bunch of them and then just start turning them on when you need them.

Or then it was VMware and containers and serverless, like all those things. What they did was. They drastically reduce the amount of time that I can get from like having an idea to implementing something and You know Looking back at that beginning when there was that server on the floor like there was a piece of software someone wanted to use And um, that was their goal and that meant someone spent a couple weeks like filtering through all the hardware options and choosing a specific thing and then they got procurement to do a couple weeks of sign offs and they ordered it and then it arrived a couple weeks later.

And sat on the floor for a couple of weeks while someone, you know, got around to, to putting it in the rack. And then once they did that, then it was like, now I have to install this software, which they actually want to use. And then I have to get the users to be able to log in. And eventually then they're using this thing and you're like, great.

I did a something of value. Um, so now like, It's not that the job just became easy and you don't have to do anything, but those tedious steps at the beginning of like the ordering and the sign offs and the shipping and the boxes and the cables, those are all gone. So now I can just like go back to my desk after that meeting.

And spin up a computer and install the software, but, and then you still have all the hard work of making that do what it's supposed to do. That's still part of the job, but like the potentially weeks or months of lead time to get to that point is almost eliminated. And so that's where I feel like we step up the game a little bit, like.

If I'm a software engineer, I used to like, sure, I'd write the software. That would be like a little part of it. And all that other stuff I mentioned would all also need to happen before someone gets to use the software. And, um, so now it's just like zero minutes between when you type some code and when someone can start using that code.

And that's like the magic of all this cloud tech and then. Of course, if you make everything that easy, then they're all going to use it infinitely more than they used to. And so that's why the cost hasn't gone down with all of that efficiency.

(43:42) Hybrid Cloud & Addressing Skills Gaps

Greg Alhiem: Again, you're trading one level of complexity and work for another, really.

And making that, I think that's the, when you, when you were talking about the long tail of hybrid cloud, um, You know, it's, it's for some companies, they're, they're never going to completely detach from premise-based or like what we would consider a private cloud, I suppose, and they're learning, um, to operate.

In that realm, right. To say that this is how we're going to operate for the foreseeable future. Um, and, and I, I know there are a lot of MSPs that, that can help with that kind of thing. There's a lot of, of, uh, consulting firms that can help with it. And I think what I've seen, and maybe this is, um, I wonder if you've seen this where, you know, when they decide to have that, that sort of, that sort of hybrid, that hybrid thing is, is maybe it's on a transition, um, or transitory, it could be how they mean to operate.

For, for the foreseeable future, they now have to source new talent, right. While maintaining their existing talent, which means they have to grow their team, or I guess in some cases they could potentially, you know, cross train, but that's a lot to know. Right. And so, you know, how do you think about companies trying to, to fill that skilled gap?

If hybrid is how they want to operate or plan to operate?

Lukas Karlsson: That's a great question. I think, um, When you're dealing with a new space like this, like it's, I kind of laugh when I see a, a job description that is saying that it wants you to have like 10 years of cloud experience because I'm like, okay, if you want to get technical about it, like I have 10 years of cloud experience, but it's like, That's hardly something that you can expect from typical people, because like, that was only a few of us who were there at that moment, and um, I think people have to be a lot more realistic about this and say like, we want people who are comfortable in this cloud space.

You know how to work in this new way. Like, I said, instead of physically going to the data center, you're like committing some code and tagging it and doing a pull request. And then magic is happening. Like all kinds of things are automated now. Um, you want people who can work like that, but you can't necessarily do the same thing you might've done with the servers and say like, I want someone who has 15 years dealing with servers and you know, whatever that was like, that was something you could kind of expect.

So. I've seen a huge, um, growth in sort of the perceived value around certifications there. In particular, some of the cloud providers offer certs that are like, um, very closely tied to getting high salaries in the marketplace. And so, you know, the, the Google cloud professional architect certificate is an example of.

There are companies like you were describing, they're like in the middle of making this shift, and they need practitioners who can do this stuff. Either the ones who have the potential and they can figure it out, or the ones who have actually been doing it for some amount of time. And so, I think unlike some of the other tech, um, certs where there, I think the value has either was never there or maybe deteriorated over time.

Cause it was just like, everyone just checks this box. Everyone has this box checked. And so therefore it's not a distinguishing thing anymore. I think, um, right now when the space is riddled with people who haven't even had a chance to start touching this tech yet, you need some kind of. Bar and so looking at the, the professional search from the different cloud providers is, is helpful if you're working with a certain cloud and you can find some people who have a couple of years experience and they've passed one or more of these professional certs, like those certs are not easy. They require the person to actually study and the industry has been treating them like they have real value, which like I said, doesn't always happen with the different certifications. So, um, I think Across the board, you know, if you're like me and you work at a partner business where, you know, you're part of your business is to, is to service that, that company's cloud, then you're obligated to have certs in order to even maintain that relationship.

If, um, if you're in any of these companies that's in the midst of, or preparing to do a big shift into the cloud, they all have that on the menu of like, okay, well. All these people have been going to the data center for the last 20 years. Suddenly we're going to tell them to do, you know, Terraform instead.

Um, that's going to be a big change for all of them. And so we need to have everyone go through these certs. Like, you know, you can get your cloud provider to come on site and teach 40 people at a time or something, but like, that's a metric, a measurable metric that has real value in helping you succeed in this process of being like.

Okay. You weren't doing cloud over here in the future. You're doing cloud. In order to get from here to there, you need practitioners. And like, so how many do you have today? You have zero. Okay, great. That's a number. And that now we can measure and we can say like, Oh, now we have 20 people who've all done this basic level of things.

So when those people are all talking about some basic stuff, they all understand what each other are talking about. That's huge because you can build on that and start getting to the really advanced stuff as well. And so I think. Companies that are doing this, they, if they're not already taking the suggestion, they need to start just being like, what level are these different team members at?

Because Most likely, whatever level they're at, we're trying to up level them because we're moving in this direction. So we want them to be growing in the same direction. Um, and you know, most likely it's not at the level where you need it to be right now for what you're exactly trying to accomplish. So it means like getting everyone to speak the same language.

It's the same. There was a period of time where I worked in companies where we were, everyone was talking about ITIL, um, which is a framework for how to manage IT infrastructure that comes out of British telecom. And, um, it's like, there's a lot of language involved and a lot of specific processes that are.

Tied to that, you know, the learnings associated with that framework. And so you can't have people being like, Oh, we have an incident and then we got to get the problem manager to do this kind of thing. You start talking in lingo, no one even knows what you're talking about. And they certainly aren't bought into the whole thing.

So that's why a company like that would have everyone or some percentage of people all go through this ITIL training. And now they're in a position to start building on that. And so cloud is the same thing. You need to get everyone speaking the same language. Um, in the beginning, you have someone like, Okay, before when I needed to get you this file, I would put it on the, the file share, the team file share on the Windows server.

And it's like, what is the new thing that I would do in the new model to get you that file? But then suddenly everyone, they know what a bucket is and, and they know how that works. And they know you could just be like, Oh, let's just create a bucket. We'll put the data in there and share it with the collaborator.

But it's like, If, if no one even has that, those, the language to speak about it, and they don't even know the little things that have shifted between what they're used to and what is happening now, then it's just, you're fumbling all along the way. And you're just constantly having to explain everything to everyone.

So. You want to get people up to a certain basic level of understanding and capability. It doesn't mean everyone becomes like a cloud architect and they just like quit the job and go do that full time. You don't want all your team members to have to be burdened with that level of knowledge, but like you do want a bunch of people to just understand the basics and.

You know, for a lot of places, that means like they should know how to create a bucket and put files in it and set permissions on it so it doesn't get hacked, you know, um, whether it's because you want them to just be able to get their job done, or whether it's because you want them to not like mistakenly share the keys to the kingdom with the public, um, there's real value in getting those people to have that basic understanding, and so some kind of training program, In some kind of way of tracking who's at what level and where you trying to get to so you can keep that practice going forward as you sort of move on this journey.

(53:01) Cloud Center of Excellence Governance

Greg Alhiem: Sounds like that's really part of your center of excellence. A company who's going to adopt cod or they're early in their quadrant, the sooner that they can get organized around best practices for governance, the better, right?

So if they can do it early, it's going to save a lot of time later, right? Because you're, you're starting with a good foundation. So I, I think that's one thing that we really agree on.

Lukas Karlsson: And a lot of potential pain because there's so many ways you can shoot yourself in the foot in this new model. And so, while that's a great way to learn not to shoot yourself in the foot, like, there's other ways you could potentially learn that just by like, getting ahead of it.

And so, I always encourage companies to try and avoid those situations right from the beginning instead of waiting until they happen and then putting all of the process in place to try to prevent them in the future.

Greg Alhiem: Yeah, it's a lot easier to, uh, to prevent than it is to, to square away something that's not going well for sure.

Um, and then you also talked about, you know, there's probably some tool selection that goes in there, right? And, uh, doing things with common tool sets that allow you, you know, you, you know, it's something that, that occurred to me while you were talking. You were talking about, The ability, um, that, that multi cloud tools are just, you know, the example you've had with infrastructure as code, the term of Terraform, you may not start thinking you're going to need a multi cloud strategy, but if you put some of those best practices in place, when you do, or if you do find that you, that you do need a multi cloud strategy, you really set up pretty well from that perspective, right?

So it's better to, to, to, to build. For the future, uh, if you can, if you, if you have the staff, the, the, the, the funding, all the right things, um, you know, do that early on and, and, uh, with the future in mind, it's kind of unpredictable and things are changing. You said earlier, you know, so much changing so rapidly, um, that's probably a really good practice as well.

Lukas Karlsson: Yeah, that's a great point. You know, um. If you know that you're going to be doing things in multiple places, and, you know, we talked about the observability around billing and budgets and tracking, like getting that tool set into place early, even while you're just working with one cloud at the moment, um, is great because That means that when you introduce the second and the third cloud provider, assuming, you know, whatever tool you want with supports all the things you're going to use, it's just like, Oh, can someone go into the tool and add in the new thing?

It's not like. OMG, we just hit this scaling issue, you know, we designed everything about around one cloud provider, and now that we have the second one, we're scrambling to figure out how we're going to change everything to account for this, you know, two and then, oh no, you run into this same scaling issue when you add in the third one.

Um, and so I think that's a perfect idea. Like as I was describing, there's so many reasons why you might say like. I want to manage all the billing sort of stuff in a, in a separate place from where the stuff is happening, boom, you know, and that's like, you can do that early on and then just grow with that.

Um, the one thing that I, you know, that can be challenging and depending on how these companies price the products, like there was one product that I was using and. Um, you know, I wasn't spending much money in the cloud, so the product was very inexpensive. I'm like, it's pennies. It's like, I'm not even thinking about it.

And then when my spend went up exponentially, and now it's like an enormous number, then I go over and look at the thing that was pennies, and now it's not pennies anymore. It's like grown exponentially too. And, um, I don't know, like, depending on the situation, that doesn't always, it's not always the best model.

Um, you know, like, if you grow and you're spending billions of dollars and suddenly you have a million dollar bill to this software company that hasn't, you know, enhanced the software at all, but it suddenly costs you exponentially more for the same thing. So there's been a lot of like flux and how, how that works with the different providers, but for the most part, it's like some percentage of the spend, um, with maybe some tiering in there, but yeah, it's.

It's totally the kind of thing you want to have in place before you absolutely need it. Like, the thing I described earlier, it's like if, if the moment when you need to do an investigation is when you start deciding to purchase like a tool to do investigations with, that's like a very inopportune time to do it.

(58:06) Outro - Cloud Predictions & Challenges for the Next 10 Years

Greg Alhiem: Yeah, there's only one worse time to do it, I think, and that would be after the problem has grown beyond when it's, you know, when it's a very serious problem. Yeah, exactly. So if we think about where things are going, you know, what do you kind of see, uh, if you look five, 10 years out, what would you, what would you imagine, you know, cloud adoption looks like, um, where does it go?

Are there new thing, new, new challenges that we don't know that we, that, that, uh, are on the horizon? How do you think about that?

Lukas Karlsson:, I would say that my, um, perspective on this is, um, well, not to be too punny, but it's clouded by all of this, um, generative AI stuff. I think there's so much, um, energy and enthusiasm and, um, and financial resources being pumped into generative AI across the industry, you know, like.

I won't be surprised if I go to buy a new refrigerator and it's like a generative AI ice making or something that, you know, uses a model to decide how to make ice. Like those things will not shock me. And, and, um, because all those shifts I talked about earlier, which are all enormous, um, impact on the industry, like I don't see none of them happen with the sort of, um, Pace that the, the gen AI thing is.

And so I don't even know what to think. Like I was there the moment the internet just like kind of popped into existence. And, um, then I watched over the course of many years as it went from something. Of obscurity to being something that everyone just lives like, you know, during the pandemic, everyone was just doing Zoom and I was like, Oh, it's like the thing we were all talking about.

It's here. Like everyone is using this thing. Um, but that it happened slowly over a long period of time. And so I got to kind of watch it happen like one day, you know, almost like I think chat GPT 3. 5 came out on November 30th of last year. And so we're vastly approaching like the one year anniversary of when that thing landed and changed all of our industry.

And, um, you know, it was like in January, like Google and Microsoft and everyone else had, um, you know, their own thing, uh, in the space as if they had been spending decades building it and just were popping it out right now. Um, but. It's really hard for me to anticipate what the next few years look like.

Um, because unlike NFTs and blockchain, I think that, um, this kind of fad has legs and there's a lot of real value that will come out of the generative AI stuff. Um, much of which I probably can't even guess what that looks like yet, because it hasn't quite been invented yet. But I think there's a lot of real.

Potential there, but also the risk is like seemingly much higher than with other things. There's so many ways it could go wrong You know, there's like if you just add AI to everything that you're doing And a few of those things you do a little bit wrong and the AI like goes completely awry Like those could be even more spectacular failures than the ones We were doing that were unassisted by AI.

Um, and so every sort of thing that we sort of touched on in this conversation is probably going to be impacted by gender of AI in some fashion. Um, you know, the tools that I use to look at my bill right now. They all have some amount of AI in them, whether they're leveraging, um, something that can predict next month's spend that the cloud provider is already building into it, or whether they're adding in their own, you know, magic that is analyzing the data and giving you new insights.

Their tool set just exploded on what it's able to do. So they're all, they're building the next generation of those things that have this new kind of AI built into it. Um, where like, you know, two years ago we had a disaster and I. Dug into the database and I'm starting to run SQL queries so I can figure out what the disaster was, you know, next year, I'm going to type in like, what was that disaster into a bot?

And it's just going to show me the exact same chart. And so. I'm confident that that's, like, really close, and so, um, man, you know, think about five years of that, um, and who knows if there's going to be some other big, you know, dynamic shift like this generative AI thing that's going to come in that same time frame.

So I have a hard time predicting. I do think if you had asked me a couple of years ago, like what was going to happen with NFTs, I was extremely skeptical and, and the metaverse, I was very skeptical and now I feel totally justified in my skepticism and I do not feel that way about this, I think. While there will be a lot of terrible, terrible ways to apply generative AI, um, you know, there will be some really, really important advances that it brings that are still somewhat beyond what we can predict at the moment.

So I'm looking forward to it. Like, I'm a little cautious cause. I feel, um, I'm not one of the people who's like losing their mind, freaking out over how AI is gonna, it's the Terminator and you know, but I do firmly believe that if we, this is really the theme that we've been talking about. Like if you just pretended like it wasn't any problem at all and ignored it, then one day you'll wake up and you'll have like a big mess to deal with, just like.

You know, if you don't put all the tools in place to look at your bill, then when you have a problem, It's like now a big problem. I think it's good that everyone is actively discussing like, what is okay? And what is not okay? And where's the line? Like, because without those discussions, we could have a big mess.

But, um, I'm not like totally worried that I'll wake up tomorrow. My job has been eliminated or any of that. I think that, um, Everything that we work with is just going to have some new spin on it that is going to involve this, this generative AI technology. Yeah, I mean, you'll have, you know, uh, needs for active training of models.

Greg Alhiem: I'm imagining somehow there's going to be, um, there's a lot of data to store, right? So, um, you know, that that's going to probably drive some changes and, and storage and, and, um, and then the power, like there's just not enough power to run all of the, the, the AI that everybody wants to run.

Lukas Karlsson: So, I mean, that was, you know, that was. Part of the discussion around the blockchain was like, okay, I see what you're saying about why the blockchain is so great. And how it'll change how money works and all that. But like, if it's gonna consume all the world's resources, is it really, you know, a viable solution?

And so this is the same kind of stuff. You need to like, do it in a way that's sustainable and sensible. And so that's, that's, I'm a, you know, big proponent of all the research that's going on. And whatnot. Um, but yeah, it's exciting. It's really like having seen the internet and, and all those other things we talked about containers and virtual machines and serverless.

Like, blow up where the term doesn't exist at all, and then suddenly it's all anyone is doing. Um, I, I could not have imagined something like that, like this, you know, the way 2023 has gone with generative AI. It's been just overwhelming wave of enthusiasm. Um, but yeah, I think there's some real meat there that I'm excited for.

Greg Alhiem: It's certainly a lot more enthusiasm than there are chips, uh, to, to run things right. Um, you know, uh, first of all, just thank you for taking so much time. I know, um, I know you're busy. Uh, this was great. Really enjoyed the conversation and just, um, you know, learning from your experience. Um, if you were to, if you were to kind of wrap up, if you were to say, you know, one thing that you would tell.

You know, a leadership team about how they can best adapt cloud and be prepared. What would that one or maybe, maybe it's two, but what, what, what advice would you give them at the highest level?

Lukas Karlsson: I mean, I think they should take the lead, like instead of waiting for this to bubble up and then suddenly be something on their plate, like they should take the lead.

As you said, like without the buy in from the top, it's really hard to get everyone on the same page. And once it bubbles up from within the organization, it's hard to bring it to the right visibility to get that buy in after the fact. And so, folks who are planning to do this, just, you know, basically take the reins, get everyone in the room, and like, start to build.

Whether you call it a center of excellence or not, like you need to build this, um, set of people who work together and share resources to make this a collaborative success. And, um, so yeah, like build that and they will come.

Greg Alhiem: Yeah, that's great. Well, listen, thank you, Lucas. Um, appreciate it and, uh, great conversation and, um, you know, if you're up to it again, maybe we can do this for another time.

Lukas Karlsson: Awesome, Greg. Thanks for having me. It was a pleasure.