Unleashing AI with OpenInfra: The Future of Open Source in Artificial Intelligence
OpenInfra Days at SCaLE 22x
Friday, March 7, 2025
- Architectural trends and best practices for deploying AI workloads with OpenInfra projects
- Collaboration between hardware and software vendors to advance open source AI solutions
- Identifying and addressing open source software gaps for running AI workloads effectively
- Real-world use cases and lessons learned from OpenInfra projects in production AI scenarios
- The role of open source in democratizing AI and fostering innovation across industries
Video and Transcript
[00:00:00] Allison Price: Hi, so my name’s Allison Price. I’m with the Open Infra Foundation. I think I’ve probably talked to every single person in this room this week. So welcome to 5:00 PM We’ve made it most of us Kevin made it. So today we’re gonna be talking about unleashing AI with open infra, the future of open source and artificial intelligence.
So all the buzzwords, one headline. I am moderating the panel so I get to pepper these folks with questions. Some of who. Read the questions. Some of who didn’t, you’ll know, you’ll see. But before we kick things off, I wanna make sure everyone gets a chance to introduce themselves. So I’m gonna pass it to Kevin.
[00:00:37] Kevin Carter: Hey everybody, appreciate y’all being here at five. My name’s Kevin Kerr. I’m the product director for our OpenStack I view at Rackspace. I’m excited to be here, you know, be here. Talk about buzzwords.
[00:00:46] Todd Robinson: Supposed to read the questions.
[00:00:49] Allison Price: It’s okay. It’s gonna make for a better panel, right?
[00:00:52] Todd Robinson: I’m Todd Robinson with OpenMetal and OpenMetal is a on demand private cloud provider. Kind of unique in this space and being able to let people just have clouds and practice on them. So we have little tiny clouds, three servers.
Hyper-converged steps underneath is providing the storage layer, but ha properly set up as much, you know, as everything is, could be at that, at that size. But one of our things is that we allow people to come on and experience it and learn it based upon a a configuration that has a standard that’s been kind of burned in and tried and true.
So hopefully those are helpful. If you do sign up for that, don’t sign up with like a fake Gmail address and try to try, try to run Bitcoin mining on. It’s not cool. But, but definitely if you, if you’re interested and you’re in that spot where you’re trying to get into the space, we’d love people to learn.
That’s why we built that there.
[00:01:41] Armstrong Foundjem: Thank you so much. Armstrong Foundjem. I’m a researcher from the University of Montreal, the engineering school called Polytechnic. So I researched mostly on the trustworthiness of safety, AI critical systems which mostly we are working now with. Airline companies like Airbus Bombardier and the Boeing and some of the other partners to build certification for AI systems in those systems.
[00:02:07] Allison Price: Yeah. Awesome. Well, welcome. So, you know, one of the big things that we have all been talking about this week is the power of open source, whether it’s OpenStack, Kubernetes, or the millions of other projects. So. When you’re looking at ai, which everyone’s trying to build a strategy around what are the actual benefits in your perspectives of open source solutions versus the proprietary ones from either the hyperscalers or other vendors out there?
[00:02:31] Armstrong Foundjem: So that’s a very interesting question that I’ll try to look from it from the open source machine learning world and see how open source software engineering can bring values. I remember quite well last year or so we investigated the Hugging Face platform. And after doing some studies with the community, most of the works I did with OpenStack, I could suggest to the Hugging Face community some modifications and some benefits and best practices to make, which the work was published.
And you could see it in the Hugging Face website. They, they, they were, they accepted the paper from Senna last year. And now one of the things that we learned is. Software engineering is applied science. Machine learning could be theoretical, it could be mathematical, and most of them be it computer sciences or mathematicians.
They can write codes. Codes that can run, but not necessarily functional code that can be deployed. Now most people will write codes that can print some output things that they are software engineers that’s wrong. We need collaboration. Open source is a community we need people of from documentation, people from different aspect collaborating together.
That aspect is missing in most platforms that are run by the modern day startups and things like that. Now, open source has these benefits that it has. Learn from diverse contributors across time. It has transparency that you can trace things back to their code base. Proprietary don’t give you access to their code base, so you don’t, they just tell you what you have to hear at their own terms and conditions.
But open source has that transparency, accountability, and you can verify things. So security is also an important aspect that open source can leverage to the, aI world with the machine learning community, which I found useful.
[00:04:47] Todd Robinson: Yeah, I think I would just add that I think it’s imperative that there is a really strong open source community around AI.
Like to have some of these bigger, these giant companies just get bigger off of this technology and off of the data that they’ve scraped off the internet or that they pulled from, you know, you over time. I think that’s a dangerous thing unless there’s a strong offsetting community that embraces the open source aspect of what can be brought.
And, and I’ll also say like, the creativity that the world at large can bring to these models or these businesses, it’s different than if you’re trying to, I’m, well, all I do is just grab open AI as API. Okay, you could do that. But then at what spot then is your business model? Is your business model always up here? Never down at the fundamental level. And so for me, I think it’s, I think one, it’s like, I would say thank you to Facebook, which I wouldn’t normally ever say thank you to Facebook. But, but they’ve been a great open source contributor, whether it’s hardware, they’ve done hardware in the data center. They’ve done JavaScript frameworks and, and they’ve, and they’ve also kind of led in, in releasing some of their, their, I, so I don’t know how particularly you would support them, but I wouldn’t, I wouldn’t go on Facebook.
But anyways but I would just tell, we, I think we should, should be thankful for some of those companies that have led that way and said, look, yeah, we, we do believe in that. And then for us to get in there and use ’em and get familiar with them. And so I know I talked earlier and there was just a board up about. Here’s some of the stuff you’ve gotta learn distillation. Yep. You might have to learn. And it’s kind of fun, to be honest. Like it’s a new challenge for us. So I would just say it’s imperative that we do get involved into it and it’s, and to have a community that’s already formed, like an OpenStack community or any of these other ones, I think it’s great.
[00:06:18] Kevin Carter: But I would also, like, I would add, you know when you’re talking about open source AI, you really can’t take away from the conversation that’s being happening around DeepSeek and everything that’s coming out around that. And whether you trust the DeepSeek as a model you have to look at what they’ve done and make, you know, kind of made it so that you’re getting these, you know, marked improvements in the models themselves and how you run things and how you work with things.
And where, you know, his, up until that point where they kind of came out on the on the space and didn’t ask for permission distilled all of their disinformation against other, other models potentially, you know scraped all the same data or whatnot, but came up with a novel way of processing the data that didn’t have such a reliance on the the proprietary hardware.
And so they are coming out with these capabilities that is really empowered by the open source community. But if you take anything from that initiative, it is that AI is becoming more, you know accessible.
Whether you’re running it on an M4 Mac or a, you know, latest and greatest Nvidia GPU or some AMX from Intel or something, you know, whether you’re using all of these capabilities the open source community is enabling those functions in new and exciting ways. And so I, I feel like that is it’s it, without the open source community, you would just be paying the proprietary platforms to get the new GPU to run the new model that is being replaced every quarter.
Which on top of the fact is, you know, your exponential power costs and curve that comes up with all of that. So it’s, I think you can look at open source and they, they can say, you know, sure, you can do all of that, and we can use all of those crazy pieces of hardware with these new open models, but also it’s more accessible, right?
Whether you’re running it on an M2 Mac Pro or you’re running it in a world class data center.
[00:07:58] Allison Price: When I think the thing that was interesting from every single one of you and I wish I kept counts, but I, I didn’t was you each said community. Least two or three times. And I think that, that, that’s why we’re all here.
We’re all members of the community. And I think that’s what really innately drives progress from an open source perspective. And so I’m happy to kind of see that thread because I think that’s how NFB accelerated and open source became the defacto standard for NFB and all these telcos run open staff and Kubernetes, and I think that HPC, you know, thread with the community, the scientific SIG is celebrating 10 years of collaboration this year.
And so one of the things that the foundation has kicked off, and this is me just plugging a few things as actually an AI working group because I think that we’re at the beginning, the very beginning of ai. We haven’t seen anything yet in my opinion. Like we’ve seen some awesome stuff and we’ve seen some real innovation, but there are so much more innovation to be had and I think that the communities will accelerate that and they’ll exponentially grow it.
So if you’re not part of the AI working group, it’s something that we’re working with our foundation members around, please find me. I’m temporarily leading it while I learn about ai. And then we are gonna continue to work on it. Publish a white paper this year about how is open source propelling AI faster and more effectively than proprietary solutions.
So it’s taking this panel, putting it in a white paper, coming to you soon. But you know, AI is massive, right? So I kind of wanna distill it down with the panelists about some of the actual technical requirements. And you all have OpenStack experience. You have businesses that rely on OpenStack, so. You know, scaling, compute, accelerated compute, scaling your storage, it’s not insignificant.
It’s a massive challenge. And so I’m curious, how can open source and specifically OpenStack how can it best be used to address these challenges? Because those things historically have cost a lot of money, cost a lot of hours, bandwidth teams. How can we as a community address that and make sure that OpenStack stays relevant in this growing market?
[00:09:57] Kevin Carter: So, I mean, yeah. At Rackspace, we, we have a bunch of AI workloads that are already running. We’re running with H100s, a P40s a couple different accelerators on different, you know server platforms. And I think the thing that OpenStack enables us to do specifically, ’cause all of our GPU enabled workloads are running on an OpenStack platform.
And what OpenStack allows us to do is to, to attach those workloads to different VMs, different scopes, different sizes, whether a customer is needing two GPUs or four GPUs, or one. I can schedule that intelligently. Now. I think one of the hardest parts is the scale of that because yeah, you’re looking at some of these boxes where you’re getting four H100s in it and it’s using 14 kilowatts of power.
You know, in a historic, in the US at least, a lot of our racks are set up for like 16 kilowatts power max, so that’s one box per rack. And you know, that’s, you know, that’s difficult. You have to, you, you have to either come up with a new power strategy or you have to come up with more efficient ways of posting those workloads, or you have to figure out a way to, you know again, extract more value out of the hardware that you’re buying, right?
’cause again, customers are looking for fractional GPUs or whole GPUs. But with a lot of the models today, fractional GPUs don’t make a ton of sense. They want the whole thing. They need all the course power that that device can give them. In fact, they need many of them. And so, you know, you’re trying to figure out better ways to use these accelerators, get more density out of it. Even in a case where you have one box per rack potentially. Right. Which is kind of crazy if you break and it is totally ridiculous. ’cause you know, coming back from, you know, experience of running OpenStack public clouds, we used to run 22 hypervisors in rack and now you have this, you’re like, it’s, it looks ridiculous when you walk around data center like that, something doesn’t look right. But the nonetheless it, it, but it’s a problem. And that power draw is a problem.
[00:11:46] Todd Robinson: Yeah, I think so. I’ll agree. Definitely with Kevin on that. It’s like, you, you really gotta understand the power consumption of some of these things. But, but also, so for us, there’s no perfect answer for sure. Like, so we don’t have a specific strategy, we’re forming our strategy now. And this is how we did it. And I think I, I would just share this with, I think the, how I feel like we should be going forward a bit when it comes to open infra and, and how we can learn this as we go. So in our case, what we do. We don’t know enough about something. So we, we, we get a couple workloads.
We pick some businesses that are friendlies to us, and we, we share some information with ’em. We get some workloads going to make sure we understand what’s going on. Then we start to say, all right, which of these are the can we focus on which ones can we productize? And so far, and so we’ve been doing that for, for a year and a half or so now, and we’re now down to the spot where we’re picking, okay, what are the racks actually gonna look like?
What’s the gear that’s coming? These are 20, these, the H100s are 25,000 bucks a piece, right? So you really don’t wanna mess that up. Yeah. If you buy a whole bunch of ’em, it’s not, not a good idea. And so, but, but what we, I think the, the thing that may help us for open infra as a community is trying to get more visible use cases.
What are people doing with it? And then that drives where we need to focus our energy so that we can decide this. And actually in some cases, because we, we both provide bare metal and automated OpenStack and sometimes the customer just says, gimme the bare box. Yeah, because it’s a fully load.
They, they, they have the it’s gonna consume the GPU resources, that one application that they’re running is gonna consume the whole shot. And so Ironic comes in, so OpenStack is still in there. Ironic is still doing it. But, but actually it’s not being orchestrated by kind of in, in OpenStack is being provided by.
And so we haven’t seen a lot of, we, we know how to do the migs and we know how to do the time scaling, but that actually hasn’t come up that much. People are using the CPU because it’s like really simple and, and like I think I had said in my talking you probably have CPU in your cluster already, like almost everybody.
Actually I think it was Rackspace who did this, like, you know that setting it’s like 12 or 16 to one or whatever for the Yeah, yeah. This is, you guys commit ratio right Over commit ratio. Right? But nobody but, but people still go out there and they sell, you know, the four V VCPUs get eight gigs and so you burn up your ram and you end up with extra with leftover CPU.
So use it. Yeah. So in some cases it’s just recognizing you can get into it, start using some of the models, get familiar with the lingo, start getting close to your customers. So they’ll tell you, this is what I’m trying to do with it. And it goes all the way from, we’ve got one coming in that’s trying to correct a cost for natural language processing.
[00:14:13] Armstrong Foundjem: They’re using AWS is transcribed, and it’s just like, whoa, this is gone absurd, right? So now they have to figure out how to do that, that, but that’s not an OpenStack thing. But it does teach you, okay, maybe this is the space we’re gonna be in, this is how we’re gonna do it. Yeah, there we go. I don’t know if I answered that very well, but
Yeah, so like what the other panelists said, I think I can build some facts on that by saying that a lot of research institution right now runs on OpenStack, and it is time that OpenStack should go closer to those people to find out the challenges that they are facing. On some of their workloads.
[00:14:52] Armstrong Foundjem: Now, AI in general is a name that could be overused. We have to know each particular aspect of ai. For example, they’re talking about generative ai. Not everyone now has that capability to do generative AI with infrastructure, but we need to get in involved. We could start with other deployments. There are so many algorithms that, and big institutions that are doing things differently.
We could start from a, a base, a small workload on use cases, and then we start growing. But above all, we need to learn and understand from the end users. For example, in Canada, there are so many. Like Compute Canada and many others are run on OpenStack, even though they may have multicloud, but OpenStack is one of them.
If they could, they have been using OpenStack for a very long time. They have seen some values, so we cannot sit back and just say, okay, they’re using obit. What are the challenges? Because I remember around 2021 researcher was about to spin some workloads for what he wanted to develop an algorithm.
But the technicians that were running the clusters could not be communication between this researcher and the, the technical team did not go well, and they had to abandon it and use services from Amazon. It wasn’t something that they could not do better. That level of communication made the guy to come closer to say, okay, I’ve had this challenge.
Come to the community like this AI community that we are about to start. There are so many other communities that you can partner with them. ML common is there, they’re running HPC workloads. You have so many other universities that are actually doing works on these areas. I remember even last year I was teaching a course on cyber physical system, AI cyber system.
One of the platform was, were using Amazon gave us some limited but free account. At some point in time, we’ve run into serious difficulties because when we wanted to train some models with the limited resources, we could not even have flexibilities of doing any adjustment or modification. Half of the class said that, why can’t we use OpenStack?
At that point, I was convinced. I said, okay, write a petition so we can take it to the school that it never went through. Because it was already at the beginning of the academic year. Now, people who are working at OpenStack, I think you should make more collaborations with research institutions because not all the big models are effective.
Start with small, powerful, like the ones that are coming up right now. We have seen that. Smaller models. I just work on a project that we’re doing a generative code model. We wanted to replicate the work that. Open AI did long, like 2021. At some point we fell into difficulties because we could not replicate that work.
We don’t have that resources, so we improvise smaller models and bypass this and we have similar result in so many areas and that is where I think OpenStack can really come handy.
[00:18:27] Kevin Carter: So kind of tying it back to what you had talked about, the open source community, improving the AI workloads in space and building on what you were just talking about, how like the cross-functional groups that we can participate in. One of the things that OpenStack does extremely well is expose hardware, accelerators and, virtual functions, which is, you know, whether you’re, you’re working on, you know, your telecommunications workloads or you’re doing your, your 5G, 4G, you know, v nfs or you’re just trying to attach to an FPGA, like the, the, the, the capability is already there because open source and OpenStack has done the work to make that accessible. And now why that’s important is because exactly what you were just saying, small. Let’s start with smaller models.
Doesn’t mean you have to start with an H 100 or, or, or an A 30 or a Flex GPU or, or you know, the latest and greatest MI 300 from AMD. You could start with on-ship accelerators or onboard accelerators. And because OpenStack has those virtual functions, capabilities that you can tie into and schedule intelligently against to get that higher density out of it, you can do really, really incredible things with this platform, all because of the work that you have all done.
[00:19:36] Allison Price: I think you all set up my next question. And before I ask it, I wanna say like, this is totally open. I will run the mic to anyone. He wants to ask a question or contribute a perspective because this is, we’re all learning together. So please raise your hand at any point and we will figure out how to work it in.
But something I think you all touched on is, you know, there’s work to be done. Like I was actually talking, I’m gonna call him out. Gotham is here. Hey he’s the chair of the OpenStack tc and we were talking last night at the 15th birthday, happy hour about this AI working group, right? And one of the most.
Strategic outcomes of it that I’m very committed to doing is that it’s not just us talking about what are people doing and what needs to happen, but it’s connecting back to the upstream community and saying, these are the gaps, these are the opportunities. I actually, like one year ago, almost Caracal or 2024.1, came out and introduced VGPU Live migration.
Right. It was, it changed the game for a lot of organizations who were building their AI strategies. And I recently just talked to a company in Vietnam, they have it in production and they’re like, their customers are really happy. So one of the questions I have for y’all, it wasn’t in the preparation, but I’m curious, is like, what can the upstream, OpenStack community, like what do we need to be focusing on? What are some of the software gaps that y’all experienced or you’ve identified that we need to start thinking about long-term solutions for if we want us? Our community, our software and our collaborations remain relevant and used.
[00:21:01] Todd Robinson: I’m gonna just borrow from philosophy in that case, because I think it does well, it goes back to this one, which is we, we don’t know yet.
Yeah. Honestly. Right. So I would say you, you, you want to get, we need a pipeline of listening, right? Yeah. Like, what are people doing? I always say like I, ’cause I’ve been into this stuff for a while. I used to introduce hugging face to people and people would like. Like, that’s not a real thing. Like no, that’s a real thing, right?
But it takes us a bit to get used to, like, this is a new community. You gotta learn the new lingo. You gotta figure out what the strengths, weaknesses are. If it maybe we’re gonna be a hardware enabler, we enable other large companies to come into this space to compete against Nvidia or something like that, because there’s flexibility and we actually will do that, right?
The big clouds are like, man, I got my model fixed. I’m gonna got my discount. That’s my thing. When I just move on, maybe they’re gonna go build their own. Right. This wouldn’t surprise me at some point, or maybe they’re doing it already. If somebody else knows, it wouldn’t surprise me. Right. If AWS or something, like they have the clout to do that and they’re not gonna pay the Nvidia prices.
Right. But maybe we fit in like that. That one’s a tricky one. So I would say, yeah, I think I, I’d love other people’s opinions to learn where we could focus. And maybe there’s something also like, I know with like, like Zul and stuff that, like there’s companies are donating hardware and access.
Maybe we could do that too. You could ask companies like us to do that. I don’t know if that’s right or helpful, but probably could be. Yeah. So yeah, unfortunately. Tell me, yeah. Who wants to go?
[00:22:22] Kevin Carter: I, I think, I don’t actually don’t, I don’t have anything I would add to what you said. I think it’s been really perfect.
It, we need to engage in this period of listening. I think as a community, we also need to figure out, you know what Brian Lily, one of our, you know, leaders used to say is we be less of a propeller head. And and figure out like how we bring the technology to these people so that they can use it in an interesting and exciting way that isn’t like you know create, like if, if you have ever tried to work with virtual functions in, in Nova, it is, you know, a special kind of hell when you first get started.
But it’s really easy to get to do once you figure it out. But figuring it out is that, is that like quantum leap you have to take? So I would say that we need to figure out a better way or, or at least, a nice way to figure out how to, you know more easily enable these functions. So that’s it, it doesn’t require all the propeller heads to, to run the platform.
[00:23:14] Armstrong Foundjem: Yeah. So I want to look at it from a perspective where, since you mentioned the upstream, upstream to me, should take a proactive measures. For example, Ollama is an open source AI model, and the benefit of OpenStack is that it is in a modular form. You don’t have to run all the projects to get it spin up going closer to Ollama.
Look at your interface. Can you integrate this with Horizon? Can you integrate this with some of the projects that are existing to bring up values? Because the whole aspect of machine learning is that if you can collect massive amount of data, process them, and then try to make the visualization of this data meaningful to the end user, I.
No matter how Hugging face is popular today, it still need help in so many areas that the community members don’t want to dedicate in. They are running behind or ahead of the train, and that is some areas that creates gap, where the upstream can say, okay, we have had experience of building good APIs where we can query multimodal AI system and then leverage it on OpenStack dashboards. So these are challenges that the upstream can just leverage right now, and you’ll have values And then you make OpenStack more attractive. Yeah, so I think one way of doing that, and I will still go back to the same premises I met earlier, go closer to most research institutions.
They need help from OpenStack and they get so little, and some of them are frustrated. Did. Yeah.
[00:24:57] Allison Price: Well, I think something you hit on, and I think it’s been a conversation that we’ve all been having throughout this week and in our ongoing jobs and is, you know, we are one, our OpenStack is one open source project.
We’re one community, but there’s a lot out there. So what other groups should we be collaborating with? Who else should we be learning from? Because I think if we just learn from within, it becomes really insular and we are actually not expanding our knowledge and what our technology should do. So from y’all’s interactions or like we’ve mentioned Hugging face, we’ve mentioned Ollama, we’ve mentioned DeepSeek. Like who should we be reaching out to? Who should we ensure that they’re actually coming within the OpenStack community or we’re going to them and we’re learning from each other to ensure maybe integration, maybe just positive learnings and alignment, but what do y’all think?
Yeah, yeah.
[00:25:46] Armstrong Foundjem: I think, I think one way of learning in this modern era. Is to bring out something that I have been trying a model and this is what I have done, but I found, well, I run into these difficulties. Everybody wants to know that you are bringing some influence or some affirmation from what they are doing.
Okay. If you go to Hugging face, I was decided again, they have one section of their online portal where they keep track of every paper that has been published. Based on urban face community, that is one area that I think OpenStack can just invest a little bit of time. Researchers like to see their work outside and once they know that you are recognizing them, just by doing that, they want, I mean, others who see, oh, I saw your, what is happening in that community?
There are so many ways of in bringing this incentive ways of attracting people, but above all, most people are building models. As I mentioned earlier, they still have some user interface issues and then they have, like my talk today was also mentioning about explainability. It is not about building a model.
Once that model is out there, that is when the real engineering work begins. You have an engineering system. This is where we can come up and say, okay. The best model that exists on this planet earth. After a period of time, that model would degrade in performance. It’s natural. That is why in machine learning, after a period of time, they have to retrain the model.
If not, the properties would degrade and the model will start giving wrong performance. Knowing those basics, or I could say knowing all those limitations of machine learning, we already have OpenStack with so many projects. It now start looking into some of the those projects to say, okay, we have these projects that does this monitoring, that does this observability, that does this open.
That already aligns perfectly well with Kubernetes. So scaling things could not really be a whole lot of issue based on the workload that small and medium sized enterprise are doing. We may not start by running onto very big workloads and some things like that. Making this impact and looking for gaps.
Then bring those value to the people. Everyone will want to see, oh yeah, you did this. I’ve been thinking about this, but I don’t have time. Now that you have started it, let’s keep on, keep on this with this discussion. A lot of these kind of things that are mentioning are happening out there. Most people have good products, but the user interface or the experience is sometime crappy.
Even OpenAI. I don’t like the interface. But then what can you do? I mean, so some of these innovative things can start from the upstream and then people start making contribution. Then you see it becomes now a service that you provide functionalities to multimodal community. You stay agnostic. Yeah.
[00:29:02] Todd Robinson: I’ll follow up on that because the, the, this kind this concept, which I think you were saying, which is can open, this is in my own head, so it may not be an s at all what you said, but it’s 5:00 PM It’s okay. Right. What, what I was thinking about, like, part of our problem I think sometimes is OpenStack is sharing the abilities of OpenStack.
And so I, I’ll share like a, a very specific story. We have a, a customer, one of our favorite customers very, very large for us. And they were struggling, so they were already overrunning a bunch of stuff on OpenStack, got a bunch of clusters in different locations. And they came in and, and they were like, I can’t, they’re running Databricks on top of AWS or Azure or something like that.
And they’re just like, this is killing me, but I, I know I can’t move it over here. And I was like, oh, okay. Why do you say, and this guy’s cool, got like an Isaac Asimov t-shirt and everything. So I’m like, yes, you can, you should know that. But so, but I’m thinking like maybe the connection there is us being more vocal publishing. This is the workloads that are running on OpenStack and we can do the AI ones, of course, this is like part of it, but it’s also like we we, one of our, one of our engineers is like super smart. Once we convinced him that like, hey, what you could build a Databricks the true open source version of Databricks.
And so he went and did that. He took up, I don’t know, six weeks or whatever and like fired up the traditional one, you know, so it’s Debe zium talking to Kafka, Kaka dumping into Delta Lake, Delta Lake being read by Spark, by and, and then Spark the big Spark clusters. Got a bunch of stuff connecting to it, right?
It’s all pure open source. It’s all a hundred percent on OpenStack. But this, we had to teach the customer that, no, absolutely we can do that. Like, and in fact it’s, this is the fundamentals underneath. And as soon as they understood that, they’re like, oh, now I can do this. Oh, I don’t need to do this other thing that I used to have to do.
It saves my debt, my, my people, a whole bunch of time to actually process the data through there. My batch processes are better, et cetera. So, I don’t know, maybe that’s the idea is, is that we take a little more time to publish what OpenStack can do and it’s proof that it can do that, which we, we all in the space already know it, but you gotta share it.
Yeah, yeah, definitely. Oh yeah. Love this is interesting, right?
[00:31:06] Madhavi: I’m like connecting the dots here. So you ask the question like, who should I go? So I’m ex-Intel, so I, I was like breathing accelerated, and now I’m on the other side, like I’m working on the customer side of the company from that space.
So I, I wish I had the access to all the people that I’m talking to when I was there. Right. So you, you speak about like, okay, I’m an open source community. I have this, all this great things that I want to do, like where you speak power consumption, bring all the power to. To real people and stuff. Hugging Face started with the same thing, right?
I want to make sure that Ai, AI models are accessible. You know, like people can go you know, build it for their own applications and stuff, but they are in their AI domain, and I see OpenStack is in an infra domain. The real connection would happen when these two talk. The, the reason I say it, there is a lot of corporate thing going on in if I wanna like go access.
The AI models, it’s all in Nvdia underneath. So they have amazingly created the stickiness. So what happens with the, I love Nvdia so it, it, I mean like his made some bold, amazing decisions, which has created that stickiness to their product.
But I wish it was not just the GPUs alone, right? Because. There. I mean, if, if you look at it, we have so much research on, even if you take a CPU based workload, it’s only 30% of the CPU is being utilized at a given time, and you have all of this change of the type of workload that keeps happening. So there is a shift in, you know, how much CPU is being consumed and how much power, how much cooling, and all of that.
And then AI is becoming bigger and bigger. And you kind of, there’s so much of marketing that’s being done, like AI equals, GPU equals, you know, all that. Right? And like how do you even make people go. And look at other accelerators and
[00:33:16] Todd Robinson: think it’s, it’s, yeah, it’s just gonna take some time. You mentioned AMX before, so, right.
That’s the fourth generation. Yeah. And so there is some accelerators already in the cores I realize when, so it’s, it’s attached to the core. Yeah. Yeah. You probably go better than I do it. Right. So then it’s the fifth generation got made it better. The sixth generation is now also specifically addressing that, and they separated into the E core and P core.
Yeah. So, yeah, it feels like there is gonna be multiple solutions and that’s why I say try it on the, try it on the, the CPU.
[00:33:44] Madhavi: It is, it is, but the real customers don’t have access to it because they are looking at like, oh, am I looking at my CAP workflow? Orb workflow? So it goes back to your point of like, Hey, I’m gonna talk to my customer, understand their workload, make sure that I put it on an open staff platform so it actually works.
Right. So then how do you. Partner with somebody like, and goes to this point of let’s start small, right? Like what kind of smaller AI workload that you can take, and then you have access to like, Hey, Kevin said like, Hey, you can access a CPU or an accelerator or anything you want. And that’s probably something you create POC type platforms for customers and community help.
For saving wallet and fuel.
[00:34:34] Todd Robinson: Yeah, and I think one of the points I’m, I’m trying to make is OpenStack already does it like it’s CPU, it’s already efficiently handling this stuff. So you can run these models directly in a VM today, right? You could just go do that. So that’s then entry point, and I think I would count on all of the manufacturers will continue to improve that, right? cause Intel wants to get into the business. Of course. GPU didn’t quite get the business they want. So put it in the, put it in the CPU. Right. Makes sense for the, on their part. Right. A MD I’m less familiar with, but I’m sure they’re following a similar pattern. Yeah, they have some,
[00:35:08] Kevin Carter: I mean like the, the AMD GPU capabilities with like the MI three or the three hundreds two tens, et cetera.
They have a lot of the same features and functionality. It, what, what doesn’t work great is the software. The hardware is incredible. If you can find the, the right platform to, to well, to run the workload against it. I think where, where Nvidia has a leg up on pretty much everybody is the fact that every somebody supports CUDA.
And, and this is where I think that DeepSeek kind of expose that as a, you don’t really need that if you, you know, construct your software in, in, in a slightly different way. So this is where, yeah, I think there. We are at, if the open source communities can kind of rally behind that, that mantra, that idea that it isn’t CUDA or die.
Oh, what, sorry. Yep. Then yeah I, yeah, I think if the open source community can embrace the fact that, that there are, that CUDA is not the end all, be all, it might be something great, but it is not something that is required, then I think we can, we can get to the point where, you know, people are using accelerators.
People are using you know, specialized instruction sets. People are using AV X or, or FPGAs that are available and taking advantage of all of the incredible features and functions that a platform like OpenStack natively provides.
[00:36:28] Ken Crandall: Can I riff on that for a second? One of the, I was sitting here thinking the whole time, and as soon as you brought that up, it kind of catalyzed in my head and I’m thinking two things really resonated.
With me. One is when you said that the, the hardware’s great, but the software part isn’t great. And second is it doesn’t have to be CUDA. And I think that really points out to where I think personally as a someone who’s been open source for 20, 30 years now is, that’s, I think the opportunity for OpenStack is how do we build a framework that, like for just talking about, we’re talking about AI now, but it could be other, just the same way we built a framework for NFB, we built a framework for pluggable networking back in the early days just with Neutron and, you know, ML two. Yeah, quantum, sorry. But I think, I think that’s the level of thinking where I, I really would hope that OpenStack can take Leadership is like, Hey, let’s build an AI framework where it can be multi-model, it can be multi technology, but then when someone comes in, there is that user interface, there is that way to plug in a model.
There is that way to, you know, run the learning or the retraining and that way. It’s accessible, right? It’s, it’s putting it in front of the end users and for the people who are providing the technology. It can be, it can be done through, you know, intel process or it can be done through Nvidia and Kuda. It can be done through AMD, it can be done through, I was just reading about someone who’s trying to come up with a, I think it’s Zeus who’s trying to come up with a new GPU that is like one fifth the cost of Nvidia.
And granted, there’s definitely, I think we’re gonna see more stuff like that. The deep mind type of, type of thing outside of the box where it’s, Hey, if I can deliver 80% of the GPU at 20% of the price, then maybe it’s not as good as CUDA, but you know, people will then build models around that that can then take advantage of that and completely upend the, the economics.
So I think to be just as a member of the community and as you know, someone who’s done this before, I think that that framework is really what we can excel. Okay. Yeah.
[00:38:23] Armstrong Foundjem: Just to add on what you were saying and the previous question, I. Say that GPU is great, CPU is even better, but do we need it in all cases?
Now, if we are doing high level computations with massive amount of data, okay, we can be talking about CPUs at this point, but then if our workload is mostly working on. Let’s say natural language processing or any aspect of language models. We have a lot and tons of pre-trained models already available where CPU can do a good job.
Fine tuning my laptop here is not even the, the latest I wanted to for my presentation. I was training re doing fine tuning of OpenStack code base, some aspect of it, not the entire code base. So the model that I started with was 7 billion parameter model from Hugging face. At some point it was like frying up my system. I said, but why should I cure this system for this work? I started just doing some more findings and investigation. I saw a 66 million parameter model, which did incredible work and gave me the best performance that I could imagine. I don’t have GPU on my laptop. I have CPU. And when I saw the result, I was so pleased.
So most people want to see the bigger, the better. But many other research institutions from meta from Stanford and MIT have proven to us that smaller models can do a greater work. And just to provoke more situation. Right now, ai. The way we were thinking of software engineering 1.0 is moving fast at 3.0 with agent ai, which means we’ll need smaller models to run on agents and those agents will become now the new what name should I give them?
We will be managers of those agents. Right now we are at that level where. Programming is no longer a job of a software engineer. Many code generated models exist now that can generate good functional code. The work of a software engineer now will be to review those codes and to run test cases, and then AI agents now will spin up and do most of the jobs.
We don’t need those fancy, large models to run on agents, smaller unit, and that is where OpenStack really comes handy. Just to conclude on this topic, look at Amazon. The CEO started by selling books. We all know that at some point that we know from his from the storyline, he almost like, what am I going to do when businesses are going down?
And all this massive amount of data collected over time. Internet was at his doorpost from all his intelligence and inquiry internet with that data. It’s spin up something that is so massive today. Don’t waste your many years of experience, innovations, and building product. Look that vision of Amazon and say, okay, OpenStack have been doing this business.
We have well structured community and many have learned from us. Just leverage it now to a new adventure direction and think about agent solution. It will bring many breakthroughs. I, I believe so.
No, I was just gonna say OpenVINO. Look that up. If you’re interested in CPU doing like ence on CPU help me out. There’s a couple other ones that I should know that were on a slide deck that was, I gave somebody else’s presentation that I didn’t know what was on that, that slide. But Open Vinno. oneAPI. Yes. Thank you. Yeah, exactly.
So OpenVINO and oneAPI. Just look that up if you’re interested in that. And again, it lets you get involved without having to like, oh my God, I gotta get a H 100, which is ridiculous. Right? And one, one rack to run your one. Your one server or Yeah. Go. Sorry, go ahead.
[00:42:44] Allison Price: No, no, no. Well I think that the interesting thing, one of my biggest takeaways is like, there’s still a lot of unknowns.
Like there are things that we need to figure out, we need to learn. But to Ken’s point that I wanna draw back to is like, we also need to be leaders in developing them as materials, developing those learnings, getting the right people in to share their stories. And it’s actually exactly what the Open Infra for AI Working Group is poised to do.
We are starting with OpenStack, given that there are some proven production scenarios already out there that we’re trying to schedule architectural show and tells that we can publish, that we can open up for public q and a and say, you know, there’s a lot of architectural decisions that go into supporting an AI workload.
Like, you know, MoVI highlighted some, the panel highlighted some, and I think hearing what’s working, hearing what those challenges are, and then funneling that back to other operators, but. And more importantly, in my opinion, to the upstream community to ensure we’re all working together. That’s what the OpenInfra Community, I think is really poised to do right now.
And it’s we’re at a really opportune time because we are still at the beginning to be leaders in this space and make sure that our projects integrate in the right places and remain relevant. But like I said, I, we ran over by four minutes already. I do wanna thank my amazing panelists. I thought it was a really great discussion.
Yeah.
And actually something, it’s, it feels strange to say. And I wanna take a picture of how many people are in a room at 5 49 on a Friday afternoon. So I wanna also thank y’all for being here. I hope it was interesting. I know this is actually about to be the closing remarks for the event. So as someone from the Open Infra Foundation, I just wanna thank all of y’all for being here this week. I’ve had incredible conversations with a lot of you. This is a good book into it, but, we gotta keep these conversations going offline as well.
Related Content
Accurately measuring AI model performance requires a focus on tokens per second, specifically output generation rates. Understanding tokenization, model size, quantization, and inference tool selection is essential for comparing hardware and software environments.
This article highlights OpenMetal’s perspective on AI infrastructure, as shared by Todd Robinson at OpenInfra Days 2025. It explores how OpenInfra, particularly OpenStack, enables scalable, cost-efficient AI workloads while avoiding hyperscaler lock-in.
At OpenMetal, you can deploy AI models on your own infrastructure, balancing CPU vs. GPU inference for cost and performance, and maintaining full control over data privacy.