Christopher Bergey, EVP of Arm's Edge AI Business Unit, breaks down the impact of edge AI on modern computing. Learn how Arm is powering the next generation of smartphones, PCs, cars, and wearables through decentralized intelligence.

 
 
 
DOWNLOAD TRANSCRIPT

308 Audio.mp3: Audio automatically transcribed by Sonix

308 Audio.mp3: this mp3 audio file was automatically transcribed by Sonix with the best speech-to-text algorithms. This transcript may contain errors.

CHRIS:

The ARM architecture has existed now for almost 30 years. And that started from early investments from companies like Apple, early adopters of the ARM architecture that made it the stalwart that it is, where companies like Nintendo, companies like Nokia back in the 80s and 90s as these smartphone revolution, all that kind of stuff really started around ARM. And then that's what's driven us to be where we are today. We have big CPUs and little CPUs, and we're actually moving the workloads back and forth because certain times you need the performance, certain times you don't. And so that's really the way these devices work, where you're to your doorbell example, you're looking for maybe some motion, you're looking for something, and then once you trigger that event, okay, now let's fire up some more computing elements.

CRAIG:

In business, they say you can have better, cheaper, or faster. But you only get to pick two. What if you could have all three at the same time? That's exactly what Cohair, Thompson Reuters, and specialized bikes have, since they upgraded to the next generation of the cloud, Oracle Cloud Infrastructure. OCI is the blazing fast platform for your infrastructure, database, application development, and AI needs, where you can run any workload in a high availability, consistently high performance environment, and spend less than you would with other clouds. How is it faster? OCI's block storage gives you more operations per second. Cheaper. OCI costs up to 50% less for compute, 70% less for storage, and 80% less for networking. Better? In test after test, OCI customers report lower latency and higher bandwidth versus other clouds. This is a cloud built for AI and all your biggest workloads. Right now, with zero commitment, try OCI for free. Head to Oracle.com slash Ionai. Ion AI all run together, E Y E O N A I. That's Oracle.com slash Ion AI.

CHRIS:

So it's great to be here, Craig. Thanks for inviting me. So my name is Chris Berge. I'm a senior vice president, general manager of the client line of business at ARM. And that means I very much focus on all of the rich, rich edge devices that ARM is so prominent in. So things like smartphones, but also we are obviously making quite a bit of inroads into things like PCs. We participate all parts of your house, whether that's your TVs, your smart speakers, all kinds of rich end points that you have powered and are quickly becoming AI enabled. And I think that's what we're going to talk about today, Craig. Just a little bit about my background. I've spent almost 30 years in semiconductors. Various big companies started out of school at AMD and moved to Broadcom for almost a decade and also did some startups in between. So a long if I can't believe it's gone by this this quickly, but uh I guess you know, semiconductors have never been this cool as uh it seems like governments and uh everyone really cares about semiconductors. So I guess I I was very fortunate in my career choice.

CRAIG:

Yeah, that's kind of funny, isn't it? How things happen that way. Um yeah, and it's uh and particularly uh semiconductors on the edge. That's like the the where everything's at right now. Uh so uh ARM and and we're talking about chips for AI inference. Uh so ARM's uh ARM version nine edge AI platform has a couple of new uh uh processors uh designed to enable complex AI models. Um can you talk about those? Uh or is there something even more recent that you want to talk about?

CHRIS:

Well, I I think that's a good starting point, Craig. And you know, and and we've actually been on quite a journey um, you know, in the edge evolution, not just actually at the sorry, in the AI evolution, not just at the edge, but also um we're actually a big part of the infrastructure build out as well and data centers that you know really leverages a lot of our business model and and capabilities, and and so we're doing quite well there as well. Um but but I I'd like to spoke probably most of mostly about the edge today. And so we we did create we did a Lumex, um our Lumex platform, um, which is really around kind of high-end smartphones, high-end multimedia experiences at the Edge. Uh, we launched that back in September, and already there's uh several products, um chipsets that have been launched, and now phones that are being launched by um different leading phone manufacturers based on Lumix. Um and what's cool about it is it is the first um, it's it's it's addition to the platforms of V9, as you mentioned, that starts rolling out SME, which is the matrix extensions to the V9 architecture, um, and and starts pulling that into the ecosystem to allow people to um really take advantage of AI at the edge in these devices and do so with the traditional CPU programming model. Um, so obviously there's a lot of discussions around accelerators. Accelerators are great and are quite dominant in data center and are also finding uh their home in the edge. Um, but it's it is always this balance of you know, accelerators have gate great metrics around maybe tops per watt, um, but they definitely are a bit more challenging to program and versus let's say a CPU. And so you really got to get the reality is we we talk about heterogeneous computing because I think the answer is everything, right? I think you know, in the AI world, it's about CPU capabilities, it's about GPU capabilities, it may be about dedicated accelerated capabilities, and then it's about memory bandwidth, um, you know, quite frankly, um, and really AI is putting stress on that whole system.

CRAIG:

Okay. And let's back up a little bit for listeners that are not deep in the chip space, uh uh and I'd I'd like you to talk a little bit about V9 and what that is. Uh but uh when when you talk about heterogeneous uh computing, what you're talking and and and the programming language, uh NVIDIA, one of the reasons it has such a strong position is very early on it developed a very uh user-friendly programming language called CUDA, and and everyone has adopted that uh for uh programming AI on to uh GPU uh processors, and uh that has become the standard. And so a lot of new chips are arriving, but you need to learn a new programming language, uh, and that's a barrier. But uh you in this heterogeneous, particularly at the edge, uh what I understand is that you still use uh uh a traditional processor with a programming language that you're familiar with, and then there's some sort of a conversion that sends some of those uh workloads to the edge uh chip. Uh and it it could all be packaged together. But is is that right?

CHRIS:

Yeah, I think so. You you you you got a lot wrapped up into there. So let me just first start with V9. So um the ARM architecture um has existed now for almost 30 years, um, and that started um from early investments from companies like Apple, um, early uh early adopters of the ARM architecture that made it um the stalwart that it is were companies like Nintendo, companies like Nokia back in the you know 80s and 90s as kind of the these like you know, the smartphone revolution, all that kind of stuff really started around around ARM, and that's what's driven us to be to be where we are today. So V9 is is an architecture that we launched about five or six years ago now, um, which was the next generation of architecture, um, which was focused on a couple different things. One was security, um, another was performance, and third was AI. And um, you know, we are very much at the forefront of looking at what these next generation systems are going to require. Um, and so that's really what V9 is about. And at this point in time, um, you know, a large percentage of both iOS and Android handsets um ship with V9 um CPUs, um, and you'll see that continue to accelerate um over the next couple years. And so, and now V9 also is getting perliforated across the other markets that ARM participates in, whether that's data center, whether that's automotive, um, and then kind of AI IoT as well. So that's that's that's V9. Um so let's talk about the the programming uh comment that you made there. So you're right, CUDA is an amazing language. Um, I've actually had the pleasure of working very closely with Ian Buck, who is the, I guess, you know, who basically was the Stanford student that really kind of started this and has obviously now uh you know has been a very important part of the NVIDIA story and where they're at today and still still is there today. Um, you know, but it and and it is um a great program for you know basically programming um GPUs and especially taking care, taking advantage of some of the capabilities um in the accelerators there. But you know, it is a uh accelerator language, right? Um, versus a CPU language. And um, you know, what people do is that you know, if you basically have to start making things like driver calls and you have to move your workload off of the CPU to that. Now, it makes a ton of sense if you're gonna get, you know, I think you know, Ian and I actually used to work on quite a bit of um high performance HPC stuff for high performance computing, big government labs. And we would have these kind of rules where it would almost want to have a 100x uplift when you kind of move that workload over. Maybe at least you'd have to have 10x, but you'd really want to have that versus the convenience and and and keeping it on the CPU because you really need to load that and then you stream that workload on an accelerator. And so that's kind of as you think about this heterogeneality of you're really moving these workloads around based upon you know what's required. Is it latency? Is it, you know, is it high performance, is it lower power, all those kinds of things. And so this is, you know, what I would say the ecosystem is doing, what ARM's focused on is we're really focused on making all of those pieces, the CPU, be as performant as possible on AI workloads and be super developer friendly, which it is. Also moving some of those workloads to GPUs. ARM is actually, many people don't know this, but ARM is actually the highest volume GPU out there. We've shipped over 9 billion um GPU cores if you because of the mobile handsets and so much in the mobile industry that's based on ARM GPUs. Um, and then also in support of accelerators through um ARM's extension of our uh architecture, and so many of those accelerators hang off of things like chai buses or um that are kind of part of the ARM architecture.

CRAIG:

Yeah, and and when you say uh the ARM ships GPUs, uh where does the uh uh NPU sit in and fit into all that?

CHRIS:

Yeah, so so in a mobile handset, let's talk about that since that's when we started. Um, you know, there are you know, traditionally there was two large computing elements, or you know, there was the CPU and the GPU, right? Um, and their main function, CPU was to run the OS and then eventually apps. The GPU function was to display and play games and do all those kinds of things. Well, as these other workloads become important, whether it was, you know, camera imaging has obviously become super important and the amount of video that you take, well, then we start putting little accelerators that are maybe doing um some of the compression for the different video codecs and and those kinds of things. And so MPUs have kind of evolved um as a way to efficiently do matrix multiplications or CNNs and different kinds of models in as an accelerator. And so the system can choose to send that. Um, but what really happens in the real world, and this is the heterogeneous, is you actually end up kicking off the job usually on the CPU. You may actually run it to the GPU, you may send it to the MPU, and then it comes back to the CPU usually to kind of conclude. And so that tends to be how these workloads actually work in a system. It's all uh obviously not seen by the user, um, but it is the developer has to make some of those decisions as a software developer is thinking about, you know, one, what kind of capability is in the handset. They're also making the decision of do they want to do it on the cloud or the edge? And we should have that conversation as well today. Um, so these workloads move around, and there's a whole set of reasons that they move around, but um, we're making the the CPU at the edge be as AI friendly as AI performant um and power with with a good power envelope as possible.

CRAIG:

Yeah. And are these all packaged together? The CPU, GPU, and NPU if you're using the NPU or are they yeah.

CHRIS:

It depends on the system. Um, in today's um cell phone chips, yes, they are largely all in a single SOC. So all these computing elements exist in SOC. Um, as you get into you know things that have, let's say, larger, you know, larger battery windows, you know, in PCs, many of the NPUs are integrated in the SOC. We are seeing trends around people adding accelerators. Obviously, people are also using the GPU that's in a um in a in a um laptop or something like that. And again, that may be integrated, it may be um discrete. Um, I would say there's a trend towards integration in many of these markets, and um that is because of the memory pressure and the memory size requirements for for AI. And so I think if you look at some of the latest, you know, I'm gonna say put it on your desk kind of computing platforms, um, we're especially proud of our partnership with NVIDIA on the GB10, the product they've actually just started shipping last week. Um, it's puts, you know, I I think um Jensen likes to say, it puts puts a supercomputer on your desk, right? Where they're actually um providing, I think it's a petaflop of AI performance on your desk. And it is because of, you know, you've got these ARM CPUs, again, um, you know, 20 of them coupled tightly to this accelerator with a single memory system um that offers up to 128 gigabytes of DRAM and very high bandwidths. And so you've got this whole system that you can put together uh or that developers can use, and and we're seeing this as a trend. Whether you look at you know the latest V9 M5 that Apple just announced uh this week as well, or I think it was last week, um, again, you see amazing AI performance, leveraging the ARM CPUs, uh GPUs and other accelerators, and then a tremendous memory system um uh for for that. And so that's really what we're seeing happening. And those are integrated, but some of them, as they get a little bit bigger, they get discrete. But the problem with discrete is then you have to split the memory system, and it it gets it AI is so memory heavy that it just it's a it's a whole balancing act.

CRAIG:

Yeah, and again, I'm mindful of listeners who are not uh deep in the chip space. SOC is system on a chip, and that's where you you combine uh different kinds of chips into one from the consumer's point of view, it's one looks like one chip, right? It's all compressed in there with a cover. Um and the uh the reason this is important is that increasingly, I mean, right now I have an app that I built for myself. When I drive around, it talks to me about the history of the places I'm in. Uh and and it the voice drops out a lot of time because it's gotta you know read my GPS and send the data to the cloud, and then you know, do a lookup in in the model of whatever model I'm um using, uh, you know, get the the text from the model, convert it to speech, send it back to the the uh phone, maybe the the text of speech is on the phone, I'm not sure, but but that whole chain uh uh gets uh there's a million ways that it can get interrupted. And if you have that all happening on device, you don't have those problems with connectivity and and things like that. Can you talk about how you see uh you know V9 or or these uh these SOCs uh changing the way we uh we interact with AI?

CHRIS:

Yeah, absolutely, Craig, and and you're you're at the forefront here. I it um you know it's it's it my job's super fun because I get to talk to many of the industry leaders and and visionaries, whether that's both on um you know companies like Google that we work so closely with and all the chipset companies and and then all the the um CE companies, the consumer electronics companies that are actually putting these things in your hand. Um and you know, it's it's it's fun because you know I always I'm an I'm an edge guy. I think you know I want edge computing. That's kind of that's the business I run. And so, you know, I you know, of course I want edge computing to happen. And so, but I I do often go to these partners and say, hey, you know, okay, why can't you run it in the cloud? You know, why why can't we do it? You know, it seems to work just fine. People love Chat GPT, they love Gemini, you know, they're starting to really utilize these services. And you know, they they basically said, you know, what gives me reassurance is all these companies are like, no, no, no, we need to put these in devices. And and and one of the reasons is what you just said, Craig, which is, you know, we the what the goal of AI is that you know we're seeing some of the early cool use cases, real-time translation, um, you know, some some smart, some agents that are starting to do some things, but we're we're in the early, early innings here. And you know, one of the analogies that I like to use is um uh touch. And if you take a child, let's say less than 10 years old, and you give them a screen, they just start touching it because they don't Know anything that's not a touch screen. Um, and and you know, and whereas you know, you and I, you know, like the mouse was the big thing once once we you know, we thought that was cool. Um, but and I and I the reuse that analogy because that is how AI is going to be. If something doesn't have AI and you can't interact with it and it can't start figuring out what you're trying to do, it's gonna be that like that child that says, this thing's not gonna have a touch screen, I don't know how to use it, I don't want to use it, right? And so so first off, you need to just believe that that is that is how essential this is going to be. It's it's the you know, it's how annoying it is when it takes a long time for the app to boot up. And it's how annoying it is if you have an app die, or you know, those kinds of things that we we work out the edges and and they don't really happen as much anymore. But as technology is new. And so it's the same thing with AI, where you know, in talking to one of these partners, you know, they said, Hey Chris, we, you know, okay, yes, maybe the AI in the cloud works great, you know, 90% of the time. But you know, you're driving up 101. I'm I'm here in in Silicon Valley and San Jose, you're driving up the highway, there's a dead spot. And you know, and basically you're gonna get a bunch of latency that, you know, it's not gonna become conversational. You're gonna be waiting. Um, and the reality is that people aren't gonna complain about to their carrier saying, hey, you got this dead spot on 101, when are you gonna fix it? They're gonna say, Hey, Craig's app, I don't like that experience. It's frustrating, it doesn't work, you know, three times a day. I'm trying to use it. You know, it's it's like doing these video calls, right? When it doesn't take long for somebody to have a shaky connection, you're kind of like, hey, let's just talk next week or let's talk when you're in the office, right? It's just too frustrating. So that's just one example. There's privacy, there's performance, there's all kinds of other things that is um that really is going to drive this to the edge. The the counterside is the models have to get smaller. There is a real cost because of the computing and the memory computing um pressure it puts on it. And so it's a balance, but it is happening. Um, and again, training, and there'll still be quite a bit of inference that'll happen in the cloud. But as much as we can move to the edge, um, I think it's pretty unilateral. Whether you're a hyperscaler, whether you're an app developer, whether you're a device manufacturer, all are quite incentivized to try to make it run very, very well in the edge, in the device.

CRAIG:

But there are challenges and there are trade-offs, as you said. Uh, one is uh that uh there's a power uh issue uh and uh a heat issue as a result of that. Uh, how do you manage that? Because and that was I mentioned this company Brain Chip I had a conversation with that's using neuromorphic chips that only uh fire when the when uh the there's enough activity to wake them up. So for example, in a in a doorbell camera, you don't want the doorbell absorbing the the video, sending it to the cloud, computing and sending it back when when there's nothing happening in the scene, but that's in fact what happens. Uh, how do you manage that?

CHRIS:

Yeah, so I I think that I mean we we we have these management techniques that get used in you know in almost every aspect of a computing device, right? I mean, we get we have very clever engineers who figure out how to kind of fool you that the device is running full power, yet we're we're obviously very aggressively changing things in the background. And that that was actually one of the ARM innovations um over 10 years ago, we came up with this big little concept around we have big CPUs and little CPUs, and we're actually moving the workloads back and forth because certain times you need the performance, certain times you don't. And so that's really the way these devices work, where you're to your doorbell example, you're looking for maybe some motion, you're looking for something. And then once you trigger that event, okay, now let's fire up some more computing elements. Let's go figure out, oh, it's a face. Okay, is that a face we know? We don't, let's let's trigger the cloud or whatever. And then that's the way these systems work. Um, so you know, I think it is about um really these more intelligent computing platforms um and getting smarter on how you do these accelerators. And that's you know, that's for example, our our the SME I mentioned that's part of this SME2, that's part of V9. Um, it is this tightly coupled accelerator, but we do it in a very, very efficient manner where it's actually shared across multiple CPU cores. So you're also being very area efficient, um, which really drives cost. Um, so I mean, those are things, but the other thing is there's tons of innovation. Um I I again don't want to get too technical, but your your your uh listeners maybe have heard about HBM, right? HBM memory is stacked DRAM with a very high bandwidth interface, and it's become a huge enabler for um AI in the cloud, right? Because we have to have all this memory bandwidth and we put these HBM stacks next to these accelerators. We're looking at very similar things um at the edge relative to, you know, if where where is the power, right? Is it in the computing element? Is it in the joules per bit of memory transfer? All of these things are opportunities for innovation, and there's a ton of research and a ton of investments going into them. So I I don't worry about the power. I actually think the power is fairly manageable relative to uh inference. I think the bigger pressure we have right now is actually memory, uh, memory size. Um, so the model size and um and how how big it how do how do we make it small enough? Because at the end of the day, that drives um it drives cost, it drives power and other kinds of things. So uh it's happening. And the good news is we're in this innovation cycle where you know there's two things you can do. You can shrink the model and get the same performance, and and that seems to be shrinking, you know, almost you know, 50% a year, if not faster than that. And on the other side, you can say, hey, I want a three gig, you know, gig three billion parameter model, and it's just getting more and more intelligent every every six months, every year, right? And so there's different ways to solve the problem, but this is that's really the enabler, I think.

CRAIG:

Yeah, and and again, I'm just thinking about listeners so that so we don't lose them. Uh SME is scalable matrix uh extension, and that's a a way to optimize the the math operation uh that's that's used. Um and uh are you you know on the models, uh are you guys uh exploring uh state space models uh that optimize memory usage? I mean that have a different uh way of handling uh uh memory.

CHRIS:

Um so I'm not as familiar personally with StateSpate. I I think I understand the general concept of it. Um I would say that you know, we provide this platform that many of these innovations sit on top. So, you know, many of the um what we've basically done with, for example, our our matrix extension engine that you just talked about, uh, we basically built what we call Clyde framework on top of it, which is these libraries that now developers can basically leverage. So it just does the right thing in the hardware relative to taking advantage of which, you know, do you have the latest RMv9, et cetera? So a lot of those how the model works and state states and and what, you know, are they using a KV cache or how are they doing the different updates? Those actually sit at a little bit higher level in the stack where we're kind of more of a, I guess, a plumbing enabler is kind of the way that I would say it. So we we support all of that and we're making it super easy for developers to um to explore that and and to build their innovations on top, but there's nothing unique, at least at this point in time, that we're necessarily doing from a state-state point of view in in our CPUs.

CRAIG:

And can you talk us about some of the uh the current applications uh and where you see this going, this move to the edge? Uh, I mean, obviously uh you know uh self-driving cars is is one where you you can't afford to risk connectivity and latency by sending data to the cloud. It's got to be computed on uh on the platform or on the on in the car. Uh what are some of the other applications that you're currently uh putting chips into uh and or current devices, I should say, and and where do you see that going? I mean, how uh another the form factor of these chips, how small can they be? Uh so that they're they're sitting. I mean, when I was talking to a guy, the I wear hearing aids, uh that uh, you know, you'll be able to have these chips in the hearing aid, uh, you know, filtering or or isolating sound and that sort of thing. Uh so can you talk about that a little bit?

CHRIS:

Yeah, I mean, that's a great question. I I mean, I think the scalability is almost everywhere, right? I mean, your your hearing aid example is a good example, and and and this is one of the reasons why ARM builds the the huge set of portfolio that we do, because we are in many of those those hearing aids and those kinds of things, and we've we've actually announced AI across I mentioned Lumix, but we have our edge AI platforms that we've also announced back in February. So we are enabling that. Um, and it um it just comes down to if you know the task you are trying to do, you can make things quite small and um and efficient, right? So in a hearing aid, right? In a hearing aid, you're probably not trying to run a large language model, not yet, at least. Um you're doing things like trying to do, you know, reduce noise or picking out, you know, amplifying only the the interesting, you know, what what are you trying to hear? Now, obviously adding translation and those kinds of things are probably going to be possible soon. Um now, if I know what language I'm translating to, then I can, you know, that's gonna make my model smaller. Uh, but yeah, everything, um, you know, one of the things that I like to think about is how much, you know, we use apps today to configure things. Like a good example is like a security camera, right? So today, you probably, you know, when you try to install your security camera, in the past you would have gone to your laptop and connected it. Now, probably use your smartphone to do that. Well, you know, that is a use model. We've gotten used to it. And but but why can't you just do it with your voice? Like, why can't it talk to you and it have the camera have a large language model? And you say, hey, what it asks for what is your SSID? Here are the ones I see. And you, you know, now again, that may or may not be a better user experience, but we just see the applicability across the board. Again, I I use that touch analogy because touch has become so prevalent. AI is going to be way more prevalent than even touches today in changing the use model, how you interact with these things. Um, you know, a great example I'll give you, Craig, is we worked very closely with Meta, um, who is is is really doing a great job, I think, with with the you know, some of the advances in their glasses, right? So the the XR glasses, um, and their and the product they just announced two weeks ago, three weeks ago, um, they they now have a wristband that goes with that product. And and that actually uses one of our ethos uh NPUs, which is again super small, super low power, that you can have this wristband that basically you know has a huge battery life. I forget how many days or weeks you can wear it. And it literally just, you know, you manipulate things just by moving your fingers like this, and it senses in your wrist the changes and what's happening you know below your skin using AI to basically figure out that, oh, you just did your second finger, so that's going to do this, or you just did this, so that's or you know, you're doing this. So basically, that is the new UI of a wristband that has AI in it that has you know a long, long battery life and has a tiny, tiny battery. That's that's how small we can make AI for very specific use cases.

CRAIG:

Yeah. Uh and and what's the focus uh with with ARM right now? Is it uh reducing size? I mean, obviously it's going to be all these things, but but where do you see the next breakthrough? Is it reducing size, reducing power consumption, increasing the model size that can be uh on these chips? Uh yeah, where where do you see it going?

CHRIS:

Um it's a great question, Craig. And it it is a it is a a very open design space right now, to be honest with you. Um, you know, definitely there has been a focus on power in the past, and that's kind of where our ethos product line comes from. And but a lot of that was around CNN networks and some of the early stuff. Now you're starting to see the move to transformer networks. What's next, right? I don't think that transformers are actually the end goal. And so we also need to allow flexibility as these models change, because you know, back to kind of educating your users, I mean, it takes two years almost at a minimum to design silicon and get it in a shipping product. And you can see how fast AI is moving. Um, yeah, and so there's also this flexibility piece, but I I think most of it is performance. It's really memory performance and it's um and it's tops and CPU performance, and then um and and trying to shrink that down as as as the best we can for for whatever the power envelope is to um you know how much computing you can get. And I think that's obviously happened in the data center as well, right? Where it's okay, we're trying to get, we're now building gigawatt data centers, but how much how you know how many tokens can you create? How much performance can you get in that? So almost everywhere in the space, it's like, okay, tell me what your power envelope is. It's a gigawatt here, it's you know, seven, six watts in a phone, it's you know, in my wrist example, it's literally you know, a couple hundred milliwatts, you know, how much AI performance can you get? And so it it really is spanning a large um space.

CRAIG:

Yeah, and uh you guys don't uh you you're you're producing IP, right? You're not uh building uh you don't have a fab, you're not uh uh sending your designs out to a fab to make your chips, you're licensing this to other chipmakers.

CHRIS:

That's correct. Yeah. So our business model is we provide IP to uh uh you know the a large portion of the semiconductor ecosystem. Um, and um, and then our partners build on top of it and create uh create silicon solutions that can span, like we talked about, anything from you know a wristband or or your hearing aid to uh you know your next self-driving car.

CRAIG:

Yeah, yeah. And and where is most of this? Um, I mean obviously most of it's going into the US, but uh do you have partners that are are licensing your IP? Uh where what are the other markets that you're operating in?

CHRIS:

Well, uh, you know, I would say that traditionally semiconductors have been quite global, right? Um, you know, that the the cost of developing a semiconductor is tremendous, right? Um obviously the foundry costs, and I'm not even saying like put aside the foundry costs, um, but literally building a chip. Um, you know, if your example, if you're using uh one of the latest nodes, um just the mass costs alone of when you're done your design and getting that manufacturer is in the tens of millions of dollars. Um, the development costs up front can often be in the hundred million dollar plus range. So you're talking about tremendous amount of cost to get to the first unit. Um, but what's made semiconductors so great is that you can scale that, and because we have these amazing manufacturing capabilities, et cetera. And so they tend to be very large markets. And so generally it's been a global market for us. Um, and yes, um, you know, obviously um there's many great IC or semiconductor companies in the US that do designs. Um, we work very closely with European semiconductor companies uh in Taiwan, in Korea, um as well. And um, we also have had a um ARM China, a JV, where um semiconductors get built in China based on RIP. And so it we get uh it's a pretty global market.

CRAIG:

Yeah. Uh I would imagine the Chinese uh relationship has been strained by the sanctions. How does that affect you guys?

CHRIS:

Well, yeah, it's I mean, we always make sure we're following the local uh laws and obviously where we're we're located. Um most of the constraints have been around the manufacturing process and not necessarily um some of the IP. There is some around IP enablement, but um you know it's something that we always are very careful about, make sure we're we're complying to what is required, um, and uh and we'll see as things evolve.

CRAIG:

Yeah, yeah. Um and um um okay, am I what what am I missing here? We're we're we're closing in on your hour. Uh is there is there something that I haven't talked about that that you guys want people to hear?

CHRIS:

Melissa, anything you're talking to your mind? I don't know if you're still in um I'll put Melissa up here.

CRAIG:

Technically savvy uh uh listeners. I'd like to talk about uh so this is uh IP for chip design. Uh it's sort of looking forward where, you know, I'm not sure how to even ask the question, whether it's you know, you mentioned the wearables uh you know what what should the public be imagining is we're gonna see as a result of of these AI-enabled chips. Um but I'll I'll let you uh and I'll cut all of this stuff off.

CHRIS:

Okay, that's fine.

CRAIG:

Okay.

CHRIS:

You want to ask a question then?

CRAIG:

Start as a question or uh yeah, so uh um I and I forgot Melissa, what was your your point? Right, right, right, right. Yeah, so so i it in that that you guys are uh are primarily selling IP, uh do you work with a large developer community and what's that relationship like? Uh is how much of this is open source, for example. And uh, and then if you could talk about where you see, I mean you mentioned wearables, where wearables are gonna be a hot thing, where you see the edge really exploding the public uh uh interaction or experience with AI.

CHRIS:

Okay, yeah. So, I mean, let's let's start about developers. So, you know, ARM is um, you know, developers are super important to ARM. And in fact, we now have uh the world's largest developer ecosystem. We believe we have over 22 million software developers that are developing software on ARM. Um and that's just because of our footprint. Um that spans all the way from you know iOS to now Windows on ARM, and then of course things like um Chrome, Android, and Linux. Um, you know, we've been a longtime supporter of so um you know, really the the world's largest software ecosystems are um quite uh are you know work work closely with ARM and we we uh support them and uh and we've really focused on that developer experience and and how do we improve that developer experience. I mentioned uh Clyde before in this in this conversation. That's really about again trying to make it seamless for developers to use AI because you know you talked about CUDA. AI is not um simple from many of the programming elements. And I would say we're still seeing not necessarily the hardware abstraction that we've gotten other software ecosystems to. We still see a fairly tight coupling to today's models to actually the hardware that they run on, um, whether that's the operators they support, whether that's you know the way they kind of assume the model is going to propagate and those kinds of things. So um, you know, but we are we're clearly focused on the supporting those developers and uh and and seeing what they can build, right? And I think that kind of goes to your second question of you know what's kind of on on the forefront. Uh, and that's why it what makes my job so fun, because I get to to to to interact with so many of these super innovative folks, uh, both startups and established established companies, on kind of what is next or what what's the art of possible. Um and and if I was just to use a general concept, it would just be intelligence. The idea of something is capable of interacting with you in a way that exceeds your current expectations, right? And you know, maybe I'll I'll use another product example. Um, I'm a huge Amazon fan. Um, but one of the things I've noticed is that before I upgraded to Alexa Plus, you know, because of the way I was using a chatbot or Chat GPT or Gemini, I started talking to it in a more conversational manner because that's what I could do. But then when I would come home and use Alexa, it wouldn't necessarily, it was expecting more of kind of a search kind of give me, you know, really break it down pretty, you know, I'm giving you three words and try to contextualize those three words. You know, and so I and I think by the way, Amazon's done a great job with Alexa Plus, and I think they're they're on the right path here. But that's just a change in expectations. And I and I think that's going to only go, you know, just through the roof relative to you know the idea of you know the things that we think are how you interact with a device for it just to be smart. Um, you know, one of the things people ask me about is like, what's a gentic AI or how's that gonna feel, right? And and one of the things I like to use, and I think I think Microsoft has done an amazing job of really integrating AI into their products, as large as a company they are, if you know, the way they push co-pilot, all those kind of things. But what I like is a is a silly example of settings. So when you use Windows and Windows settings, we've all done it, and you have to figure out, okay, I want to add a screen, or I don't like this resolution of screen. How many like windows do you have to click in to kind of say, figure out how to make this change, right? Well, they've changed it to be more of an agentic experience in Windows 11 where you just say, What do you want to do? I want to add a screen. I want to reduce this. And it just says, okay, well, here's how you do it, or I'll do that for you. And so, you know, to me, that's a touchy-feely thing of how many times do we, you know, click through three apps. Oh, I'm gonna cut this. Now I have to go to my flight app, then I have to go to my rental car app. Then I like, you know, so many of these things that we're just trained to think like, oh yeah, it works good enough. I, you know, but like this idea of things are just so smart that, you know, when you call an Uber, it of course knows where you're going to go. And it's already told people when you're gonna show up and all these kinds of things. So, you know, those are, and of course, those are consumer-y examples. There's a huge amount of enterprise examples of you know, teams that are not you know sharing data or people don't understand, or they're looking for information they can't find. Um, and so I think it's just going to be the general idea of just how do we make things just super smart um and really kind of exceed expectations of what technology limitations are today.

CRAIG:

Yeah, yeah. And robotics is is another one. I'm not a great believer in humanoid robotics, I think we're a long way from that. But uh what you're describing, I mean, uh it's um you know, the day will come when uh our children or grandchildren uh will will look back and can you believe that you know any object that you use daily uh was a dumb object back then instead of being able to talk to it and and have it like configure itself or or whatever it is. And that's that's pretty exciting. What about robotics? Uh I mean, obviously you guys uh are involved in that.

CHRIS:

Yeah, I mean, it's pretty amazing that what's what's happening there um and the ability to train the robots in virtual worlds as well as real worlds. Um we are a huge believer in that and and and we're quite involved with many of the ecosystems. Um, you know, I think it is really taking the advantage of all, you know, the camera sensor has been an amazing enabler, right? If you just look at what our cell phones are able to do in capturing images, but really the the camera sensor has now unleashed this idea of physical AI and how do you, you know, because we can use vision systems, and AI is so good with vision systems, right? Whether that's looking at an MRI and or, you know, maybe an and and augmenting a doctor's reading of that or in a system and trying to understand what's going on. So it is it is a huge area. I think we're gonna see a lot of um innovation. I think, you know, to your robotics example, uh, you know, I think I don't know, go back to the Jetsons or whatever. I mean, people said we're gonna have flying cars and they were gonna be able to do all these autonomy things. Uh it's taken a long time, but you know, another great arm-powered device, my Tesla, you know, it has gotten pretty darn good relative to self-driving, and and and I really enjoy having that augment me in many, many scenarios. And so it is like you said, it's gonna just be an expectation, and you're just gonna be disappointed when it doesn't do something that it clearly can't do today. But these things are hard. I mean, you know, I think that that blend of physical um and and you know, is going to be a hard thing. But like anything, it's probably, you know, I I go back to I was involved with Bluetooth in the early days, and and you know, oh, it's gonna be amazing. You're gonna have all these Bluetooth devices, and just and and it didn't work great in the beginning. The experience wasn't great, but just look at how fundamental Bluetooth is today. Again, when I walk up to my Tesla, it just unlocks, it just like it unlocks. It's a and that's all over Bluetooth and augmented by UWB, or you know, so it's it it's just these technologies, you know, sometimes the hype cycle gets a little bit ahead of expectations, and that's uh that's our fault as technologists. But at the end of the day, the technology delivers, and uh, and it's gonna be an exciting future for uh um our children and grandchildren.

CRAIG:

Yeah, absolutely. And yeah, with moving into physical, the challenges aren't really uh the AI, it's the it's the mechanics, the actuators and and the the wear and tear and the dust and the grease and all that stuff that people kind of forget about. Uh that that's uh that's it's still got a long ways to go. Um yeah, okay. Uh so is Arm on the verge of announcing anything uh new, or is this uh is it uh really the uh the the V9 that you're focused on?

CHRIS:

We we've got quite a broad product set of products, right? So yes, I you know I have my responsibilities, but we have a whole automotive and and robotics group. We have a whole group that focuses on data center, same thing with edge AI and additional intelligence. So uh I think you'll keep seeing quite a bit from us. Uh lots of great stuff going on, and um, and again, we're just super excited about the partnerships that we get to partner with and how much developers um and hopefully some of your listeners are inspired to uh build things on top of the ARM architecture and uh and help make the future what it's what's what's show us what's possible.

CRAIG:

Yeah, and and on that note, we'll end on that. Uh developers, if they're interested, uh do you have a portal developer portal?

CHRIS:

We do, developer.arm.com. Um great, great great resource to uh to find out more, figure out how to um you know build and and uh and explore many of these new technologies. Um obviously many of the makers are familiar with Raspberry Pi. That is a uh also arm powered and something that many um you know kids and and uh educators start, um, start their journey, and and those things are getting super excited now with robotics and and beyond.

CRAIG:

In business, they say you can have better, cheaper, or faster. But you only get to pick two. What if you could have all three at the same time? That's exactly what Cohair, Thompson Reuters, and specialized bikes have, since they upgraded to the next generation of the cloud, Oracle Cloud Infrastructure. OCI is the blazing fast platform for your infrastructure, database, application development, and AI needs, where you can run any workload in a high availability, consistently high performance environment, and spend less than you would with other clouds. How is it faster? OCI's block storage gives you more operations per second. Cheaper. OCI costs up to 50% less for compute, 70% less for storage, and 80% less for networking. Better? In test after test, OCI customers report lower latency and higher bandwidth versus other clouds. This is a cloud built for AI and all your biggest workloads. Right now, with zero commitment, try OCI for free. Head to Oracle.com slash ionai. Ionai all run together, E Y E O N A I. That's Oracle.com slash Ion AI.

Sonix is the world’s most advanced automated transcription, translation, and subtitling platform. Fast, accurate, and affordable.

Automatically convert your mp3 files to text (txt file), Microsoft Word (docx file), and SubRip Subtitle (srt file) in minutes.

Sonix has many features that you'd love including powerful integrations and APIs, transcribe multiple languages, enterprise-grade admin tools, advanced search, and easily transcribe your Zoom meetings. Try Sonix for free today.


Learn more
 
blink-animation-2.gif
 
 

 Eye On AI features a podcast with senior researchers and entrepreneurs in the deep learning space. We also offer a weekly newsletter tracking deep-learning academic papers.


Sign up for our weekly newsletter.

 
Subscribe
 

WEEKLY NEWSLETTER | Research Watch

Week Ending 12.21.2025 — Newly published papers and discussions around them. Read more