The Future AI Tech Stack

Thomas Laffont (Co-Founder, Coatue) leads a panel with the brightest minds in the AI space to talk through applications and future-casting use cases for foundational models. They discuss:

  • Will the future of AI consist of taking standard architectures that were built for something else and repurposing them, or will it be more about creating a much more transformational innovation layer in the data center?
  • As AI becomes more widespread, will everyone have their own model that’s maintained on their own data or a few foundational models that all of us are plugged into them, or some version in between?
  • How the panelists see the world of models today (and in the future) and the overlap and divergence between foundational and proprietary models
  • How Jasper is approaching these topics to best suit the end user and their unique needs and challenges when interfacing with these models

And more.

About this session

Thomas Laffont (Co-Founder, Coatue) leads a panel with the brightest minds in the AI space to talk through applications and future-casting use cases for foundational models.
Osama Zahid
Aidan Gomez
CEO & Co-Founder
,
Cohere
Osama Zahid
Thomas Laffont
Co-Founder
,
Coatue
Osama Zahid
Andrew Feldman
CEO
,
Cerebras
Osama Zahid
Peter Welinder
VP of Product and Partnerships
,
OpenAI
Osama Zahid
Greg Larson
VP of Engineering
,
Jasper

Most of you have heard about or have heard the names open air disability Jasper. You've heard about words, language, models and computation. But understanding how all of these pieces come together to form the AI that we engage with is another matter. Our next panel spans all parts of the future AI tech stack. Hearing from these folks will help evolve your understanding of how it all comes together. Our moderator for the panel is Thomas laffont, co-founder of co-2 management. Thomas has spent his career identifying and investing in the world's most exciting technology. Let's welcome Thomas and our panelists to the stage. That's a little weird. So I'm going to sit here. All right. Thank you, everybody. I'll start by saying that when I was asked to do this panel, I said, well, you should know that I think 99% of panels at conferences are a waste of time and not interesting. So I said, this one is going to break the mold. So we have a high bar. We're going to bring the energy up and hopefully give you guys a great conversation. So Andrew is joining us. He will be here shortly. But for now, why don't we start with quick introduction? So Peter wanted to lead us off. Sure hi, Ron. My name is Peter. I'm a VP of product at OpenAI. So we have an amazing research team's abilities, all of these incredible models. And I'm more focused on taking those models and putting them out in the world through applications like our API, you know, daily copilot with, with, with GitHub. And now once lately, Jackie. I am Greg Larson, head of engineering for Jasper. So I lead the teams that do the development for the application for the API that was announced earlier today in the Chrome extension and so on, working with different partners on AI technologies as well and doing what we can to serve customers with AI solutions. I'm Ed and I'm the co-founder and CEO of cohere. We build big, large language models and make them available through an API. I'm actually going to introduce Andrew since he's not here. But the good news is I know him well. He's we've been an investor in his company for almost seven years. So cerebro is building the world's first dedicated silicon architecture. To do so, he's innovating at the silicon layer and in fact, fashionable and fashionably late. Welcome, Andrew. I was giving your intro so you can. Happy Valentine's day, everybody. Do you want to give a quick intro and fill out what I said? We build big, fast accelerators for AI training and inference. And, Uh, did you go past that? that's about the depth of what I know. So so fill in. Fill in the blanks. This is leading venture capital right here. He only gets to say that because I say that to him in the boardroom, and it's. It's how we work, right? Is it? I'm not joking. And Thomas and his team saw in video in 2014 and 15, if you look at their portfolio and it's dazzling. It's extraordinary. We began our partnership with them in 2016, late 2016, early 2016. We started the company and we saw an opportunity to build a better mousetrap, a processor optimized for AI work, not a processor with a heritage in graphics, but something dedicated to the type of work that our audience and our colleagues and you guys do. And since then, we've grown and we now have customers around the world training the largest models quickly and easily. And with that, will people. OK one of the things I told the panel is I want to keep this kind of interactive amongst us. So we'll ask questions of each other and we'll try and have it as if we were all kind of having lunch or coffee and you're all were kind of in the conversation with us. So we'll try and keep it as interactive and in a fun format as possible. One of the things, and Andrew, we'll just start with you were just chatting about it. If I think about the cloud revolution, we were basically taking the very much a similar server with the same CPU, the same memory, just the same architecture. We just took that box and instead of having it on premise, we put it in the cloud. But ultimately the architecture was the same. My question to you is, as we think about a.I., do you think the same thing will happen where we're just taking standard architectures that were built for something else and repurposing them? Or do you think that, in fact, no, it's going to create a much more transformational innovation layer in the data center. So in the 90s. We we've got a new workload and it was called ethernet and IP and new types of processors were developed and whole new companies emerged. We know them today as the dominant players Cisco, juniper, Arista. About a decade later, another new workload emerged. And it was a workload and compute that. Suddenly cared more about power draw than it did about top performance and the two companies that had 100% share of the processor market. One 0% share that market. And that was the rise of arm and cell phone parts. We're in a similarly transformative moment where the work that you guys are doing and the people here at open and I am coherent isentropic instability that they're doing and all of you are doing is ready for a new type of compute. And I think that's the opportunity out there. I think many of you already use multiple types of processes for this work, whether it's GPUs or TPS or gear or other start gear. And I think it creates an opportunity and historically has been the lever for which great new companies emerge in hardware. Well, we're excited to see what you guys do, because obviously we need to bring costs down to make this innovation kind of available to everybody. One of the things I was curious, if we think about the foundational models, I think there's kind of a view of is everyone going to have their own model that's maintaining their own data? Are there going to be a few foundational models that exists and all of us are kind of plugged into them or some version in between it. And maybe I'll start with you. Tell us a little bit about how you see the world of models today, how you see it break down between the foundational models and proprietary models, and may be thinking three or four years out, what can we expect these models to produce? help us dream the dream on exciting new things that models could achieve. Yeah I think that it doesn't make much sense for every single party to be building their own large language model, like in terms of the amount of compute that's required to do that. You know, these supercomputers, they're already extremely competitive. And so if everyone had to get their own supercomputer, start from scratch, I think that that's I mean, you might. That's my dream. That to keep going. You're going great. Yeah Yeah. So until the, you know, silicon pipeline can support that. I think we need to resource share. And so I'm sure it's the same thesis with Peter and and OpenAI but. Yeah, I think we need to start resource sharing and that will enable the cost to come down. That will let dedicated teams like OpenAI and cohere focus on optimizing that layer of the stack and that enables everyone else to innovate on top of that and really focus on what I feel like is the hardest problem, which is the product problem which Jasper has been incredible at. So got my. So we'll bring Greg in on that in a minute. Continue with that thought and tell us a little bit about how much better these models are getting and maybe what are some of the cool innovations or things we might see coming out of them in the next few years? Yeah, I think like 18 months ago, we would have been talking about the first Jupiter three, right? Like these base models, these big models of the web where you scrape a trillion tokens off of hundreds of millions or a billion web pages. But we've come a really long way since then, and a lot of it has been like the user interface that's changed. So I kind of view it as this the stack that builds off of the previous layer. The initial piece was that GPT three, the base large language model. On top of that, you do some fine tuning and you get instruct models or what we here call command models. That was like a huge lift in terms of usability and performance on the base models. You were doing prompt engineering, you were spending hours and hours and hours trying to figure out how to get the model to behave in the way you wanted it to. These command models are construct models. They enable just, you know, a very simple do this for me. Here's the data output comes out. I think the next innovation that we saw two months ago or three months ago from OpenAI is dialogue. So the layer on top of command and the base model, it's now instead of just like a one turn interaction, instead of just do this for me output, it's now a conversation, right? It's like, write this article for me. It does one past. You say, OK, yeah, that's close. But could you make it a little bit like this? It does another pass off of that. And so it's an exchange and it's iterative. So that's where we are today. I think in terms of next steps, it's very clear that retrieval is going to be a big part of this. And so you're seeing that with Bing shot or Google's barred model. And so that's a huge piece in terms of truthfulness and groundedness, the ability to actually retrieve from a, you know, up to the millisecond knowledge base. These models don't really want to lie. They don't want to make up facts. That's not an effective strategy. What what they want to do is give the correct answer. That's what their training objective tells them to do. But if they're not given access to the correct answer, if you ask them something, you know, that wasn't observed in their training data, they have to resort to either saying, I don't know or fabricating something. And so one solution for that is exactly what, you know, CBT is doing with Bing, which is grounding it in a database that it can query and retrieve from. So that's the next layer on top of dialogue. There's one more, which is actual tool use and taking action out in the real world. So these models are currently confined to like this little box. They can't do much. They can just write back to you. That is their modality of interacting. The next unlock is these models actually being able to use the tools that we've built for ourselves, whether it's apis, whether it's operating a web browser. Once we actually equip them with the power to go out and accomplish tasks, you can imagine a scenario where you give it an instruction like, go buy me x, y, z, and it just goes out and finds some products and you know, which one do you want? Like I said, it has all of your logins to Amazon or whatever. It has your credit card, your address. I can carry out that entire instruction chain for you. And so I think action is really where we're all sprinting towards. Greg, why don't we bring you in? You know, if I think about Jasper, you know, the history of the company is really interesting, right? It initially was going to do kind of something else, a little bit of an experiment to see what if we kind of added this user interface on top, would there be a market for it? And then incredible reception in the market. Right we've seen true product market fit. Marketers all over the world are using this now as part of their workflow. How do you think about how much you want to differentiate. Now at that layer? How are you thinking about both having something that is benefits from the entire ecosystem. But also unique to what the customer needs? So give us just a bit of a glimpse given how the company got started, your journey from just being kind of this one thing to now all of the different methods and things that you're exploring inside of the Jasper data center. Yeah so I think it kind of goes back to where Andrew was first saying, like building the foundation to empower these people. Opening I, cahir and others to build foundational large language models. And then they're now pushing the envelope on what air is capable of in general capability and all the great things that I could do. And then we're Jasper came in and saw the success that we saw was then really starting with understanding the customer. I think that was Jasper's like first superpower sort of speak was understanding the actual customers and what their needs were, regardless of whether that included AI or not as part of that solution. But then seeing that there were these needs that customers had and then these incredible capabilities provided by this technology, we were able to marry the two. And as we continue to do that, we're able to see the feedback from customers and how they use AI for their different use cases. We can learn from that. They can learn from that and become better and better. And sometimes that means that in partnership with the big model providers and things, we can learn from each other as your nodes figure out how to push the envelope even further on their side with the foundational models. But then a lot of cases, Jasper is able to invent some smaller AI technologies. Within that, there may be more specific in our case, to what our customers are asking for and for what they need. And it's, again, kind of built on the back of all of these other things that got us to this point. So is the idea basically that the main jasper? Module may be powered by a foundational model, but maybe jazz voice may have a different infrastructure that's unique to that customer. And so you're going to have essentially. You know, a bifurcated world where you just have a lot of different types of models that are running different parts of what the customer is doing. Yeah, exactly. That's that's definitely one of the possibilities, one of the things that we're building towards. And for a few reasons, I think one of the questions early on was, should everyone have a large language model? And though that is great for Siri versus bottom line, and then maybe it will get there to some extent when it's, you know, more economical and things like that. But I think, to some extent, that's not needed. A lot of customers, a lot of users of AI won't need a huge, you know, multi-million or billion, $100 billion parameter model to accomplish what they want. They need something smaller but something more specific and honed for what they want to do. And that's exactly what we're working on is making it. So in this case, Jasper gets to know you as a user, as a customer, what your needs are, the things about how you want to create content, and then putting those things together. Some of those things, maybe these large language models that are very powerful, some of those other things are going to be smaller pieces of the AI toolchain that we can add in to do different things to make again the outputs better for the user for their specific use cases. Aiden, one of the things you mentioned that was really interesting is that I think a point you were making is that how we interface with these models is changing a lot, right? Peter, maybe you can give us a little bit of a story on chat, right? Because if we look at that innovation, I mean, I'm sure you guys saw the core innovation. You'd already put it on in the market. People were already interfacing with it. And yet when you put the chat format on top of it, it seems to have really kind of exploded. So I'm just curious, one, is that kind of something that you expected and give us a little bit of context on there. And then to how did you prepare the infrastructure for the influx? I think we've all used it. We've all sometimes gotten the screen that says, hey, it's busy, come back. So I'm really fascinated about how you did you forecast that demand was going to happen and how are you even kind of catching up to it? Yeah, that's a great question. I mean, so what Aiden said, like, I think the one thing that we have been thinking about a lot is generally like, how do you get more out of these language models like, you know, in the early days of TV three, if you want to get anything out of the models, you had to be, you know, kind of a whisper, you know, you need to kind of know all the ins and outs on how to kind of put yourself in the mind of a random person on the internet trying to write something, some random blog post and on the topic that he wanted to write it. And then you could kind of get what you wanted out of it. And that just didn't seem seemed like the right way to interact with an API. So we put in a lot of work to figure out what is, what is much better interface. So they start serious models, what's the next step or the command kind of models. And what we noticed was also like oftentimes when you kind of, you know, the model that I often have for this model, for the mental model I have for interacting with these models, it's a little bit like you've hired a new person. How do you get them to do a task? You know, know, the first thing you might want to do is you kind of give them a bit of an instruction of here's how I want you to kind of I don't know, like we get customer inbounds. Here's how I wanted to categorize them or whatever. Here's the criteria. But, you know, we don't try to get everything right in the first construction. We kind of go in with this idea of that, no, there will be I won't be able to cover everything I need to kind of I don't know exactly what this person how the person would receive the instructions or whatever. And so the way we train each other is really through dialogue. And so it's been something we had been exploring for a while. How do we get to a place where, where, where you can actually have a bit back and forth to put these models on the right track in terms of carrying out whatever task. But separately, we also kind of started exploring kind of what makes it fun to interact with, like what, what makes fun of interactions. You know, these models are already they kind of know a lot about conversations. They've been trained on. So much parts, so many parts of the internet, which is basic conversations between humans. But that part wasn't really lifted forward in the models. So that's kind of what got us started building on a more dialogue based approach in terms of them kind of releasing it to the world. I mean, I wish there was like a super kind of clever product marketing kind of plan here. But like, you know, we definitely did not believe that it would get the take off that it did. You know, you know, if you ask anybody once you do launch a product, probably like right between Thanksgiving and Christmas is probably not the most ideal time to kind of launch anything. And, you know, part, partly was because I think, you know, we had internal debates of like, how special is this? Like this has been out there like you can we had a playground. You could have a dialogue in the playground if you want to do it. So there was, I think, also kind of a fair bit of skepticism within opening about how big of a deal with this really be. If we release just kind of different skin on top of these models and make them a little bit better for dialogue. And, you know, it was definitely kind of a big surprise to see the way it took off. And it's been like really delightful to see because now so many more people get to experience what I can do. Like, I hear all these stories of, like, people going to, like, family gatherings. So grandmother is asking them for the best activity problems. You know, like every like it's suddenly like it's not only that people are using this for so many different things now, but like people are starting to understand how it's going to be impacting us in the future. And I think that's really important because this is a technology that has changed so many things of what we're doing. So that piece is it's just it's important that everybody kind of starts to kind of learn about it and how to use it in terms of the kind of scaling it's on. I mean, even our head of engineering. Here in the audience, he would tell you that it's know, that's everybody. It's been all hands on deck for basically two months not to scale with all the demand. We we don't like the file. We'll either and we're doing everything we can to remove it. That's like our era right now. Well, I can tell you one takeaway for me as I think about this, which I'll share with the audience, is that it is exciting to think that there's innovation obviously at the model layer. Right and we know we'll have GPT four coming out pretty soon and coherent lots of other things. But what it sounds like is there's still a lot of innovation on how we interface with these models. Right I mean, to your point, I think what you were saying is you didn't want to develop an industry where you had to be an expert in prompting to get the best out of the model. Right and it seems like that's kind of intuitive to what you guys are trying to build. And so then just by doing something in chat now, all of a sudden you expanded the reach of fine. You don't need to be figure out the perfect prompt with 17 different adjective and adverbs to kind of get the best, you know, the best out of it. You really want to kind of maximize, you know, the reach, I guess. Right and you can see there's so much low hanging fruit here, right? Like this right now, the interaction model is like I say something, the models say something back, I say something modest, something back. There's a pretty awkward kind of conversation to just kind of do like we humans, we interrupt each other, we say much more. We do follow UPS, you know, also like these models at this point, they forget do everything right after you've basically maxed out the problem to context. And so I feel like we're getting started doing something, I think just a community and an ecosystem and building these models that will be changing quite a lot over just the coming year because a lot of these ideas, they're like, they're not kind of rocket science ideas. They're like pretty low hanging fruit in terms of just how do you put together the pieces in the right way? And I'm pretty excited about just seeing what people build on top of these models and how they put them together in the right ways to kind of make them much, much more fun to interact with. I think someone was saying earlier that I think that gp2 was really exciting because this has been going on for a while developing these large language models and doing like exploring is technology and everything. Chad GPT gave it this face and this approachability for people that suddenly, the conversation shifted from, you know, us designers trying to explain to people like this is what you could use AI for. So now all of a sudden we have people coming to us saying, this is what I want to use a for. How can you help me do this? And so I think it's like shifted that conversation a really fundamental way where now people are starting to get it. Now everyone sees in a more practical way, like what it means for them in their day to day lives or jobs or whatever. Yeah, we were talking backstage how crazy it was, like even four months ago, right? Like just before chat, CBT in the conversations that you'd be having with like average people or executives in enterprises, you know, they'd heard of GPT three, it was like something out there this nefarious, you know, something. Now they have first person experience interacting with the model. And so they come to you with ideas of, oh, I think I could plug this in here. I think we could do this over here. It's just such a shift in terms of awareness. I think it's really fun about being at Jasper's. We're really close to those customers who were doing those small like Mr nuance things in their day to day jobs. And they're coming to us with all these ideas and things like that, and we're excited to do everything we can to power those things again. Working with you all on top of these powerful technology, that's like now at the point where this is all kind of coming together, it's like all the dots are starting to be connected now. Well, I'm sure, Greg, you see that in the Jasper product, right? Which I use all the time. But I even take the concept of the template, right in, in something like yeah, if you actually just tell the user and create a little template on how to create an Instagram ad or another is an Amazon product listing or which are again versions of what Peter was describing, right? It's the same foundational model at the bottom. I'm sure there's an engineer, OpenAI who's like, I don't get it. It's the exact same model. It's like a year. Why is everyone going crazy now? But it is amazing how you put these little tweaks on the user interface, which I'm sure you're just learning and getting feedback from, you know? And so over the next 12 to 24 months, as you keep experimenting and building this great kind of UI on top of it, it's going to be really exciting to see what, you know, what users will do with the software. Yeah, I think it's making a better experience, an easier experience crafting it again, like you're saying, like with templates and things, making it so people can use it for what their, what the job is that they're trying to accomplish. But then I think another big piece of that, too, that we haven't talked as much about is making it available in more places like speaking to where it's now in your browser or it's now in your other tools and things that you're already using. You no longer have to change and alter your workflow or how you like to do what you do to then take advantage of this technology. But rather the technology is coming to you in places you're already familiar with it. And we're starting to see that all over the board now with, you know, being tools that are or software tools, services are pulling in as features or pieces that are within the software or platforms like Jasper that are building things out to make it. So it's one place that you can have a lot of different ways to create content, not just a single dimensional way, but then also having that go with you. So you have Jasper in this case in your browser or wherever you may be working in different tools. So let's shift gears for a bit. You know, if I think about Jasper as well, it's one of the first companies to really leverage a.I., but that doesn't its own infrastructure. If we think about AI companies, you know, we might think about Google and Meta and bytedance, and they obviously have all of their own engineering and infrastructure. And so it's really kind of been an in-house kind of technology stack. What is your assessment of just the maturity of the stack like is do you think we've kind of coalesced on a standard way if you're going to build of an AI company of how it has to be built, or are you kind of seeing the industry kind of adapt in real time with the needs of a company like yours? I think it's definitely the latter. Have a lot of conversations with people that either are already doing things in the HR space to that end, like building tech stacks, infrastructure to support, building AI technology and tools and things. And I think it's sometimes this attempt to take what they have been doing and adapt it to this sort of like New architecture, this like these new technologies and things and figuring out how to make it fit. But I think it's also just like opening up all these new ideas and again, making AI more relevant for more people. And so now it's almost like you have a different customer or a different person that's wanting to do this. And so then, you know, they're not going to be the type that might go do what's already been made available. They're not going to go, you know, create big teams and huge data centers like, you know, like the big Giants and things would. We're looking at New ways to make it possible to interact with AI and bring in-house when necessary or in some cases, you know, for a lot of customers, they may even be able to level up that capability, but it may not be as important. It goes back to the cloud architecture where there's a little bit of could rack and stack servers, you could make your data center. And in some cases that might make sense. But more and more, you see people either starting in the cloud or migrating to the cloud because it's this like New way of thinking about it. I think the infrastructure stack has moved in exactly the same way you were describing. The customer stock moving that we've been able to make it easier for customers to adopt 100 different ways, easier to engage with, easier to get the air to do what you want it to do, and easier that forward to add value underneath that from a hardware all the way up to the API, we've made it easier for companies to build on top of it. Right but two or three years ago you were predominantly buying servers, renting servers, and now you can fine tune, you can go to services. We have them. Many ways to consume. You can bring them your data to give me trade parameters back. And that sort of takes from the company perspective. You don't need to be an expert. You need to be a user, a consumer, somebody who takes it as a block and inserts it in your technology. And you can focus on your customer, your data that you gather, that sort of the new gold from which you can take customer insight and give other customers a better experience. But that same pattern that we saw that open air and go here and you do Jasper sort of Greg discovered you describe also happens underneath. And now there are lots of ways for you to get to meaningfully I work which you can then build upon interim I'm curious I mean, if you look at in video it's really interesting know the market cap today I think it's 570 billion it's now three or four times Intel. And for those of you who may not have been following the semiconductor industry for a long time, the fact that In video would even be larger than Intel. You know, someone would have laughed at you 10 or 15 years ago, you know. In fact, there was a point not too recently where in video was trading at the cash on its balance sheet. Right so literally, while Thomas won't tell you, that's when he began investing in them. I wish. But I say that to say that, you know, in video was in the business for over a decade building building things for a guy that had no market. And so I'm curious as it relates to our journey at cerebro is right 6 to seven years in. Do you feel like you'd like the amount? It feels to me like the innovation and the focus on air has taken a different tenor even since Thanksgiving. Thanks Thanks to Peter Wright. You look at Google. $150 billion plus a market cap erased in just a few days on a bad demo. Right which is kind of interesting to think about. So do you feel like the tenor of when you're talking to enterprises about our systems, like the conversation is starting to change versus maybe even where it was, you know, 12 or 24 months ago. I think everybody's experience, the tenor of the conversation change. I mean, my brothers are doctors and no one is more backward in technology than doctors. And you know it. You know, you're the disappointment in the family. My mother my mother, 80-year-old Jewish woman introduces me to her friends. Is this is my son, Andrew, the one that isn't a doctor. Right you know, we sit with my brothers and they're like has a widget business. This goes on for years. And then no really mean like, I save lives. I save lives. How's the widget business? My brother calls me up and says, are you involved in this church thing? And I was like a little bit sort of that's our industry. And he goes, it's pretty cool. And he hangs up. And that you have conversations on. On airplanes with people and you sit down next to them and suddenly. Yeah, I tried that on my kids using that to write papers. Is that OK? It it when it gets to random people who have no particular sort of reason to have expertise in this and they're interested and they're excited, you can be sure that every boardroom, every sort of every business is asking themselves, wow, should I be rethinking x? Should I be rethinking the way I do customer support or should I be rethinking the way I do marketing? Should I be rethinking sales and all of that sort of trickles down to the infrastructure layer. And so when we engage, it's no longer should be using as we have 12 projects now and how can you help us? And that's best since sliced bread. I mean, when you don't have to also explain why they need you. Instead you can explain sort of why you're good at the things they know they need. Right? they arrive with a prepared mind. And what you have to do is compete on the benefits, your product. That's awesome. And I'm sure it's the same for, for all you guys that suddenly family members know and grandmothers know. And that's and we were not there four years ago. It's like you do what? And now everybody knows that's really. It's fun and it Cascades down into business decisions. All right. So there's a question I've always wanted to ask you. I never have asked that. So I'm going to ask it now and I'll let all the others chime in. But so what happens if we have a model and it's learned every web page, every book, every TV show, every YouTube video, it's basically learned everything. Is it is it over? Do we go home and pack up and we say, great job and we're done? Like, I mean, eventually we will have trained on everything. So what's the next frontier after that? Yeah, I think machine learning solves. We can kind of just pack it up and you know, I think and by the way, and maybe are we close to that? Like, where are we from? Maybe having just trained on everything, I mean, maybe opening as close to that, I don't feel like career is at all. I think there's a lot of data. And the rate at which data is being created is staggering. Like it's hard to keep up to begin with. So I don't think we're heading towards a point where we've run out of tokens on the web and certainly not when we've run out of the right tokens. Right there's a lot of garbage out there. And that doesn't help to training. Yeah, Yeah. No, it is. It is very messy. Yeah and it's going to get Messier, right. As the language model's outputs start appearing on the web and then get Fed into the next generation of model that we're training. And so, yeah, we need to get very good at detecting the difference between synthetic media and human generated media. Openai has contributed to this, and then we need to find new sources of data, stuff like transcribing audio and video to get much more conversation and dialogue style text. Yeah, in the limit when you exhaust that, when you're so good at consuming data that you can keep up to the very day or second, I think you have to start getting creative. Can you use the models at that stage to help augment your dataset? Can you ask them to just generate a data set of a new trillion tokens on a specific topic? So I think we need to get more creative, but I do view that as, at least on my mind was just bent with your last comment. So, well, we won't go too far deep that rabbit hole. Before we turn to peter, I'm curious to get your point of view on this. Aidan, are you one of the things that I find fascinating is that we don't seem to have reached the point of diminishing return with models yet, where it still feels like the more you feed them, you're getting kind of the equivalent kind of return, you know, improvement. Maybe another way of asking it. When do you think we hit diminishing returns? Like when? When do you think we're at some point, right? We have to kind of hit it. I'm curious how you think about that. Yeah, I think it's fascinating. Like initially with the original GPT three, that was like 175 billion parameters. I think there was a very strong thesis that we need to keep going larger along that axis. Along the parameter axis thinking has changed and evolved. I think what we've seen is going further on the data access training for more tokens. Taking more steps was really underrated and under appreciated. My sense in terms of where we go, there's trends that models are getting smaller and they need to be faster or more efficient. At the same time, hardware is accelerating, which enables us to train larger models. And so there's kind of two competing trends I think we'll be able to deploy and to train much larger models. And at the same time, we're seeing that the model scale doesn't need to be that much larger for specific capabilities or for enterprise relevant capabilities. So I think there's right now like a very conflicting set of data inference goes up with bigger models too. So in latency and I want to touch on inference, so maybe we'll hit that next. But peter, give a shot. You know, answer my naive question, but what if we've trained on everything? You know, what's left? Like, how should we think about Open already in that train on so much of the internet? Right I mean, at some point, is there no more trading data left or how do we think about that? Yeah, I think the data piece is maybe a little bit of a red Herring. We go back, you know, know, I don't know. I feel like most humans are pretty smart and we haven't solved all the internet, you know, like that's not really what makes us super versatile and so on, right? So there's something different here that is required. And, and, you know, you know, I don't know that we talk a lot about AGI and getting to the point where it is these models can do everything that humans can do. And I like sometimes I think, you know, maybe we look back like five years from now and be like, Oh my god, we were almost there already. 2053 we just didn't know how to kind of. Use them all in the right way. I think that's probably a little bit of an exaggeration. But like, it's mind blowing to me. The number of things people keep on filing with these, like this stuff that they're able to kind of pull out of the mouth, like the whole, you know, step by step recently, you know, like you're out like five words or whatever. And then suddenly the bottom gets like 10 times smarter, like, you know, it's pretty crazy. So I think it's much less about the data and much more about how you put together the pieces in the right way. Some of the things that Aidan said about like, you know, to use, like having to actually in the malls go out and do research on a question or like, you know, we had this demo a few months back where, you know, on a 0 shot kind of getting a program written, it's kind of pretty hard. But if you have the model right to program and then you have the model go and look at what is the error message and go and fix the code and like do a bit of iteration. You get to something that actually works. And I mean, that's how we work as humans as well, right? Like we don't do everything in 0 short. We kind of iterate, we kind look at what we try something and if it doesn't work, we try a different angle. And so on. And I think this is much more kind of I think much more of what we will see over the coming years. And I think I kind of feel in the limit, like the data should actually matter much, much less. It will be much more about the architectures and the way we kind of train, train the models and like, you know, I feel that, you know, who knows what the next set of models would be like? I think the transformer model is like that in another developed, it's like we're still on. That has been kind of a pretty incredible breakthrough. And I wonder like maybe there's a few things like that. We're also missing again, because I kind of go back and look at us as humans and, you know, we don't necessarily need that much data to get really, really far. Well, I think in some ways something interesting about GPT is I remember before chat GPT when GPT 3 was out, the whole of course focus was what about 4 and is 4 going to be this much better and this and that. And I think what that showed is how much better 3 can be just with a different interface. Right and to your point, it doesn't does it even really matter how much better 4 is than three? You know, I'm sure it'll be great. But to me it just showed that when you have a breakthrough on the interface side and people are now using it in a lot of different ways, like the models are already good enough to get incredible kind of things out of them. Andrew I want to come back to a point that you made, which is on inference. So I would say that, you know, when I read about AI or we talk about a.I., it seems like 95% of the focus is still on training, how the models were built, and the incredible amount of data that's being Fed into them. But you bring up a good point around inference. At the end of the day, the cost of inference is the cost of getting something out of that model. If we don't make inference cheaper, we're not going to be able to have these models be quarried as much as we all would hope by everyone around the world. So what are your thoughts on inference. And generally kind of the cost curve and the trends in that market? Like, you know, like. Peter, spending a little bit on influence right now for good, I think. And, you know, just a little bit, I think they're in danger of bringing down Microsoft with that. But I think what Aiden in Greg said, we really resonate with I think big models are good to do. Inference on them is more expensive. And there's more latency. And so what you're going to want to do, I think, in this week is to right size the model to the job, right. You're going to try and either develop or take a part of a big model or develop your own or take an open source model and find the right size for the problem at hand where you can effectively serve results at a cost effective manner at the latency that your user demands. And I think that's, you know, an interesting part that's overlooked frequently. But once we get our models into production, it's an inference problem. Are you a believer in inference at the edge or do you think it's owe it to them to live in the data center or where working it live in the network? It's going to live everywhere. I mean, I think if you look at the portion of your cell phone chip that's now dedicated to doing inference, it's about 14% Right? so that's when you're a chip maker, all you have is real estate on your chip. That's it. And so you can tell about how important something is approximately by how much real estate they allocate to it on this fixed block or real estate. And so they took away they took away space from everything else and put it on inference because they wanted it right here. And you know what, what was sort of one of stabilities great steps forward was take it away from the cloud and put it on your laptop and all of a sudden you had stable diffusion. I think. And those. They aren't big enough and good enough to do GGP too. And so that still needs to be served out of the cloud. And so just like everything else in our lives, some of it will be served off your cell phone and some of it will be served off your PC and others will be served off of a Cloud Edge. Right networks will be built to drive low latency like other such networks. And then there'll be some big clouds that for the hardest work. And it will be sort of in the same way other. The way other things are served to us. Video serve the same way. So I want to double click on one thing there, which is super interesting, which is this fact that, you know, only like ten, like 10 ish so years ago. So I've been kind of in research for a while. And it used to be the case where, you know, for a lot of tasks, like the items were like terrible, you know, image classification, you know, you got like you were happy if you got like it, right, like 10% of the time or $20 or 70% of the time, you know? Exactly I mean, and you know, and suddenly now we're at this point where for more for so many different tasks, we're like approaching or kind of surpassing human level. And we're at this, whereas we can now kind of measure kind of how much kind of horsepower of a monitor you need for a particular task. Like if you're doing like spelling correction, you don't need that big of a model to do kind of really high accuracy results. But if you're like trying to solve a really complicated like math problem, then you need a very big and it kind of costs kind of expensive model. So it seems fairly clear that what we'll see, it's like for different like tasks that people want to do. Like at some point, like you basic set, you're saturated, you don't need that much of a bigger model. I think that would be really interesting for us as a people to figure out like where, where are those for different things. And in that case, it's pretty clear that more and more of those things will kind of move to the edge because they can. Right and so I think that's one big aspect. But I think another aspect of this is also like it it may another thing that we haven't pushed on very much is can you use the smaller models to just do more by just having them do more recently, more like you just run kind of more flops on through them or like run them for a longer time. Yeah and, and I kind of share this view that what we'll probably end up seeing is sort of a little bit of a mixture of lots of models that you kind of want real time or like where you're not connected to the internet. And then it's going to be the stuff that's kind of like, what do you need the economies of scale. It needs to run in the cloud. And I don't think we've begun to explore fully techniques to make those big models more efficient. There are a lot of techniques that are you know, we've adopted a sort of a brute force approach with bigger compute, more data. And there are other techniques, whether it's sparsity, whether that's mixture of expertise. There are all sorts of other ways we could try and make these models more efficient the training, then maybe prune them down or make them smaller and less costly to do inference on. And so that's an area that, as an industry, we've been so busy making it cool first and interesting and usable, and now we've got to step back and say, well, we have all these other algorithmic techniques that might make it more efficient and now we know it can do cool stuff. Can it do cool stuff off my phone? Can it do cool stuff off my laptop instead of always going back to the. So peter, just a follow up on that. Are you saying that we May 1 day we'll have spelled GPT, you know, which is just a mini model that's just focus on spelling or voice GPT. That might just be a voice recognition that we're going to see these kind of smaller task oriented models. Right, so that you're not pinging the big model every time. You know, it's more broken down into much more nimble pieces, I think. I'm not sure. And I think that will happen in some specialized domains. But I think it's more a matter of like there we will it's kind of pretty interesting to just think about it in terms of they're just different intelligent intelligence levels you get for like model sizes. And I still think that you will probably have more general models that are running on devices, but you just have to kind of learn that they will not be as smart, you know, like they will not be able to answer all the questions or do everything, but but that they are. And then, you know, they will need to be able to communicate that, you know, this is how far I can go. You have to go and ask like the big guy in the cloud if you want, if you want, if you want a better answer. But I think I'm pretty sure that we'll see this kind of range of models. And we already see it to some extent, right where I think both us and whereas in others are using some different size models and we see customers kind of pick them based on what they need. In terms of the latency profile, cost profile of the application is on this. Oh, sorry. No, no, go ahead. Part of this whole scaling thing was that we were racing towards meeting performance criteria. Right so you want to perform at a particular threshold on a wide array of tasks. Once, you know, once you reach that and you know that you have a big group of customers who are doing that one thing, maybe the use cases, summarization, maybe it's something else. You can actually just focus in on that specific use case and really compress. Model size. Get a very specialized mullet model at that one thing. So I think that's a trend that we haven't seen happening yet, but it will be coming soon. I called my Red Bull. That's that's what Jasper is working on. I think in large part is you. A common Tread to a lot of these answers is. Is it? It depends. Right bigger isn't necessarily always better. It's more about what you're trying to accomplish, rather than the place where all good, all panels go to die. It depends. We don't want any. It depends here. That's true. Well well, to make it concrete. Yes some things will need large language models. The, you know, the biggest of them in order to accomplish that job. Some more specialized jobs and things we'll need won't need bigger models and actually might benefit for different reasons from smaller models. Sort of like the processor analogy you made in the beginning about don't want a huge processor in your phone because it'll give you a half hour battery life right now. You don't need that processing power. So for a lot of use cases, you don't need that type of horsepower from your model. And so there's going to be this like array of models and technologies that you can use. And for many, for Jasper especially, we're working to make it so you don't have to figure all that out and go out and figure out the different things and distill the big models into small models or find the specialized ones. But again, we're trying to go first to the customer to find out what they want, what the use cases that they're after, and then back that into which Technologies Match up to accomplish that job as best as we can find. All right. Well, I think that's the end of our panel. I'll say this. I was given more raw talent on this panel than any other panel that I've ever been a moderator on. So I hope I we beat your expectations. But I do want to Thank Dave and Schoen and the team at Jasper for putting this on. I do remember when we discussed putting this conference on and no one was really doing conferences. And to me to see this many people kind of assembled in person in San Francisco, by the way, is really amazing. And what an exciting time to be a founder, an investor and a user. So Thanks for being great hosts and look forward to meeting everybody.

GenAI Conference

Hosted by Jasper
7
Sessions
5
hours
hours