Tech
Shaping the Future: Ruslan Salakhutdinov on AI, AGI, and Society
Ruslan Salakhutdinov, a distinguished UPMC Professor of Computer Science at Carnegie Mellon University’s Machine Learning Department, stands as one of the most prominent figures in artificial intelligence research today. With a focus on deep learning, probabilistic graphical models, and large-scale optimization, Salakhutdinov has consistently been at the forefront of innovation in AI.
A defining aspect of his career has been his collaboration with Geoffrey Hinton, his doctoral advisor and the pioneer behind “deep belief networks,” a transformative advancement in deep learning. Since earning his Ph.D. in 2009, Salakhutdinov has authored over 40 influential publications, exploring topics ranging from Bayesian Program Learning to large-scale AI systems. His groundbreaking contributions have not only advanced academic understanding but also propelled practical applications of AI in industry.
Salakhutdinov’s tenure as Apple’s Director of AI Research from 2016 to 2020 marked a pivotal period in his career. During this time, he led significant advancements in AI technologies. Subsequently, he returned to Carnegie Mellon and resumed his academic pursuits, further cementing his role as a leader in the field. In 2023, he expanded his influence by joining Felix Smart as a Board Director, channeling AI’s potential to enhance care for plants and animals.
A sought-after speaker, Salakhutdinov has delivered tutorials at renowned institutions such as the Simons Institute at Berkeley and the MLSS in Tübingen, Germany. His research, widely cited by peers, underscores his enduring impact on AI and machine learning. As a CIFAR fellow, he continues to inspire the next generation of researchers while pushing the boundaries of machine intelligence.
Salakhutdinov’s journey in AI traces back to his undergraduate years when he was sparked by the seminal textbook Artificial Intelligence: A Modern Approach. His early work with Geoffrey Hinton laid the foundation for innovations in deep belief networks and deep learning. Today, his research focuses on building robust, autonomous AI systems capable of independent decision-making. Amidst the challenges of reliability, reasoning, and safety, Salakhutdinov’s work bridges the gap between cutting-edge theory and practical application, shaping a future where AI systems enhance human creativity and problem-solving.
Scott Douglas Jacobsen: What first drew your interest to artificial intelligence as opposed to the intricacies of human intelligence?
Ruslan Salakhutdinov: My first interest in AI was during my undergraduate studies in North Carolina. A book by Peter Norvig and Stuart Russell, Artificial Intelligence: A Modern Approach, intrigued me. It was published in 1995 and sparked my interest in AI.
I decided to pursue graduate work in AI and applied to several schools. Luckily, I ended up at the University of Toronto, where I eventually started working with Geoffrey Hinton. A great turn of events led me to work in AI. I have always been curious about machines that can learn independently and perform creative tasks. The concept of building systems that can learn fascinated me when I began my undergraduate studies in the late nineties. At that time, the term “AI” wasn’t very popular; during my graduate work, the focus was more on machine learning and statistical machine learning.
The field was fairly statistics-oriented because it was perceived as a proper discipline. AI was often seen as a domain for people building decision support systems. Working with Geoffrey Hinton and his lab completely revolutionized my work. In the early days, around 2005 or 2006, Geoffrey Hinton began promoting deep learning and learning multiple levels of representation. I had just started my PhD, so I was in the right place at the right time.
As with anything in life, timing is crucial. Ilya Sutskever, a co-founder of OpenAI, was my lab mate. We sat beside each other, and a few others were now driving much of this work across different companies and universities.
Jacobsen: Geoffrey Hinton has become a household name over the past year, largely due to his warnings about artificial intelligence. On the other hand, Eric Schmidt, the former CEO of Google, has offered a more balanced perspective. He emphasizes the need to understand and control AI systems and even suggests we might need to “pull the plug” if they act unpredictably.
Meanwhile, Ray Kurzweil’s visions of the law of accelerating returns and his almost spiritual pursuit of merging with AI to explore the cosmos evoke shades of Carl Sagan. The discourse surrounding AI is as diverse as the field itself.
Similar to a vector space, this diversity reflects how terms like AI, AGI (Artificial General Intelligence), and ASI (Artificial Superintelligence) carry varied interpretations. Why do you think these differing definitions persist?
Salakhutdinov: We lack a set of benchmarks or a standardized set of problems that would allow us to define these terms clearly. If we have a system that solves those problems, we’ve reached AGI. Or if we have a set of problems we’re solving, we’ve reached ASI. So, the definitions depend on whom you talk to. People like Geoffrey Hinton and Eric Schmidt say the academic community has potentially huge, existential risks.
And then you have people on the other side who say, look, we’re going to reach a point where these systems will be very intelligent. They’ll be smart and, at some point, will reach superintelligence. Still, we will probably go to the point of existential risk. There are risks associated with AI in general, and people are looking into those. One area that I specifically work on at CMU is building agentic systems or AI that can make decisions or take actions independently. So think about a personal assistant where you can say, “Hey, buy me the best flight I can get to San Francisco tomorrow.” The assistant will find the information and book the flight for you.
You can think of it as a personal assistant. And, of course, risks are associated with this because now you’re moving from systems like ChatGPT, where you ask a question and get an answer, to systems where you give a task, and the agent tries to execute that task. My personal feeling is that when it comes to AGI, I think about autonomous systems that can make decisions.
Where we are right now is unclear because we are experiencing rapid progress with ChatGPT and many other advancements. Will we continue this exponential growth or hit a ceiling? We’ll eventually hit the ceiling, and getting the remaining 10% or 15% of progress will be challenging, so these systems will be very useful.
At what point we will reach the true level of AGI—systems that are general enough to do anything for you—is unclear to me. People have predictions. For example, Geoffrey Hinton initially thought it would take less than 100 years. With the advent of models like ChatGPT, predictions have been accelerated to around 30 years. He’s saying it might be 10 years, but there’s still much uncertainty.
Predicting anything beyond five years is hard because AI development can either accelerate with systems getting better, smarter, and more autonomous with strong reasoning capabilities—as we’re seeing with OpenAI’s models like GPT-4 and GPT-3.5 that can perform complex reasoning and solve hard math problems—or it could progress more gradually.
Jacobsen: In the coming years, we may see the emergence of profoundly analytical tools. When we speak of agency in AI, the term holds a very different meaning compared to human or animal agency. This evolution in large language models and AI systems seems to herald a new era. What are your thoughts on these agentic capabilities?
Salakhutdinov: You want to build systems that can be your assistant. Think of it as a system that handles all your scheduling, tasks, and whatever you need. It’s your financial adviser that gives you advice on your finances. It’s your doctor that gives you advice on your health. At some point, when I have conversations with my colleagues about this, some are saying that if you have an AI assistant that can do a lot for you, that’s close to AGI.
Some people would call it AGI because the problem we see right now is that GPT is the best in coding—it’s the best in speed coding contests. People try to code something within a fixed period, and these systems are better than humans. And I said, “Okay, that’s good.” And he said, “Well, aren’t you amazed? We have systems that can outcompete competitive coders right now.”
The reason why it’s impressive but not making big rounds is that these systems are still not reliable. It’s not like I can delegate a task to the system and be 100% sure it will solve it. 80% sure that solving a task is not enough. This notion of hallucination and robustness in the system is missing at this point. That’s why, for example, in coding, it hasn’t replaced professional coders. It’s useful as a tool, but it hasn’t emerged to the point where I’m replacing all of them with AI if I have an organization with programmers.
AI is helping them write better code, but it hasn’t gotten to the point where this robustness and reliability is achieved. It’s like having a personal assistant, which is 80% correct. I don’t want a personal assistant who books my flights 20% of the time incorrectly. Right? That’s just not acceptable. So, this is where we are at this point. To get to AGI, we need the system to be robust to hallucinations. It’s not there yet.
Jacobsen: Are governments, policymakers, and economists equipped to handle the sweeping changes AI demands? For example, these systems will likely require access to significant amounts of personal data to make decisions, raising urgent concerns about data privacy. Additionally, the economic landscape could shift dramatically as corporations opt for AI solutions that outperform human employees. How should society navigate these dual challenges of privacy and employment disruption?
Salakhutdinov: These models we see today are very data-hungry and improve with more data, especially personalized data. If they know you, the decisions they make can be much better. That aspect is going to be important. There are regulations regarding what that would look like, which will soon be coming into place. These models are not yet at the point where they can be reliably deployed or fully useful.
Economists are doing some work on job displacement. How much of it will happen is still not clear. Still, someone gave me an example of a company that laid off several translators from one language to another because machines can do it better, cheaper, and faster. Translation from English to French is just one example. That’s worth considering, especially as these systems improve.
One question I always have is, when these systems reach the point where certain parts of our economy see displacement, what will governments need to do to retrain people? The next two years will be critical because if progress continues as it has over the last couple of years, the changes will be fairly quick. Usually, with humanity, if it takes a generation or two to adapt, it’s fine. But it’s a fast change if it happens over five to ten years. So yeah, that’s worth considering, as well as closely tracking how these models progress. By 2025, we will see this every year—an iteration of models coming out, like GPT-2, GPT-3, GPT-4.
We’re still waiting for GPT-5. Google has Gemini 2, you know, Gemini 2.4. It’s like, and this year will also be interesting because it’s the next stage of what’s frontier-based models, which consume more data and computing. So the question this year is, what will that gap be if we see GPT-5?
Jacobsen: Eric Schmidt jokingly remarked that Americans might one day turn to Canada for hydropower due to the immense energy demands of advanced AI systems. What do you make of this observation, and how might the energy consumption of AI shape global resource dynamics?
Salakhutdinov: That’s true. And as these models become bigger, there’s now thinking about reducing the cost because you can’t afford it otherwise. More research should be done to build these models more efficiently and train them with less computation. Otherwise, the cost is going to be prohibitive.
Jacobsen: Jensen Huang recently noted that we are approaching the end of Moore’s Law, yet he highlighted transformative announcements at CES suggesting new hardware and software efficiencies. He described this as an “exponential on an exponential.” How do these compounding efficiencies shape your view of AI’s trajectory?
Salakhutdinov: So that’s true—for example, the hardware. If you look at NVIDIA, for example, some of their latest GPUs have massive improvements compared to five years ago. One thing is that as we achieve these efficiencies, we are reaching the point where we’re training these models on all of the Internet data. So, everything available goes into these models. And if you think about it, there’s no second or third Internet. So, data is limited based on what we have access to.
Much data is in the video space and images, like other modalities and speech. However, potentially, there will also be data that we call synthetically generated data—data generated by models that we can use to train and continue improving our models.
Jacobsen: There’s a concept I’ve been reflecting on—where we rely on limited data and generate artificial datasets through statistical extrapolation. What is the technical term for this approach, and how central do you see it becoming to AI advancements?
Salakhutdinov: That’s what artificial data means. For example, as these systems improve, you can generate artificial data from your model. There are ways of filtering and cleaning this data, which now becomes training data for the next model.
There are these bootstrapping pieces that you can do that work reasonably well. We still can’t just train on artificial data.
So, we still need real data. And how do we get this real data? I suspect multimodal models will use images, videos, text, and speech in the future. There’s a bunch of research happening—my former student, now a professor at MIT, is looking at devices that collect data and building these foundation models based on that.
Now, compute is the case; data is the main workhorse. But data is important because you need to be able to clean it and curate it. I remember Microsoft doing this funny thing early on, announcing the Copilot project around 2022, right after ChatGPT. They were training the models, and somebody told Copilot, “Well, 2 + 2 is 5.” And the Copilot would say, “No.”
“It’s two plus 2, which is 4.” Then you say, “No, it’s five because my wife told me it’s 5.” The Copilot would say, “Okay, it’s 5.”
You know? So, things of that sort. “I agree with you. If you insist, I agree with you.” Or it would say, “Yeah.”
Or, at some point, it would say, “No, that’s incorrect.” And the user would say, “Well, you’re stupid.” And the Copilot would say, “Well, you’re stupid.” And so you get into this conversation where you’re an idiot. The Copilot would call you an idiot.
It would do this because much of the conversational data was taken from Reddit. If you look at Reddit, some conversations say, “Oh, here’s the right thing.” And somebody says, “No, you’re an idiot. It’s this thing.”
If you train on data like this, you get similar behaviour because the model statistically learns how conversations go. This is where mitigations come in: cleaning the data and understanding what’s needed. That’s also part of the process of building these models.
Jacobsen: Do we have a theoretical framework for determining the ultimate efficiency of a single compute unit? Or are we still in the realm of empirical guesswork?
Salakhutdinov: Yes. There is something called scaling laws.
The scaling laws were the idea that came up: “Look, we’re building a 500,000,000,000-parameter model. How much data do we need? What kind of accuracy do we expect to get? It’s very expensive to run this model, right?”
You can only do a single run to get that model. You can’t, like, try. And so what would happen is that you take smaller models and build these curves by saying, “Okay, this is how much data I have, this is how much compute I have, this is the accuracy that I get.”
“If I increase the data but keep the computer, this is the accuracy. If I increase the data and compute, I will get this.” So, you build this on small models and extrapolate further. And you say, “Okay, if I have that much more computing and data, this is the accuracy I’m expecting to have.” That was a guiding principle for a lot of existing model buildings.
But it’s also very hard to predict. Nobody’s been able to say, “Look, if we triple the compute and we triple the data, we’re going to reach AGI, or we’re going to reach ASI, or we’re going to reach the point where.” We get these scaling laws up to some point, but we don’t know what that will look like beyond.
Hard to predict. There is something whose initial thinking was that we throw more data, we throw more computing, and we get better models, which is what the industry is doing. There’s a second paradigm, which is what’s called test-time compute or inference compute, which is what these reasoning models are doing, which is to say, “Well, if you let me think more for a specific problem, if I spend more compute thinking about the problem, I can give you the answers.”
So, that’s part of the scaling laws to say we can get better systems. But again, no one has clearly defined what it would mean to reach ASI or AGI, so we are still not there. It’s not clear whether we’re going to get there.
Jacobsen: When we talk about AGI and ASI, the definitions seem to hinge on a mix of factors: computational power, neural network efficiency, and even evolutionary adaptability. Some argue that framing AGI around human intelligence sets a false benchmark, as human cognition itself is specialized and full of gaps. Should we redefine intelligence benchmarks in AI to account for these nuances?
Salakhutdinov: That’s a very good question. People associate AGI with human-level intelligence. But it’s unclear whether these systems can match human-level intelligence. Because ChatGPT or any large language models are better at math than most people, does this mean they’re intelligent? There is something about human intelligence where you can extrapolate and reason and do things that machines can’t, at least at this point. They require these: There is an example where a machine can solve math or Olympiad competitions. But then, when you ask it, like, “What is bigger, 9 or 9.11?” the model gets confused and says, “Well, nine is bigger than 9.11.”
Jacobsen: There are clear gaps in AI systems’ reliability—areas where common sense might dictate one course of action, but machines falter. While AI excels in tasks like drafting and summarizing, it struggles with others, like physical intelligence in robotics. A robotics expert once quipped that the first company to build a robot capable of unloading a dishwasher will become a billion-dollar enterprise. What are your thoughts on this divide between theoretical AI capabilities and practical applications?
Salakhutdinov: It is. But it still gives you this notion that it’s very hard to predict because, 10 years ago, people would have thought that building creative machines—machines that can draw creative pictures or write creative text—would be far more difficult than the robot unloading your dishwasher. And it’s just completely the other way around at this point.
I can prompt them all. They can do very good creative writing for me, improve my writing, generate realistic-looking images, and compose things in interesting ways—for designers, for example. These are amazing tools.
It points to the problem of predicting five years. People like Geoffrey Hinton, Eric Schmidt, and others are ringing the bell because they say, “Look, there is a non-zero chance these models will become very dangerous.” And I buy that. I don’t buy the whole Skynet future. These robots—where these models or AIs will say we don’t need humans and have full control. I don’t see that in the future, but as I’ve mentioned, it’s always hard to predict what will happen in five to ten years.
So, we need to consider everything. I was recently talking with Geoffrey Hinton, and I asked him, “Why are you so worried?” I think he was saying he’s worried but wants to make sure that some of the resources are allocated to safety research and, like you said, understanding the economy, job displacement, how these systems can be more robust, and how to conduct safety research.
That has never been the priority, at least until now. I agree with that. We need to do more work, research, and more—people are focusing more on capabilities and building more capable and better models. At the same time, we need people who understand these models’ safety aspects, robustness, economics, etc.
Jacobsen: Among your peers in the AI field, who do you consider the most consistently accurate in their predictions? Is there a figure whose insights have particularly resonated with you?
Salakhutdinov: This is a difficult question. I don’t know anyone who has consistently been accurate in their predictions.
Jacobsen: I wondered if the public has an accurate picture because they use many of the same terms. The definitions are a bit off. That leads to too much confusion about how people report this to the public and how they are taking it in. A long time ago, AI was about machine learning, statistical engines, etc. Still, these were quite distinct areas of specialization. They were almost niche. Now, though, they’re front and center as if they’re exactly one thing. That’s probably the area of confusion, but this will help clarify. Nice to meet you, and thank you so much for your time today.
Salakhutdinov: I appreciate it. Nice meeting you as well. Thanks for doing this.