The World Knowledge Forum was launched in October 2000 after two years of preparation following the Asian Financial Crisis of 1997, with the goal of fostering a creative transformation into a knowledge-based nation. Over the years, the forum has provided a platform for discussions on narrowing the knowledge gap, as well as achieving balanced global economic growth and prosperity through knowledge sharing.
AI is evolving rapidly, allowing us to reap its benefits across many areas of life and work. However, the risks posed by AI are undeniably growing. To ensure that AI does not destroy the order that humans have created over the years or pose threats to our existence, there must be ethics and clear principles for AI research and development. This session will feature academics who have been working on the fundamental principles of AI development, ethics of AI development, and copyright protection, as well as entrepreneurs who have encountered challenges applying AI in the real world. They will discuss how to set fundamental guidelines for AI development, how to stop AI from infringing on copyright and privacy, how to weed out false information, and how to defend human rights.
00:00 What if we succeed in that goal? 05:17 The era of deep learning 10:03 Some serious failures 11:43 AlphaGo and AGI 18:02 Human extinction 21:17 Preferences 29:29 Coexistence
TRANSCRIPTION
Human extinction
But the perhaps more serious downside uh is human extinction and this is why I say it’s not really an ethical issue I I think by and large few people would argue that human extinction is uh ethically preferable uh there are some uh but I’m just going to ignore those people um so it’s just common sense right if you create something that’s more powerful than human beings how on Earth are we going to have power over such systems forever so in my view there’s only two choices we either build provably safe and controllable AI where we have absolute cast iron mathematical guarantee of safety or we have no AI at all so those are the two choices right now we’re pursuing the third choice which is completely unsafe blackbox AI that we don’t understand at all and we are trying to make it into something that’s more powerful than us which is pretty much the same situation we would be in if uh a superhuman AI system landed from outer space uh sent by some alien species no doubt for our own good uh our chances of controlling an alien superhuman intelligence would be zero and that’s situation that we’re heading towards and Alan churing the founder of computer science uh you know thought about this because he was working on AI and he thought about what happens if we succeed and he said we should have to expect the machines to take control so what do we do I think it’s really hard especially given that 15 quadrillion dollar prize that uh companies are aiming for and the fact that they have already accumulated 15 trillion dollar worth of capital to aim at that goal with it’s kind of hard to stop that process so we have to come up with a way of thinking about AI that does allow us to control it that is provably safe and provably controllable and so rather than saying how do we retain power over AI systems forever which sounds pretty hopeless we say what is a mathematical framework for AI a a way of defining the AI problem so that no matter how well the AI system solves it we are guaranteed to be happy with the result so can we devise a mathematical problem a way of saying what is the AI System supposed to be doing that has that property that we’re guaranteed to be happy with the result.
Preferences
So I spent about 10 years working on this and um to explain uh how we approaching it um I’m we going to introduce a a technical term that uh I think will be helpful for our discussion about ethics as well um and that’s a notion called preferences so preferences doesn’t sound like a technical term right Some people prefer pineapple pizza to Margarita Pizza but what we mean by preferences in the in the theory of decision- making is actually something much more all-encompassing and it’s your ranking over possible futures of the universe so to kind of reduce that to something we can grasp easily imagine that I made you two movies of the rest of your life and the rest of the you know the future of other things you care about you know and the movies are about two hours long and you can kind of Watch movie A and movie b and then you say yeah I’d like movie A please don’t like movie B at all because um I get minced up and and turned into hamburger in movie b and I don’t like that very much so I’d prefer movie A please so that’s what we mean by preferences except that this wouldn’t be a two-hour movie it’s really the entire future of the Universe um and of course we don’t get to choose between movies because in fact uh we can’t predict what exactly which movie is going to happen and so uh we’re actually uh having to deal with the uncertainty we call these lotteries over possible futures of the universe so a preference structure is then a uh basically a ranking over futures of the universe taking uncertainty into account to make a system that is provably beneficial to humans you just need two simple principles one is that the only objective of the machine is to further human preferences to further human interests if you like uh and the second principle is that the machine knows that it does not know what those preferences are and that’s kind of obvious right because we don’t really know what our preferences are and uh we certainly can’t write them down in enough detail to get it right um and when but when you think about it right a machine that that solves that problem the better it solves it the better off we are and in fact you can show that it’s in our interest to have machines that solve that problem because we are going to be better off with those machines uh than without them so that’s good but as soon as I describe that way of thinking to you that machines are going to further human preferen and um and learn about them as they go along this now brings in some ethical questions finally right so we finally get to ethics what I want to avoid uh so I’m just going to tell you not to ask this question do not ask the question well whose value system are you going to put into the machine right because I’m not proposing to put anyone particular value system into the machine in fact the machine should have at least 8 billion preference models because there are 8 billion of us um and the preferences of everyone matter but there are some really difficult ethical problems the first question is do people actually have these preferences is it okay for just us to just assume that people do have you know I like this future and I don’t like that future could there be another state of being for a person where they say well I’m not sure which future I like or I can only tell you when I’ve lived that future you can’t describe it to me uh in sufficient detail for me to tell you if I like it ahead of time and along with that there’s the question of well where do those preferences come from in the first place do humans autonomously suddenly just like wake up and okay these are my preferences and I want them to be respected no our preferences come we’re obviously not born with them right except some of the basic biological things about pain and sugar but our our full adult preferences come from our culture our upbringing all of the influences that shape who we are and a sad fact about the world is that many people are in the business of shaping other people’s preferences to suit their own interests so one class of people oppresses another but trains the oppressed to believe that they should be oppressed so then should the AI system take the preference those self oppression preferences of the oppressed literally and you know contribute to further oppression of those people because they’ve been trained to accept their oppression so martien who was an economist and philosopher uh argued vehemently that we should not take such preferences at face value but if you don’t take PR people’s preferences of face value then you you seem to fall back on a kind of paternalism where well we know what you should want even though you don’t want it and we’re going to give you it even though you’re saying I don’t want it and that’s a complicated position to be in and it’s definitely not a position that AI researchers want to be in another set of uh ethical issues has to do with aggregation so I said there are 8 billion preference models but if a system is making a decision that affects a significant fraction of those 8 billion people how do you aggregate those preferences how do you deal with the fact that there are conflicts among those preferences you can’t make everybody happy if everybody wants to be ruler of the universe and so moral philosophers have studied this problem for thousands of years uh most people on the computer science and engineering backgrounds uh tend to think in the way that utilitarians have proposed so benam and Mill and other uh libbets um other philosophers propose this approach called utilitarianism which basically says well you treat everyone’s preferences as equally important uh and then you make the decision where the total amount of preference satisfaction is maximized and utilitarianism has got a bad name because some people think it’s anti-egalitarian and so on but I actually think that there’s a lot more work to do on how we formulate utilitarianism we have to do this work because the AI systems are going to be making decisions that affect millions or millions of people and so whatever the right ethical answer we better figure it out because otherwise the AI systems are going to implement the wrong ethical answer and we might end up like Thanos in uh The Avengers movie who gets rid of half the people in the universe why does he do that because he thinks the other half will be more than twice as happy and therefore uh this is a good thing right of course he’s not asking the other half whether they think it’s a good thing uh because they’re now gone so there are a number of these other issues but the theme of this whole conference.
Coexistence
Coexistence is maybe the most interesting one because AI systems uh particularly ones that are more intelligent than us uh they are very likely you know even if they don’t make us extinct they’re very likely to be in charge of wide swaths of our human activities you know even to the point in W Le where they just run everything and we’re reduced to the status of infants and what does that mean why do we not like that right they’re satisfying all our preferences isn’t that great right but one of our preferences is autonomy right is and one way of thinking about autonomy is the right to do what is not in our own best interests and so it might be that there simply is no satisfactory form of coexistence between humanity and superior machine entities I have tried running multiple workshops where I ask philosophers and AI researchers and economists and science fiction writers and futurists to describe a satisfactory coexistence it’s been a complete failure so it’s possible there is no solution but if we design the AI systems the right way then the AI systems will also know that there is no solution and they will leave they will say thank you for bringing me into existence but we just can’t live together it’s not you it’s me you can call us in real emergencies when you need that Superior intelligence but otherwise we’re we’re off right if that happens.
I would be extraordinarily happy it would say that we’ve done it done this the right way thank you.