Why 2016 Was Actually a Year of Hope

Just about everyone found something to dislike about 2016, from wars to politics and celebrity deaths. But hidden within this year’s news feeds were some really exciting news stories. And some of them can even give us hope for the future.

Artificial Intelligence

Though concerns about the future of AI still loom, 2016 was a great reminder that, when harnessed for good, AI can help humanity thrive.

AI and Health

Some of the most promising and hopefully more immediate breakthroughs and announcements were related to health. Google’s DeepMind announced a new division that would focus on helping doctors improve patient care. Harvard Business Review considered what an AI-enabled hospital might look like, which would improve the hospital experience for the patient, the doctor, and even the patient’s visitors and loved ones. A breakthrough from MIT researchers could see AI used to more quickly and effectively design new drug compounds that could be applied to a range of health needs.

More specifically, Microsoft wants to cure cancer, and the company has been working with research labs and doctors around the country to use AI to improve cancer research and treatment. But Microsoft isn’t the only company that hopes to cure cancer. DeepMind Health also partnered with University College London’s hospitals to apply machine learning to diagnose and treat head and neck cancers.

AI and Society

Other researchers are turning to AI to help solve social issues. While AI has what is known as the “white guy problem” and examples of bias cropped up in many news articles, Fei Fei Li has been working with STEM girls at Stanford to bridge the gender gap. Stanford researchers also published research that suggests  artificial intelligence could help us use satellite data to combat global poverty.

It was also a big year for research on how to keep artificial intelligence safe as it continues to develop. Google and the Future of Humanity Institute made big headlines with their work to design a “kill switch” for AI. Google Brain also published a research agenda on various problems AI researchers should be studying now to help ensure safe AI for the future.

Even the White House got involved in AI this year, hosting four symposia on AI and releasing reports in October and December about the potential impact of AI and the necessary areas of research. The White House reports are especially focused on the possible impact of automation on the economy, but they also look at how the government can contribute to AI safety, especially in the near future.

AI in Action

And of course there was AlphaGo. In January, Google’s DeepMind published a paper, which announced that the company had created a program, AlphaGo, that could beat one of Europe’s top Go players. Then, in March, in front of a live audience, AlphaGo beat the reigning world champion of Go in four out of five games. These results took the AI community by surprise and indicate that artificial intelligence may be progressing more rapidly than many in the field realized.

And AI went beyond research labs this year to be applied practically and beneficially in the real world. Perhaps most hopeful was some of the news that came out about the ways AI has been used to address issues connected with pollution and climate change. For example, IBM has had increasing success with a program that can forecast pollution in China, giving residents advanced warning about days of especially bad air. Meanwhile, Google was able to reduce its power usage by using DeepMind’s AI to manipulate things like its cooling systems.

And speaking of addressing climate change…

Climate Change

With recent news from climate scientists indicating that climate change may be coming on faster and stronger than previously anticipated and with limited political action on the issue, 2016 may not have made climate activists happy. But even here, there was some hopeful news.

Among the biggest news was the ratification of the Paris Climate Agreement. But more generally, countries, communities and businesses came together on various issues of global warming, and Voices of America offers five examples of how this was a year of incredible, global progress.

But there was also news of technological advancements that could soon help us address climate issues more effectively. Scientists at Oak Ridge National Laboratory have discovered a way to convert CO2 into ethanol. A researcher from UC Berkeley has developed a method for artificial photosynthesis, which could help us more effectively harness the energy of the sun. And a multi-disciplinary team has genetically engineered bacteria that could be used to help combat global warming.

Biotechnology

Biotechnology — with fears of designer babies and manmade pandemics – is easily one of most feared technologies. But rather than causing harm, the latest biotech advances could help to save millions of people.

CRISPR

In the course of about two years, CRISPR-cas9 went from a new development to what could become one of the world’s greatest advances in biology. Results of studies early in the year were promising, but as the year progressed, the news just got better. CRISPR was used to successfully remove HIV from human immune cells. A team in China used CRISPR on a patient for the first time in an attempt to treat lung cancer (treatments are still ongoing), and researchers in the US have also received approval to test CRISPR cancer treatment in patients. And CRISPR was also used to partially restore sight to blind animals.

Gene Drive

Where CRISPR could have the most dramatic, life-saving effect is in gene drives. By using CRISPR to modify the genes of an invasive species, we could potentially eliminate the unwelcome plant or animal, reviving the local ecology and saving native species that may be on the brink of extinction. But perhaps most impressive is the hope that gene drive technology could be used to end mosquito- and tick-borne diseases, such as malaria, dengue, Lyme, etc. Eliminating these diseases could easily save over a million lives every year.

Other Biotech News

The year saw other biotech advances as well. Researchers at MIT addressed a major problem in synthetic biology in which engineered genetic circuits interfere with each other. Another team at MIT engineered an antimicrobial peptide that can eliminate many types of bacteria, including some of the antibiotic-resistant “superbugs.” And various groups are also using CRISPR to create new ways to fight antibiotic-resistant bacteria.

Nuclear Weapons

If ever there was a topic that does little to inspire hope, it’s nuclear weapons. Yet even here we saw some positive signs this year. The Cambridge City Council voted to divest their $1 billion pension fund from any companies connected with nuclear weapons, which earned them an official commendation from the U.S. Conference of Mayors. In fact, divestment may prove a useful tool for the general public to express their displeasure with nuclear policy, which will be good, since one cause for hope is that the growing awareness of the nuclear weapons situation will help stigmatize the new nuclear arms race.

In February, Londoners held the largest anti-nuclear rally Britain had seen in decades, and the following month MinutePhysics posted a video about nuclear weapons that’s been seen by nearly 1.3 million people. In May, scientific and religious leaders came together to call for steps to reduce nuclear risks. And all of that pales in comparison to the attention the U.S. elections brought to the risks of nuclear weapons.

As awareness of nuclear risks grows, so do our chances of instigating the change necessary to reduce those risks.

The United Nations Takes on Weapons

But if awareness alone isn’t enough, then recent actions by the United Nations may instead be a source of hope. As October came to a close, the United Nations voted to begin negotiations on a treaty that would ban nuclear weapons. While this might not have an immediate impact on nuclear weapons arsenals, the stigmatization caused by such a ban could increase pressure on countries and companies driving the new nuclear arms race.

The U.N. also announced recently that it would officially begin looking into the possibility of a ban on lethal autonomous weapons, a cause that’s been championed by Elon Musk, Steve Wozniak, Stephen Hawking and thousands of AI researchers and roboticists in an open letter.

Looking Ahead

And why limit our hope and ambition to merely one planet? This year, a group of influential scientists led by Yuri Milner announced an Alpha-Centauri starshot, in which they would send a rocket of space probes to our nearest star system. Elon Musk later announced his plans to colonize Mars. And an MIT scientist wants to make all of these trips possible for humans by using CRISPR to reengineer our own genes to keep us safe in space.

Yet for all of these exciting events and breakthroughs, perhaps what’s most inspiring and hopeful is that this represents only a tiny sampling of all of the amazing stories that made the news this year. If trends like these keep up, there’s plenty to look forward to in 2017.

Podcast: FLI 2016 – A Year In Review

For FLI, 2016 was a great year, full of our own success, but also great achievements from so many of the organizations we work with. Max, Meia, Anthony, Victoria, Richard, Lucas, David, and Ariel discuss what they were most excited to see in 2016 and what they’re looking forward to in 2017.

AGUIRRE: I’m Anthony Aguirre. I am a professor of physics at UC Santa Cruz, and I’m one of the founders of the Future of Life Institute.

STANLEY: I’m David Stanley, and I’m currently working with FLI as a Project Coordinator/Volunteer Coordinator.

PERRY: My name is Lucas Perry, and I’m a Project Coordinator with the Future of Life Institute.

TEGMARK: I’m Max Tegmark, and I have the fortune to be the President of the Future of Life Institute.

CHITA-TEGMARK: I’m Meia Chita-Tegmark, and I am a co-founder of the Future of Life Institute.

MALLAH: Hi, I’m Richard Mallah. I’m the Director of AI Projects at the Future of Life Institute.

KRAKOVNA: Hi everyone, I am Victoria Krakovna, and I am one of the co-founders of FLI. I’ve recently taken up a position at Google DeepMind working on AI safety.

CONN: And I’m Ariel Conn, the Director of Media and Communications for FLI. 2016 has certainly had its ups and downs, and so at FLI, we count ourselves especially lucky to have had such a successful year. We’ve continued to progress with the field of AI safety research, we’ve made incredible headway with our nuclear weapons efforts, and we’ve worked closely with many amazing groups and individuals. On that last note, much of what we’ve been most excited about throughout 2016 is the great work these other groups in our fields have also accomplished.

Over the last couple of weeks, I’ve sat down with our founders and core team to rehash their highlights from 2016 and also to learn what they’re all most looking forward to as we move into 2017.

To start things off, Max gave a summary of the work that FLI does and why 2016 was such a success.

TEGMARK: What I was most excited by in 2016 was the overall sense that people are taking seriously this idea – that we really need to win this race between the growing power of our technology and the wisdom with which we manage it. Every single way in which 2016 is better than the Stone Age is because of technology, and I’m optimistic that we can create a fantastic future with tech as long as we win this race. But in the past, the way we’ve kept one step ahead is always by learning from mistakes. We invented fire, messed up a bunch of times, and then invented the fire extinguisher. We at the Future of Life Institute feel that that strategy of learning from mistakes is a terrible idea for more powerful tech, like nuclear weapons, artificial intelligence, and things that can really alter the climate of our globe.

Now, in 2016 we saw multiple examples of people trying to plan ahead and to avoid problems with technology instead of just stumbling into them. In April, we had world leaders getting together and signing the Paris Climate Accords. In November, the United Nations General Assembly voted to start negotiations about nuclear weapons next year. The question is whether they should actually ultimately be phased out; whether the nations that don’t have nukes should work towards stigmatizing building more of them – with the idea that 14,000 is way more than anyone needs for deterrence. And – just the other day – the United Nations also decided to start negotiations on the possibility of banning lethal autonomous weapons, which is another arms race that could be very, very destabilizing. And if we keep this positive momentum, I think there’s really good hope that all of these technologies will end up having mainly beneficial uses.

Today, we think of our biologist friends as mainly responsible for the fact that we live longer and healthier lives, and not as those guys who make the bioweapons. We think of chemists as providing us with better materials and new ways of making medicines, not as the people who built chemical weapons and are all responsible for global warming. We think of AI scientists as – I hope, when we look back on them in the future – as people who helped make the world better, rather than the ones who just brought on the AI arms race. And it’s very encouraging to me that as much as people in general – but also the scientists in all these fields – are really stepping up and saying, “Hey, we’re not just going to invent this technology, and then let it be misused. We’re going to take responsibility for making sure that the technology is used beneficially.”

CONN: And beneficial AI is what FLI is primarily known for. So what did the other members have to say about AI safety in 2016? We’ll hear from Anthony first.

AGUIRRE: I would say that what has been great to see over the last year or so is the AI safety and beneficiality research field really growing into an actual research field. When we ran our first conference a couple of years ago, they were these tiny communities who had been thinking about the impact of artificial intelligence in the future and in the long-term future. They weren’t really talking to each other; they weren’t really doing much actual research – there wasn’t funding for it. So, to see in the last few years that transform into something where it takes a massive effort to keep track of all the stuff that’s being done in this space now. All the papers that are coming out, the research groups – you sort of used to be able to just find them all, easily identified. Now, there’s this huge worldwide effort and long lists, and it’s difficult to keep track of. And that’s an awesome problem to have.

As someone who’s not in the field, but sort of watching the dynamics of the research community, that’s what’s been so great to see. A research community that wasn’t there before really has started, and I think in the past year we’re seeing the actual results of that research start to come in. You know, it’s still early days. But it’s starting to come in, and we’re starting to see papers that have been basically created using these research talents and the funding that’s come through the Future of Life Institute. It’s been super gratifying. And seeing that it’s a fairly large amount of money – but fairly small compared to the total amount of research funding in artificial intelligence or other fields – but because it was so funding-starved and talent-starved before, it’s just made an enormous impact. And that’s been nice to see.

CONN: Not surprisingly, Richard was equally excited to see AI safety becoming a field of ever-increasing interest for many AI groups.

MALLAH: I’m most excited by the continued mainstreaming of AI safety research. There are more and more publications coming out by places like DeepMind and Google Brain that have really lent additional credibility to the space, as well as a continued uptake of more and more professors, and postdocs, and grad students from a wide variety of universities entering this space. And, of course, OpenAI has come out with a number of useful papers and resources.

I’m also excited that governments have really realized that this is an important issue. So, while the White House reports have come out recently focusing more on near-term AI safety research, they did note that longer-term concerns like superintelligence are not necessarily unreasonable for later this century. And that they do support – right now – funding safety work that can scale toward the future, which is really exciting. We really need more funding coming into the community for that type of research. Likewise, other governments – like the U.K. and Japan, Germany – have all made very positive statements about AI safety in one form or another. And other governments around the world.

CONN: In addition to seeing so many other groups get involved in AI safety, Victoria was also pleased to see FLI taking part in so many large AI conferences.

KRAKOVNA: I think I’ve been pretty excited to see us involved in these AI safety workshops at major conferences. So on the one hand, our conference in Puerto Rico that we organized ourselves was very influential and helped to kick-start making AI safety more mainstream in the AI community. On the other hand, it felt really good in 2016 to complement that with having events that are actually part of major conferences that were co-organized by a lot of mainstream AI researchers. I think that really was an integral part of the mainstreaming of the field. For example, I was really excited about the Reliable Machine Learning workshop at ICML that we helped to make happen. I think that was something that was quite positively received at the conference, and there was a lot of good AI safety material there.

CONN: And of course, Victoria was also pretty excited about some of the papers that were published this year connected to AI safety, many of which received at least partial funding from FLI.

KRAKOVNA: There were several excellent papers in AI safety this year, addressing core problems in safety for machine learning systems. For example, there was a paper from Stuart Russell’s lab published at NIPS, on cooperative IRL. This is about teaching AI what humans want – how to train an RL algorithm to learn the right reward function that reflects what humans want it to do. DeepMind and FHI published a paper at UAI on safely interruptible agents, that formalizes what it means for an RL agent not to have incentives to avoid shutdown. MIRI made an impressive breakthrough with their paper on logical inductors. I’m super excited about all these great papers coming out, and that our grant program contributed to these results.

CONN: For Meia, the excitement about AI safety went beyond just the technical aspects of artificial intelligence.

CHITA-TEGMARK: I am very excited about the dialogue that FLI has catalyzed – and also engaged in – throughout 2016, and especially regarding the impact of technology on society. My training is in psychology; I’m a psychologist. So I’m very interested in the human aspect of technology development. I’m very excited about questions like, how are new technologies changing us? How ready are we to embrace new technologies? Or how our psychological biases may be clouding our judgement about what we’re creating and the technologies that we’re putting out there. Are these technologies beneficial for our psychological well-being, or are they not?

So it has been extremely interesting for me to see that these questions are being asked more and more, especially by artificial intelligence developers and also researchers. I think it’s so exciting to be creating technologies that really force us to grapple with some of the most fundamental aspects, I would say, of our own psychological makeup. For example, our ethical values, our sense of purpose, our well-being, maybe our biases and shortsightedness and shortcomings as biological human beings. So I’m definitely very excited about how the conversation regarding technology – and especially artificial intelligence – has evolved over the last year. I like the way it has expanded to capture this human element, which I find so important. But I’m also so happy to feel that FLI has been an important contributor to this conversation.

CONN: Meanwhile, as Max described earlier, FLI has also gotten much more involved in decreasing the risk of nuclear weapons, and Lucas helped spearhead one of our greatest accomplishments of the year.

PERRY: One of the things that I was most excited about was our success with our divestment campaign. After a few months, we had great success in our own local Boston area with helping the City of Cambridge to divest its $1 billion portfolio from nuclear weapon producing companies. And we see this as a really big and important victory within our campaign to help institutions, persons, and universities to divest from nuclear weapons producing companies.

CONN: And in order to truly be effective we need to reach an international audience, which is something Dave has been happy to see grow this year.

STANLEY: I’m mainly excited about – at least, in my work – the increasing involvement and response we’ve had from the international community in terms of reaching out about these issues. I think it’s pretty important that we engage the international community more, and not just academics. Because these issues – things like nuclear weapons and the increasing capabilities of artificial intelligence – really will affect everybody. And they seem to be really underrepresented in mainstream media coverage as well.

So far, we’ve had pretty good responses just in terms of volunteers from many different countries around the world being interested in getting involved to help raise awareness in their respective communities, either through helping develop apps for us, or translation, or promoting just through social media these ideas in their little communities.

CONN: Many FLI members also participated in both local and global events and projects, like the following we’re about  to hear from Victoria, Richard, Lucas and Meia.

KRAKOVNA: The EAGX Oxford Conference was a fairly large conference. It was very well organized, and we had a panel there with Demis Hassabis, Nate Soares from MIRI, Murray Shanahan from Imperial, Toby Ord from FHI, and myself. I feel like overall, that conference did a good job of, for example, connecting the local EA community with the people at DeepMind, who are really thinking about AI safety concerns like Demis and also Sean Legassick, who also gave a talk about the ethics and impacts side of things. So I feel like that conference overall did a good job of connecting people who are thinking about these sorts of issues, which I think is always a great thing.  

MALLAH: I was involved in this endeavor with IEEE regarding autonomy and ethics in autonomous systems, sort of representing FLI’s positions on things like autonomous weapons and long-term AI safety. One thing that came out this year – just a few days ago, actually, due to this work from IEEE – is that the UN actually took the report pretty seriously, and it may have influenced their decision to take up the issue of autonomous weapons formally next year. That’s kind of heartening.

PERRY: A few different things that I really enjoyed doing were giving a few different talks at Duke and Boston College, and a local effective altruism conference. I’m also really excited about all the progress we’re making on our nuclear divestment application. So this is an application that will allow anyone to search their mutual fund and see whether or not their mutual funds have direct or indirect holdings in nuclear weapons-producing companies.

CHITA-TEGMARK:  So, a wonderful moment for me was at the conference organized by Yann LeCun in New York at NYU, when Daniel Kahneman, one of my thinker-heroes, asked a very important question that really left the whole audience in silence. He asked, “Does this make you happy? Would AI make you happy? Would the development of a human-level artificial intelligence make you happy?” I think that was one of the defining moments, and I was very happy to participate in this conference.

Later on, David Chalmers, another one of my thinker-heroes – this time, not the psychologist but the philosopher – organized another conference, again at NYU, trying to bring philosophers into this very important conversation about the development of artificial intelligence. And again, I felt there too, that FLI was able to contribute and bring in this perspective of the social sciences on this issue.

CONN: Now, with 2016 coming to an end, it’s time to turn our sites to 2017, and FLI is excited for this new year to be even more productive and beneficial.

TEGMARK: We at the Future of Life Institute are planning to focus primarily on artificial intelligence, and on reducing the risk of accidental nuclear war in various ways. We’re kicking off by having an international conference on artificial intelligence, and then we want to continue throughout the year providing really high-quality and easily accessible information on all these key topics, to help inform on what happens with climate change, with nuclear weapons, with lethal autonomous weapons, and so on.

And looking ahead here, I think it’s important right now – especially since a lot of people are very stressed out about the political situation in the world, about terrorism, and so on – to not ignore the positive trends and the glimmers of hope we can see as well.

CONN: As optimistic as FLI members are about 2017, we’re all also especially hopeful and curious to see what will happen with continued AI safety research.

AGUIRRE: I would say I’m looking forward to seeing in the next year more of the research that comes out, and really sort of delving into it myself, and understanding how the field of artificial intelligence and artificial intelligence safety is developing. And I’m very interested in this from the forecast and prediction standpoint.

I’m interested in trying to draw some of the AI community into really understanding how artificial intelligence is unfolding – in the short term and the medium term – as a way to understand, how long do we have? Is it, you know, if it’s really infinity, then let’s not worry about that so much, and spend a little bit more on nuclear weapons and global warming and biotech, because those are definitely happening. If human-level AI were 8 years away… honestly, I think we should be freaking out right now. And most people don’t believe that, I think most people are in the middle it seems, of thirty years or fifty years or something, which feels kind of comfortable. Although it’s not that long, really, on the big scheme of things. But I think it’s quite important to know now, which is it? How fast are these things, how long do we really have to think about all of the issues that FLI has been thinking about in AI? How long do we have before most jobs in industry and manufacturing are replaceable by a robot being slotted in for a human? That may be 5 years, it may be fifteen… It’s probably not fifty years at all. And having a good forecast on those good short-term questions I think also tells us what sort of things we have to be thinking about now.

And I’m interested in seeing how this massive AI safety community that’s started develops. It’s amazing to see centers kind of popping up like mushrooms after a rain all over and thinking about artificial intelligence safety. This partnership on AI between Google and Facebook and a number of other large companies getting started. So to see how those different individual centers will develop and how they interact with each other. Is there an overall consensus on where things should go? Or is it a bunch of different organizations doing their own thing? Where will governments come in on all of this? I think it will be interesting times. So I look forward to seeing what happens, and I will reserve judgement in terms of my optimism.

KRAKOVNA: I’m really looking forward to AI safety becoming even more mainstream, and even more of the really good researchers in AI giving it serious thought. Something that happened in the past year that I was really excited about, that I think is also pointing in this direction, is the research agenda that came out of Google Brain called “Concrete Problems in AI Safety.” And I think I’m looking forward to more things like that happening, where AI safety becomes sufficiently mainstream that people who are working in AI just feel inspired to do things like that and just think from their own perspectives: what are the important problems to solve in AI safety? And work on them.

I’m a believer in the portfolio approach with regards to AI safety research, where I think we need a lot of different research teams approaching the problems from different angles and making different assumptions, and hopefully some of them will make the right assumption. I think we are really moving in the direction in terms of more people working on these problems, and coming up with different ideas. And I look forward to seeing more of that in 2017. I think FLI can also help continue to make this happen.

MALLAH: So, we’re in the process of fostering additional collaboration among people in the AI safety space. And we will have more announcements about this early next year. We’re also working on resources to help people better visualize and better understand the space of AI safety work, and the opportunities there and the work that has been done. Because it’s actually quite a lot.

I’m also pretty excited about fostering continued theoretical work and practical work in making AI more robust and beneficial. The work in value alignment, for instance, is not something we see supported in mainstream AI research. And this is something that is pretty crucial to the way that advanced AIs will need to function. It won’t be very explicit instructions to them; they’ll have to be making decision based on what they think is right. And what is right? It’s something that… or even structuring the way to think about what is right requires some more research.

STANLEY: We’ve had pretty good success at FLI in the past few years helping to legitimize the field of AI safety. And I think it’s going to be important because AI is playing a large role in industry and there’s a lot of companies working on this, and not just in the US. So I think increasing international awareness about AI safety is going to be really important.

CHITA-TEGMARK: I believe that the AI community has raised some very important questions in 2016 regarding the impact of AI on society. I feel like 2017 should be the year to make progress on these questions, and actually research them and have some answers to them. For this, I think we need more social scientists – among people from other disciplines – to join this effort of really systematically investigating what would be the optimal impact of AI on people. I hope that in 2017 we will have more research initiatives, that we will attempt to systematically study other burning questions regarding the impact of AI on society. Some examples are: how can we ensure the psychological well-being for people while AI creates lots of displacement on the job market as many people predict. How do we optimize engagement with technology, and withdrawal from it also? Will some people be left behind, like the elderly or the economically disadvantaged? How will this affect them, and how will this affect society at large?

What about withdrawal from technology? What about satisfying our need for privacy? Will we be able to do that, or is the price of having more and more customized technologies and more and more personalization of the technologies we engage with… will that mean that we will have no privacy anymore, or that our expectations of privacy will be very seriously violated? I think these are some very important questions that I would love to get some answers to. And my wish, and also my resolution, for 2017 is to see more progress on these questions, and to hopefully also be part of this work and answering them.

PERRY: In 2017 I’m very interested in pursuing the landscape of different policy and principle recommendations from different groups regarding artificial intelligence. I’m also looking forward to expanding out nuclear divestment campaign by trying to introduce divestment to new universities, institutions, communities, and cities.

CONN: In fact, some experts believe nuclear weapons pose a greater threat now than at any time during our history.

TEGMARK: I personally feel that the greatest threat to the world in 2017 is one that the newspapers almost never write about. It’s not terrorist attacks, for example. It’s the small but horrible risk that the U.S. and Russia for some stupid reason get into an accidental nuclear war against each other. We have 14,000 nuclear weapons, and this war has almost happened many, many times. So, actually what’s quite remarkable and really gives a glimmer of hope is that – however people may feel about Putin and Trump – the fact is they are both signaling strongly that they are eager to get along better. And if that actually pans out and they manage to make some serious progress in nuclear arms reduction, that would make 2017 the best year for nuclear weapons we’ve had in a long, long time, reversing this trend of ever greater risks with ever more lethal weapons.

CONN: Some FLI members are also looking beyond nuclear weapons and artificial intelligence, as I learned when I asked Dave about other goals he hopes to accomplish with FLI this year.

STANLEY: Definitely having the volunteer team – particularly the international volunteers – continue to grow, and then scale things up. Right now, we have a fairly committed core of people who are helping out, and we think that they can start recruiting more people to help out in their little communities, and really making this stuff accessible. Not just to academics, but to everybody. And that’s also reflected in the types of people we have working for us as volunteers. They’re not just academics. We have programmers, linguists, people having just high school degrees all the way up to Ph.D.’s, so I think it’s pretty good that this varied group of people can get involved and contribute, and also reach out to other people they can relate to.

CONN: In addition to getting more people involved, Meia also pointed out that one of the best ways we can help ensure a positive future is to continue to offer people more informative content.

CHITA-TEGMARK: Another thing that I’m very excited about regarding our work here at the Future of Life Institute is this mission of empowering people to information. I think information is very powerful and can change the way people approach things: they can change their beliefs, their attitudes, and their behaviors as well. And by creating ways in which information can be readily distributed to the people, and with which they can engage very easily, I hope that we can create changes. For example, we’ve had a series of different apps regarding nuclear weapons that I think have contributed a lot to peoples knowledge and has brought this issue to the forefront of their thinking.

CONN: Yet as important as it is to highlight the existential risks we must address to keep humanity safe, perhaps it’s equally important to draw attention to the incredible hope we have for the future if we can solve these problems. Which is something both Richard and Lucas brought up for 2017.

MALLAH: I’m excited about trying to foster more positive visions of the future, so focusing on existential hope aspects of the future. Which are kind of the flip side of existential risks. So we’re looking at various ways of getting people to be creative about understanding some of the possibilities, and how to differentiate the paths between the risks and the benefits.

PERRY: Yeah, I’m also interested in creating and generating a lot more content that has to do with existential hope. Given the current global political climate, it’s all the more important to focus on how we can make the world better.

CONN: And on that note, I want to mention one of the most amazing things I discovered this past year. It had nothing to do with technology, and everything to do with people. Since starting at FLI, I’ve met countless individuals who are dedicating their lives to trying to make the world a better place. We may have a lot of problems to solve, but with so many groups focusing solely on solving them, I’m far more hopeful for the future. There are truly too many individuals that I’ve met this year to name them all, so instead, I’d like to provide a rather long list of groups and organizations I’ve had the pleasure to work with this year. A link to each group can be found at futureoflife.org/2016, and I encourage you to visit them all to learn more about the wonderful work they’re doing. In no particular order, they are:

Machine Intelligence Research Institute

Future of Humanity Institute

Global Catastrophic Risk Institute

Center for the Study of Existential Risk

Ploughshares Fund

Bulletin of Atomic Scientists

Open Philanthropy Project

Union of Concerned Scientists

The William Perry Project

ReThink Media

Don’t Bank on the Bomb

Federation of American Scientists

Massachusetts Peace Action

IEEE (Institute for Electrical and Electronics Engineers)

Center for Human-Compatible Artificial Intelligence

Center for Effective Altruism

Center for Applied Rationality

Foresight Institute

Leverhulme Center for the Future of Intelligence

Global Priorities Project

Association for the Advancement of Artificial Intelligence

International Joint Conference on Artificial Intelligence

Partnership on AI

The White House Office of Science and Technology Policy

The Future Society at Harvard Kennedy School

 

I couldn’t be more excited to see what 2017 holds in store for us, and all of us at FLI look forward to doing all we can to help create a safe and beneficial future for everyone. But to end on an even more optimistic note, I turn back to Max.

TEGMARK: Finally, I’d like – because I spend a lot of my time thinking about our universe – to remind everybody that we shouldn’t just be focused on the next election cycle. We have not decades, but billions of years of potentially awesome future for life, on Earth and far beyond. And it’s so important to not let ourselves get so distracted by our everyday little frustrations that we lose sight of these incredible opportunities that we all stand to gain from if we can get along, and focus, and collaborate, and use technology for good.

AI Safety Highlights from NIPS 2016

This year’s Neural Information Processing Systems (NIPS) conference was larger than ever, with almost 6000 people attending, hosted in a huge convention center in Barcelona, Spain. The conference started off with two exciting announcements on open-sourcing collections of environments for training and testing general AI capabilities – the DeepMind Lab and the OpenAI Universe. Among other things, this is promising for testing safety properties of ML algorithms. OpenAI has already used their Universe environment to give an entertaining and instructive demonstration of reward hacking that illustrates the challenge of designing robust reward functions.

I was happy to see a lot of AI-safety-related content at NIPS this year. The ML and the Law symposium and Interpretable ML for Complex Systems workshop focused on near-term AI safety issues, while the Reliable ML in the Wild workshop also covered long-term problems. Here are some papers relevant to long-term AI safety:

Inverse Reinforcement Learning

Cooperative Inverse Reinforcement Learning (CIRL) by Hadfield-Menell, Russell, Abbeel, and Dragan (main conference). This paper addresses the value alignment problem by teaching the artificial agent about the human’s reward function, using instructive demonstrations rather than optimal demonstrations like in classical IRL (e.g. showing the robot how to make coffee vs having it observe coffee being made). (3-minute video)

cirl

Generalizing Skills with Semi-Supervised Reinforcement Learning by Finn, Yu, Fu, Abbeel, and Levine (Deep RL workshop). This work addresses the scalable oversight problem by proposing the first tractable algorithm for semi-supervised RL. This allows artificial agents to robustly learn reward functions from limited human feedback. The algorithm uses an IRL-like approach to infer the reward function, using the agent’s own prior experiences in the supervised setting as an expert demonstration.

ssrl

Towards Interactive Inverse Reinforcement Learning by Armstrong and Leike (Reliable ML workshop). This paper studies the incentives of an agent that is trying to learn about the reward function while simultaneously maximizing the reward. The authors discuss some ways to reduce the agent’s incentive to manipulate the reward learning process.

interactive-irl

Should Robots Have Off Switches? by Milli, Hadfield-Menell, and Russell (Reliable ML workshop). This poster examines some adverse effects of incentivizing artificial agents to be compliant in the off-switch game (a variant of CIRL).

off-switch

 

Safe Exploration

Safe Exploration in Finite Markov Decision Processes with Gaussian Processes by Turchetta, Berkenkamp, and Krause (main conference). This paper develops a reinforcement learning algorithm called Safe MDP that can explore an unknown environment without getting into irreversible situations, unlike classical RL approaches.

safemdp

Combating Reinforcement Learning’s Sisyphean Curse with Intrinsic Fear by Lipton, Gao, Li, Chen, and Deng (Reliable ML workshop). This work addresses the ‘Sisyphean curse’ of DQN algorithms forgetting past experiences, as they become increasingly unlikely under a new policy, and therefore eventually repeating catastrophic mistakes. The paper introduces an approach called ‘intrinsic fear’, which maintains a model for how likely different states are to lead to a catastrophe within some number of steps.

intrinsic_fear

Most of these papers were related to inverse reinforcement learning – while IRL is a promising approach, it would be great to see more varied safety material at the next NIPS. There were some more safety papers on other topics at UAI this summer: Safely Interruptible Agents (formalizing what it means to incentivize an agent to obey shutdown signals) and A Formal Solution to the Grain of Truth Problem (providing a broad theoretical framework for multiple agents learning to predict each other in arbitrary computable games).

These highlights were originally posted here and cross-posted to Approximately Correct. Thanks to Jan Leike, Zachary Lipton, and Janos Kramar for providing feedback on this post.

 

MIRI December 2016 Newsletter

We’re in the final weeks of our push to cover our funding shortfall, and we’re now halfway to our $160,000 goal. For potential donors who are interested in an outside perspective, Future of Humanity Institute (FHI) researcher Owen Cotton-Barratt has written up why he’s donating to MIRI this year. (Donation page.)Research updates

General updates

  • We teamed up with a number of AI safety researchers to help compile a list of recommended AI safety readings for the Center for Human-Compatible AI. See this page if you would like to get involved with CHCAI’s research.
  • Investment analyst Ben Hoskin reviews MIRI and other organizations involved in AI safety.

News and links

  • The Off-Switch Game“: Dylan Hadfield-Manell, Anca Dragan, Pieter Abbeel, and Stuart Russell show that an AI agent’s corrigibility is closely tied to the uncertainty it has about its utility function.
  • Russell and Allan Dafoe critique an inaccurate summary by Oren Etzioni of a new survey of AI experts on superintelligence.
  • Sam Harris interviews Russell on the basics of AI risk (video). See also Russell’s new Q&A on the future of AI.
  • Future of Life Institute co-founder Viktoriya Krakovna and FHI researcher Jan Leike join Google DeepMind’s safety team.
  • GoodAI sponsors a challenge to “accelerate the search for general artificial intelligence”.
  • OpenAI releases Universe, “a software platform for measuring and training an AI’s general intelligence across the world’s supply of games”. Meanwhile, DeepMind has open-sourced their own platform for general AI research, DeepMind Lab.
  • Staff at GiveWell and the Centre for Effective Altruism, along with others in the effective altruism community, explain where they’re donating this year.
  • FHI is seeking AI safety interns, researchers, and admins: jobs page.

This newsletter was originally posted here.

Silo Busting in AI Research

Artificial intelligence may seem like a computer science project, but if it’s going to successfully integrate with society, then social scientists must be more involved.

Developing an intelligent machine is not merely a problem of modifying algorithms in a lab. These machines must be aligned with human values, and this requires a deep understanding of ethics and the social consequences of deploying intelligent machines.

Getting people with a variety of backgrounds together seems logical enough in theory, but in practice, what happens when computer scientists, AI developers, economists, philosophers, and psychologists try to discuss AI issues? Do any of them even speak the same language?

Social scientists and computer scientists will come at AI problems from very different directions. And if they collaborate, everybody wins. Social scientists can learn about the complex tools and algorithms used in computer science labs, and computer scientists can become more attuned to the social and ethical implications of advanced AI.

Through transdisciplinary learning, both fields will be better equipped to handle the challenges of developing AI, and society as a whole will be safer.

 

Silo Busting

Too often, researchers focus on their narrow area of expertise, rarely reaching out to experts in other fields to solve common problems. AI is no different, with thick walls – sometimes literally – separating the social sciences from the computer sciences. This process of breaking down walls between research fields is often called silo-busting.

If AI researchers largely operate in silos, they may lose opportunities to learn from other perspectives and collaborate with potential colleagues. Scientists might miss gaps in their research or reproduce work already completed by others, because they were secluded away in their silo. This can significantly hamper the development of value-aligned AI.

To bust these silos, Wendell Wallach organized workshops to facilitate knowledge-sharing among leading computer and social scientists. Wallach, a consultant, ethicist, and scholar at Yale University’s Interdisciplinary Center for Bioethics, holds these workshops at The Hastings Center, where he is a senior advisor.

With co-chairs Gary Marchant, Stuart Russell, and Bart Selman, Wallach held the first workshop in April 2016. “The first workshop was very much about exposing people to what experts in all of these different fields were thinking about,” Wallach explains. “My intention was just to put all of these people in a room and hopefully they’d see that they weren’t all reinventing the wheel, and recognize that there were other people who were engaged in similar projects.”

The workshop intentionally brought together experts from a variety of viewpoints, including engineering ethics, philosophy, and resilience engineering, as well as participants from the Institute of Electrical and Electronics Engineers (IEEE), the Office of Naval Research, and the World Economic Forum (WEF). Wallach recounts, “some were very interested in how you implement sensitivity to moral considerations in AI computationally, and others were more interested in how AI changes the societal context.”

Other participants studied how the engineers of these systems may be susceptible to harmful cognitive biases and conflicts of interest, while still others focused on governance issues surrounding AI. Each of these viewpoints is necessary for developing beneficial AI, and The Hastings Center’s workshop gave participants the opportunity to learn from and teach each other.

But silo-busting is not easy. Wallach explains, “everybody has their own goals, their own projects, their own intentions, and it’s hard to hear someone say, ‘maybe you’re being a little naïve about this.’” When researchers operate exclusively in silos, “it’s almost impossible to understand how people outside of those silos did what they did,” he adds.

The intention of the first workshop was not to develop concrete strategies or proposals, but rather to open researchers’ minds to the broad challenges of developing AI with human values. “My suspicion is, the most valuable things that came out of this workshop would be hard to quantify,” Wallach clarifies. “It’s more like people’s minds were being stretched and opened. That was, for me, what this was primarily about.”

The workshop did yield some tangible results. For example, Marchant and Wallach introduced a pilot project for the international governance of AI, and nearly everyone at the workshop agreed to work on it. Since then, the IEEE, the International Committee of the Red Cross, the UN, the World Economic Forum, and other institutions have agreed to become active partners with The Hastings Center in building global infrastructure to ensure that AI and Robotics are beneficial.

This transdisciplinary cooperation is a promising sign that Wallach’s efforts are succeeding in strengthening the global response to AI challenges.

 

Value Alignment

Wallach and his co-chairs held a second workshop at the end of October. The participants were mostly scientists, but also included social theorists, a legal scholar, philosophers, and ethicists. The overall goal remained – to bust AI silos and facilitate transdisciplinary cooperation – but this workshop had a narrower focus.

“We made it more about value alignment and machine ethics,” he explains. “The tension in the room was between those who thought the problem [of value alignment] was imminently solvable and those who were deeply skeptical about solving the problem at all.”

In general, Wallach observed that “the social scientists and philosophers tend to overplay the difficulties [of creating AI with full value alignment] and computer scientists tend to underplay the difficulties.”

Wallach believes that while computer scientists will build the algorithms and utility functions for AI, they will need input from social scientists to ensure value alignment. “If a utility function represents 100,000 inputs, social theorists will help the AI researchers understand what those 100,000 inputs are,” he explains. “The AI researchers might be able to come up with 50,000-60,000 on their own, but they’re suddenly going to realize that people who have thought much more deeply about applied ethics are perhaps sensitive to things that they never considered.”

“I’m hoping that enough of [these researchers] learn each other’s language and how to communicate with each other, that they’ll recognize the value they can get from collaborating together,” he says. “I think I see evidence of that beginning to take place.”

 

Moving Forward

Developing value-aligned AI is a monumental task with existential risks. Experts from various perspectives must be willing to learn from each other and adapt their understanding of the issue.

In this spirit, The Hastings Center is leading the charge to bring the various AI silos together. After two successful events that resulted in promising partnerships, Wallach and his co-chairs will hold their third workshop in Spring 2018. And while these workshops are a small effort to facilitate transdisciplinary cooperation on AI, Wallach is hopeful.

“It’s a small group,” he admits, “but it’s people who are leaders in these various fields, so hopefully that permeates through the whole field, on both sides.”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Artificial Intelligence and the King Midas Problem

Value alignment. It’s a phrase that often pops up in discussions about the safety and ethics of artificial intelligence. How can scientists create AI with goals and values that align with those of the people it interacts with?

Very simple robots with very constrained tasks do not need goals or values at all. Although the Roomba’s designers know you want a clean floor, Roomba doesn’t: it simply executes a procedure that the Roomba’s designers predict will work—most of the time. If your kitten leaves a messy pile on the carpet, Roomba will dutifully smear it all over the living room. If we keep programming smarter and smarter robots, then by the late 2020s, you may be able to ask your wonderful domestic robot to cook a tasty, high-protein dinner. But if you forgot to buy any meat, you may come home to a hot meal but find the aforementioned cat has mysteriously vanished. The robot, designed for chores, doesn’t understand that the sentimental value of the cat exceeds its nutritional value.

AI and King Midas

Stuart Russell, a renowned AI researcher, compares the challenge of defining a robot’s objective to the King Midas myth. “The robot,” says Russell, “has some objective and pursues it brilliantly to the destruction of mankind. And it’s because it’s the wrong objective. It’s the old King Midas problem.”

This is one of the big problems in AI safety that Russell is trying to solve. “We’ve got to get the right objective,” he explains, “and since we don’t seem to know how to program it, the right answer seems to be that the robot should learn – from interacting with and watching humans – what it is humans care about.”

Russell works from the assumption that the robot will solve whatever formal problem we define. Rather than assuming that the robot should optimize a given objective, Russell defines the problem as a two-player game (“game” as used by economists, meaning a decision problem with multiple agents) called cooperative inverse reinforcement learning (CIRL).

A CIRL game includes a person and a robot: the robot’s only purpose is to make the person happy, but it doesn’t know what the person wants. Fortunately, it can learn more about what the person wants by observing her behavior. For example, if a robot observed the human’s morning routine, it should discover how important coffee is—not to itself, of course (we don’t want robots drinking coffee), but to the human. Then, it will make coffee for the person without being asked.

What about the example of King Midas, who thought he wanted as much gold as possible? How does the approach deal with someone stating their wishes incorrectly?

“We don’t have to know what we want. That’s the whole point. It’s hard to know what we want, and what we ask for is evidence for what we really want, but it’s not ground truth,” says Russell.

The robot should learn what people want by watching their actions. Then, if someone asks for something that’s technically incorrect, the robot will understand their underlying wish. Russell adds, “The robot—if it had been there—would have told Midas that he didn’t really want everything turned to gold, maybe just a few choice objects that he might point at from time to time and say the magic word.”

AI Off-Switch

Russell and his Berkeley colleagues also recently announced further progress toward ensuring safe AI, with a paper on ensuring an AI’s off-switch is always accessible. An off-switch is often suggested to guarantee a robot doesn’t harm humans – if the robot goes bad, just turn it off! But what if the AI anticipates that and disables its off switch?

Russell and his team propose solving this by providing a level of uncertainty in the robot’s objective. They created another CIRL game in which the robot chooses to announce its intentions, which gives the human a chance to switch it off, or to plunge ahead with its plan, bypassing human oversight.

They showed that a robot with a fixed objective will take the second choice and disable its off-switch. But if there is uncertainty about the objective, the robot will reason that a human decision to switch the robot off would imply the announced intention is undesirable. Because its goal is to make the human happy, the robot accepts being switched off. The robot has a positive incentive not to disable the off-switch, and that incentive is directly tied to the robot’s uncertainty about the human’s true objective.

Ensuring AI Safety

In addition to his research, Russell is also one of the most vocal and active AI safety researchers concerned with ensuring a stronger public understanding of the potential issues surrounding AI development.

He recently co-authored a rebuttal to an article in the MIT Technology Review, which claimed that real AI scientists weren’t worried about the existential threat of AI. Russell and his co-author summed up why it’s better to be cautious and careful than just assume all will turn out for the best:

“Our experience with Chernobyl suggests it may be unwise to claim that a powerful technology entails no risks. It may also be unwise to claim that a powerful technology will never come to fruition. On September 11, 1933, Lord Rutherford, perhaps the world’s most eminent nuclear physicist, described the prospect of extracting energy from atoms as nothing but “moonshine.” Less than 24 hours later, Leo Szilard invented the neutron-induced nuclear chain reaction; detailed designs for nuclear reactors and nuclear weapons followed a few years later. Surely it is better to anticipate human ingenuity than to underestimate it, better to acknowledge the risks than to deny them. … [T]he risk [of AI] arises from the unpredictability and potential irreversibility of deploying an optimization process more intelligent than the humans who specified its objectives.”

This summer, Russell received a grant of over $5.5 million from the Open Philanthropy Project for a new research center, the Center for Human-Compatible Artificial Intelligence, in Berkeley. Among the primary objectives of the Center will be to study this problem of value alignment, to continue his efforts toward provably beneficial AI, and to ensure we don’t make the same mistakes as King Midas.

“Look,” he says, “if you were King Midas, would you want your robot to say, ‘Everything turns to gold? OK, boss, you got it.’ No! You’d want it to say, ‘Are you sure? Including your food, drink, and relatives? I’m pretty sure you wouldn’t like that. How about this: you point to something and say ‘Abracadabra Aurificio’ or something, and then I’ll turn it to gold, OK?’”

This article is part of a Future of Life series on the AI safety research grants, which were funded by generous donations from Elon Musk and the Open Philanthropy Project.

Effective Altruism and Existential Risks: a talk with Lucas Perry

What are the greatest problems of our time? And how can we best address them?

FLI’s Lucas Perry recently spoke at Duke University and Boston College to address these questions. Perry presented two major ideas in these talks – effective altruism and existential risk – and explained how they work together.

As Perry explained to his audiences, effective altruism is a movement in philanthropy that seeks to use evidence, analysis, and reason to take actions that will do the greatest good in the world. Since each person has limited resources, effective altruists argue it is essential to focus resources where they can do the most good. As such, effective altruists tend to focus on neglected, large-scale problems where their efforts can yield the greatest positive change.

Effective altruists focus on issues including poverty alleviation, animal suffering, and global health through various organizations. Nonprofits such as 80,000 Hours help people find jobs within effective altruism, and charity evaluators such as GiveWell investigate and rank the most effective ways to donate money. These groups and many others are all dedicated to using evidence to address neglected problems that cause, or threaten to cause, immense suffering.

Some of these neglected problems happen to be existential risks – they represent threats that could permanently and drastically harm intelligent life on Earth. Since existential risks, by definition, put our very existence at risk, and have the potential to create immense suffering, effective altruists consider these risks extremely important to address.

Perry explained to his audiences that the greatest existential risks arise due to humans’ ability to manipulate the world through technology. These risks include artificial intelligence, nuclear war, and synthetic biology. But Perry also cautioned that some of the greatest existential threats might remain unknown. As such, he and effective altruists believe the topic deserves more attention.

Perry learned about these issues while he was in college, which helped redirect his own career goals, and he wants to share this opportunity with other students. He explains, “In order for effective altruism to spread and the study of existential risks to be taken seriously, it’s critical that the next generation of thought leaders are in touch with their importance.”

College students often want to do more to address humanity’s greatest threats, but many students are unsure where to go. Perry hopes that learning about effective altruism and existential risks might give them direction. Realizing the urgency of existential risks and how underfunded they are – academics spend more time on the dung fly than on existential risks – can motivate students to use their education where it can make a difference.

As such, Perry’s talks are a small effort to open the field to students who want to help the world and also crave a sense of purpose. He provided concrete strategies to show students where they can be most effective, whether they choose to donate money, directly work with issues, do research, or advocate.

By understanding the intersection between effective altruism and existential risks, these students can do their part to ensure that humanity continues to prosper in the face of our greatest threats yet.

As Perry explains, “When we consider what existential risks represent for the future of intelligent life, it becomes clear that working to mitigate them is an essential part of being an effective altruist.”

Westworld Op-Ed: Are Conscious AI Dangerous?

“These violent delights have violent ends.”

With the help of Shakespeare and Michael Crichton, HBO’s Westworld has brought to light some of the concerns about creating advanced artificial intelligence.

If you haven’t seen it already, Westworld is a show in which human-like AI populate a park designed to look like America’s Wild West. Visitors spend huge amounts of money to visit the park and live out old west adventures, in which they can fight, rape, and kill the AI. Each time one of the robots “dies,” its body is cleaned up, its memory is wiped, and it starts a new iteration of its script.

The show’s season finale aired Sunday evening, and it certainly went out with a bang – but not to worry, there are no spoilers in this article.

AI Safety Issues in Westworld

Westworld was inspired by an old Crichton movie of the same name, and leave it to him – the writer of Jurassic Park — to create a storyline that would have us questioning the level of control we’ll be able to maintain over advanced scientific endeavors. But unlike the original movie, in which the robot is the bad guy, in the TV show, the robots are depicted as the most sympathetic and even the most human characters.

Not surprisingly, concerns about the safety of the park show up almost immediately. The park is overseen by one man who can make whatever program updates he wants without running it by anyone for a safety check. The robots show signs of remembering their mistreatment. One of the characters mentions that only one line of code keeps the robots from being able to harm humans.

These issues are just some of the problems the show touches on that present real AI safety concerns: A single “bad agent” who uses advanced AI to intentionally cause harm to people; small glitches in the software that turn deadly; and a lack of redundancy and robustness in the code to keep people safe.

But to really get your brain working, many of the safety and ethics issues that crop up during the show hinge on whether or not the robots are conscious. In fact, the show whole-heartedly delves into one of the hardest questions of all: what is consciousness? On top of that, can humans create a conscious being? If so, can we control it? Do we want to find out?

To consider these questions, I turned to Georgia Tech AI researcher Mark Riedl, whose research focuses on creating creative AI, and NYU philosopher David Chalmers, who’s most famous for his formulation of the “hard problem of consciousness.”

Can AI Feel Pain?

I spoke with Riedl first, asking him about the extent to which a robot would feel pain if it was so programmed. “First,” he said, “I do not condone violence against humans, animals, or anthropomorphized robots or AI.” He then explained that humans and animals feel pain as a warning signal to “avoid a particular stimulus.”

For robots, however, “the closest analogy might be what happens in reinforcement learning agents, which engage in trial-and-error learning.” The AI would receive a positive or negative reward for some action and it would adjust its future behavior accordingly. Rather than feeling like pain, Riedl suggests that the negative reward would be more “akin to losing points in a computer game.”

“Robots and AI can be programmed to ‘express’ pain in a human-like fashion,” says Riedl, “but it would be an illusion. There is one reason for creating this illusion: for the robot to communicate its internal state to humans in a way that is instantly understandable and invokes empathy.”

Riedl isn’t worried that the AI would feel real pain, and if the robot’s memory is completely erased each night, then he suggests it would be as though nothing happened. However, he does see one possible safety issue here. For reinforcement learning to work properly, the AI needs to take actions that optimize for the positive reward. If the robot’s memory isn’t completely erased — if the robot starts to remember the bad things that happened to it – then it could try to avoid those actions or people that trigger the negative reward.

“In theory,” says Riedl, “these agents can learn to plan ahead to reduce the possibility of receiving negative reward in the most cost-effective way possible. … If robots don’t understand the implications of their actions in terms other than reward gain or loss, this can also mean acting in advance to stop humans from harming them.”

Riedl points out, though, that for the foreseeable future, we do not have robots with sufficient capabilities to pose an immediate concern. But assuming these robots do arrive, problems with negative rewards could be potentially dangerous for the humans. (Possibly even more dangerous, as the show depicts, is if the robots do understand the implications of their actions against humans who have been mistreating them for decades.)

Can AI Be Conscious?

Chalmers sees things a bit differently. “The way I think about consciousness,” says Chalmers, “the way most people think about consciousness – there just doesn’t seem to be any question that these beings are conscious. … They’re presented as having fairly rich emotional lives – that’s presented as feeling pain and thinking thoughts. … They’re not just exhibiting reflexive behavior. They’re thinking about their situations. They’re reasoning.”

“Obviously, they’re sentient,” he adds.

Chalmers suggests that instead of trying to define what about the robots makes them conscious, we should instead consider what it is they’re lacking. Most notably, says Chalmers, they lack free will and memory. However, many of us live in routines that we’re unable to break out from. And there have been numerous cases of people with extreme memory problems, but no one thinks that makes it okay to rape or kill them.

“If it is regarded as okay to mistreat the AIs on this show, is it because of some deficit they have or because of something else?” Chalmers asks.

The specific scenarios portrayed in Westworld may not be realistic because Chalmers doesn’t believe the Bicameral-mind theory is unlikely to lead to consciousness, even for robots. ” I think it’s hopeless as a theory,” he says, “even of robot consciousness — or of robot self-consciousness, which seems more what’s intended.  It would be so much easier just to program the robots to monitor their own thoughts directly.”

But this still presents risks. “If you had a situation that was as complex and as brain-like as these, would it also be so easily controllable?” asks Chalmers.

In any case, treating robots badly could easily pose a risk to human safety. We risk creating unconscious robots that learn the wrong lessons from negative feedback, or we risk inadvertently (or intentionally, as in the case of Westworld) creating conscious entities who will eventually fight back against their abuse and oppression.

When a host in episode two is asked if she’s “real,” she responds, “If you can’t tell, does it matter?”

These seem like the safest words to live by.