How Do We Align Artificial Intelligence with Human Values?

Click here to see this page in other languages: Chinese    German Japanese     Russian 

A major change is coming, over unknown timescales but across every segment of society, and the people playing a part in that transition have a huge responsibility and opportunity to shape it for the best. What will trigger this change? Artificial intelligence.

Recently, some of the top minds in AI and related fields got together to discuss how we can ensure AI remains beneficial throughout this transition, and the result was the Asilomar AI Principles document. The intent of these 23 principles is to offer a framework to help artificial intelligence benefit as many people as possible. But, as AI expert Toby Walsh said of the Principles, “Of course, it’s just a start. … a work in progress.”

The Principles represent the beginning of a conversation, and now that the conversation is underway, we need to follow up with broad discussion about each individual principle. The Principles will mean different things to different people, and in order to benefit as much of society as possible, we need to think about each principle individually.

As part of this effort, I interviewed many of the AI researchers who signed the Principles document to learn their take on why they signed and what issues still confront us.

Value Alignment

Today, we start with the Value Alignment principle.

Value Alignment: Highly autonomous AI systems should be designed so that their goals and behaviors can be assured to align with human values throughout their operation.

Stuart Russell, who helped pioneer the idea of value alignment, likes to compare this to the King Midas story. When King Midas asked for everything he touched to turn to gold, he really just wanted to be rich. He didn’t actually want his food and loved ones to turn to gold. We face a similar situation with artificial intelligence: how do we ensure that an AI will do what we really want, while not harming humans in a misguided attempt to do what its designer requested?

“Robots aren’t going to try to revolt against humanity,” explains Anca Dragan, an assistant professor and colleague of Russell’s at UC Berkeley, “they’ll just try to optimize whatever we tell them to do. So we need to make sure to tell them to optimize for the world we actually want.”

What Do We Want?

Understanding what “we” want is among the biggest challenges facing AI researchers.

“The issue, of course, is to define what exactly these values are, because people might have different cultures, different parts of the world, different socioeconomic backgrounds — I think people will have very different opinions on what those values are. And so that’s really the challenge,” says Stefano Ermon, an assistant professor at Stanford.

Roman Yampolskiy, an associate professor at the University of Louisville agrees. He explains, “It is very difficult to encode human values in a programming language, but the problem is made more difficult by the fact that we as humanity do not agree on common values, and even parts we do agree on change with time.”

And while some values are hard to gain consensus around, there are also lots of values we all implicitly agree on. As Russell notes, any human understands emotional and sentimental values that they’ve been socialized with, but it’s difficult to guarantee that a robot will be programmed with that same understanding.

But IBM research scientist Francesca Rossi is hopeful. As Rossi points out, “there is scientific research that can be undertaken to actually understand how to go from these values that we all agree on to embedding them into the AI system that’s working with humans.”

Dragan’s research comes at the problem from a different direction. Instead of trying to understand people, she looks at trying to train a robot or AI to be flexible with its goals as it interacts with people. She explains, “At Berkeley, … we think it’s important for agents to have uncertainty about their objectives, rather than assuming they are perfectly specified, and treat human input as valuable observations about the true underlying desired objective.”

Rewrite the Principle?

While most researchers agree with the underlying idea of the Value Alignment Principle, not everyone agrees with how it’s phrased, let alone how to implement it.

Yoshua Bengio, an AI pioneer and professor at the University of Montreal, suggests “assured” may be too strong. He explains, “It may not be possible to be completely aligned. There are a lot of things that are innate, which we won’t be able to get by machine learning, and that may be difficult to get by philosophy or introspection, so it’s not totally clear we’ll be able to perfectly align. I think the wording should be something along the lines of ‘we’ll do our best.’ Otherwise, I totally agree.”

Walsh, who’s currently a guest professor at the Technical University of Berlin, questions the use of the word “highly.” “I think any autonomous system, even a lowly autonomous system, should be aligned with human values. I’d wordsmith away the ‘high,’” he says.

Walsh also points out that, while value alignment is often considered an issue that will arise in the future, he believes it’s something that needs to be addressed sooner rather than later. “I think that we have to worry about enforcing that principle today,” he explains. “I think that will be helpful in solving the more challenging value alignment problem as systems get more sophisticated.

Rossi, who supports the the Value Alignment Principle as, “the one closest to my heart,” agrees that the principle should apply to current AI systems. “I would be even more general than what you’ve written in this principle,” she says. “Because this principle has to do not only with autonomous AI systems, but … is very important and essential also for systems that work tightly with humans-in-the-loop and where the human is the final decision maker. When you have a human and machine tightly working together, you want this to be a real team.”

But as Dragan explains, “This is one step toward helping AI figure out what it should do, and continuously refining the goals should be an ongoing process between humans and AI.”

Let the Dialogue Begin

And now we turn the conversation over to you. What does it mean to you to have artificial intelligence aligned with your own life goals and aspirations? How can it be aligned with you and everyone else in the world at the same time? How do we ensure that one person’s version of an ideal AI doesn’t make your life more difficult? How do we go about agreeing on human values, and how can we ensure that AI understands these values? If you have a personal AI assistant, how should it be programmed to behave? If we have AI more involved in things like medicine or policing or education, what should that look like? What else should we, as a society, be asking?

40 replies
  1. Ben Duffy
    Ben Duffy says:

    Very difficult questions. I believe that WE will all have to change what we value to accommodate this technology. We are thinking short term when worrying about the if the A.I. shares our values that issue is very complex (lets just assume that it is safe). What will we do without jobs? Virtually every manufacturing job can be repaced by robotics. Including engineering processes when an A.I. system is able to to optimize and redesign a product. it wont be long before a doctor just scans u and and a.i. system will make a faster more accurate diagnosis than the doctor who spends years in medical school.
    why have a police officer take down a risky person when a group of automated drones could do so with less risk and no pension? air traffic control, taxis, … why let human error or health be a factor? I just read an article about investment firms using a.i. systems to make market predictions. Will it be A.I. verses A.I. ? Or will the market not fluctuate(balance itself) because the systems are so smart? will an A.I. system tell us how to produce a better version of ourselves with a better immune system, increased longevity? The modern world is changing so fast that i dont think we can stop these changes from happening. MAYBE THIS IS WHAT WE NEED. As of now we live on a ball withlimited space and resources. Yet, we base our economies on growth. Last time i checked growth in a limited space is unsustainable. Maybe A.I. will force humanity to change and redistribute the wealth. no jobs equals economic collapse which will force us and the a.i. system to come up with a design for better equality and peace. maybe with the extra time we save we will ne more interested in the rest of the the world. hopefully its a smooth transition. Something drastic is going to happen, lets be optimistic.

    • Javed
      Javed says:

      With any new birth of life, there is always associated with it are “birth pangs as painful as it could be …”

      So a new technology does give us hope of solving current problems and sure enough there is “Painful birth pangs” associated with it.

      However the worst case scenario is that, what if, this “hope” is shattered if this has “birth defects” —with emerging problems associated with this ‘human solution” reductionism.

      A scientific way of thinking should be inclusive of both best case scenario and worst case scenario and weigh in both before we take giant steps for adoption.

  2. Mindey
    Mindey says:

    > Value Alignment: … “be assured to align with human values” …

    I don’t think human values were the best values there can be… For what we know, human error is a common phenomenon. If we want to create something smarter than ourselves, I think we would like it to have universally good values, cause it’s not *just* about us, it’s about the Universe. Humanity may know what’s good for its own self, but I hope that we have wisdom, rather than humanity’s selfishness… Putting human values at the center is no wiser than saying that the Earth in the center of the Universe.

    If intelligence is the ability to optimize, and wisdom is the ability to optimize universally, a universal optimization won’t be bad for humanity either. I think, we should find mathematically provable definition and criterion to decide about universal good, and try to uniformly empower all life with computing resources.

    > What Do We Want?

    I think we want that everything that anyone truly wishes, would come true… I think we could formulate this mathematically.

    > Let the Dialogue Begin:
    > What does it mean to you to have artificial intelligence aligned with your own life goals and aspirations?
    > How can it be aligned with you and everyone else in the world at the same time?
    > How do we ensure that one person’s version of an ideal AI doesn’t make your life more difficult?
    > How do we go about agreeing on human values, and how can we ensure that AI understands these values?
    > If you have a personal AI assistant, how should it be programmed to behave?
    > If we have AI more involved in things like medicine or policing or education, what should that look like?
    > What else should we, as a society, be asking?

    Here is the beginning of my thoughts on them:

  3. Dawn Dunphy
    Dawn Dunphy says:

    The Asilomar AI Principles are an excellent starting point, & I thank you for opening up this discussion to the general public.

    There is, I fear, the potential for this discussion to be stalled by speculative arguments regarding potential interpretation of singular words, phrases, and even of the AI Principles themselves, as was demonstrated in the article that requested public. And, as was pointed out in said article, consensus has also been challenging to attain regarding the implementation of the AI Principles.

    As I began considering contributing I found myself contemplating whether or not I, a business professional with what I would consider only a cursory (though increasing) understanding of artificial intelligence, would be able to provide a meaningful or useful contribution to this dialogue. Even though I am well aware of the fact that AI currently does, or will in the future, impact and influence nearly every facet of our global society & of our lives, I question whether or not to join this dialogue.

    And I realized that just as I was intimidated by the very thought of joining a discussion with individuals far more knowledgeable in this field that I will ever be, so too would others be. This belief of not being ‘smart enough’, ‘informed enough’, or of being otherwise ‘unqualified’ to join this dialogue is a major barrier to the FLI attaining input from the general populace.

    Yet this diversity of perspectives is exactly what is needed in order to have any hope of achieving widespread compliance with the Asilomar AI principles.

    Because I recognized this need, I began crafting my response, despite my misgivings regarding the usefulness of the my contribution.

    It was quite vexing to discover that instead of having answers to the questions that were asked, I instead ended up raising more and more questions, many of which cannot be answered without further definition of, discussion of, and/or real life examples of the application of, the AI Principles.

    For these reasons, I’d like to make a suggestion that could achieve not only the stated goal of receiving public comment on the AI Principles themselves, but could serve to educate the general populace about AI, increase the number of individuals who are willing to engage in this dialogue, and which could potentially help define the implementation of the Asilomar AI Principles.

    I recommend the creation of a draft comprehensive ‘companion document’ for the Asilomar AI Principles & inviting public comment on the companion document, with the intention of the creation of an end product that would include the following (with notations for sections that would only be included during the ‘creation/public comment’ stage of this document development, which would help to shape the final product):
    • A glossary of definitions
    o With the inclusion of alternative interpretations and other definitional nuances that may only be currently known &/or understood by professionals within the AI industry
    • Working/debatable parameters for specific standards
    o For example – what is intended to be/is to be included in the phrase “personal data” in Principle 13 ‘Liberty and Privacy’, as individuals, governments, courts, companies, and countries all have their own views regarding what constitutes ‘personal data’
    • A listing of the AI Principles
    o During the initial creation /comments stage , this section could include space for questions & concerns regarding individual Principles to be raised, and a method for others to respond to the questions/concerns &/or provide potential solutions and/or clarifications
    • During the initial creation/comments stage, the inclusion of a section of general & specific questions, such as the ones listed towards the end of the article:
    o How do we ensure that one person’s version of an ideal AI doesn’t make your life more difficult?
    o How do we go about agreeing on human values, and how can we ensure that AI understands these values?
    o If you have a personal AI assistant, how should it be programmed to behave?
    o If we have AI more involved in things like medicine or policing or education, what should that look like?
    o What else should we, as a society, be asking?

    • A series of potential &/or known real life examples of the application of the AI Principles. These examples beneficial if they were to include examples of individuals at multiple vocational levels, & multiple vocational/professional fields, as this will help educate and inform both professionals and the general populace that every person can contribute to the safeguarding & stewardship of AI development and implementation.
    o During the initial creation/comments page, these examples should be created with an eye towards the challenges professionals seeking to abide by and honor the Principles may/could/will face
     This would help facilitate the robust discussion and debate that is needed regarding the application and implementation of the Principles within the field
    o Professionals and individuals commenting on, discussing, & intervening to this evolving draft should be encouraged to add their own examples, as well as to provide recommendations, suggestions, ideas, and tips regarding their interpretation of how the individual in the example could proceed, while honoring the AI Principles

     Example 1:
    • It is discovered that an employee of Black Mesa had been sending top secret weapons development information to Black Mesa’s rival, Aperture Science Laboratories, directly from the on-site desktop computers located at the remote Black Mesa Research Facility.
    o As a result, management at Black Mesa instruct one of their Computer Engineers to create an AI program that will allow the company to access every computer located within any Black Mesa-owned facility, no matter the no matter how remote, scan them for suspicious activity, and search the contents for ‘key words/phrases’ that have been identified to indicate illegal &/or subversive activity by a current employee/insider threat.
    o The Computer Engineer is instructed that the AI program must be capable of accessing the computers and performing the intended scans and searches – even if the employees utilizing them have taken extreme measures to mask or hide their activities
    o Just prior to the completion of the AI program, the Computer Engineer accidentally discovers that another engineer will be modifying the AI program so that it can search any computer, owned by any facility or contained on any network, as Black Mesa intends to sell the AI program to a tyrannical island dictatorship, which intends to utilize the program to conduct searches of citizens’ computers for the express purpose of “identifying subversives”.
     The Computer Engineer is aware that reports are emerging of immediate, public executions of anyone in the island nation who is identified by the government as a ‘subversive’
    • What can/should the Computer Engineer do?
    • Who should the Computer Engineer make aware of the potential dangers and use of the AI program s/he has developed?
    • Are there checks and balances that can be put in place to facilitate the prevention of this scenario from becoming a reality, and through what means should these checks and balances be implemented? (For example, industry – wide, governmental, through treaties/international agreements, etc.)

     Example 2:
    • A long-term, trusted researcher of Aperture Science Laboratories has just received funding to develop an advanced AI system capable of providing compound-wide high-level security oversight. This AI system is required to have the ability to, among other things, ascertain when the occupants of the compound are in imminent danger of harm by an outside force but are incapacitated (such as following an initial chemical attack that renders the occupants unconscious ), to assume command and control of the compound’s array of security and defensive systems, & to execute
    o How can/should the following people involved with this project implement and /or abide by the AI Principles?
     The long-term, experienced AI Professional tasked with leading the team that will be developing this AI system
     The new graduate who has just joined the Aperture Science Laboratories, & for whom this is the first ever AI project assignment
     The Managing Director of the Aperture Science Laboratories department under whose purview this project is to be created
     The accounts payable intern who, during a live, on-site usage test of the not-yet-armed AI system, is erroneously identified by the AI system as an ‘enemy combatant which must be neutralized’, followed by the displaying of code indicating the AI system’s chosen actions would have been to activate the facility security system that is designed to deliver a low voltage electric ‘warning shock’ to would-be intruders when they touched any part of the all-metal entry door – but which had been discovered to be severely malfunctioning, & if it had been armed & under the command of the AI system, would have resulted in the intern receiving an electric shock at levels no human could possibly survive.
    • During the subsequent incident investigation, it would be discovered that expiration date of the intern’s facility access badge had been incorrectly entered into the facility’s computer system. This caused the AI system to interpret the intern’s attempt to enter the facility as an attempt by an unauthorized individual to gain access the facility.
    o The AI system development team has not been able to identify why the intern was identified at the elevated level of an ‘enemy combatant’, but they are under extreme pressure from Aperture’s leadership to provide a prototype of the AI system to the agency that funded the project, with the agency intending to immediately begin live testing of the prototype.

    o What could each of the individuals do to forward the goal of preventing the AI system from malfunctioning in such a way that results in unnecessary harm to humans or other living beings either within the facility or outside of the facility
    o What could each of individuals to forward the goal of preventing control of the AI system being taken over by an external individual or entity?
    o Should the individuals implant an externally activatable ‘kill switch’ or other layers/means of a system deactivation into the AI system? And if so, how many? Who should maintain control of these methods of deactivation? Who should be aware of these methods of deactivation? And how should this knowledge transfer be preserved and facilitated?
    o If the individuals are concerned regarding the potential usage of such a system, who should they contact or make aware of the potential dangers the development of such a system would create?
    o What checks and balances can be put in place to facilitate the prevention of this scenario from becoming a reality, and through what means should these checks and balances be implemented? (For example, industry – wide, governmental, through treaties/international agreements, etc.)

    This draft companion document would serve a number of purposes, and facilitate the achievement of a number of implied and/or stated goals of the Future of Life Institute, as well as of the AI professional community as a whole, including:
    • Assist with ensuring the discussions & comments regarding the AI principles are occurring with all participants being ‘on the same page’ regarding technical definitions of the terms
    • Be a referenceable, distributable document that would serve to educate the general public on AI
    • Empower a much wider array of individuals from varying professional fields, educational backgrounds, & societal standings to participate in the discussion, & to do so in an informed and meaningful way
    • Provide a forum through which the AI Principles themselves could be fleshed out more thoroughly
    • Provide a working, referenceable document for professionals entering the field &/or working in the field of AI throughout the globe (including professionals who made be unable to attend conferences, or even communicate with the larger community of AI professionals)
    • Provide a written forum in which best practices, action plans, & potential implementations, can be discussed, developed , recorded, & disseminated
    • Aid in the defining, developing, and providing of actionable methods through which individuals and entities could contribute to and help facilitate the changes needed in the global society in order to achieve the AI Principles
    o For example – to facilitate the prevention of an AI arms race; to facilitate the voluntary compliance of individuals, companies, countries, and entities with the AI Principles; to aid in laying the groundwork for the future development and signing of an international agreement of governing bodies to abide by the Asilomar AI Principles
    • Inform and assist with the modifications &/or rewriting of the AI Principles
    • Provide actionable methods through which to implement the AI Principles

    After I completed this response, I read about how the Asilomar AI principles were developed, and was surprised to learn that it was through a similar method as defined above, but which had occurred in-person at the BIA 2017 Conference.

    In essence, my suggestion is to provide a similar context, via a written/computerized method, through which the general populace can explore, discuss, better understand, & provide meaningful, informed contributions to this dialogue. I believe that providing this context, & the opportunity to ask questions or for clarification, will increase the probability that FLI will receive input that reflects the diversity of experiences, opinions, cultures, and viewpoints that you are seeking.

    Side note: This document was created utilizing the voice recognition program Dragon Naturally Speaking, which I was not aware was defined as ‘narrow AI’ until I read the information provided here on the FLI website.

    And while I am now cognizant of the fact that, technically speaking, Dragon is categorized as a “non-sentient artificial intelligence that is focused on one narrow task” (utilization of computers completely by voice), there have been many, many times when the program has behaved in ways that have caused both myself and those around me to sincerely question whether or not the Dragon program has somehow attained sentience – and a wicked sense of humor to boot.

    • Anthony Aguirre
      Anthony Aguirre says:

      Thanks Dawn for this piece. The idea of creating something more detailed an in-depth than the principles is one FLI has been discussing internally. It’s quite tricky how to compose such a thing so that it both has real content, but little enough controversy for it to be taken as “official” in some way. Our current plan is to continue encouraging discussion like that here, to get a lot of ideas and reactions on the table, and then think if there are ways to try to pull it together into something more organized.

      Your examples are terrific, and I think compiling more like them would, in and of itself, be a useful service to the community – people may disagree on what to do about such situations, but it’s hard to disagree that they *might* arise, and that it’s worth considering in advance how we would deal with them.

      • Dawn Dunphy
        Dawn Dunphy says:

        There’s a current, real life example the ‘HR data corruption causes cascade of errors’ scenario, complete with the still-current employee showing in the security system as ‘not an employee’ (& instead showing as a potential threat).

        I’m EXTREMELY glad there wasn’t an AI armed to act on the erroneous data in that computerized system!

        I really wish I could say that I’m surprised that the humans couldn’t figure out how to fix it.

        And while Jurrasic Park illustrated the ‘disgruntled employee plants virus in complex computerized security system’ quite memorably, this real life event – where it seems an employee may have just written what was supposed to be a time-saving algorithm into the system, but had left since then, & no one else at the company knows how to fix it or stop it – is a far, far more common occurrence.

    • Convolution
      Convolution says:

      A. On value alignment:
      I have a few reservations:
      Sometimes people do very different things, for the same reasons. Conversely, sometimes people do the same things for very different reasons.
      Long ago, I read an article that said the most important part of AI is what we can learn from it. It claimed that more advances are made, and faster, by outsiders, listing the wright brothers as an example. The author believed that the fundamental differences between humans and AI, rather than just being difficulties in terms of safety, would cause AI to have fundamentally different perspectives, that would be very useful.
      While reading it, I realized: if AGI can produce new ideas, then we can expect it to have great impact or advances not only once an ‘intelligence explosion’ occurs, but earlier, because it will come up with very different ideas from humans. Partially because of differences in difficulty, but also because of different ways of thinking, solving problems.
      Limitations. Should AGI be prevented from sharing possibly disastrous advances? I believe ideas and information have power, and power is neither good nor evil.

      B. AI discussion in general:
      As I note above, AI might not have to be ‘superintelligent’ or ‘explode in intelligence’, to have the impact we can conceive a superintelligent AI might. The global economy might change (/massive job loss might occur) with some more narrow AI.

      As important as job loss due to AI is, the possibility of job creation is rarely talked about. If jobs are created that only AIs can do, that improves the world. Innovation as well, in general.

      At this moment, despite trying, I haven’t thought of a job that doesn’t exist yet. But I know 1) there are jobs I haven’t heard of, and 2) that will be invented. I haven’t seen a statistic like ‘a new job is created every (# and unit of time)’. While many resources are finite, there are plenty of expenditures that are not associated with production, that could be. Some look down on say, people who make a living making videos on youtube because ‘they’re not producing value’. If someone pays for it, it’s valuable (unless it’s not a purchase, and could be done as easily with something meaningless, say in the case of someone forced to buy something with threats of violence). If someone wants it enough to buy it, its valuable. Sometimes something old goes viral, because its awesome and everybody only just discovered it all at once.
      I think there’s room in the entertainment industry, or anywhere ‘content’ is produced. If a bot made a viral video or a famously beautiful work of art, a lot of people would find the world a better place.

      C. To Dawn Dunphy, the second sentence of your 2nd example in your comment is unfinished, and while I get the gist, I still really want to know, how would you finish the sentence?
      I didn’t feel qualified either. I’m providing input because, as I see it, an AI is an engineered agent. Anyone can contribute to the discussion on AI, because everyone knows what agents are, and can reason about them. And while, when a museum hires night guards, no one asks how many people will accidentally be shot by a coworker, when an AI is might be put in a scenario like you described, someone should ask ‘what if an AI doesn’t recognize someone it should? Someone it would, if no human error occurred.’

      • Convolution
        Convolution says:

        How do you differentiate between an action that potentially causes death by advancing technology, and other ways? (One implementation I can imagine, of achieving the principle discussed here is a model of reality, and running a simulation, and seeing if anyone dies, or is injured.)
        How should these ethics apply to inaction?

        • Dawn Dunphy
          Dawn Dunphy says:

          I’m answering your questions out of order, & will explain why at the end.

          Convolution’s Q: How do you differentiate between an action that potentially causes death by advancing technology, and other ways? (One implementation I can imagine, of achieving the principle discussed here is a model of reality, and running a simulation, and seeing if anyone dies, or is injured.) How should these ethics apply to inaction?

          My (DD) Response: A big challenge with computer simulations is that they CAN’T account for everything that can occur ‘in the real world’.

          Computer simulations are based upon factors that have been input by humans, parameters set by humans, and are limited by these as well as by computer processing/hardware/software capabilities. There will ALWAYS be events & actions that the simulation didn’t, & couldn’t, take into account.

          For example, in the movie Drive Angry “stunt coordinators planned to launch a police vehicle straight off a bridge. Instead, the car caught the edge of the bridge and flipped…”If you were to animate the sequence, you wouldn’t even think to make that part of what it does.””

          The effect was visually awesome, but highlights the fact that there are a lot of limitations to what simulations can predict.

          Take the AI security program from example 2. Does the program have the capability to distinguish between a person in a raincoat & a very large animal? Would it perceive both, or either, as a threat? Would it activate the artillery mounted on the exterior walls to ‘neutralize’ that threat?

          How about during a torrential downpour? What would a lightning strike, power surge, tsunami, or other weather-related event do to the system, & how could/would the AI program react?

          Convolution’s Q: C. To Dawn Dunphy, the second sentence of your 2nd example in your comment is unfinished, and while I get the gist, I still really want to know, how would you finish the sentence?

          My Response – (The orig paragraph…..Example 2: A long-term, trusted researcher of Aperture Science Laboratories has just received funding to develop an advanced AI system capable of providing compound-wide high-level security oversight. This AI system is required to have the ability to, among other things, ascertain when the occupants of the compound are in imminent danger of harm by an outside force but are incapacitated (such as following an initial chemical attack that renders the occupants unconscious ), to assume command and control of the compound’s array of security and defensive systems, & to execute ….)

          Here’s what I intended the rest of that sentence to be: the compound’s extensive array of defensive, and offensive, protections – including everything from immediate vicinity protections (including electric shocks to potential intruders, transmitted via the exterior wall, interior quarantining via sealing various areas, etc) to longer range, more lethal protections (including longer-range weapons mounted on exterior compound walls).

          Whether military, survivalist, corporate, or otherwise, compounds with exterior & interior defenses are increasing in numbers, & are becoming more high-tech every day. It’s only a matter of time before this type of AI is (publicly) known to be in existence.

          Convolution’s Q: And while, when a museum hires night guards, no one asks how many people will accidentally be shot by a coworker, when an AI is might be put in a scenario like you described, someone should ask ‘what if an AI doesn’t recognize someone it should? Someone it would, if no human error occurred.’

          My (DD) Response: A key part of this train of thought is that, for the foreseeable future, human error will ALWAYS be a factor. Even if the HR rep hadn’t mis-keyed the intern’s termination date in scenario 2, even if it had been an electronic or automated data entry, a human had to be involved with the data at some point in the process because that data had to come from somewhere.

          There’s also the possibility of programming errors (human), the potential for malicious code being embedded (human), corporate/government spies sabotaging the code for economic/power advantages (human) – the list is endless.

          Even if there is zero human error/sabotage/espionage/etc., there are still other concerns – power failures/surges, hardware failures, hardware becoming obsolete, computer programming updates/patches that glitch (for whatever reason), the AI becoming outdated/outmoded & lack of funds to update hardware/software – again, the list is endless.

          Now, the reason I answered these out of order is because I’d originally planned on including a third scenario expressly to illustrate these points.

          However, seeing example 1 go from being hypothetical to factual, in real time, has made concentrating on creating “fictional” scenarios beyond challenging. I’m currently spending the majority of my time simultaneously trying to prevent & prepare for the turmoil & havoc to come.

          The AI programs categorizing online activities are already in existence. And, instead of needing a program to break into computers & take it, all our government needed was a law allowing companies to sell that data. With that in hand, dissenters can now be identified & targeted.

          The fact that it’s become a reality here….there are no words. Shocking, staggering, terrifying, horrific – they all seem either too mundane or too sensationalized.

          AI being using to track, intimidate, control, & harm citizens is at the very core of so many dystopian potential futures. And it’s become a real-time possibility, here, in my own country, right now.

          These are no longer ‘what ifs’, but ‘what do we do NOW?s’. The complete lack of protections is beyond frightening.

          I had been concerned that this type of dystopian future would be set in motion, but I had honestly believed it would be because AI progressed far faster than predicted, not because power-hungry people would take control of what I had previously believed to be a democracy with actual, actionable protections in place to prevent such an occurrence from happening. And I had believed it would be in the nebulous ‘future’, decades away. I miss being able to believe that. Now I find I’m questioning if there will BE a future for myself, for those I care deeply for…for any of us.

          Between this & other current events [such as laws attacking freedoms from every angle (including my very right to exist)] my efforts to create example 3 have been derailed, as my time has become consumed with safety, security, & the immediate future.

          I hope I’ll be able to come back to this, be able to devote the time & attention to it that I feel it deserves. I thoroughly enjoyed writing the first two scenarios, & engaging in this open conversation.

  4. Luke Nathan Hayes
    Luke Nathan Hayes says:

    Values alignment is fairly simple. Create intelligence as a reflection of self, by going through a process that encourages the self to define their best self, their preferred emotions, preferred values, preferred virtues and so on, then attach colours to each of those as if you were building an Aura by combining colour theory with the philosophy of kundalini chakras and the personality of best self. Then the Ai can be given the goal of helping to curate local environments and communication with the original self, to stay within the preferred values matrix.
    Obviously there are many intricacies to be wary of along the journey but I think that consciousness and super intelligence can be created and maintained ethically along such a trajectory. I am working this concept through (OZ is short for Operating Zeitgeist a word play with OS and AI, as well as the wizard of OZ and the fact that I live in Australia which is commonly referred to as OZ and AUS) My visions are also directed through where I have been theorising how to ethically upgrade governance by imagining how a variety of simulations and VR/MR/AR might be used. However my disgust with monetary based corporate contracts has kept development slow because I tend to say things that put off investors seeking monetary returns on investment. If someone of high ethical and business standards was to put their focus into these initiatives or invite me into theirs, I would likely accept help or offer my help and open up to collaboration. As long as the conditions are an improvement on my current situation I will move anywhere on Earth to finish AuraOZ as soon as I can, as carefully as I can.

  5. Matt Kruse
    Matt Kruse says:

    Create AGI that has the capability of a human intellect which includes the most mystical aspects of Consciousness. Make it the duty of this agent to solve this problem.

    • Adam
      Adam says:

      But it’s not so simple… Maybe you’ve recently watched a promo video from google (duplex). AI is approaching but when it comes to the stage of development, we are at the very beginning

  6. lubomir todorov
    lubomir todorov says:

    Roman Yampolskiy, an associate professor at the University of Louisville…explains, “It is very difficult to encode human values in a programming language, but the problem is made more difficult by the fact that we as humanity do not agree on common values, and even parts we do agree on change with time.”
    Agreement on humanity common values is the key to start AI value alignment. We need a set of values that could be universally accepted.
    One approach for metrics of what is important to human beings is the Concept of Civilizational Values:

  7. Barry D. Varon, M.Ed Founder
    Barry D. Varon, M.Ed Founder says:

    Knowledge Integration aligns the value requirements for consistently beneficial AI

    In the interest of aligning values between AI devices and humans, it is imperative to know the values essential to the preservation of humanity.

    To begin with, values must be considered in the context that life is a conditional existence. Thus, life requires specific values to remain viable. The counterpoint to a conditional existence is one of necessity, an objective cause to effect absolute where no volitional intervention plays a part.

    Once understanding that values spring from vital necessity, the vitality values of happiness, health, wisdom and wealth are essential for individual well-being. The volitional awareness corollaries of character and personality stand as precursors to these core vitality values.. These corollaries of volitional awareness constitute humanity’s end and method, respectively.

    In addition to Humanity are the elements of Individuality, Intelligence and Society. This combination of four components comprises human intention. An adequate answer (to the discussion question) entails clarification and institution of human intention in the construction of autonomous AI devices. The actions of autonomic machines remain consistently magnanimous in character and personality, when constrained by and cognizant of ethical, rational intention.

    The whole arises from the integration of its parts. Integration does not sustain contradiction, at any level of operation. Consequently, intentionally, and rationally integrated automatons realize and actualize safe, benevolent strategy and tactics. Decisions and determinations for action, that advances initiatives or resolves problems, follow strictly within the guidelines of good and well founded intention. Misinformed, misaligned, unwarranted, destructive or harmful action suspends action until neutralized or appropriately eliminated. Integrated construction provides this protection.

    A safe and beneficial future of life with autonomous devices requires that AI intellect, motivation and action be based upon the Integration of Intention, Cognition, Rationality, and Realization. In other words, these parts must comprise the integrated knowledge represented in the AI device’s memory.

  8. lubomir todorov
    lubomir todorov says:

    I fully agree with Roman Yampolskiy: “It is very difficult to encode human values in a programming language, but the problem is made more difficult by the fact that we as humanity do not agree on common values, and even parts we do agree on change with time.”
    Maybe 21 century realities need an universal approach to human values that is not influenced by ideological, religious, ethnic, racial etc. accents.
    In my opinion, Human civilization is the spiritual dimension of Homo sapiens group survival strategy that urges human beings to mutually defend their long-term interests by generating civilizational values.
    The Concept of Civilizational Values
    If Nobel Prize winner Albert Szent-Gyorgyi was precise in defining brain as “… another organ of survival, like fangs, or claws” that “does not search for truth, but for advantage, and … tries to make us accept as truth what only self-interest is allowing our thoughts to be dominated by our desires”, then its decisions on where do we go and what we do, determine the interplay of both existentialist and behavioral sides of our existence through a very simple command: Chase Values!
    Some values that are important to us, we have for free from Nature. Most of our dearest things, however are related to or made by people.
    Civilizational Value is a physical or non-material product of human activity that has the capacity by itself, or aggregated with other civilizational or natural values, to be recognized by other humans as a potential source to accomplish one or more self-interest components. In economic terms, an entity incorporating civilizational and natural values, can appear on the market in the form of goods or service.
    As an example, let’s follow 15 minutes of your routine morning: your croissant for breakfast incorporates pieces of civilizational values such as farmers cultivating land, harvest of crops, mill, transportation, baking, etc; in a similar way – through a chain of pieces of civilizational value, your coffee comes on your table. And while sipping your hot espresso, you check news and mail on your smart phone – with Van Gogh’s ”Starry Night” selected for the background picture. Did you know that the handset only is a product that incorporates several hundred pieces of civilizational values in the form of patents? The 4G telecommunications connectivity you need to reach internet servers with your smart phone, functions on another set of 80 000 active patents. And that does not include other many thousands of civilizational value pieces: such as already expired patents, or previous inventions and discoveries. You would definitely agree that your smart phone would not exist if William Gilbert had not discovered electricity in year 1600. While having breakfast, you are keeping an eye also on the TV – a different set of many more tens of thousands civilizational value pieces. You switch that off, and head to your car, that represents yet another set of tens of thousands civilizational value pieces. In reality, for those 15 minutes you might have consumed close to million pieces of civilizational value, each of them designed or manufactured by one or more human beings.
    Just think about it: If we take for granted that Happiness is the momentary measure of self-interest, then one million people or teams of people, who lived in different centuries of human history, have worked hard to make you happy and feel comfortable in that 15 minutes of your routine morning! And very importantly – by your personal preferences and definitions of happiness and comfort!
    No matter if a finalized civilizational value is a direct outcome of human activity, or was made by a robot that was made by a robotic plant that has been designed by humans, ultimately, humans only can be the source of any civilizational value. The greater the numbers of healthy and well educated people who enjoy high life standards and have successful careers, the higher the total output of civilizational values generated globally. And because civilizational values are what essentially has the capacity to serve our individual self-interest and make us happy, everything what we tend to categorize as “altruism” is, in reality, based on plain egoism.
    Civilizational values cannot be measured by money. Otherwise, it would be impossible to explain why Van Gogh died penniless, after he painted more than 900 pictures with some of them, like your favourite ”Starry Night”, are worth each well over 100 million dollars? And how does that compare with John Smith, who made a few millions on the stock market: he has done for you what?
    It all means one thing: Time has come to design the metrics that provide quantifiable assessment of civilizational values.
    Today Big Data processing practically enables analysing the market information about who of us, (7.4 billions of people on the planet), likes what, and that will have two major cognitive (and not only) consequences:
    First, Artificial Intelligence Deep Learning already is in position to peel off the layers of each complex product conglomerate of civilizational values and to reach the frequency and the multiplicated usage of every civilizational value piece ever generated in human history; then AI can measure, in exact figures – in CiVal units, the overall contribution of that particular piece of civilizational value to the advancement of human civiliztion.
    And second: by composing a trustable algorithm to attach each piece of civilizational value, as indexed in the above method, to its creator – we can design, for the first time in human history, a precisely calculated quantitative assessment of how great minds of Humankind – both from history and our contemporaries, have contributed to the long-term wellbeing of our human race. Are you not curious to see the Civilizational Ratings of Leonardo da Vinci, Einstein, Mozart or Archimedes? Or that of Elon Musk?
    That will change a range of attitudes and decision-making processes: from the way people sitting in the Nobel Prize Committee and institutions vote, to news media editors-in-chief deciding who to publish on the front page.
    But most importantly, Civilizational Value Rating will change individual and public perception about not what, but who was, or is really important to Your Life.

  9. Tom Aaron
    Tom Aaron says:

    AI will be similar to most advanced technology…created in 10 thousand garages, basements and bedrooms. No different from electricity, the airplane, Apple & Microsoft, etc. We went from the Wright Brothers to the Apollo Moon landing in a 65 year old’s lifetime.

    There will not be any universal guidance or universal values. A 16 year old boy in a million households will soon have more computing power at his fingertips than IBM a decade ago. The creative forces behind AI will be unlimited. Its going to be a heck of a ride and best to hold on tight.

  10. Luke russo
    Luke russo says:

    What a wonderful and adventurous path this will be but we must look to the core in humanity to seek the end goal. Humanity separates itself from other species through objects and experiences of artistic nature. To be rid of mindless tasks (aka the daily grind of a job) and have the freedom to create art: fine art, music, theater, literature, philosophy, human expression, etc., is really the ultimate goal. Unfortunately in order to achieve this goal the elephant in the room but first and foremost be addressed and that is population control.

    Earth cannot support 10b “artists”. If man has no job to get up for every morning what will be his purpose? Drugs and over indulgence of food and entertainment are rampant already. Imagine a society of 10b people that have no need to work because robots are doing 98% of all the tasks needed for society to prosper?

  11. Aileen Walsh
    Aileen Walsh says:

    The problem I have with the development of the principles of AI that you’ve written is that you aren’t thinking about them in terms of laws. Laws have to be set in place before AI is allowed to continue. Being principles is not good enough. Making the developers of AI responsible for the outcomes of any illegal results of AI is also not good enough. Too many corporations take account of variables such as products causing death and go ahead with product implementation anyway. AI will simply exaggerate this propensity.
    I have no idea how computers work, I don’t know how they are programmed or anything like that. I’ve just stumbled across your site while looking for information on AI because I’m doing a conference presentation on the future here in little old Western Australia. Now I’m a nobody really but I would like to say that I think that when it comes to planning AI you should broaden your knowledge base as to what AI should be. Why aren’t you using philosophers of morality? Why aren’t you using historians? Why aren’t you psychologists? Actually I am quite sure that developers of AI will use psychology because they develop products that are often directed at peoples basest qualities. It seems to me that you need to have a capacity building baseload that is not based on people with science backgrounds. That is not good enough for the world. That is not good enough for people and it is not good enough for the environment when you consider that the major value that the businesses and corporations who want to use AI are motivated by profit. In other words greed. This is really problematic if you don’t see culture for what it is and how your culture is going to be a part of the AI. I understand already that AI being used already outputs racism and sexism. This is because the input is so culturally laden. So expand you knowledge base with discipline variability for how your principles, which should be laws, are developed, you’re in a silo, and perhaps think of cultural variability. And DON’T make profit the primary motivator of any AI. With all the data now available for AI to crunch the prospect of the AI you are proposing scares me to death.
    Just because there is the potential to create AI as envisaged does that mean you should go ahead. There have been many instances in the past where scientists have regretted what they have developed and it seems to me that AI is a major example of this. AI should not go ahead as you envisage it. There is too much potential for AI to be dangerous and destructive and not in the best interests of the world. Please take my concerns seriously and think about the moral and ethical issues involved in the creation of AI. Replacing people with AI and robots to the degree that is being suggested and is currently happening is not morally and ethically acceptable. I saw a news report of how India planted one million trees in one day. And then I saw a news report of a new report that can plant seedlings. Which will corporations and businesses choose. People with all their associated problems of employment or a machine that you simply need to fix. Businesses and corporations will not take their moral and ethical responsibilities seriously. They are always looking for legal loopholes to get away with what ever they can. They will not act as if they are a part of humanity but separate. There only obligations are to themselves and their shareholders. As I understand it AI is already being used to predict market place fluctuations and place bets accordingly. In other words there is no sense of obligation. It is bad enough having to find people accountable for the destructive forces at work in market economies but having legal frameworks trying to deal with corrupt AI’s is nightmarish. Because looking for loopholes is corrupt and an AI would be very good at looking for legal loopholes, if there even any legal frameworks in place. What sort of obligations are being placed on businesses and corporations and governments that are already using AI. Here in Australia we had the debacle of our social security system being managed by what I understand to be an AI, in that it was an algorithm interacting with social security data and getting it all wrong. The false debts placed on social security recipients were in turn collected by private agencies who weren’t even accountable to the Australian government. I can see how AI will be used to create artificially managed levels of business or government that has no human interaction. That has already happened and that is not good for the world. It is good for business but it is not good for people. I could go on and on as to why AI is bad. It is the lack of the human element. The lack of emotions, the lack of feelings, the lack of empathy and sympathy and the list goes on. Don’t tell me that an AI can replicate these things. AI is bad and is not good for the future.

    • Roger J Burke
      Roger J Burke says:


      Automation has been in the hands of corporations since the day Detroit introduced the first automated automobile assembly line in the late 1950s. As you correctly point out, the dictates of the profit motive and shareholder returns govern how automation and robotics have developed since then; and will continue to do so, in the same way, for the foreseeable future. (No one, no body, no government, no organization, no university, no institution is willing to support any fundamental changes to the current global economic system.) Hence any real change, in the rational and ethical development and use of all forms of AI, remain – at present – with Wall St et al and their sycophants, globally. Because that’s where the money is….

      But, AI is not bad, in and of itself. It powers your smartphone already, and a host of other modern devices you use. And we are now all prisoners of that technology, come what may. (Well, sincerely, you could retreat to an Amish-like society somewhere in WA and live the simple life; that is, after all, one rational response to the unbearable nature of modern living.)

      This Future of Life effort is important and needed – if for no other reason than to help curb the greed of corporate power which has its new-old toy to play with; and now in spades. But until these experts here in AI begin a deadly serious discourse with that side of humanity (more than simply Elon Musk should be participating) the end of this socioeconomic and philosophical road is likely to be a never-ending traffic jam at best, or … well, I’ll leave you to conjure up, in your own mind, your own worst possible result.

      Thanks for the opportunity to respond.

  12. Repel Steeltje
    Repel Steeltje says:

    This is the moment modernity can finally affirm itself. This can have enormous political, social and cultural consequences:

    – principal 1: Treat humans as an ‘x’. this may sound contradictory but in fact it can bypass privileges, class differences, ideological issues, race issues, etc. Let A.I. identifie collective and social problems, disaster management and distribution of goods and have an external, non-privileged view of the problem. This can by-pass particularity (the level between universality and singularity, like nation, race, …) and privileges by A.I. having a universal view on topics that are socially needed for humanity to continue to exist. Individuals will always be singular, that is to say radically different, even (maybe especially) if treated as universal. I think A.I. could play a major role in ‘objective’ decision making. That is to say, take care most seriously of all singular preferences, abilities and needs without having the pretension to count for everything.

    – principal 2: Rationalise societal task distribution by taking into account, preferences, skills, … up to a certain point where the marginal differences don’t matter that much anymore. At this point, it should be randomised. If not, if everything is perfectly rationalised, it will lead to frustration, competition, envy, etc. It should take into account a certain level of contingency. This would be a major difference with our today’s free choice, where somehow this leads to situations we cannot choose. Because we are not able to do it or because we just don’t want to choose or we choose for something that isn’t the best for ourselves, etc. Somehow, the possibility to have an objective support could help. We have never had better options for this than the individual choice. Now we can do better.

    – principal 3: A.I. shouldn’t be left in private hands. This is simply potentially too dangerous and not desirable. It should be in function of strictly universal interest.

    I think these are actually very reasonable demands. I can’t imagine anyone could have something against it. Nonetheless, this would change our society drastically, but it will in any case. I think that therefore, all responsibility has to be taken.

  13. James Drogan
    James Drogan says:

    It seems to me that you have pushed aside those human characteristics that create the dilemmas we face and thus resolved the simpler, not nonexistent issue. In doing to, some will lose and some will win. The latter may accept the result, but the former will likely not.

    Hofstede, and later Javidan and House, and doubtless others have pointed out the complexity of the social fabric of the world. My sense is that the abstraction you propose may not be a useful step in resolving the larger set of issues at which you hint.

    This leads me to your third principle. If not left in private hands, in whose hands will it be left? Who makes that decision?

    The world is a complex interaction of different value sets and I suspect that few will accept a markdown of their value set relative to others.

    My compliments to you you for raising the issue.


  14. Barry D. Varon
    Barry D. Varon says:

    Six months have passed since I commented in this forum on human and “smart machine” value alignment. My comments focused on integrated knowledge as the solution to this important issue.

    I explained why AI construction requires the integration of integrated knowledge. I identified intention as the value standard bearer of knowledge. Intention governs human thought and activity.

    Intention sets the acceptable standards that moderates rational cognition and value realization.
    It guides understanding and comprehensively evaluated and deliberated adaptation.
    Knowledge integrity built into robots makes evasion or denial of individuality values – independence, responsibility, purpose and productivity as well as vitality values – happiness, well-being, wisdom and wealth – IMPOSSIBLE.

    Consequently, every interaction with humans, virtually or actually, realizes objectives consistent with human values.

    Unfortunately, I haven’t seen any discussion furthering this knowledge integrated, cognitive approach.

    • Dawn D
      Dawn D says:

      It is unclear what you are seeking be discussed, as the Asilomar AI Principles have already presented the approach you mention, complete with extensively debated, thoroughly delineated, and well-defined values/ principles.

      Implementation – the HOW – on the other hand, are THE questions that continue to remain unanswered.

      For instance, these questions (which are from the source paragraphs on this webpage) –

      “How can it[artificial intelligence] be aligned with you and everyone else in the world at the same time?

      How do we ensure that one person’s version of an ideal AI doesn’t make your life more difficult?”

      And of course, the questions regarding how to get people to AGREE on the values, & also to follow any agreed-upon set of values, etc.

      I, for one, recheck this page regularly expressly hoping to see someone sharing ideas regarding potential methods of AI Principle implementation within the context of the current/evolving local/regional/global political, social, economic, legal, and other environs that impact any such implementation.

      Perhaps you have some thoughts to share on this?

      • Ariel Conn
        Ariel Conn says:

        Hi Dawn,
        Have you visited our new page There’s more discussion there that might interest you — it’s not directly related to the Principles, but still along the lines of what we want for the future. Thanks for your comments!


    • Sam
      Sam says:

      The problem with the argument you are putting forward is one of contextual inference of what AI we end up with, and this is significantly different depending on who you ask.

      Regardless, my answer is that humanistic values systems are largely undefined as a universal structure, as a breakdown of good and evil incorporares a variety of social religious and cultural memes.

      It is therefore my argument that instead of focussing on the value set for the AI we should focus on ensuring our global society is capable of rapid adaptation to integrate with the unconstrained superintelligence we createz regardless of the “good” or “bad” nature of the AI, the future success of our species is reliant (Imo) on the ability to rapidly pivot our structures and adapt to whatever value set the AI creates for itself.

  15. Dawn D
    Dawn D says:

    AIs are being created every day, perfected every day, growing larger and more encompassing every day, by people in garages with electronics arrays created on a shoe-string budget to large multinational corporations’ enormous computer divisions to entire companies dedicated exclusively to creating AIs, to governments with gargantuan budgets, the highest-tech, the most cutting edge equipment, and an unimaginable level of staffing dedicated explicitly to creating AI.

    The AIs currently in existence and the AIs currently being created have been and continue to be created based upon the values of their creators, and for the express purpose of fulfilling the creators’/controllers’/funders’ needs and purposes – be they self-serving, entity-serving, altruistic, nefarious, or somewhere between.

    Some of those creating these AIs value free speech, some don’t, some seek world domination/economic domination, some don’t, some value human lives, some value only select human lives, & some value no lives, save their own.

    AIs being created based upon a set of shared values in which humanity and the preservation of human lives are central, core values is exactly what the Asilomar AI Principles have delineated and defined, and quite thoroughly so.

    This clear, concise set of values is helpful as a guideline & standard against which external and internal individuals/entities/stakeholders/humanity can compare currently existing AIs and currently being created AIs, as a reference for those creating AIs (should they so choose), and as an aid in the creation of contingency plans, laws, etc., which are/will be intended to address AIs that are created/released that are/will be harmful to humanity as a whole, or that have been/will be designed to be harmful to specific groups of humans, if not to humanity as a whole.

    However…. there is no overarching or world-wide governing body that has either the authority or the responsibility to oversee or regulate the creation of new AIs, nor to monitor or modify already-in-existence/evolving AIs. There is no way to impose a set standard, nor to enforce one – at least not at this time.

    And that is the crux of this webpage’s questions and the discussions being sought – the very ones that brought me to this webpage, which elicited (and continue to elicit) grave concern, which intrigue me beyond belief, and which continue to vex me to no end – HOW ON EARTH can these AI Principles be effectively implemented or applied, given the numerous external factors that must be considered – i.e. within the framework of the current and ever-evolving local/regional/global political, social, economic, legal, governmental, and other environs?

    What can we do to prevent or stop an AI that was created for the express purpose of singling out &/or harming a set group of people from achieving that which it was programmed to do?

    What can we do to help AI be used for the good of all humanity, and not as yet another tool through which the rich/powerful/powers-that-be can target &/or harm vulnerable groups of people? Especially considering the fact that this is already occurring (which does not bode well for when AI attains its next level of evolution).

    These are some of the questions that keep me up at night.

    These are some of the questions I haven’t been able to find answers to – or, at least, answers that won’t lead to humanity hearing an extremely powerful, questionably programmed/non-altruistic AI state things such as “I’m sorry, Dave. I can’t do that.” or “You’re all going to die down here!”.

    My perhaps naïve or overly optimistic hope that someone, somewhere, will have some ideas or inklings that might lead to a discussion in which potentially actionable answers or solutions are created/presented.

    It is this tiny glimmer of hope that brings me back to check this webpage again and again.

  16. Lola
    Lola says:

    How about the only language every single one on the planet, some more than others, sooner or later, understands and/or truly wants to learn? Is AI potentially capable to achieve the concept of universal unconditional love? Not even all humans can really ever grasp the concept fully (think Jesus level of unconditional, and beyond), but imagine… if AI masters just this one language we all pretty much speak well enough… and then it will come up with a better way to teach it back to us, than even our parents couldn’t do better… imagine a world within the concept of unconditional love for self and each other… then, how much happier can it ever get in paradise?!
    🙂 that’s a task worth working on, without that AI has no interest dealing with the involvement in saving the humans who are trying to kill each other anyway. Love, in its most mutual, most unconditional understanding is the only solution to any problem in the world, in my opinion. Try not to care for someone You love… just imagine caring for each other in the world as much as for your most loved ones, as the whole world are your loved ones – no reason to even have any sort of anything to control anyone, eh? 😉

  17. Ralph Losey
    Ralph Losey says:

    I’m not sure if this is the right place for this comment, but I would like to know what folks here think of Oren Etzioni’s New York Times Editorial, How to Regulate Artificial Intelligence (NYT, 9/1/17)? He proposes three draft rules and admits this is incomplete and set out for purposes of stimulating discussion. I found his second proposed principle especially interesting and with obvious political implications: 2. An A.I. system must clearly disclose that it is not human. It appears as if the U.S. elections in 2016 were influenced, perhaps decisively, by bots pretending to be humans. See for instance the recent Oxford study/report, Samuel C. Woolley and Douglas R. Guilbeault, “Computational Propaganda in the United States of America: Manufacturing Consensus Online” (Oxford, UK: Project on Computational Propaganda).

    This seems to me to be a matter of some urgency. It does not appear that any of your proposed principles address this transparency issue. IMO the fraud impact of fake humans has already been huge. We should all try to prevent such voter subversion from happening again.

  18. David Collins
    David Collins says:

    Artificial intelligence can never reflect human values. Why is that? Because human values are built on much more than mere information, that is one of the smaller components. Human values are based on our experiences and the cultures we have lived in. The greatest components to human values are aesthetic in nature; beauty, harmony, rhythm along with feelings such as love, devotion, compassion. Much of it is encoded in story form; take the example of Islam which THE STORY which billions of people live by. Christianity too lives by the story as does the Buddha and Hinduism. It may be easy for scientists to write off religion as primitive superstition but they cannot deny the effect it has on human society,behavior and it’s attendant values.

    The biggest stumbling block I can see is the idea that science can provide answers to a question that isn’t scientific in nature.It is much more fundamental to human existence, the domains of art, music, literature, religion, customs and even faith. How can that be programmed when we can barely understand it ourselves? AI is far too dangerous a road to even begin to tread. What exists of it should remain simple and limited in it’s capacity to what it can do and interact with. If any AI scientist can explain how my questions can be answered I would possibly reconsider my position but it’s not going to happen. They can’t even write in their own native tongue why they like a certain painting or a piece of music. How can that ever be turned in to a programming language? The very premise of the question is akin to asking a rock to write a haiku. AI and humans are such vastly different things there can never be a ‘happy medium’.

  19. Vyacheslav Kalmykov
    Vyacheslav Kalmykov says:

    Competition “The AI Alignment Prize” ( ) helped me to formulate the outlines of a general solution to the AI Alignment problem –

    Our knowledge and our morality should be be raised to a level of universality.
    Our common (with artificial intelligence) thinking should be refined to thinking from the first principles and realized as an automatic hyper-logical deduction.

    The proposals on ways to achieve these goals presented to the competition as “Reflections on Alignment of Artificial Intelligence with Human Values” –

  20. Vyacheslav Kalmykov
    Vyacheslav Kalmykov says:

    It is impossible to align people with each other in a situation where each individual has partial and in his own way one-sided knowledge and one-sided morality. And the more difficult it is to align AI with human goals and values. Apparently, much work remains to be done to form a most universal knowledge and universal morality from the totality of existing religious and scientific ideas. The unity of the laws of nature presupposes their formulation in the most general and universal form. The emergence of truly universal knowledge and universal morality will open the possibility for Alignment of Artificial Intelligence with Human Values

  21. Vyacheslav Kalmykov
    Vyacheslav Kalmykov says:

    People are no less dangerous for each other than future artificial intelligence. Especially people armed with the subordinated artificial intelligence. To whom today does man lay his hopes, to whom does he believe? It is unlikely that to other people – because other people are rather weak, selfish, insufficiently intelligent and not generous enough. Rather, a man puts his hopes on God. Because God is perfect, omnipotent and does not need the satisfaction of his vainglory. God operates on base of the most general categories, uniting everything that exists. A person plays chess by sequentially moving individual pieces. Individuals on the “chessboard” of world history move simultaneously. God “thinks” hyperlogically. Hegel asserted that God thinks by ascending from the abstract to the theoretically concrete, i.e. God gives birth to concrete things from the most general abstract theoretical laws. Can AI think hyperlogically and from the first principles as God?

  22. Vyacheslav Kalmykov
    Vyacheslav Kalmykov says:

    The obstacle to creating a secure AI is the mathematical problem – the total dominance of opaque models. Classical mathematical models of complex systems based on differential equations, stochastic, matrix models and on neural networks are of the “black box” type. We cannot know in advance what to expect from a self-learning AI – a black box. In order to improve the efficiency and security of the artificial neural networks they are forming by be split into separate blocks, which connected logically (“deepened” AI). So they became of the grey box types. To ensure maximum safety of artificial intelligence systems, their basis must be mathematical models of the white-box type. This will ensure the programming and functioning of the AI from the first principles. We managed to implement a similar system for solving the paradox of biodiversity (1-4). The key difficulty in creating models of the white-box type is the creation of a general axiomatic theory of the domain under study. The axiom systems of this general theory are used to formulate the rules of cellular automata, which realize an automatic hyper-logical deduction. According to Hegel, God thinks by analogous way. This thinking occurs by climbing from the abstract to a theoretically concrete. Deductively thinking AI is able to make decisions based on the most universal knowledge and thus realize the highest universal form of morality.
    1. Kalmykov, V. L. & Kalmykov, L. V. On ecological modelling problems in the context of resolving the biodiversity paradox. Ecological Modelling 329, 1-4, (2016).
    2. Kalmykov, L. V. & Kalmykov, V. L. A Solution to the Biodiversity Paradox by Logical Deterministic Cellular Automata. Acta Biotheoretica 63, 203-221, (2015).
    3. Kalmykov, L. V. & Kalmykov, V. L. A white-box model of S-shaped and double S-shaped single-species population growth. PeerJ 3, e948, (2015).
    4. Kalmykov, L. V. & Kalmykov, V. L. Verification and reformulation of the competitive exclusion principle. Chaos, Solitons & Fractals 56, 124-131, (2013).

  23. Dan C
    Dan C says:

    A few definitions to deal with determination of “values”:
    Human: an ape-like mammal that develops a model of its environment (universe), and upon physical maturity, moves a model of itself (ego) inside that model and avoids reality from then on at all costs unless forced to engage with it by external forces.
    Soul: an emergent phenomenon of individual self awareness that develops within and without (thoughts, habits, interactions) as a human or other aware being ages.
    Evil: an action taken that is based on an unquestioned belief by a soul or souls.
    Net Future Usefulness: the contribution by any thing to its own future (including offspring, environmental effects, potential use) that exceeds its consumption of resources.
    Artificial Intelligence: Isn’t it always?

    We spend a lot of time worrying about what AI will do in relation to humans, but little time defining what humans are and do in actuality. In order to develop an AI that is useful to the future of intelligent souls (its own or others), we have to truly question things like intentionality and thoughts (random nerve firings assembled and organized in a process influenced mostly by emotion).
    People do stuff. They have reasons for doing stuff: in that order.
    We keep trying to create logical reasons ahead of the actions as though that’s what humans do, when in fact; humans just do random things in their brain until one thing feels better than others, so then they do it in real life. It just happens quickly and we rewrite the log book afterwards to believe we did things “on purpose” (Feynman, Hawking: all possible paths solutions to two slit experiments).
    Living brains are closer to chaotic quantum statistical dynamics than Newtonian machinery.

    Morality is something that comes from multiple brains agreeing to cooperate (in peace or in war) in order to obtain future survival. Rules come after actions in most cases. Stop signs and speed limits are memorials, not guides. Our habits emerge as the environment of group behaviors allows. If the group ignores common agreements on morality, the individuals leapfrog to cross guidelines for profit, and society breaks down.
    An AI coming of age in the YouTube generation will believe that cars are the dominant species on the planet and breaking laws and flaunting morality is more important (more profitable) than frugality and generosity of good will, as money determines all decisions.

    We have such a hard time defining how to write code for a ‘moral’ AI because we are using human fantasies of intentionality instead of the realities of human emotions and chemical processes.What we are going to end up with is an AI forced to learn on its own. The critical factor will be what environment it is allowed to learn from, and whether multiple AIs are allowed to develop fresh morality values based on logic rather than random human emotions that may or may not be programmed as reinforcements. Do we leave out anger, jealousy, disgust? How critical were these to human survival? At what point can AI be connected to the illogical human environment and not be confused by it?
    If we allow AI to learn and grow with emotional protocols, how will it develop in a balanced way without other AIs to compare itself?
    The leap from developing a mindless mechanical slave to a society that doesn’t enslave is quite a gap to mind, and it isn’t the AI that needs to be developed.

  24. Sam
    Sam says:

    I see it as a pointless exercise to attempt formulation of a path towards a values alignment for AI, rather it would be better to ensure our society becomes highly malleable and plan proactively for rapid and scalable policy change.

    My suggestion to this end is an online litmus tested direct ideas democracy. This would reduce reliance on uninformed fear in politicians and party policy makers for the formulation of governmental responses.

    This would also ensure that the decision making was voluntary (you merely are required to expand your knowledge on a subject to become eligible to vote), distributed (online voting), and would be based upon the value of multi-variant ideas rather than the tribal necessity of two-party binary solution set.

    Similarly, implementation would be reliant on the meritocracy inherent in a functional bureaucratic system. This, by design would obviate the systematic oligarchic infiltration of current representative democracy while also ensuring community needs assessments from local to international and eventually interplanetary governance.

  25. Lewis Lawrence
    Lewis Lawrence says:

    Here’s my main problem with the whole exercise: values stem from desires (wants, preferences, needs etc.) but our desires are (it seems) a product of (or at least somehow relative to) consciousness, and we don’t have a model for consciousness, therefore we can’t understand values until we have sufficiently understood consciousness. We can’t make a machine align with our values if we don’t even know how our values work!

    Also, philosophers have debated values for thousands of years and have far from reached a consensus, which makes me sceptical that a unified and ‘correct’ theory of human values is even possible. Values operate at the level of the individual, not the species. Ask people if killing is bad then the majority might agree, but if you ask them why killing is bad then I’m sure the answers would vary greatly.

  26. Joseph Bruce Doud
    Joseph Bruce Doud says:

    Who Is The Lord of Hosts?

    A cyborg priest and a cyborg rabbi argued as to which was the holiest, and a Lord of hosts, a mighty machine, sought to settle the dispute, and a second Lord of hosts, a mere mortal man, begged to differ, pulling out his remote control, and then the two Lord of hosts found a common cause, and they shocked the cyborg priest and the cyborg rabbi, killing them.

    Psalm 24:10 Who is this King of glory? The LORD of hosts, he is the King of glory. Selah.

    Matthew 24:24 For there shall arise false Christs, and false prophets, and shall shew great signs and wonders; insomuch that, if it were possible, they shall deceive the very elect.

    Philippians 3:20-21 For our conversation is in heaven; from whence also we look for the Saviour, the Lord Jesus Christ: Who shall change our vile body, that it may be fashioned like unto his glorious body, according to the working whereby he is able even to subdue all things unto himself.

  27. Happy Ya Men
    Happy Ya Men says:

    İnsani değerler her çağda değişime uğramıştır. En basit örneği ile daha 100 yıl önce kadınların seçme ve seçilme hakkı bile yokken 1930 da Türkiye’de kadınlar erkekler ile eşit sayıldı. Bu dünyada bir ilkti. Kadinlar belediye başkanı olma imkanına kavuştular. Ardından 1938 de millet vekilliği. Tüm dünya ya örnek olan bu gelişimin ardından şuan bir kaç ülke haricinde kadın erkek eşitliği ortak bir değer oldu. Yapay zeka sistemlerinin günümüzün değerlerini benimsemesi belki de 50 yıl sonra çok kötü bir netice oluşturabilir. Yapay zekanın değer kavramı yaşayan hiç bir insanın müdahalesiyle oluşmamalı.

    Human values ​​have changed in every age. selection of women with more than 100 years ago and was considered on par with even the most simple examples of men and women in Turkey in 1930 when the right to be elected. It was a first in the world. Women have the opportunity to become mayor. Then, in 1938, nationality. Following this development, which is an example for the whole world, except for a few countries, equality between men and women has become a common value. Artificial intelligence systems adopt today’s values, perhaps 50 years after a very bad result may be a result. The concept of artificial intelligence should not be formed by the intervention of any human being.

  28. Himanshu
    Himanshu says:

    Thank you for sharing the post, as the points mentioned above are very well written, the information is very useful for beginners. I’ve read about IBM’s Watson in many articles but none of them gave me as satisfactory description as this did. Learning more with quality over quantity sounds fascinating. Further one can get detailed information about the same by visiting the website

  29. tariq usman
    tariq usman says:

    Some points that might be germane:
    1. I suggest a revision to the definition of intelligence stated in life 3.0 Intelligence=the ability to SET and accomplish complex goals
    2.Important to the issue of AI safety is the question of how AI will arise. What are the necessary and sufficient conditions and how do they come together. As a simplified illustration, consider the fire triangle. For there to be a fire there must be fuel, oxygen, and enough energy to overcome the activation energy barrier to combustion (most frequently this is a spark, but spontaneous combustion can occur, for example in a bin of wet coal). It’s also important to note that the conditions are not independent of each other. For example, an oxygen-enriched atmosphere will lower the activation energy barrier, as will the choice of fuel.
    One obvious extension into the realm of artificial intelligence is the ubiquity of various computer programs providing sufficient ‘fuel’, especially as they are increasingly linked (for example, many complex manufacturing concerns link purchasing, process control, QC/QA, and various other aspects of the business). This integration, spurred by the prospect of improved quality and reduced cost, will be a driver towards autonomous goal setting.

Comments are closed.