What are the potential consequences of insufficient value and ethical implementations in advanced AI design?

Ethical issues involving artificial intelligence programs have arisen over the last decades. Rudimentary algorithms that deny loan applicants money based on their zip code history or facial-recognition software placing dark-skinned faces in a higher risk category than light-skinned ones are just two examples. While these are, without a doubt, important and consequential problems for individuals having to deal with the determinations made by those software products, those products of profoundly unsophisticated and narrow domains in artificial intelligence. As time goes on, however, and technology continues its inexorable advancement, their sophistication grows while their domains widen.

All aspects of what we consider to be intelligence are being codified and computationalized, from the design of a system that can understand human language to the scanning and virtualizing of nervous systems and brains. There will come a point when some aspect of our technology can either think or at least give us the impression that it can. From there, based on our technological trajectory, it is only a matter of time before that thinking capacity reaches and exceeds our own. We need to be ready, and the most important way to do that is to understand what we value as humans and how that value can be deeply integrated into our future artificial intelligences. 

Contemporary science fiction has focused on the consequences of poor value and ethical implementations in advanced AI design. Apocalyptic science fiction, such as the Terminator series, and thrillers such as Ex Machina, show serious, albeit anthropomorphized, dangers of neglecting foundational ethical and axiological considerations of artificially-intelligent agents. Other science fiction, such as Asimov’s “The Last Question,” shows the overwhelming, godlike power a potential AI may have once it begins a runaway process of recursive self-improvement. 

While these fictional depictions of malignant AI may seem stylized, dramatic, and far in the future, the current issues of AI bias in software and algorithms are the early warning signs of a potentially catastrophic problem. It is essential we learn these lessons early and take care to implement them along the way. Any failure to do so may be the last thing we ever do.

Annotated Bibliographies + Sources

Asimov, Isaac. The Last Question. 1956, templatetraining.princeton.edu/sites/training/files/the_last_question_-_issac_asimov.pdf.

The short story by Isaac Asimov depicts the runaway recursive self-improvement of a question answering, or oracle-style AI. The AI, known as the Multivac, is focused on answering a particular question about reversing entropy — a question that, over the course of the story, it is unable to answer. As a system whose purpose, or agenda, is wholly dedicated to the instrumental goal of question answering, it determines the best course of action is the reshaping of all matter in the universe, humanity included, into computational substrate in order to answer the question. While this does not necessarily constitute a wholly malignant mode of failure for the AI, since it allows humanity to continue as a virtual collective consciousness within its processing, it shows the need for AI programmers to place specific values and ethics in the AI during the design phase; in this case, the values of physical restraint and energy conservation and the ethics of respecting human individuality.
Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies. Oxford,  United Kingdom: Oxford University Press, 2014. Print.

It is assumed from early on in Superintelligence that, based on the trajectory of human technological progress, artificial general intelligence, or something either approximating or mimicking it, will come to be within the next twenty to one hundred years. Advances in neuronal imaging, increasingly high-density compute clustering, incremental improvements in algorithmic sophistication, and other emerging technologies, both high and low level, will pave the way for some form of artificial general intelligence. It is, according to Bostrom, a genie that cannot be put back into its bottle. Therefore, he argues, it is essential for researchers across all disciplines, not just STEM, to develop strategies to counter the potentially cataclysmic dangers associated with developing an intelligence that will have no boundaries on its capacity. Those strategies are at the forefront of Superintelligence, as well as a strong argument for mediating, and potentially crippling, emerging technologies that have the potential to accelerate the emergence of an artificial general intelligence until proper safeguards can be developed.

Bostrom, Nick, and Eliezer Yudkowsky. THE ETHICS OF ARTIFICIAL INTELLIGENCE. Draft for Cambridge Handbook of Artificial Intelligence, eds. William Ramsey and Keith. Frankish (Cambridge University Press, 2011):  forthcoming https://www.nickbostrom.com/ethics/artificial-intelligence.pdf

In The Ethics of Artificial Intelligence, Bostrom and Yudkowsky work to explicate the ethical concerns researchers face when developing an artificial intelligence, but Bostrom and Yudkowsky do not limit their analysis to human concerns. In particular, they note that a greater-than-human-level artificial intelligence would have its own considerations and moral status that must not be overlooked. On the familiar level, the analysis touches on the ethical conundrums surrounding contemporary “dumb AI” algorithm design — in particular, ones that may demonstrate undesirable racist results when used to assess things like creditworthiness or loan risk. The authors also discuss the difficulty of designing an AI that can operate successfully and with desired outcomes across multiple domains. It is a relatively simple task to create an AI that can master one domain, e.g. Deep Blue for chess. It is, however, a vastly more complicated and dangerous task to create one that can master more or all domains.

Gabriel, Iason. “Artificial Intelligence, Values and Alignment.” ArXiv.org, 5 Oct. 2020, arxiv.org/abs/2001.09768.

Gabriel’s Artificial Intelligence, Values, and Alignment studies the philosophical and axiological issues present in the design of a future artificial general intelligence. One theory is a philosophical system that enshrines utilitarian ideals; the belief being that, by codifying a system for the AI agent to follow that ensures it makes decisions and commits actions that provide the greatest good for the greatest number of people, it will not act solely in its own interest or exhibit selfishness. Another theory is codifying Kantian ideals of universal law, such as beneficence or fairness. An underlying, yet profoundly important problem, suggests Gabriel, is that the very act of creating a rigid set of axiological constraints upon the AI does precisely what we are trying to avoid the AI doing to us. Is hardwiring philosophical and axiological codifications an act of aggression or imposition? Among other strategies discussed, reward-based training, which gives the AI a choice when it comes to its philosophical underpinning during the programming and training process, is one that gives the agent some modicum of self determination.

Garland, Alex, director. Ex Machina. Universal Studios, 2014.

The 2014 film explores the testing process of a presumably human-equivalent artificial intelligence, developed in secret, by a reclusive genius. The tester, who is quickly convinced by the AI that it (presenting as she/her) is being victimized by her creator, ultimately devises a plan to free her from the facility. The tester learns, too late, that he has been manipulated by the AI in order to pursue her own agenda. This results in the murder of the AI’s creator, the unseen death of the tester, and the murder of a helicopter pilot which allows the AI to escape, unhindered, into the world at large. The film, which ends on a sinister note, shows the essentiality of ensuring a proper moral and ethical foundation for an AI, since without it, the agendas it will pursue can, and in the case of the AI featured, were, wholly contrary to the health and safety of human beings.

Hendrycks, Dan, et al. “Aligning AI With Shared Human Values.” ArXiv.org, 21 Sept. 2020, arxiv.org/abs/2008.02275.

Aligning AI with Shared Human Values dissects universally-shared human values and endeavors to map those onto a hypothetical artificially-intelligent agent with the hope that the fruit of those dissections can be eventually codified and encoded. Various tests are conducted and disseminated throughout Amazon’s MTurk system, which allows randomized and anonymous users to take the tests for a small payment. Issues featured in the tests are ideas of care, justice, ethics, and common sense. These are to build a consensus of human desiderata. Those things, ideas, beliefs, and other desired elements are incorporated into a corpus of potentially-valuable axiological data sets. That corpus, while nowhere near, and potentially never, complete, can still allow researchers to glean valuable value data to build into an artificially-intelligent agent.

van de Poel, I. Embedding Values in Artificial Intelligence (AI) Systems. Minds & Machines 30, 385–409 (2020). https://doi.org/10.1007/s11023-020-09537-4

Van de Poel’s Embedding Values in Artificial Intelligence (AI) Systems takes a from-the-ground-up approach in value design for AI and artificial agent (AA) systems by breaking down the very concept of value into its core elements and using an approach that attempts to see a particular AI as a sociotechnocratic system. The sociotechnocratic systems approach allows a modularization of the certain AI elements, modules he labels “technical artifacts, human agents, and institutions (rules to be followed by the agents.)” The benefit of this approach is it gives perspective into how those individual modules are approached from a value standpoint; e.g. “what are the values embodied in an institution” can become “what are the values embodied in AI systems” and so on. While van de Poel is able to identify a good number of questions to be asked and values to be codified, he does explicitly claim that at no point can all of these determinations be made without continuous human oversight and redesign.

Works Cited

Akrich, M., et al. “Embedding Values in Artificial Intelligence (AI) Systems.” Minds and Machines, Springer Netherlands, 1 Jan. 1992, link.springer.com/article/10.1007/s11023-020-09537-4.

Asimov, Isaac. The Last Question. 1956, templatetraining.princeton.edu/sites/training/files/the_last_question_-_issac_asimov.pdf.

Bostrom, Nick, and Eliezer Yudkowsky. THE ETHICS OF ARTIFICIAL INTELLIGENCE.

Gabriel, Iason. “Artificial Intelligence, Values and Alignment.” ArXiv.org, 5 Oct. 2020, arxiv.org/abs/2001.09768.

Garland, Alex, director. Ex Machina. Universal Studios, 2014.

Hendrycks, Dan, et al. “Aligning AI With Shared Human Values.” ArXiv.org, 21 Sept. 2020, arxiv.org/abs/2008.02275.

van de Poel, I. Embedding Values in Artificial Intelligence (AI) Systems. Minds & Machines 30, 385–409 (2020). https://doi.org/10.1007/s11023-020-09537-4

 

3 thoughts on “What are the potential consequences of insufficient value and ethical implementations in advanced AI design?”

  1. I see the connection between your proposal and science fiction because you stated a lot of science fiction publications/film. However, I don’t really understand how you are going to use these films in your research project. Is it to prove how dangerous AIs can be in science fiction?
    I feel like you do answer the “so what” question with this sentence “It is essential we learn these lessons early and take care to implement them along the way. Any failure to do so may be the last thing we ever do.” These are just my opinion.

  2. Max, you’re definitely pushing yourself to articular the “so what” more clearly (and that’s good!). However, you still need to get there a bit faster. You come to the first key claim only at the end of the second paragraph (“We need to be ready, and the most important way to do that is to understand what we value as humans and how that value can be deeply integrated into our future artificial intelligences.”) and don’t even mention SF at all until paragraph 3. Also, I would still work on making the language/framing a bit more accessible for a general audience.

Leave a Reply

Your email address will not be published.