The Ethical Algorithm: The Science of Socially Aware Algorithm Design

Author:

Kearns Michael,Roth Aaron

Abstract

THE ETHICAL ALGORITHM: The Science of Socially Aware Algorithm Design by Michael Kearns and Aaron Roth. New York: Oxford University Press, 2019. 232 pages. Hardcover; $24.95. ISBN: 9780190948207. *Can an algorithm be ethical? That question appears to be similar to asking if a hammer can be ethical. Isn't the ethics solely related to how the hammer is used? Using it to build a house seems ethical; using it to harm another person would be immoral. *That line of thinking would be appropriate if the algorithm were something as simple as a sorting routine. If we sort the list of names in a wedding guest book so that the thank-you cards can be sent more systematically, its use would be acceptable; sorting a list of email addresses by education level in order to target people with a scam would be immoral. *The algorithms under consideration in The Ethical Algorithm are of a different nature, and the ethical issues are more complex. These algorithms are of fairly recent origin. They arise as we try to make use of vast collections of data to make more-accurate decisions: for example, using income, credit history, current debt level, and education level to approve or disapprove a loan application. A second example would be the use of high school GPA, ACT or SAT scores, and extra-curricular activities to determine college admissions. *The algorithms under consideration use machine-learning techniques (a branch of artificial intelligence) to look at the success rates of past student admissions and instruct the machine-learning algorithm to determine a set of criteria that successfully distinguish (with minimal errors) between those past students who graduated and those who didn't. That set of criteria (called a "model") can then be used to predict the success of future applicants. *The ethical component is important because such machine-learning algorithms optimize with particular goals as targets. And there tend to be unintended consequences--such as higher rates of rejection of applicants of color who would actually have succeeded. The solution to this problem requires more than just adding social equity goals as part of what is to be optimized--although that is an important step. *The authors advocate the development of precise definitions of the social goals we seek, and then the development of algorithmic techniques that help produce those goals. One important example is the social goal of privacy. What follows leaves out many important ideas found in the book, but illustrates the key points. Kearns and Roth cite the release in the mid-1990s of a dataset containing medical records for all state employees of Massachusetts. The dataset was intended for the use of medical researchers. The governor assured the employees that identifying information had been removed--names, social security numbers, and addresses. Two weeks later, Latanya Sweeney, a PhD student at MIT, sent the governor his medical records from that dataset. It cost her $20 to legally purchase the voter rolls for the city of Cambridge, MA. She then correlated that with other publicly available information to eliminate every other person from the medical dataset other than the governor himself. *Achieving data privacy is not as simple as was originally thought. To make progress, a good definition of privacy is needed. One useful definition is the notion of differential privacy: "nothing about an individual should be learnable from a dataset that cannot be learned from the same dataset but with the individual's data removed" (p. 36). This needs to also prevent identification by merging multiple datasets (for example, the medical records from several hospitals from which we might be able to identify an individual by looking for intersections on a few key attributes such as age, gender, and illness). One way to achieve this goal is to add randomness to the data. This can be done in a manner in which the probability of determining an individual changes very little by adding or removing that person's data to/from the dataset. *A very clever technique for adding this random noise can be found in a randomized response, an idea introduced in the 1960s to get accurate information in polls about sensitive topics (such as, "have you cheated on your taxes?"). The respondent is told to flip a coin. If it is a head, answer truthfully. If it is a tail, flip a second time and answer "yes" if it is a head and "no" if it is a tail. Suppose the true proportion of people who cheat on their taxes is p. Some pretty simple math shows that with a sufficiently large sample size (larger than needed for surveys that are less sensitive), the measured proportion, m, of "yes" responses will be close to m = ¼ + ½ p. We can then approximate p as 2m - ½, and still give individuals reasonable deniability. If I answer "yes" and a hacker finds my record, there is still a 25% chance that my true answer is "no." My privacy has been effectively protected. So we can achieve reasonable privacy at the cost of needing a larger dataset. *This short book discusses privacy, fairness, multiplayer games (such as using apps to direct your morning commute), pitfalls in scientific research, accountability, the singularity (a future time where machines might become "smarter" than humans), and more. Sufficient detail is given so that the reader can understand the ideas and the fundamental aspects of the algorithms without requiring a degree in mathematics or computer science. *One of the fundamental issues driving the need for ethical algorithms is the unintended consequences that result from well-intended choices. This is not a new phenomenon--Lot made a choice based on the data he had available: "Lot looked about him, and saw that the plain of the Jordan was well watered everywhere like the garden of the Lord, like the land of Egypt ..." Genesis 13:10 (NRSV). But by choosing that apparently desirable location, Lot brought harm to his family. *I have often pondered the command of Jesus in Matthew 10:16 where he instructs us to "be wise as serpents and innocent as doves." Perhaps one way to apply this command is to be wise as we are devising algorithms to make sure that they do no harm. We should be willing to give up some efficiency in order to achieve more equitable results. *Reviewed by Eric Gossett, Department of Mathematics and Computer Science, Bethel University, St. Paul, MN 55112.

Publisher

American Scientific Affiliation, Inc.

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Quantum Privacy and Hypothesis-Testing;2023 62nd IEEE Conference on Decision and Control (CDC);2023-12-13

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3