Please provide a short (approximately 100 word) summary of the following web Content, written in the voice of the original author. If there is anything controversial please highlight the controversy. If there is something surprising, unique, or clever, please highlight that as well. Content: Title: More Than Just Algorithms Site: queue.acm.org What case study topics do you want to read about? Take a quick survey. March 27, 2023 Volume 21, issue 1   PDF Case Study More Than Just Algorithms A discussion with Alfred Spector, Peter Norvig, Chris Wiggins, Jeannette Wing, Ben Fried, and Michael Tingley Dramatic advances in the ability to gather, store, and process data have led to the rapid growth of data science and its mushrooming impact on nearly all aspects of the economy and society. Data science has also had a huge effect on academic disciplines with new research agendas, new degrees, and organizational entities. Recognizing the complexity and impact of the field, Alfred Spector, Peter Norvig, Chris Wiggins, and Jeannette Wing have completed a new textbook on data science, Data Science in Context: Foundations, Challenges, Opportunities, published in October 2022. 6 With deep and diverse experience in both research and practice, across academia, government, and industry, the authors present a holistic view of what is needed to apply data science well. Ben Fried, a venture partner at Rally Ventures and formerly Google's CIO for 14 years, and Michael Tingley, a software engineering manager at Meta, gathered the authors together as they were finishing up the manuscript to discuss the motivation for their work and some of its key points. Norvig is a Distinguished Education Fellow at Stanford HAI (Human-centered Artificial Intelligence) and a research director at Google; Spector is a visiting scholar at MIT with previous positions leading engineering and research organizations; Wiggins is an associate professor of applied mathematics at Columbia University; and Jeannette Wing is executive vice president for research and professor of computer science at Columbia University. (More biographic detail on the panelists is available at the conclusion of this article.)   Ben Fried You've come to data science from very different backgrounds. Was there a shared inspiration to write the book? Alfred Spector In one way or another, I think we all saw a deep and growing polarity in data science. On the one hand, it has enormous, unprecedented power for positive impact, which we'd each been lucky enough to contribute to; on the other hand, we had seen serious downsides emerge even with the best of intentions, often for reasons having little to do with the technical skills of the practitioner. There are many excellent texts and courses on the science and engineering of the field, but it seems like there is something in the headlines every day that demonstrates there is an urgent need to educate on what you, Ben, have called the "extrinsics" of the field. Peter Norvig Throughout the rapid growth in applications of data science, there have been serious issues to confront: click-fraud, the early Google bombs, data leaks, abusive manipulation of applications, amplification of misinformation, overinterpretation of correlations, and so many more—all things we read about daily. Some problems are more serious than others, but we feel education will help us to lessen their frequency and severity, while simultaneously allowing us to understand their significance. BF Why the word Context in the title of your book? Chris Wiggins It was our primary motivator. In a nutshell, we wanted to provide some inclusive "context" for the data-science discipline. We felt the term data science is often used too narrowly. AS We think of context in three ways. It refers to the topics beyond just the data and the model. These include dependability, clarity of objectives, interpretability, and other things I'm sure we'll get into. It also refers to the domain in which data science is being applied. What is crucial for certain applications isn't needed for others. Teams practicing data science must be particularly sensitive to the uses to which their work will be placed. Finally, context refers to the societal views and norms that govern the acceptance of data-science results. Just as we have seen changing views and norms regarding privacy and fairness, data science will increasingly be expected to solve challenging problems, where societal views vary by region and over time. Some of these problems are "wicked," in C. West Churchman's 2 language, and they are so very different from the problems that computing first addressed. Jeannette Wing While data science draws from the disciplines of computer science, statistics, and operations research to provide methods, tools, and techniques we can apply, what we do will vary according to whether we're working on a healthcare issue, something related to autonomous driving, or perhaps exploring some particular aspect of climate change. Just as each discipline comes with its own constraints, the same might be said of each of these different problem domains. Which is why the application of data science is largely defined by the nature of the problem we're looking to solve or the task we're trying to complete. PN Beyond this, I personally wanted to reach a broader audience than I had with my more mathematical and algorithmic textbook. To do data science, we need to know many techniques, but we also need to be conversant with larger, societal issues. We all shared this motivation. BF All this leads to the question of how you define data science. JW By the time Alfred and I first started talking about working on a book, I was already writing papers and giving talks where I defined data science as the study of "extracting value from data." But we agreed that this definition was too high level and insufficiently operational. AS So, we started with "extracting value from data," then added prose to address the two personalities of the field—one where data is used to provide insight to people (as in many uses of statistics) and the other having to do with data science's ability to enable programs to reach conclusions. CW We also recognized we needed a capacious definition [see sidebar] to respect what people are doing in the name of data science within industry and academia, as well as the rapidity of change in the field. Definition of Data Science Data science is the study of extracting value from data—value in the form of insights or conclusions. A data-derived insight could be: • An hypothesis, testable with more data. • An "aha!" that comes from a succinct statistic or an apt visual chart. • A plausible relationship among variables of interest, uncovered by examining the data and the implications of different scenarios. A conclusion could be in an analyst's head or in a computer program. To be useful, a conclusion should lead us to make good decisions about how to act in the world, with those actions taken either automatically by a program or by a human who consults with the program. A conclusion may be in the form of a: • Prediction of a consequence. • Recommendation of a useful action. • Clustering that groups similar elements. • Classification that labels elements in groupings. • Transformation that converts data to a more useful form. • Optimization that moves a system to a better state. Taken from Data Science in Context: Foundations, Challenges, Opportunities. 6 BF It's a very fluid definition. Not only does data science mean different things to different people, it also has fuzzy boundaries. CW Exactly! We're at that time in the creation of a new field where it does have fuzzy boundaries. It touches on many different subjects: privacy/security, resilience, public policy, ethics, etc. But it's also clearly taking form with the creation of job titles, degrees, and departments. We saw an opportunity to take a stab at defining its breadth—starting with the diverse challenges its practitioners must overcome. Michael Tingley Do you make a distinction between data science and machine learning? AS As a domain, data science is broader than machine learning, in that machine learning is only one of the techniques it employs. Data science encompasses many techniques from statistics, operations research, visualization, and many more areas: in fact, all the things needed to bring insights and conclusions to a worthwhile end. That being said, the revolutionary growth in machine learning has absolutely catalyzed the most change: incredible successes but some challenges too. PN One difference is that, in the machine-learning arena, a researcher's focus might be to write a paper that touts some new algorithm or some tweak to an existing algorithm. Whereas, in the data-science sphere, research is more likely to talk about a new dataset and how to apply a collection of techniques to use it. BF So , you were motivated by the breadth of challenges we face. Where did you end up? Are there approaches that can help? Analysis Rubric • Tractable data • Technical approach • Dependability • Understandability • Clear objectives • Tolerance of failures • Ethical, legal, societal implications PN After lots of give and take, we came up with something we call an analysis rubric, where we enumerate the elements a data scientist needs to take into account. As Atul Gawande writes in The Checklist Manifesto, 3 checklists such as our rubric make for better solutions, and we hope ours might help people avoid some of the mistakes we have made in past projects. But because each project is different, it's hard to come up with one checklist that will work across all of them, so we'll see how well it holds up to the test of time. AS Let's be specific. The analysis rubric addresses the challenges in seven categories. Some relate more to how we implement or apply data science. The others relate more to the requirements we are trying to satisfy. PN The rubric starts with data: getting and storing it, wrangling it into a useful form, ensuring privacy, ensuring integrity and consistency, managing sharing and deletion, etc. In some ways, this may be the hardest part of a data-science project. For me, the first big revelation of data science was that data can be a key asset that offers real value. 4 But, the second revelation was that data can be a liability if you're not a good shepherd for it. BF Are there hidden costs to holding onto data? PN I've learned something in this regard from all the efforts that have been made in recent years to advance federated learning. In earlier days, if a team wanted to build a better speech recognition system, it would import all the data into one location and then run and optimize a model there until they had something they could launch to users. But then that would have meant holding onto all these people's private conversations, with concomitant risks. As a field, we decided it would be best if you didn't hold onto that information but instead optimized each person's data privately while figuring out some clever way to share the optimizations made individually with multiple people in a federated learning framework. This federated approach seems to be working out pretty well. The privacy concerns have ended up leading to a pretty good scientific advancement. AS Our second rubric element is the most obvious. There needs to be a technical approach, which can come from machine learning, statistics, operations research, or visualization. This offers a way to provide valuable insight and conclusions, whether prediction, recommendation, or the others. It isn't easy to find a model in some situations. Sometimes there is just too much inherent uncertainty, and other times the world may continually change and make modeling efforts ineffective. Some situations are game-theoretic, and a model's conclusions themselves generate feedback that makes the world less predictable. One example of the limitations of modeling has been to predict what might happen due to Covid-19. For many reasons relating to limitations of data, rapidly changing policy, variations in human behavior, and virus mutations, the ability to make long-term predictions of mortality has been poor. BF Are you saying data science didn't help at all in the war on Covid? PN I was involved in a project with an intern and some statisticians at UC Berkeley where we were trying to give hospitals advance notice of how many staffers they would need to bring in three days ahead of time. We couldn't give them accurate predictions 30 days in advance, but we could do useful short-term predictions. JW And for sure, data science was applied successfully in many other areas, most obviously in the vaccine and therapeutics trials. BF We could devote our whole time to models, but given the topic's broad coverage, let's move to the next rubric element: dependability. JW With data science being used in ever more important ways, dependability is of increasing importance, and we include four subtopics under it: Are the privacy imp