November 17, 2017
In this 2016 talk, Oxford University's Owen Cotton-Barratt discusses how effective altruists can improve the world, using the metaphor of someone looking for gold. He discusses a series of key effective altruist concepts, such as heavy-tailed distributions, diminishing marginal returns, and comparative advantage.
This write-up of Owen's talk has been moderately edited for clarity. It isn't meant to be a complete transcript.
The central metaphor running through this article is thinking about Effective Altruism as 'mining for gold'. I'm going to use this metaphor to illustrate different points. Gold, here, is standing in for whatever it is that we truly value. This could include making more people happy and well-educated, trying to avert a lot of suffering, or trying to increase the probability that humanity makes it out to the stars. When you read "gold", take a moment to think about what you value (it may not be just one particular thing) and put that in place of the gold.
Figure 2 is a photo of Viktor Zhdanov who I learned about in Will MacAskill's book 'Doing Good Better'. He was a Ukrainian biologist, who was instrumental in getting an eradication program for smallpox to occur. As a result, he was probably counterfactually responsible for saving tens of millions of lives.
Obviously, we don't all achieve as much as this. But by looking at examples like Zhdanov we can notice that some people manage to get a lot more gold, or whatever it is we altruistically value, than others. This is reason enough to make us ask questions like:
What is it that gives some people better opportunities than others? How can we go and find opportunities like that?
Others have discussed where the gold is and treasure maps to find it. I'm not going to do that in this article. Instead I'm going to focus on the tools and techniques that we can use for locating gold, rather than trying to give my view of where it is directly.
I want to begin by saying a bit about why I am even using a metaphor. The things we care about are big, complicated, and valuable; so why would I try and reduce that down to gold? It's because I want this article to focus on techniques, tools, and approaches that we can use. If you have complex values, these would just keep pulling your attention. But a lot of the things that we might do to identify where valuable things are, and how to go and achieve them, are constant, regardless of what the valuable thing is. In replacing them with a super simple stand-in for value, I think it helps to put the focus on a level above discussions of what is valuable.
My first point is that our gold, like literal gold, is pretty unevenly spread around the world. There are lots of places with almost no gold at all, and then there are a few places where there is a big seam of gold running into the ground. This has a number of implications. One is that we would really like to find those seams.
The next point is about sampling. For example, if I want to know roughly how tall people are, sampling five people and taking their average height would not be a bad methodology. However, if I want to know on average how much gold there is in the world, sampling five random places and measuring that is not a great methodology. It's quite likely I will find five places where there is no gold and significantly underestimate. Or possibly, one of them will have a load of gold and I will have an inflated sense of how much gold there is in the world.
This statistical property is often referred to as having a heavy-tail on the distribution. On the left in Figure 6 we have a distribution without a heavy-tail. This corresponds to a range of different amounts of gold in different places, but none of them has substantially more or less than typical.
In contrast, on the right, we have a heavy-tail distribution. It looks similar to the distribution on the left, apart from this long tail, reaching vast amounts of gold with probabilities that are not dying off very fast. This has implications.
Figure 7 is another way of looking at these distributions. In this case, I have ordered, from left to right, places in increasing amounts of how much gold they have. The percentiles are on the horizontal axis and the amount of gold on the vertical axis. In this case, the coloured in area beneath the graph corresponds to the total amount of gold. On the left, for the distribution that is not heavy-tailed, we see the gold is evenly spread across different places. If we just want to get most of the gold, what is important is getting to as many different places as possible.
Solar power is like the graph on the left. Some places get more sunlight than others, but the amount of solar power you generate depends more on the total number of solar panels, rather than on exactly where you place them.
On the right-hand graph, however, we have a distribution where a lot of the area is in the spike on the right-hand side. This means a lot of the gold, or whatever it is we find valuable, comes from the top percentiles of the distribution, which are just unusually good.
I'm not a geologist and I don't know much about gold, but I understand that literal gold is distributed like the heavy-tailed graph on the right. We might then ask, is this also true of opportunities to do good in the world? Here is some support for this.
When we look at the work in all its complexity, we see distributions with this heavy-tail property appearing in a number of different places. There are some theoretical reasons to expect certain types of distributions to arise. Empirically, looking at things like income distributions around the world, we see the heavy-tail property. [Figure 8]
Obviously, there are lots of things that don't have this property. But the more we look at things that are complex systems with lots of interactions, the degree to which we see this property increases. This is a big feature of lots of ways that we try and interact to improve the world.
Simply looking explicitly at opportunities to do good, I can find a couple of reasons why I am convinced we get some of this property.
One reason is just convincing arguments. If I care about stopping people starving, which I do, I could ask: Should I be interested in direct famine relief and trying to get food to people who are starving today compared to something more speculative? Some argue that it would be more effective to focus on researching solutions for feeding vast numbers of people if agriculture collapses, which I personally have found convincing. This is an extreme example and not something we usually think about. But, limiting myself to just trying to feed people, one of the mechanisms looks much more effective than the other.
This data from DCP2 [Figure 9], has tried to estimate the cost-effectiveness of lots of different developing world health interventions. The x-axis is on a log scale, so these have been put into buckets and each column is on average ten times more effective to the ones on its left. Thus, the rightmost column is about 10,000 times more effective than the leftmost column. Just within this one area of global health, we have managed to get good enough data that we can estimate these things, showing there is just a very wide range of cost-effectiveness.
The implications of this are that if we want to get gold, we should focus on finding seams which contain huge amounts of gold. This might give us surprising conclusions. We may be less excited to discover something is at the 90th percentile because before we knew anything, it might have been anywhere on the distribution. Since most of the possible value comes from the 99th percentile, then discovering that something is only at the 90th percentile would be good to know but would make us think less well of it. This is the case if you have got a fairly extreme distribution, but it's interesting to see how you can get these counterintuitive properties with heavy-tailed distributions.
Another implication is that naïve empiricism ("we'll just do a load of stuff and see what comes out best") is not going to be enough, because of the sampling issue. It's not possible to sample enough times and measure the outcomes well enough to judge how effective things are really going to be.
If we want to get as much gold as possible, we want to go to a place where there is lots of gold. We want to have the right tools for getting the gold out and we want to have a great team that is going to be using those tools. We can port this analogy over to opportunities to doing good as well. We can roughly measure the effectiveness of the area or type of thing that we are doing. We can measure the effectiveness of the interventions we are implementing to create value in an area, relative to other interventions in the area. We can also measure the effectiveness of the team or organisation which is running the implementation, relative to how well other teams might implement such an intervention.
If we have these different dimensions then the total value of the work done is equal to the product of these dimensions. In Figure 11 this is represented by volume, which we want to be maximising. This means we want to be doing reasonably well on each of the different dimensions, or at least not terribly on any of the dimensions. Some implications here might be that if we have an area and an intervention that we are excited about, but we can only find a mediocre team working on it, it may be better not to support them, but to try and get somebody else working on it. Alternatively, we could do something to really improve that team. Similarly, we might not want to support even a great team if they are working in an area that doesn't seem important.
In this section, I am going to discuss the tools and techniques for identifying where the gold is. A nice property of literal gold is that when you dig it up, it's easy to recognise. In our altruistic efforts, we often have to deal with cases where we don't have this. We don't have the gold, so we have to try to infer its existence by using different tools. This fact is like the dark matter of value.
This fact increases the importance of applying those tools diligently. Actually, the picture in Figure 12 is iron pyrite, not gold. So just because someone says, "Hey, this is gold", it doesn't mean we should always take their word for it, although it may provide some evidence. We want to have great tools for identifying particularly valuable opportunities and being able to differentiate and say, "Okay, actually this thing, although it has some aspects of value, may not be what we want to pursue.”
If you first go to an area where nobody has been before, then the seams of gold often have little nuggets of gold just lying on the ground, and it's extremely easy to get gold. So you have some people go in, they do this for a bit and they run out of all the gold on the ground.
Now, if they want to get more gold, maybe more people come along and bring more shovels. It's a bit more work, but you can still get gold out. [Figure 15].
Then you dig deeper, until you can't get in with shovels anymore, so you need bigger teams and heavier machinery to get the gold [Figure 16]. You can still get gold, but it is more work for each little bit that you are getting. This is the general phenomenon of diminishing returns on work that you are putting in. This concept comes up in a lot of different places, so it is worth having an understanding of it.
This, like several of the things I am going to be talking about is a concept native to economics. In some cases I am merely just pulling this from economics, but in others there is a little bit more modification on the concept.
For instance, I think this is particularly the case in global health. I understand that 15-20 years ago, mass vaccination were extremely cost-effective and probably the best thing to be doing. Then the Gates Foundation came in and funded a lot of the mass vaccination interventions. Now, the most cost-effective intervention is less cost-effective than mass vaccinations, because we have picked the low hanging fruit. Similarly, if in AI safety, writing the first book on superintelligence is a pretty big deal compared to the 101st book on the topic.
Previously, I discussed how we could factor the effectiveness of organisations into: the areas which it was working on, the intervention it is pursuing, and the team working on it. Now, I am going focus on the first factor - how to assess the effectiveness of a cause area. I'm going to give a further factorisation, splitting this into three different dimensions.
The first of these dimensions is scale. All else being equal, we would prefer to go somewhere where there is a lot of gold, rather than a little bit of gold. Most likely, per unit of effort, we are going to get more gold if we do that.
The second dimension is tractability. We would like to go somewhere where we make more progress per unit work. Ideally, where it is easy to dig that ground, rather than trying to get your gold out of a swamp.
The third dimension is uncrowdedness. This has sometimes been called neglectedness. Personally, I find that term ambiguous because sometimes people use 'neglectedness' to mean that this is an area which we should allocate more resources to. What I mean here that there are not many people looking at it. All else being equal, we would rather go to an area where people have not already picked up the nuggets of gold on the ground, than one where they have (where the remaining gold is quite hard to extract).
Ideally, we would like to be in the world where there is lots of gold, where it's easy to get out, and nobody has taken any of it. But, we are rarely going to be in that exact ideal circumstance.
I'm going to present one attempted way to make the above question more precise [Figure 20].
If you're not used to thinking in terms of derivatives, just ignore the 'ds' here. On the left in Figure 20, is the value of a little bit of extra work. This is what we care about if we are trying to assess which of these different areas we should do more work on.
On the right is a factorisation which is mathematically trivial and looks like it just makes things more complicated. I've taken the expression on the left and added in a load of things which cancel each other out. But I hope I can justify this decomposition by virtue of it being easier to interpret and measure. So I'm going to present the case for why I think it is.
The first term is measuring the amount of value you get for solving an extra one percent of a solution. This roughly tracks how much of a big deal the whole problem area that you are looking at is. I think that is a pretty precise version of the notion of scale.
The second term is a bit more complicated. It is an elasticity that is measuring, for a proportional increase in the amount of work being done, what proportion of a solution that gives you.
The final term just cancels to one over the total amount of work being done, so it is very naturally a measure of uncrowdedness.
By making a precise version of this kind of scale, tractability, uncrowdedness framework we can avoid people having different characterisations for different terms. Although, there have been some notions of tractability that don't all line up with this, the idea of measuring how much more work gets you towards a solution, is fairly well captured here.
I think all of these dimensions matter. This means we probably don't want to work on something that does terribly on any of the dimensions. I am not going to spend an hour helping a bee, even if nobody else is helping it and it would be pretty easy to help, because the scale of it is pretty small. I also don't think we should work on perpetual motion machines, even though basically nobody is working on it and it would be fantastic is we succeeded, because it seems like it is not tractable.
This might give us a warning against working on climate change, because at a global scale it gets a lot of attention as a problem. I am going to add some more caveats to this point. One is that this is going to be true while we think that there are other problems which are just significantly more under-resourced. Another is that you think you might have an exception if you have a much better way of making progress of the problem of climate change than the typical work that is done on it.
Even so, we should find it surprising that I am making a statement like "climate change is not a high priority area". This just sounds controversial and we should be sceptical of this. But I think the term 'high-priority' is a bit overloaded, so I want to distinguish that more.
If we have two places where there is gold in the ground and we ask: "Where should we send people if we want to get gold?" The answer is going to depend. Perhaps we send the first person to the place on the right in Figure 22; there is only a little bit of gold but it is really easy to get out. Then we send the next ten people to the place on the left, because there is more total gold there. The first person will already have retrieved most of the gold on the right and we want more people in total working on the place on the left.
Which of these two places is the higher priority? Well, it depends on which question you're asking.
These number are made up, but we might have some distribution like this on the left in Figure 23. When asking the question "How much should the world spend on an area in total?" we get a distribution where perhaps climate change looks very big.
If we ask instead, "How valuable is marginal spending?" The graph might look quite different because it significantly depends on how much is already being spent. The dotted lines in Figure 23 might represent how much is already being spent. Then the graph of the right is a function of how much should be spent in total, how much is already being spent and what the marginal returns are - what the curve looks like there.
I think both of these are important notions, and which one we should use should depend on what we are talking about. If we are having a conversation about what we as individuals or as small groups should do, it is appropriate to use this notion of marginal priority and how much extra resources help. If we are talking about what we collectively as a society or the world should do, it is often correct to talk about absolute priority and how much resources ought to be invested, total.
For most things here I have been extremely agnostic about what out view of value is. However, for this point, I am going to start making more assumptions. Many people have the view that we want to try and make as much value over the long term as we can. If you don't have that view, you can just treat this as a hypothetical. If you have not thought about it, it is a pretty interesting and important question, and worth spending some time on.
Suppose we do care about creating as much value in the long term as possible. In our gold metaphor, that might mean wanting to get as much gold as possible eventually, rather than just trying to get as much gold out of that ground this year.
Some technologies may be destructive; we can use dynamite, which get us lots of gold now, but it also blows up some gold which we never can get later. That could be good if the focus is on trying to get gold in the short term, but it could be bad from this eventual gold perspective.
We could develop some technologies that are similarly efficient but less destructive. There are going to be some people in the world who do care about creating as much gold as possible in the short term. They are going to use whichever technology is the most efficient for that. So one of the major drivers of how much gold is eventually extracted is the order and sequence in which the technologies are developed. If we discover dynamite first, people are going to use dynamite and destroy a lot of the gold. On the other hand, if we discover the drill first, then by the time dynamite comes along the attitude will be "Well, why would be use that? We have this fantastic drill."
Philosophers like Nick Bostrom have used this to argue for trying to develop societal wisdom and good institutions for decision making, before developing technologies or progress which might threaten the long-run trajectory of civilization. As well as trying to focus on differentially aiming to develop technologies which enhance the safety of new developments before anything that is driving risk.
In this section I am going to discuss how all this is a collaborative endeavour. The idea is that we are not all individually saying: "I need to work out where the most gold is, that's the most neglected and tractable. I personally am going to do just that." There are many people who are thinking like this and there are more every year. I am really excited about this, but we need to learn how to cooperate.
Largely, we have the same or similar views on what to value. Maybe some think that silver matters too, not just gold, but we all agree that gold matters. We want to be able to coordinate and make sure that we are getting people working on the things which make the most sense for them.
In Figure 25, I have Harry, Hermione, and Ron and they have three tasks they need to do to get some gold. They need to do some research, mix some potions, and do some wand work. Hermione is the best at everything, but she doesn't have a time-turner so she can't do everything. So we need to have some way of distributing the work.
This is the idea of comparative advantage. Hermione has an absolute advantage on all of these tasks, but it would be a waste for her to go and work on potions because Harry is not so bad at potions. Nobody else is good at doing research, so we should probably put her on this.
This is a tool that we can use to help guide our thinking about what we should do as individuals. For example, if I think some technical work is the most valuable thing to be doing but I would be pretty mediocre at it, and instead I am a great communicator, then maybe I should go into trying to help technical researchers in that domain communicate their work to get more people engage with it and bring in more fantastic people.
Now we have applied this at an individual level, we can also apply it at the group level. We can notice that different organisations or groups may be better placed to take different opportunities.
On an even more speculative level, we can apply this idea to time. We can ask ourselves: "What are we, today, particularly well suited to do, compared to people in the past and people in the future?" We cannot change what those in the past did, but we can make a comparison of what our comparative advantage is relative to people in the future. If there are going to be different possible challenges in the future that we need to meet, it makes sense that we should be working on the early ones. Because if challenges are coming next year, the people five year later just do not have a chance to work on that.
Another consideration is that we have a position to influence how many future people there will be who are interested in and work on these challenges. We have more influence over that than people in the future do, so it may make sense as a thing for us to focus on.
Another particularly important question is how to work stuff out. The world is big, complicated, and messy. We cannot expect all of us, individually, to work out perfect models of it, in fact, it is too complicated for us to expect anybody to do this. Maybe, we are all walking around with the little ideas, which, in my metaphor, are puzzle pieces for a map to find the gold. We want institutions for assembling these into a map. It is complicated because some people have puzzle pieces which are from the wrong puzzle and they don't track where the gold is. Ideally, we want out institutions to filter these out and only assemble the correct pieces to guide us where we want to go.
As a society, we have to deal with this problem in a number of different domains and we have developed different institutions for doing this. There is the peer review process in science, Wikipedia for aggregating knowledge, Amazon reviews aggregate knowledge on which products are good, and democracy lets us aggregate preferences over many different people to try and choose what is going to be good.
Of course, none of these institutions is perfect and this is a challenge. This is like a wrong puzzle piece that made it into the dialogue. This come up in many cases, for example, the replication crisis in part of psychology, vandalism in Wikipedia, and fake reviews on Amazon that make some products looks good and others bad.
It may be the case that we can adapt an existing institution for our purpose of trying to aggregate knowledge about what are the ways to go and do the most good. But we may want something a bit different and maybe someone reading this is going to do some work on coming up with valuable institutions for this. I think this is a really important problem that is going to become more important for us to deal with as the effective altruism community grows.
Another thing which can help construct a better picture, is trying to have good local norms. We tell people the ideas we have and other people may start listening. They may just listen based on the charisma of the person talking, rather than the truthfulness of the puzzle piece. But we would like to promote the spread of good ideas, inhibiting the spread of bad ones, and encouraging original contributions.
To achieve the first two aims, we can rely on authority. We can take the attitude: "Well, we've worked out this stuff. We're totally confident about this and now we won't accept anything else." But this isn't helpful in getting new ideas.
Something we can do is here is pay attention to why you believe something. Do you believe it because somebody else told you? Do you believe it because you have really thought this through carefully and worked it out for yourself? There is often a blur between these two cases and we often accept reasons given to us without deeply examining the argument.
It is useful to be honest with yourself about this and then also communicate it to other people. If it is the case that you believe this because Joe Bloggs has told you, but Joe is a pretty careful guy and he is diligent about checking his arguments, you can communicate this. Or it could be the case that you cut this puzzle piece yourself.
Cutting it yourself doesn't mean we should necessarily have higher credence in it. Previously, I have worked things out and thought I have proved things, before finding a mistake in my proof. So you can separately keep track of the level of credence you have in something and why you believe it.
Moreover, our individual and collective reasons for believing things can differ. Consider the statement: "It costs about $3,500 to save a life from malaria." This is broadly believed across the effective altruism community. I think, collectively, the reason we believe this is due to a number of randomized control trials. Then some pretty smart, reasonable analysts at GiveWell have looked carefully at this, dived into the counterfactuals, produced their analysis and come to the conclusion: "On net, it is about $3,500".
But that is not why I believed it. I believed it because people have told me that GiveWell have done this analysis and they say it is $3,500 on their website. In writing this article, I went and read it myself on the website. Although it is more work for me, it is of value for the community because I am shortening the chain of Chinese whispers. As messages get passed along, it is possible that mistakes enter and then it gets repeated. By going back and checking earlier sources in the chain, we can try to reduce that and make ourselves more robustly confident in these statements.
When you notice that you disagree with somebody, you can see that perhaps parts of their jigsaw puzzle are wrong. You might dismiss what they have to say, but I want to argue that this is often not the most productive thing to do. Although, part of what they have to say is wrong, they may have some other aspect of their thinking process that would fill a gap in your perspective and help you get a better picture of what is going on.
I often do this when I find that someone has a perspective that I think is unlikely to be correct. I am interested in this process of how they get there and how they think about it, in part, because people and they way they think is fascinating. But also because it is polite and useful. It helps me to build a deeper picture of all the different bits of evidence we collectively have.
In this section, I am going to apply the ideas I have just mentioned. Throughout this article, I've discussed a range of different things without touching on my level of competence in these areas, or why I believe them. So I am going to do that here.
Heavy-tailed distributions: I think it is pretty robust that the baseline distribution of opportunities in the world does follow a distribution with a heavy-tailed property. Seeing it in many different domains and understanding some of the theory behind why it should arise makes it extremely likely. However, there's an open empirical question to exactly how far that tail goes out as the property is not binary but rather a continuum.
There is an important caveat here: there is a mechanism which might push against this. If people are good at seeking out and identifying the best opportunities and they are uniformly seeking out and taking them, then the best things that are left might not be so much better.
This occurs in regular markets; ways to make money start out distributed across a wide range. In Figure 36 we use a log scale on the x-axis to represent a heavy-tailed distribution. Those that are losing money stop doing that, and observe others doing activities which are making lots of money and pursue those instead. More people move into this area and by diminishing returns, you actually make less money than you used to in that area. Eventually, you end up with a narrower distribution of the value that is being produced by people doing these different things than we started with.
We might get a push in that direction among opportunities to create altruistic value. Certainly, I don't think we are in a properly efficient market, but I am not sure how efficient it is. I hope that as this community grows and more people are actively trying to choose very valuable things, that will mean the distribution does get less heavy-tailed.
One of the mechanisms that leads to efficiency in regular markets is feedback loops; people notice if they are getting rich or losing money. Another mechanism is people doing analysis as a result of feedback loops. They try to work out whether additional resources in a particular opportunity will make us richer or not. I think that doing this sort of analysis is an important part of the project we are collectively embarking on.
Overall, I don't think we do have an efficient market for this, I do believe we have heavy-tailed distributions; and I am not sure how extreme, but that is because it responds to the actions people are taking.
Factoring cost-effectiveness: I think this is a simple point without room for error. However, there is an empirical question as to how much these different dimensions matter. We may have more variation in one of the dimensions than others. I don't have a clear view on the matter. We saw that the intervention effectiveness within global health varies by 3-4 orders of magnitude. I think area effectiveness could be more than that, mn and as for organisational effectiveness I'm not an expert and cannot claim much of a view on that.
Diminishing returns: I think this is a robust point. In some domains, there are increasing returns to scale, where you get efficiencies of scale and that helps you. This more often applies to the organization scale or organization within a domain, whereas diminishing returns often apply at the domain scale. Despite this, I would add a note of caution as I do know some smart people who think I am overstating the case for diminishing returns.
Scale, tractability, neglectedness: I think it is obvious that they all matter and that this factorisation is correct as a factorisation. What is less clear is whether this breaks it up into things that are easier to measure and whether this is a helpful way of doing it. I would argue that is is, since is loosely matches up with an informal framework that people have been using and seem to find helpful.
Absolute and marginal priority: Again, this point is somewhat trivial, but I made this point because I think not everybody has these separate notions and we can confuse each other if we blur them.
Differential progress: This argument appears in a few academic papers and is believed by some of the smartest and most reasonable people I know, which gives me some evidence that it might be true, outside of my personal introspection. It is a bit counterintuitive and has not had much scrutiny, so we may want to expose it to more.
Comparative advantage is a standard idea from economics. Normally markets try to work to push people into working in the way that utilizes their comparative advantage. We don't necessarily have that when we are aiming for altruistic value.
The application across time is more speculative. I am one of the main people who have been trying to reason this way and as of yet I haven't had anybody push back on it. But it should be taken with a bit more of a pinch of salt as it has faced less scrutiny.
Aggregating knowledge: I think everyone tends to think we want institutions for this. There is also a broad consensus that existing institutions are not perfect, but whether we can build better institutions, I am less certain.
Sharing reasons for beliefs: This is something that I think is common sense and all else being equal, this is a good thing. There are costs to doing it; it slows down our communication and it may not sound glamorous and be harder to get people on board with. I think we want to nudge people in the direction of sharing reasons for beliefs, but I am not sure how far, and do not want to by overwhelmingly demanding on this. To some extent, I believe this because some smart, reasonable people I know think we should head in this direction and I weigh in the opinion of other when I don't see a reason that I have a notably better perspective on it than them.
Finally, why have I been sharing this with you?
People can mine gold without understanding all these theoretical arguments about the distribution of gold in the world.
But, because what we actually value is invisible, we need to be more careful about aiming at the right things.
I think it is important for our community to have this knowledge broadly spread; and as we are still in the early days of the community, it is particularly important to try and get this knowledge into the foundations and work out better versions.
We don't want to have the kind of gold rush phenomenon where people charge of after something and it turns out there wasn't actually that much value there.