Research

Fallibilism, Bias, and the Rule of Law

Effective Altruism can learn from the rule of law.

Elliot Temple

17 Oct 2022 • 15 min read

Table of Contents

Effective Altruism (EA) aims to apply rationality, science, math and evidence to doing good. One of its main activities is evaluating the cost effectiveness of different charitable causes and encouraging donations to effective charities.

Since rationality is important to EA’s mission, I asked about whether they have rational debate methods or an alternative way to rationally resolve disagreements. The answer was, in my words:

EA lacks formal rationality policies, but individuals informally do many things to try to be rational and the community tries to encourage rationality. EA’s rationality efforts are larger than in most communities and include funding some rationality-related projects.

Many people find that kind of answer satisfactory, but I think it’s actually a major problem. I’ll analyze it using two main concepts: fallibilism and political philosophy.

The goal of this article is to explain the problem and, in principle, what sort of solutions to pursue. I do have some concrete suggestions but, to keep this shorter and more focused, I’ll save them for a later article.

I’ll discuss fallibilist principles, then political philosophy, then advocate a way of using a political philosophy concepts to be more rational. Finally, using those ideas, I’ll criticize some of EA’s ways of trying to be rational.

Fallibilism Summary

Fallibilism says that people are capable of making mistakes and there’s no way to get a 100% guarantee that any particular idea is not mistaken. Every idea we have is fallible. There are logical arguments for that. Fallibilists also commonly believe that mistakes are common and an important issue: our epistemology should have a lot to say about mistakes rather than treating them as merely a technically possible.

Making mistakes without realizing you’re making mistakes is common. Being biased without realizing you’re biased is common. Being irrational about an issue without realizing it is common.

I’m guessing that the majority of EAs agree so far, and possibly there will be significant agreement with the rest of this section as well as the political philosophy section.

I go further by rejecting answers in general categories like: “try not to be mistaken, biased or irrational” or “trust yourself not to be mistaken, biased or irrational” or “bet high stakes on you not being mistaken, biased or irrational in this case”.

I think it’s important to assume you will fail at bias/rationality sometimes, and to plan for that. What kind of failsafes can help you in cases where your trying doesn’t work? What can you do so that you aren’t betting anything important on your fallibilism not striking in a particular case? I regard many intellectuals, including scientists, as betting their the productivity of their careers on not being mistaken about certain issues, which I think is unnecessary and unwise. The career betting is due to the combination of working based on certain premises, which is OK alone, and also being unwilling to consider and discuss some criticisms of those premises. Even if you can’t find ways to avoid all risky bets, you can minimize them and take steps to make the remaining bets less risky. (It’s one thing to make a mistake that no one knows is a mistake. It’s worse to make a mistake that someone already understands is a mistake, when they were willing to share that information, but you wouldn’t listen due to e.g. using gatekeeping that blocks the criticism from reaching you. You should have robust policies to prevent that scenario from happening.)

Put another way, when you think you’re being unbiased, you should not trust yourself; you should regard that self-evaluation as unreliable and prefer an approach that’s more objective than trusting your rationality. If you consistently trust yourself and your judgment, you’re likely to go wrong sometimes. A strong fallibilist who refuses to trust himself will design things to mitigate the harm when he’s biased or irrational, while a person who thinks trusting himself is OK will not work on many if any mitigations or failsafes, so his irrationality or bias will do more harm.

Political Philosophy

A key idea in political philosophy is that government officials cannot be simply trusted. They may be biased, irrational, corrupt, selfish, greedy, etc. So you shouldn’t give them unlimited power to arbitrarily do whatever they want.

Even if they promise to try really hard to be fair, government officials should not be trusted. Even if they really do try, they still should not be trusted. Good faith trying to be fair is not good enough. They’ll be biased sometimes and unaware of their own bias.

So we design government systems with features to help with this problem. Those features include assigning officials only limited powers, written rules and policies, checks and balances, transparency, accountability and elections. Each of these design ideas can be used, not only for politics, but also by fallibilist individuals to help deal cases where they’re biased or irrational.

Note: For good effectiveness, these ideas must be used in general, at all times, not only in the cases where you’re biased, because you don’t accurately know which cases you’re biased about. We can’t just activate these policies, as needed, to deal with bias. We have to use them all the time because bias could strike at any time without us realizing it. Similarly we ask government officials to use these things all the time, not just in the cases where they think they might be biased. We do also have extra policies that can be used when someone does recognize their own potential bias. An example is a person abstaining from participating in something because they have a conflict of interest. That is something that other people can request but people also sometimes voluntarily choose it for themselves.

Perhaps the most important innovation to help with with abuse of power by officials and elites is called the rule of law. The idea is to write down laws which apply to everyone. Officials, instead of deciding whatever they want, are tasked with following and applying the laws as written. Instead of using their own judgment about the issue, they instead have the more limited task of interpreting the law. If the law is written clearly, then in many cases it’s pretty obvious to everyone what the law says. That means if a government official gives the wrong ruling, he’s doing something wrong which everyone can see, so there will be backlash (e.g. voting him out of power; if it’s a military dictatorship or hereditary monarchy then citizens have worse options such as violent revolution or complaining).

Clear, written laws provide predictability for citizens. They know in advance what actions will get them in trouble and what actions are allowed. The laws also limit the actions government officials may take. In order to provide predictability in advance, laws should be changed infrequently, and you should only get in trouble if you do something bad after there was already a law against it. New or changed laws shouldn’t be applied retroactively.

Despite the importance of rule of law, people haven’t done a great job of using the same concept outside of political matters. For example, online forums frequently have vague rules and then moderators do things like ban someone without even claiming that he broke a rule that already existed when he did whatever action the moderator didn’t like. Warnings help a lot here; if someone breaks something you think should be a rule, but that isn’t clearly written down, then you can tell him that’s not allowed and ban him if he does it again after being warned. That gives him reasonable predictability so that he can avoid being banned if he wants to (as long as he gets a warning for every different thing and could get a dozen separate warnings without being banned, which would be atypical moderator behavior). Many moderators use warnings part of the time but are inconsistent about it, which allows them to unpredictably ban people sometimes. Unfortunately, the cases where moderators don’t issue warnings tend to be the very same cases where the moderators are biased, irrational or unreasonable. If you follow good policies 95% of the time, but you’re biased sometimes, it’s likely that the times you’re biased will be some of the same times you decide not to follow your general policies. It’s really important to follow policies 100% of the time because the exceptions you make are not random; they are likely to be correlated with the times you’re biased.

(Note: I reviewed some EA moderator policies and past actions. Having some publicly documented history of moderator actions is great. I did not find any major moderator errors. However, overall, I also didn’t find enough clear differentiators which would convince me that EA moderators are definitely better than the bad moderators I’ve seen at some other forums. I think there’s room for improvement at communicating about moderator policies if not improving the policies themselves. I think a lot of people have had bad experiences with moderators at some other sites, and if EA is actually better it’d be worth communicating that in more convincing ways that show understanding of the problems elsewhere and explain clear differentiators. Superior transparency is a good differentiator but isn’t enough alone and also isn’t emphasized. I found the rules too vague, however I also noticed EA moderators giving warnings, which does a lot to make up for vague rules. For governments, I don’t think warnings are a reasonable replacement for laws. But for forums, especially small forums, they work much better and I do use warnings at my own forum. Writing down really clear rules is hard but is absolutely worth the effort for governments with millions of citizens, but it’s reasonable for small forums to do less there. Social media sites with more users than the populations of many countries should have rules just as clear and high effort as governments do. But they very much do not and aren’t trying to. And when those social media sites function partially as extensions of the government which are significantly influenced or controlled by the government, then their poor written policies actually erode the rule of law and give the government a way to have power over people’s lives outside of the legal system.)

The idea has been explained as replacing the rule of man with the rule of law. An article titled Rule of Man or Rule of Law? argues that the rule of law is the “single most important factor in quality governance” (and also says that most educated Western adults, whose lives benefit from the rule of law, don’t understand its importance). Wikipedia has entries for rule of law and rule of man. Many political philosophy books have discussed these issues.

If you trust yourself to be unbiased, or think trying to be unbiased is adequate, you’re relying on the rule of man. If you pre-commit yourself to follow specific written rules (laws), then you’re treating your individual, personal life using the superior political philosophy of the rule of law. It also helps to use related policies like transparency and accountability. If you have rules that no one else knows about, or no one can tell when you’re following them or not, then you can more easily break your own rules and find a way to rationalize it to yourself. Your bias or irrationality might get you to excuse a special exception. And if you make exceptions, they’re likely to correlate with when you’re doing something wrong, so saying “I follow my rules 99% of the time” isn’t actually very good. (If you made special exceptions 1% of the time and the 1% were chosen by random dice roll, that’d actually be pretty good. The point of claiming 99% rule following is to fool yourself into thinking you’re doing just as good as that, but you’re not.)

If you’re not comfortable making public policies and pre-commitments, I still recommend doing it privately, in writing, and trying to hold yourself accountable. That’s much better than nothing and may also lead to increased comfort with doing it publicly in the future.

Overcoming Bias Using Written Policies

Let’s return now to EA’s viewpoint, which I consider unsatisfactory:

EA lacks formal rationality policies, but individuals informally do many things to try to be rational and the community tries to encourage rationality. EA’s rationality efforts are larger than in most communities and include funding some rationality-related projects.

I view this as failing to apply the concept of the rule of law. Having formal, written policies is like having the rule of law. It constrains your arbitrary actions. It limits your own power. It means that, in cases where you’re biased or irrational, you have a defense mechanism (a.k.a. failsafe): your pre-commitment to some written policies.

I view EA as not designing around the fallibilist assumption that we’ll fail at rationality and bias sometimes. Just as a government official trying his best to be fair is not good enough, so too is trying your best to be rational not good enough. I think EA as a group, and individual EAs, should use written policies to take a more “rule of law” style approach to rationality and bias. (I also have the same suggestion for approximately all other groups besides EA. None of this is intended as a claim that EA is worse in comparison to some other group.)

I propose that EA individuals and groups should work on good policies to publicly write down and pre-commit to in the rule of law style. I think that would make EA significantly more effective than can be achieved by trying hard to be rational to or trusting yourself to be rational. Effort and trust are bad solutions to bias, irrationality and fallibility.

It’s great that EAs are more interested than most people in reading and writing about rationality, and they put more effort into overcoming their biases. That is a positive trait. But I suggest adding rule-of-law-inspired written policies and being more suspicious of anyone, including yourself, who thinks anything that fits into the trying or trusting categories is good enough.

Critique: Examples of EAs Trying and Trusting

In response to my questions, Thomas Kwa wrote a list of ways EA tries to be rational. To help clarify what I mean above, I’m going to quote and comment critically, primarily about why his points fit into the trying and/or trusting categories. I’m responding to Kwa’s comments in particular because I think they were the best and clearest answer.

– taking weird ideas seriously; being willing to think carefully about them and dedicate careers to them

This means trying to take weird ideas seriously. It involves making an effort to be fair, reasonable, curious, etc. There is an EA norm, which people try to follow, which favors this. People put effort into it.

But there are no concrete, written policies. Neither EA as a group, nor individuals, write policies that they pre-commit to follow in order to achieve this goal. They just try to do it and perhaps trust their ability to do it pretty well (though not perfectly or infallibly).

– being unusually goal-directed

This means trying to be goal-directed but not following a written policy.

– being unusually truth-seeking

– this makes debates non-adversarial, which is easy mode

Being good at truth-seeking is something people try to do. This makes some debates less adversarial, but we shouldn’t trust it to make all debates non-adversarial with no qualifiers.

– openness to criticism, plus a decent method of filtering it

EAs try to be open to criticism. Also I think any method of filtering criticism (a.k.a. gatekeeping) is especially important to write down because it could systematically block consideration of some broad categories of important ideas. It could also explicitly or implicitly do gatekeeping based on social status. Irrational filtering is a widespread problem in society today.

– high average intelligence. Doesn't imply rationality but doesn't hurt.

This basically means trying to be smart. And there are no written (or unwritten?) policies to only let smart people join EA, so I don’t think it’s very reliable. (Note: I think such policies would be bad.) This also implicitly relies on a small, fairly homogeneous community, which I comment more on below.

– numeracy and scope-sensitivity

– willingness to use math in decisions when appropriate (e.g. EV calculations) is only part of this

Dealing with math and scopes well is something EAs try for.

– less human misalignment: EAs have similar goals and so EA doesn't waste tons of energy on corruption, preventing corruption, negotiation, etc.

This claims that having more anti-corruption effort would waste energy. It seems contrary to rule of law attitudes. We’re similar to each other and our goals are aligned, so we don’t need to do much to prevent corruption!? No. Even in a small town or a small, homogeneous, low-immigration, high-trust society, you still should not trust government officials that much. Similarly, people who run charities should not be trusted with the money; it’s worth the energy to have transparency and accountability.

This point implicitly advocates more trust on the basis of having a small, fairly homogenous community. That attitude makes it harder for outsiders to immigrate to the community (or join and fit in, in less political terms). It’s suggesting that people who are different than me are more likely to be corrupt. Right now, everyone can be mostly trusted, but if the community becomes more intellectually diverse, then I won’t be so trusting. In other words, people similar to me are morally superior (less corruptible). That kind of belief has a long, biased history. It’s important to aim for intellectual diversity from the start, plan around it, and be welcoming towards the out-group, the heretics, the dissenters, the people who are harder for me to be friends with, the people who rub me the wrong way, etc. We need more tolerance and respect for culture clash, even when it takes effort, instead of wanting to need little negotiation because everyone is already so similar that things feel easy.

I considered that this point might only refer to corruption in a narrow sense like embezzling money. But I think it covers e.g. decision makers abusing their power to make biased decisions. Even if it only meant stuff like embezzling, I still think EA should spend effort preventing that. I think it’d be really naively trusting to lack robust defenses against embezzlement and then believe that was efficiently saving effort.

– relative lack of bureaucracy

Whether this is good depends on how it’s achieved and what it means. If it means having less rule of law, I think it’s bad. If it means having fewer bad policies, that’s good. Bureaucracy can refer to lots of negative things including bad laws, bloated policies, unproductive paperwork requirements, office politics, power struggles, and people who won’t use their brain. It’s bad to mechanically follow rules instead of using rules to enhance creativity or constrain actions.

It’s common for organizations to become more bureaucratic as they get larger. Why? One of the main reasons is because when everyone is culturally similar and friends with each other, they find it relatively easy to get along. They can settle many disputes by talking it out, and they have fewer disputes in the first place. However, as a group becomes larger and therefore more intellectually diverse, the method of trying to discuss problems and trusting others to engage in good-faith problem solving becomes less effective. In other words, the larger the group, the more necessary some sort of rule of law is.

As organizations get bigger, written rules and policies can help a lot with fair, predictable dispute resolution and can help with other issues such as spreading useful knowledge and ways of doing things to newcomers. For example, a policy of writing things in a knowledge base, instead of spreading that knowledge to new people by word of mouth, can make an organization or community more welcoming to people who don’t quickly fit in and have social rapport with existing members. Barriers to entry like needing to ask other people for help in informal ways help keep groups less intellectually and culturally diverse. Lack of clear, written rules and expectations (and the community actually following the rules instead of writing down one thing then doing something else) is also extremely stressful for many people.

Making things more explicit increases their legibility or understandability for people who are different than you. The more similar to people you are, the less you need them to explain things; you can pick up on hints better and make better guesses about what they mean. Groups which aren’t very explicit are inhospitable to other ideas besides the ideas that all the members believe but don’t write down.

Whether policies seem bureaucratic primarily depends on how good they are. If they’re designed well enough, and cost effective enough, then people will like them instead of complaining about bureaucracy. As it grows, EA will have a better chance to have good policies if it starts developing them now instead of waiting until later. It’s easier to experiment with policies, and figure out how to make them work well, when smaller. (EA’s lack of a clear central authority will make this a bit more complicated, and will mean more different actors need to think about policies. I think that can work well but will require some extra experimentation to get right.)

– various epistemic technologies taken from other communities: double-crux, forecasting

– ideas from EA and its predecessors: crucial considerations, the ITN framework, etc.

I think these are good, but without written policies that pre-commit about when to use them, their effectiveness is limited. I’m concerned that people rely on trying to use them when they should be used, or trusting themselves to use them at the appropriate times.

– taste: for some reason, EAs are able to (hopefully correctly) allocate more resources to AI alignment than overpopulation or the energy decline, for reasons not explained by the above.

This reads to me as viewing EA as higher in rationality because you agree with some of its conclusions. I consider that irrational. Rationality should be judged by ability to make progress and openness to error correction, not by whether you agree with a person’s or group’s views.

Overall, I read this list as ways EA tries to be rational and sometimes trusts its rationality instead of putting effort into mitigating the harm of rationality failures. I think pre-commitment to written policies, analogous to the rule of law, would be more effective. It would also expose those rationality ideas to critical discussion and enable an ongoing project of coming up with policy (and therefore rationality) improvements. This is especially important because rationality failures correlate significantly with key issues rather than being random, so even if you make a pretty good effort at rationality, and keep the quantity of failures fairly low, the impact is still large.

So, learn from the rule of law, and pre-commit to written rationality policies instead of trying to be rational or trusting yourself. (If you don’t want to do that, it could be a sign that you want the flexibility to act capriciously. It could also be a sign that you don’t know how to write policies well. Poorly written policies can do more harm than good.)

Fallibilism Summary

Political Philosophy

Overcoming Bias Using Written Policies

Critique: Examples of EAs Trying and Trusting

Sign up for more like this.