In her seminal paper, “What is the Point of Equality?”, Elizabeth Anderson writes:
Would democratic equality support a wage-squeezing policy as demanding as Rawls’s difference principle? This would forbid all income inequalities that do not improve the incomes of the worst off. In giving absolute priority to the worst off, the difference principle might require considerable sacrifices in the lower middle ranks for trifling gains at the lowest levels. Democratic equality would urge a less demanding form of reciprocity. Once all citizens enjoy a decent set of freedoms, sufficient for functioning as an equal in society, income inequalities beyond that point do not seem so troubling in themselves. (p. 326)
Is this really an accurate representation of the difference principle? The vanilla version of the difference principle states that “social and economic inequalities are to be arranged so that they are to the greatest benefit of the least advantaged.” But in his discussion of chain connection (p. 80-83 of the original edition of A Theory of Justice), Rawls sets out “a more general principle” which he terms the lexical difference principle:
…in a basic structure with n relevant representatives, first maximize the welfare of the worst off representative man; second, for equal welfare of the worst-off representative, maximize the welfare of the second worst-off representative man, and so on until the last case which is, for equal welfare of all the preceding n-1 representatives, maximize the welfare of the best-off representative man. (p. 83)
The lexical difference principle clearly permits inequalities that do not improve the incomes of the worst off, as long as the incomes of the worst off as already as high as they can be and are not reduced by further gains to the best-off. But there is some ambiguity as to whether this lexicality should be read into the difference principle as it is normally formulated. Rawls says that he assumes that the incomes of different groups always affect each other (“close-knittedness”) “in order to simplify the statement of the difference principle”; in cases where the assumption of close-knittedness doesn’t hold, the difference principle must be expressed in the lexical form. I think Rawls is strongly implying that a precise statement of the difference principle would take the lexical form, but because (he thinks) close-knittedness is the rule, the vanilla difference principle and the lexical difference principle are practically equivalent. And because the vanilla difference principle is so much shorter and clearer, he prefers to use express the principle in its vanilla form.
However, the issue isn’t so cut and dried. As G. A. Cohen points out in Rescuing Justice and Equality (p. 156-158), Rawls elsewhere — before and after introducing the lexical difference principle — discusses the difference principle in terms that explicitly rule out inequalities that do not benefit the least advantaged, even if the least advantaged are already as well off as they can be and are made no worse off by gains further up the distribution. Cohen thinks that there are really two distinct difference principles in Rawls’s work, both of which are sometimes expressed in the vanilla form, but one of which should actually be interpreted as including the lexicality condition. Attempts to identify which one is the “real” difference principle are pointless; for there to be even one determinate difference principle, there have to be at least two.
Regardless of which difference principle we apply, however, Rawls already has an answer for Anderson’s complaint that “the difference principle might require considerable sacrifices in the lower middle ranks for trifling gains at the lowest levels.” The response is roughly that although this kind of situation is conceivable, it is not likely. We can expect that when gains at the top of the distribution maximize incomes at the bottom, gains will also be made in all social positions between them. The same is likely to hold for gains by those in the middle class; if these gains to the middle are also maximally advantageous to the poor, the lower middle class probably stands to benefit as well. Rawls calls this phenomenon “chain connection.” (p. 80)
Later in A Theory of Justice, Rawls takes up the question again:
I want to conclude this section by taking up an objection which is likely to be made against the difference principle and which leads into an important question. The objection is that since we are to maximize (subject to the usual constraints) the long-term prospects of the least advantaged, it seems that the justice of large increases or decreases in the expectations of the more advantaged may depend upon small changes in the prospects of those worst off. To illustrate: the most extreme disparities in wealth and income are allowed provided that the expectations of the least fortunate are raised in the slightest degree. But at the same time similar inequalities favoring the more advantaged are forbidden when those in the worst position lose by the least amount. Yet it seems extraordinary that the justice of increasing the expectations of the better placed by a billion dollars, say, should turn on whether the prospects of the least favored increase or decrease by a penny….
Part of the answer is that the difference principle is not intended to apply to such abstract possibilities. As I have said, the problem of social justice is not that of allocating ad libitum various amounts of something, whether it be money, or property, or whatever, among given individuals. Nor is there some substance of which expectations are made that can be shuffled from one representative man to another in all possible combinations. The possibilities which the objection envisages cannot arise in real cases; the feasible set is so restricted that they are excluded. The reason for this is that the two principles are tied together as one conception of justice which applies to the basic structure of society as a whole. The operation of the principles of equal liberty and open positions prevents these contingencies from occurring. For as we raise the expectations of the more advantaged the situation of the worst off is continuously improved. Each such increase is in the latter’s interest, up to a certain point anyway. For the greater expectations of the more favored presumably cover the costs of ·training and encourage better performance thereby contributing to the general advantage. While nothing guarantees that inequalities will not be significant, there is a persistent tendency for them to be leveled down by the increasing availability of educated talent and ever widening opportunities. (p. 157-158)
And against the complaint that his response makes the content of justice too dependent on contingent natural facts, Rawls has a rejoinder that I imagine the pragmatist Anderson would strongly sympathize with:
Some philosophers have thought that ethical first principles should be independent of all contingent assumptions, that they should take for granted no truths except those of logic and others that follow from these by an analysis of concepts. Moral conceptions should hold for all possible worlds. Now this view makes moral philosophy the study of the ethics of creation: an examination of the reflections an omnipotent deity might entertain in determining which is the best of all possible worlds. Even the general facts of nature are to be chosen. Certainly we have a natural religious interest in the ethics of creation. But it would appear to outrun human comprehension. From the point of view of contract theory it amounts to supposing that the persons in the original position know nothing at all about themselves or their world. How, then, can they possibly make a decision? A problem of choice is well defined only if the alternatives are suitably restricted by natural laws and other constraints, and those deciding already have certain inclinations to choose among them. Without a definite structure of this kind the question posed is indeterminate. For this reason we need have no hesitation in making the choice of the principles of justice presuppose a certain theory of social institutions. (p. 159-160)
So Rawls anticipated objections of the sort Anderson raises, and I think he effectively knocked them down. This doesn’t suffice to show that those committed to the doctrine Anderson calls democratic equality should endorse the difference principle. But at least it neutralizes the negative case against the difference principle as a requirement of democratic equality.