<-Back

@Article{kraines-1993a, author = {David Kraines and Vivian Kraines}, title = {Learning to Cooperate with Pavlov: An adaptive strategy for the Iterated Prisoner's Dilemma with Noise}, year = {1993}, volume = {35}, value = {aa}, review-dates = {2005-02-26}, pages = {107--150}, read-status = {reviewed}, journal = {Theory and Decision}, hardcopy = {no}, key = {kraines-1993a} }

Pavlov is successful against other strategies, even those that are not rational or are not cooperative. It is successful where TFT is and in more cases. However, it is not ideal by Kraines' criteria of an ideal strategy since it a) can be exploited, esp. early in a limited game, and b) may exploit others.

Denotes Pavlov variants as P*n*, where P3 and P4 are the most
generally recommended strategies. P3 and P4 are fast learners, but
are not hopelessly prone to reactionary defections such as TFT and
lower-order Pavlovs (P1 and P2). The article then elaborates on this,
indicating P(k,n) as denoting fully specified strategy, where *k*
is the initial determinant of the cooperation probability which
is *k/n*.

The chance of meeting again, or discount rate, is again specified
as *w*.

Kraine's definition of an *ideal strategy* will (p 114):

- Approach average score of mutual cooperation (+1; Kraines have diff. payoffs, see kraines-1989a review)
- In long run, average of differences in payoffs should be 0 or in the player's favor
- It will quickly recover from noise (perceived defections); can re-learn cooperation

Kraine's definition of a *fraternal strategy* is (p 115) that
given a sequence of outcomes, that after some finite period they will
reach mutual cooperation. [also defines clan]. All-C and TFT are
fraternal.

All-D outscores PT in the short-run (p 135, p 136).

Notes prior applications / simulations:

- Trench warfare - axelrod-1984a
- Tree swallows - Lombardo 1985
- International trade and tariffs - numerous, no cites
*More examples*- axelrod-1987a

According to patchen-1987a, [exploitation usually occurs in the absence of retaliation.] (quote or paraphrase?)

Notes that TFT can never outscore any opponent and does poorly against its clone in a noisy environment. TFT is not fraternal with itself (presumably under noise, that is, or a variant with initial defection).

Abstract claims that Pavlov is not forgiving because it "will exploit altruistic strategies until punished by mutual defection." [But it is forgiving if it doesn't learn to fast and cooperation has already been established]. Can think of:

- All-D as "short-sighted"
- NT as "far-sighted"
- Pavlov as "hind-sighted"

Recap of strategies discussed:

*P(k,n)*Pavlov strategy with learning step 1/n and initial probability k/n of cooperation; kraines-1989a, kraines-1991a*All-D*Always defect; p=0*All-C*Always cooperate; p=1*Cp*Cooperate with probability p*TFT*Tit-for-tat*TFTT*Tit-for-two-tats*PT*Perfect Trainer: If opponent is more likely to Cooperate, then play C. Otherwise play D.*NT*Normal Trainer: Approximate but more realistic version of PT*E(k,n)*Evolutionary strategy of n individuals k of whom cooperate*EF*Error Forgiving; molander-1986a*CC*Conditional Cooperating; mueller-1987a*NS(y,p,q)*Reactive strategy; nowak-1989a*Downing and C-Downing*axelrod-1984a and donninger-1986a

This article has an appendix describing Markov chain methods used.

**!!**NT similar to L. Downing Downing and Downing-C strategies were very effective under noise and variable payoff. - via axelrod-1984a and donninger-1986a.- Many references, including
*simpleton*or P1 - rapoport-1965a - Continuum of payoffs (but not amenable to Markov chain analysis) - may-1989a, may-1991a
- w.r.t. IPD with noise: "there is a tradeoff: unnecessary conflict can be avoided by generosity, but generosity invites exploitation." - axelrod-1988a
- Error Forgiving (EF) (exploitable) - molander-1985a
- Conditional Cooperating (CC) (exploitable) - mueller-1987a
- Stereo wars - opp-1988a
- Bounded rationality and social learning leads to altruism [or cooperation?]: - simon-1991a

Problem Addressed: What is a natural model for real-life conflict-of-interest encounters and will lead to mutual cooperation? How does one tune P(k,n) to make it a nearly ideal cooperator?

Main Claim and Evidence: Pavlov is close to an ideal strategy, And P3 or P4 are the best candidates for the ideal. Generally, a Pavlov should start out fairly cooperative, such as P(n-2,n). Pavlov can be exploited and will exploit, however. But no known simple strategy does better. Possibly in the face of noise an NT-like strategy, such as Downing's entries in axelrod-1984a or donninger-1986a would be the nearest to ideal.

Assumptions:

- Only strategies amenable to Markov chain analysis were analyzed, though this doesn't seem to be a major problem.

Next Steps: None noted.

Remaining Open Questions:

- Still some question whether Pavlov is a good model for biological, especially complex biological entities.
- Why is aiming for cooperation the ideal, when goal of individuals would to maximize outcome of their strategy. Aren't All-C chumps just an invitation to evil? For instance, should a criteria for ideal be that there are no non-retaliating strategies left?

Contribution/Significance is outstanding.

Quality of organization is outstanding.

Quality of writing is outstanding.

<-Back