When you play darts, you can throw at 62 regions, * z* on the dartboard. Namely, the singles regions

*S1, …, S20*, the double regions

*D1, …, D20*, the treble regions

*T2, …, T20*and the single- and double bullseye,

*SB*and

*DB*. Every region has a numerical score,

*. Lets say, that*

**h***represents the location on the dartboard where the player aims at. In addition, we know the probability of hitting any*

**mu***when aiming for*

**z***.*

**mu**Ignoring turns, a player wants to minimize the expected number of throws to get to zero. This can be formulated as a stochastic shortest path DP. The state space for this DP is **s***{1, 2, …, 501}* and represents the players score. **i** denotes the number of darts thrown. The goal is to determine the aiming location that minimizes the expected number of throws.

When a player throws, there are three possible outcomes:

This can be formulated as:

In addition, we have a Bellman equation that satisfies

The terminal condition is equal to

```
V(0)=0
```

We can solve the DP successively for **s***= 2, …, 501*. All of the above can be found on page 13 of the following paper.

Just for my understanding, please let me know if I understand the methodology correctly:

If we have **s***= 2*, there is only one legal action, *D1*. This the only * z* where where we consider all

*p(z; mu)*for, correct? Because other possibilities of

*results in busts. And then you have basically a circular reference where the value of*

**z***V(2)*is

```
V(f(2,0) = V(2)
```

In addition, throwing *D1* when aiming for * mu* gives the terminal condition

```
V(f(2, D1)) = 0
```

So, the Bellman equation will always be equal to 1, independent of **mu**? Although, the probability of throwing D1 is larger when the player aims for D1 than for DB, the product of the probabilities and the terminal condition is always 0 (and next we add 1).So, it doesn’t matter where you aim on the dartboard when **s***= 2*, but this feels incorrect.

Please advice!