Completed: / exercises

HW4: Value iteration

Due date: 2026-04-20 23:59.
Recommended Resources:

Question 1

Choose the correct statements.
Warning: You have not logged in. You cannot answer.

Question 2

The region of inevitable collision is defined as
X_{\mathrm{ric}} = \left\{ x(0) \in X \,\middle|\, \forall \tilde{u} \in \mathcal{U}_{\infty},\ \exists\, t > 0 \text{ such that } x(t) \in X_{\mathrm{obs}} \right\}. \\\
That is the set of states that will inevitably make the robot collide with the obstacle region, regardless of what actions you apply. Imagine a car that is in the free space with an obstacle one meter in front of it. If the car is going 100 km/h, its certain that it will collide, no mater how hard you brake or turn.
Choose the correct statements.
Warning: You have not logged in. You cannot answer.

Value iteration

Lets give the recurrence equations for different variations of value iteration.
\text{a) }G_k^*(x_k) = \min_{u_k \in U(x_k)} \left\{\min_{\theta_k} \left\{ l(x_k, u_k,\theta_k) + G_{k+1}^*(x_{k+1}) \right\} \right\}. \\\
\text{b) }G_k^*(x_k) = \min_{u_k \in U(x_k)} \left\{\max_{\theta_k} \left\{ l(x_k, u_k,\theta_k) + G_{k+1}^*(x_{k+1}) \right\} \right\}. \\\
\text{c) }G_k^*(x_k) = \min_{u_k} \left\{ l(x_k, u_k) + G_{k+1}^*(x_{k+1}) \right\}. \\\
\text{d) }C_k^{*}(x_k) = \min_{u_k^{-1}\in U^{-1}(x_k)} \left\{ l(x_{k-1}, u_{k-1}) + C_{k-1}^*(x_{k-1}) \right\}. \\\
\text{e) }G_k^*(x_k) = \max_{u_k \in U(x_k)} \left\{ l(x_k, u_k) + \sum_{x_{k+1} \in X} G_{k+1}^*(x_{k+1})\, P(x_{k+1} \mid x_k, u_k) \right\}. \\\
\text{f) }G_k^*(x_k) = \min_{u_k \in U(x_k)} \left\{ l(x_k, u_k) + \sum_{x_{k+1} \in X} G_{k+1}^*(x_{k+1})\, P(x_{k+1} \mid x_k, u_k) \right\}. \\\

Question 3

Which is the recurrence equation for basic backward value iteration?
Warning: You have not logged in. You cannot answer.

Question 4

Which is the recurrence equation for a nondeterministic value iteration that is pessimistic about nature?
Warning: You have not logged in. You cannot answer.

Question 5

Which is the recurrence equation for a nondeterministic value iteration that is optimistic about nature?
Warning: You have not logged in. You cannot answer.

Question 6

Which is the recurrence function for a probabilistic value iteration?
Warning: You have not logged in. You cannot answer.

Python notebook

Download and open the following python notebook hw4_nb.ipynb
In valit_prob, each action u (graph edge) chosen by the robot, nature allows it to be chosen with probability p. The remaining possible k actions (graph edges) are each chosen with probability (1-p)/k.

Question 7

Modify the function valit_prob so that it stops if the difference between the best cost and the cost to stay is less than a given value epsilon.
How many iterations does valit_prob performs with g_length=5, p=0.8, failure_cost=1000 and epsilon = 0.01?
Warning: You have not logged in. You cannot answer.

Question 8

How many iterations does valit_prob performs with g_length=5, p=0.8, failure_cost=1000 and epsilon = 0.1?
Warning: You have not logged in. You cannot answer.

Question 9

valit_prob uses more iterations to converge if:
Warning: You have not logged in. You cannot answer.

Question 10

Look at the heatmap of the python notebook. Which are correct conclusions?
Warning: You have not logged in. You cannot answer.

Author

"Emil F. Awad"
?