How is “policy iteration” used in Markov Decision Processes?
a) To minimize the number of states
b) To iteratively calculate the shortest path
c) To find an optimal policy by repeatedly improving a given policy
d) To determine the communication classes