Artificial Intelligence program that plays single player Pong.
Pong AI is single agent AI that is modeled using Unsupervised Reinforcement Learning called Q-Learning. Q-Learning is very simple. All we need is a set of states, a set of actions the agent could do, and a metric to measure the ‘quality’ of the agent’s decision. So, for the states, we discretized the board into 12×12 grid, discretized the position of the agent to be 12 positions, checks the direction of the ball (left or right), and checks the possible actions the agent could do (up/down/stay). Thus our state space is (144 x 2 x 3 x 12) + 1 = 10369. With 1 being the state where the ball has passed the agent. For actions, the agent could move up or down or stay at it’s current position. And for the metric, we give +1 point every time the agent successfully bounce the ball, and we give -1 point everytime the agent missed the ball.
As we could see it is surprisingly not hard to model. It takes about 20 minutes to train to achieve 8 bounce / ball. Which is around 100,000 iterations. A quick video is available in my YouTube channel.