Deepracer and learning

Through a conference we learn how to reinforce learning while we deeprace. Our algorithm: Vanilla policy gradient considered we have a model which let us train ourselves with positive reinforce each time we do something right.

Well, we are not dogs or cats but it looked like humans experience positive rewards in the same way animals do. We start learning some chore concepts of psychology to understand the purpose of the simulation and the training of our models.

The reinforce strategy is used only during the training and creation of the model not while we are in the race.

RL vs robotic racing

In the first one we collect data observing a driver doing the movements of driving, In the second one we can control the movements in a simulation and extract the data, which will be later explore.

Throughout different measures from the environment the participant can practice with their own virtual model to explore the training before the big race.

Published by


Journalist. Student of Global Media Communication. Interested in Politics, Economy, Social Media, Technology. Feminist. Like walking, talking and swimming.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s