What is Imitation learning

So, our environment is ready and we have explored how we can interact with it. Let’s start by trying to develop an imitation learning agent.

That being said let’s establish the ground work. Imitation learning is especially useful when it is easier for an actor to demonstrate the desired behaviour rather than specify a reward function which would generate the same behaviour and learn it. The simplest form of imitation learning would be “Behavioural Cloning” which is basically learning something by supervised learning. One of the first examples of such learning is ALVINN, where a vehicle equipped with sensors learned to steer based on the input data and effectively learned to drive autonomously (1989).

So, effectively, how does this work ? We watch the expert do something and out of it we create state-action pairs and we treat those pairs as examples and throw in some supervised learning. Easy right ? Well in theory yes, but the problem is that this kind of learning is not adaptative at all, if the agent gets out of a “known state” then it is doomed to fail as it does not know anymore what to do. A mistake made by our agent can easily put it into a state that the expert (that we copy) has never been into and therefore our agent has never been trained on. When this happens our agent is simply lost !

So, when to use imitation learning and when not ? Basically the consensus is that you should use imitation learning when you need something simple and efficient, when you don’t need long-term planning, when your expert can cover most of the states that your agent will face and finally : when committing an error does not result in dramatic outcomes such as death. I think it is quite clear to everyone that we cannot use imitation learning to train real self driving cars, as any mistake could result in the death of our passengers.

Ok, we will probably never develop a fully functional self driving car that will know what to do in any situation, but this seems as a challenge that we might be able to solve. But what are the implications? Will we will have to drive ourselves an agent in order to feed the information into our Neural Network ? This seems fun but highly constraining and I am not particularly good at driving in the Carla environment. If only we could have an efficient way to witness good driving of a car in the Carla environment… But wait, there is ! Indeed Carla has an available method called autopilot that you apply on the vehicle :

#enable auto pilot
ego_vehicle.set_autopilot(True)

We have our “expert model” and we know how to clone its behavior ! We are all good to start.

Gathering the training data

So what exactly are we going to do here ? We are trying to create a client and world, create and spawn a car and then create and attach a sensor to collect data for each frame (state of the environment). Finally we also want to record what is the autopilot model doing (action space) and treat it as labelled data for our supervised learning model.

We therefore need :

  • RGB images from the RGB Camera
  • Information concerning the controls applied to our autopilot vehicle (Steering, Braking, Throttle)

Let’s start from the beginning and initialize the world :

# Creating a client
client = carla.Client("127.0.0.1", 2000)
client.set_timeout(10.0)
client.reload_world()
for mapName in client.get_available_maps():
    print(mapName)
vehicle = None
cam = None
world = client.get_world()

We have created our client and our client is connecting to the server, we have put an IP and a port here but this depends on your environment. Probably you should use self.client = carla.Client(“localhost”,2000). We then printed the name of the map and initialized a few variables that we will use later on. Finally we create a world object that will permit to retrieve all the blueprints and the different elements that we want to interact with in the Carla world.

Next, we want to to somehow save the state and the action of our autopilot. For the sake of simplicity we will use only the RGB camera as input and 3 different actions in output : the steering (is the agent going left or right or straight?), the throttle (is our agent accelerating ?) and the brake (is our agent braking?). First I was saving the image feed from the RGB camera as images and the controls as a .csv file :

But as you can imagine this solution is very consuming regarding storage, not very optimized from an In/out perspective and rather inefficient. So we will ultimately convert each image into a numpy array and then store this numpy array in an .npy that we will store. We will therefore have :

One inputs.npy file that will contain every frame captured by the RGB Camera, in a numpy array format

One outputs.npy file that will contain the associated set of controls that our autopilot agent was applying to the vehicle at the moment of the associated frame.

As you can imagine it is crucial to conserve the sequence of frames/controls in order to enable supervised learning on our datasets.

Anyway, here is the code to create the folders, create both files and open them to make them ready to welcome our data.

#Create Folder to store data
today = datetime.now()
if today.hour < 10:
    h = "0"+ str(today.hour)
else:
    h = str(today.hour)
if today.minute < 10:
    m = "0"+str(today.minute)
else:
    m = str(today.minute)
directory = "/home/juliendo/TESTDATA/" + today.strftime('%Y%m%d_')+ h + m + "_npy"
print(directory)

try:
    os.makedirs(directory)
except:
    print("Directory already exists")
try:
    inputs_file = open(directory + "/inputs.npy","ba+") 
    outputs_file = open(directory + "/outputs.npy","ba+")     
except:
    print("Files could not be opened")

Alright, our framework is ready, now we will spawn our car. So first we retrieve the blueprint for a tesla model 3 from our world object that we created earlier. We will then give a name to our attribute “role_name” and select a random color for our car.

Then we will retrieve the different available spawn points on the map, select one of them randomly and spawn our vehicle there. If no spawn points available then we will log an error message.

#Spawn vehicle
#Get the blueprint concerning a tesla model 3 car
bp = world.get_blueprint_library().find('vehicle.tesla.model3')
#we attribute the role name brax to our blueprint
bp.set_attribute('role_name','brax')
#get a random color
color = random.choice(bp.get_attribute('color').recommended_values)
#put the selected color on our blueprint
bp.set_attribute('color',color)

#get all spawn points
spawn_points = world.get_map().get_spawn_points()
number_of_spawn_points = len(spawn_points)

#select a random spawn point
if 0 < number_of_spawn_points:
    random.shuffle(spawn_points)
    transform = spawn_points[0]
    #spawn our vehicle !
    vehicle = world.spawn_actor(bp,transform)
    print('\nVehicle spawned')
else: 
    #no spawn points 
    logging.warning('Could not found any spawn points')

This should do the trick. Alright next we want to add A RGB Camera to our car. We are going to retrieve the RGB blueprint from library in our world object. Then we are going the set the width and height of the image we want to retrieve from the RGB camera. We will get 200*88 pixels, this should be enough to train our model. We are also going to define the field of view and finally create the location and the rotation of our camera, where do we want it in terms of location in our space and where do we want it to face. Finally we will attach it to our car (object : vehicle)

#Adding a RGB camera sensor
WIDTH = 200
HEIGHT = 88
cam_bp = None
#Get blueprint of a camera
cam_bp = world.get_blueprint_library().find('sensor.camera.rgb')
#Set attributes 
cam_bp.set_attribute("image_size_x",str(WIDTH))
cam_bp.set_attribute("image_size_y",str(HEIGHT))
cam_bp.set_attribute("fov",str(105))
#Location to attach the camera on the car
cam_location = carla.Location(2,0,1)
cam_rotation = carla.Rotation(0,0,0)
cam_transform = carla.Transform(cam_location,cam_rotation)
#Spawn the camera and attach it to our vehicle 
cam = world.spawn_actor(cam_bp,cam_transform,attach_to=vehicle, attachment_type=carla.AttachmentType.Rigid)

Alright, that’s good, now we need two functions. One to save the image and one to process it. First the processing. The goal is to pass an image to the function that will then transform it to a numpy array, reshape it to the width and height that we have specified previously and then do two final operations :

Take out the alpha from our RGB camera as it is useless and normalize the data. Normalizing the data is crucial here as every pixel that we save will be used as a feature by our model, we need to make sure that they all have the same scale. As the RGB value can go from 0 to 255, we will simply divide this value by 255 to make sure all the values are between 0 and 1:

#Function to convert image to a numpy array
def process_image(image):
    #Get raw image in 8bit format
    raw_image = np.frombuffer(image.raw_data, dtype=np.dtype("uint8"))
    #Reshape image to RGBA
    raw_image = np.reshape(raw_image, (image.height, image.width, 4))
    #Taking only RGB
    processed_image = raw_image[:, :, :3]/255
    return processed_image

Next we need to actually save the image and the controls associated into our inputs and outputs npy files created and opened previously. We will do this via a function that will take an image as input, call the process_image function to get a processed image, then we will retrieve the associated control data (what is our autopilot doing when this frame is captured) and then save both into our files.

#Save required data
def save_image(carla_image):
    image = process_image(carla_image)
    control = vehicle.get_control()
    data = [control.steer, control.throttle, control.brake]
    np.save(inputs_file, image)
    np.save(outputs_file, data)

Well. I think we are all set ! Time to make our vehicle move and capture the data !

#enable auto pilot
vehicle.set_autopilot(True)
#Attach event listeners
cam.listen(save_image)

try:
    i = 0
    while i < 25000:
        world_snapshot = world.wait_for_tick()
        clear_output(wait=True)
        display(f"{str(i)} frames saved")
        i += 1
except:
    print('\nSimulation error.')
        
if vehicle is not None:
    if cam is not None:
        cam.stop()
        cam.destroy()
   vehicle.destroy()
inputs_file.close()
outputs_file.close()
print("Data retrieval finished")
print(directory)

There we go, we capture 25’000 frames with this piece of code, I would advise to probably start with something smaller to make sure everything works. Once finished we destroyed all the different actors and we close our inputs and outputs files. The data gathering is done !

I was fortunate enough to have access to a cloud setup to setup all of this but you should be able to run this in a terminal/cmd as well.

Well now that our data gathering is done it is time to get to the training itself. I ran the data collection on maybe 40 or 50 agents for 25K frames and used them all to train my model. To give you an idea.

Full Code :

#Dependencies
import glob
import os
import sys
import time
import numpy as np
import carla
from IPython.display import display, clear_output
import logging
import random
from datetime import datetime


vehicle = None
cam = None

#enable logging
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)

# Creating a client
client = carla.Client("127.0.0.1", 2000)
client.set_timeout(10.0)
client.reload_world()
for mapName in client.get_available_maps():
    print(mapName)
world = client.get_world()

#Create Folder to store data
today = datetime.now()
if today.hour < 10:
    h = "0"+ str(today.hour)
else:
    h = str(today.hour)
if today.minute < 10:
    m = "0"+str(today.minute)
else:
    m = str(today.minute)
directory = "/home/juliendo/TESTDATA/" + today.strftime('%Y%m%d_')+ h + m + "_npy"

print(directory)

try:
    os.makedirs(directory)
except:
    print("Directory already exists")
try:
    inputs_file = open(directory + "/inputs.npy","ba+") 
    outputs_file = open(directory + "/outputs.npy","ba+")     
except:
    print("Files could not be opened")
    
#Spawn vehicle
#Get the blueprint concerning a tesla model 3 car
bp = world.get_blueprint_library().find('vehicle.tesla.model3')
#we attribute the role name brax to our blueprint
bp.set_attribute('role_name','brax')
#get a random color
color = random.choice(bp.get_attribute('color').recommended_values)
#put the selected color on our blueprint
bp.set_attribute('color',color)

#get all spawn points
spawn_points = world.get_map().get_spawn_points()
number_of_spawn_points = len(spawn_points)

#select a random spawn point
if 0 < number_of_spawn_points:
    random.shuffle(spawn_points)
    transform = spawn_points[0]
    #spawn our vehicle !
    vehicle = world.spawn_actor(bp,transform)
    print('\nVehicle spawned')
else: 
    #no spawn points 
    logging.warning('Could not found any spawn points')
     
#Adding a RGB camera sensor
WIDTH = 200
HEIGHT = 88
cam_bp = None
#Get blueprint of a camera
cam_bp = world.get_blueprint_library().find('sensor.camera.rgb')
#Set attributes 
cam_bp.set_attribute("image_size_x",str(WIDTH))
cam_bp.set_attribute("image_size_y",str(HEIGHT))
cam_bp.set_attribute("fov",str(105))
#Location to attach the camera on the car
cam_location = carla.Location(2,0,1)
cam_rotation = carla.Rotation(0,0,0)
cam_transform = carla.Transform(cam_location,cam_rotation)
#Spawn the camera and attach it to our vehicle 
cam = world.spawn_actor(cam_bp,cam_transform,attach_to=vehicle, attachment_type=carla.AttachmentType.Rigid)

#Function to convert image to a numpy array
def process_image(image):
    #Get raw image in 8bit format
    raw_image = np.frombuffer(image.raw_data, dtype=np.dtype("uint8"))
    #Reshape image to RGBA
    raw_image = np.reshape(raw_image, (image.height, image.width, 4))
    #Taking only RGB
    processed_image = raw_image[:, :, :3]/255
    return processed_image

#Save required data
def save_image(carla_image):
    image = process_image(carla_image)
    control = vehicle.get_control()
    data = [control.steer, control.throttle, control.brake]
    np.save(inputs_file, image)
    np.save(outputs_file, data)
     
#enable auto pilot
vehicle.set_autopilot(True)
#Attach event listeners
cam.listen(save_image)

try:
    i = 0
    #How much frames do we want to save
    while i < 25000:
        world_snapshot = world.wait_for_tick()
        clear_output(wait=True)
        display(f"{str(i)} frames saved")
        i += 1
except:
    print('\nSimulation error.')

#Destroy everything     
if vehicle is not None:
    if cam is not None:
        cam.stop()
        cam.destroy()
   vehicle.destroy()

#Close everything   
inputs_file.close()
outputs_file.close()
print("Data retrieval finished")
print(directory)

Training the model

Once we have gathered the data from our Autopilot agent cruising in the Carla town, what to do with it ? Use it as labelled data for supervised training of course ! The input contains our training data and the output contains the associated labels : in situation A (based on the frame captured) the expert agent does B (throttle : 0.8, steering : 0.4 brake: 0.0).

Alright, so what do we want to achieve exactly in this section. First We want to open our saved data (the inputs and outputs from the previous section). Then we want to put the content from those into arrays and print some metrics concerning the training data to make sure everything is fine :

#We open the training data
INPUTS_FILE = open(DIRECTORY + "/inputs.npy","br") 
OUTPUTS_FILE = open(DIRECTORY + "/outputs.npy","br")  

#We get the data in
inputs = []
outputs = []

#We put the data into arrays
while True:
    try:
        input = np.load(INPUTS_FILE)
        inputs.append(input)
    except: 
        break
while True:
    try:
        output = np.load(OUTPUTS_FILE)
        outputs.append(output)
    except: 
        break

with STRATEGY.scope():
    input_np = np.array(inputs)
    output_np = np.array(outputs)

print(input_np.shape)    
    
#we close everything
inputs = None
outputs = None

INPUTS_FILE.close()
OUTPUTS_FILE.close()

#We take out the first 400 frames to avoid having the car idle
input_np = input_np[400:,:,:]
output_np = output_np[400:,:]

#Let's print some metrics
print("Input Shape")
print(input_np.shape)
print("-------------------------------------------------------------------------------------")
    
print("Output Shape")
print(output_np.shape)
print("-------------------------------------------------------------------------------------")

print("Input min axis 0")
print(input_np.min(axis=0))
print("-------------------------------------------------------------------------------------")

print("Input max axis 0")
print(input_np.max(axis=0))
print("-------------------------------------------------------------------------------------")

print("First line of input")
print(input_np[0])
print("-------------------------------------------------------------------------------------")

print("Output Shape")
print(output_np.shape)
print("-------------------------------------------------------------------------------------")

print("First line of output")
print(output_np[0])
print("-------------------------------------------------------------------------------------")

We are also filtering out the first 400 recordings as those are usually taken with a car that is not moving as it is just starting, we don’t want to use that so let’s ditch it. Alright next is the creation of the model itself !

I tried numerous types of CNNs with various numbers of layers and conv layers, and was not achieving much as the training data required is simply huge and it takes time to train and retrain new versions of the models. So in the end I just used the Xception model. The Xception framework is particularly adapted in the context of recognizing edges and patterns in images. It has been built by Francois Chollet from Google and is an extension of the inception Architecture which replaces the standard Inception modules with depth wise Separable Convolutions. It is a very deep Convolutional Neural Network with over 70 layers.

We also already permitted our model to accept a previously trained model in input in case we want to continue to train from a already trained model. Anyway, we first specify a strategy. As I was on a shared cloud, the first step was to limit the GPU computing power that our script would consume. This is called VM etiquette. We don’t take everything, we leave some for the others.

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Create 4 virtual GPUs with 1GB memory each
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024),
         tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    tf.config.experimental.set_virtual_device_configuration(
        gpus[1],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024),
         tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPU,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

#Strategy
STRATEGY = tf.distribute.experimental.MultiWorkerMirroredStrategy()


Then we specify if we want to load a previous model or start from scratch. We define a few variables that will come handy in customizing our model. We then define the training set and the test set and load the previous model if required. If no previous model is loaded then we create a new model, we specify of course what is expected in entry (200*88 RGB and 3 different actions) and what we expect in output ( 3 actions)

Alright, once done we display a summary of the model, for fun.

#Previous model to continue training
#load_previous = None
load_previous = f"/home/juliendo/MODELS/First_model.20201011_1739_npy.testing.h5"

#Variables
MEMORY_FRACTION = 0.6
DATASET = "20201012_1653_npy_used"
DIRECTORY = f"/home/juliendo/TESTDATA/{DATASET}"
WIDTH = 200
HEIGHT = 88
EPOCHS=10
MODEL_NAME = "Xception"
TRAINING_BATCH_SIZE = 16

#Split of the training data
with STRATEGY.scope():
    x_train, x_test, y_train, y_test = train_test_split(input_np, output_np)
    
#do we start from scratch or not ?
with STRATEGY.scope():
    if load_previous is not None:
        model = models.load_model(load_previous)
        print(f"loaded model:{load_previous}")
    else:
              
        #BRAX Model
        base_model= Xception(weights=None, include_top=False, input_shape=(HEIGHT, WIDTH,3))
        x = base_model.output
        x = GlobalAveragePooling2D()(x)
        #3 actions, throttle, brake, steering
        predictions = Dense(3, activation="linear")(x)
        model = Model(inputs = base_model.input, outputs = predictions)      
        
model.summary()

And finally we compile the model if it is a new one and we train with a .fit !

#We compile the model if we start from scratch
with STRATEGY.scope():
    if load_previous is None:
        #model.compile(optimizer=OPT, loss="categorical_crossentropy", metrics=['accuracy'])
        model.compile(loss="mse", optimizer=Adam(lr=0.001), metrics=["accuracy"])
    my_callbacks = [
        #tf.keras.callbacks.EarlyStopping(patience=10, monitor='val_loss'),
        #tf.keras.callbacks.ModelCheckpoint(filepath='models/model.' + DATASET + '.{epoch:02d}.{val_accuracy:.2f}-{val_loss:.2f}.h5', save_weights_only=False),
        tf.keras.callbacks.TensorBoard(log_dir='./logs'),
    ]
    
#my_callbacks = []
#WE TRAIN ! 
with STRATEGY.scope():
    #history = model.fit(x_train, y_train, epochs=EPOCHS, validation_data=(x_test, y_test), callbacks=my_callbacks)
    history = model.fit(x_train, y_train, epochs=EPOCHS, batch_size=TRAINING_BATCH_SIZE, verbose=0, shuffle=False, callbacks=my_callbacks)
    
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(test_loss, test_acc)
#model_file = f"models/workshop_model.{DATASET}.{test_acc:.2f}-{test_loss:.2f}.h5"
model_file = f"/home/juliendo/MODELS/First_model.{DATASET}.testing.h5"
model.save(model_file)
print(model_file)

Once I made it work in the script and the training was actually being done. I played with the parameters of the model and finally settled on the Adam optimizer with a learning rate of 0.001 and a linear activation function in the final dense layer.

Now it is time to test the model.

Testing the model

We have a model, now, how do we test it ? Testing it in a Jupyter notebook seemed to be a good choice as I was working on a cloud environment.

I will not explain in details the jupyter notebook I used to test the model as it is quite straight forward and rather long. You can find it on my git repo (link).

We start by importing a whole bunch of packages and define some variables. Then we create a class Xview in order to display in our jupyter notebook what is happening for our dear agent that we will test. Please refer to the jupyter notebook on the git for this part.

Next we want to create our agent, and this is interesting so let’s talk about it :

    
class ModelAgent(Agent):
    def __init__(self, model_path):
        self.model = keras.models.load_model(model_path)
        print("model loaded")
    
    @staticmethod
    def process_image(image):
        #raw_image = np.frombuffer(image.raw_data, dtype=np.dtype("uint8"))
        #raw_image = np.reshape(raw_image, (HEIGHT, WIDTH, 4))
        processed_image = image[:, :, :3]
        return processed_image
        
    def run_step(self, measurements, sensor_data, directions, target):
        current_speed = measurements['speed']
        input = sensor_data.camera['rgb']/255
        #[[steer, throttle]] = self.model.predict(np.array([input][0:2]))
        [[steer, throttle, brake]] = self.model.predict(np.array([input]))
        #IPython.display.display(f"Steer: {steer}\t Throttle: {throttle}\t Brake:{brake}")
        control = carla.VehicleControl()
        Xview.img_show2('img-from-sensor', sensor_data.camera['depth'], sensor_data.camera['semantic'], 100)
        control.throttle = throttle.item()
        control.steer = steer.item()
        control.brake = steer.item()
        control.brake = 0
        
        if brake < 0.1:
            brake = 0.0

        if throttle > brake:
            brake = 0.0

        # We limit speed to 35 km/h to avoid
        if current_speed > 10.0 and brake == 0.0:
            acc = 0.0
            
        control.hand_brake = 0
        control.reverse = 0
        
        return control
    
print('Task Done')

Our Agent is created and starts by loading the model that we want to test. We also define a processing method that takes the RGB from an image and then we define a step function, that will take in input our agent, the current measurements, the data from the sensors, the directions and the target. We did not use the last parameters.

We retrieve the current speed from the measurements, and we define our input based on the image captured by the RGB camera (that we normalize of course). We then store into steeer, throttle and brake the values predicted by our model based on the image that we capture. Remember, we are doing supervised learning, so for a frame captured by the RGB camera, our model spits values for those three actions based on what was doing our autopilot expert.

We then apply those controls to our agent and we tweak them a little bit in order to enhance the driving. I just put the brake, the hand_brake and the reverse options to 0 for instance and limited the speed in order to get some results. The functions returns the “control” values for our agent.

Next we create the world and the sensors. And finally the client :

log_level = logging.DEBUG
logging.basicConfig(format='%(levelname)s: %(message)s', level=log_level)
world = None

try:
    logging.info('BRAX Starting test BRAX')
    client = carla.Client('127.0.0.1', 2000)
    client.set_timeout(5.0)
    world = World(client) 
    #world.player.set_autopilot(True)
    #agent = ForwardAgent()
    agent = ModelAgent("/home/juliendo/MODELS/First_model.20201012_1653_npy.testing.h5")
        
    while True:
        world.render()
        sensor_data = world.get_sensors()
        measurements = world.get_measurements()
        control = agent.run_step(measurements, sensor_data, None, None)
        world.player.apply_control(control)
        
except KeyboardInterrupt:
    logging.info('\nTschussikovski!')
    if world is not None:
        world.destroy()

And there we go ! We connect our client, we specify the agent we want to use (based on the model trained in the previous step). And we initialize the world. Sensors are streaming the data, we retrieve current measurements and based on those we get the controls predicted by our model through the agent.run_step function. We then apply those controls to our car.

And that’s it ! The Jupyter notebook is overcomplicated at the moment as it contains multiple sensor data that is not being really used, it contains as well other types of agents that are also useless. It can be clearly simplified. If I have the time I will probably try to make a lightweight version in a script.

Alright, but what are the results from all of this ? Well at the beginning the car was only going right or left. This was due to various errors in the data collection process (normalization for instance). Then the agent was braking all the time and did not move at all. So I just disabled the brake.

And finally, after around 50 GB of data the agent made a first big step, he was not going anymore only left or right! The agent was able to get into a roundabout and actually turn with the curve. Ultimately the agent could not get out of the roundabout and crashed near the exit but it was very clear that some progress has been achieved, the agent understood that there was a roundabout and actually was able to enter it and followed the curve.

After further 200 Gb of training data the Agent is now able to drive for a longer time without crashing. It is still crashing and does very bad when it is raining or dark as he was probably mainly trained on sunny days. We clearly see as well that once it is out of a situation he explored during training it panics and does not understand what to do. Nevertheless the Agent has made great improvement since the first version which was only going left or right and proves that imitation learning is actually working.

Subsequent examples of our Agent driving with our model :

Next Steps

In order to further enhance the model for imitation learning multiple options are available. The first one would be to add new sensor types. Currently the agent is only using the RGB camera as input and while useful it permits only a limited interpretation of the state of the world and what is in front of the car. Incorporating a semantic segmentation camera, and a depth camera into the training data would provide a big impact on the model’s ability to drive safely in the Carla Town.

After adding those new sensors we would have to start to train a new model from scratch and gather extensive data, in all conditions (not only when it is sunny) and on multiple concurrent agents in order to train a good self driving imitation learning model.

Another option would be to further extend the integration of current parameters and measurements in the process of deciding for the next action. In other terms, how can we use current speed, steering and brake to decide what should those parameters change in the next frame.
The following diagrams describes well how our imitation learning model behaves today and how it could behave tomorrow. Green represents what it does today, orange what should be added in the future.

An obvious last next step would be of course to enhance the level of automation and scalability. Implementing threading and data collection pipelines from multiple concurrent agents along automated training of the model using dynamically the collected data would tremendously improve the speed at which we can launch new models and train them while expanding the training data that we collect. Introducing a UI that can efficiently track the agents that we are running and displaying clear metrics concerning the models that we are training would be clearly a path to explore.

Alright, in this section we have proved that imitation learning works in the context of self driving cars in Carla. Next time we will try to actually bring some reinforcement learning into it.


Brax

Dude in his 30s starting his digital notepad