From YOLO annotation to using the weights with darknet_ros

Anna Carolina Soares Medeiros
4 min readAug 31, 2020

This is a step by step explanation on:

  1. How to annotate images for YOLO processing
  2. How to train YOLO V3 TINY
  3. How to test the weights, asap
  4. How to use the weights with darknet_ros

In practice, I used these steps to train object recognition for drones. I am using Ubuntu 16.

How to annotate images for YOLO processing

I assume you have a video of object you want to recognize, make sure you have the object recorded under different light conditions and different angles, to make sure YOLO can recognize it well.

First you need to extract frames from the video, you can use ffmpeg for that:

ffmpeg -i myfile.mp4 extracted_frames%04d.jpg -hide_banner

You can download ffmpeg here.

Donwload the YOLO Annotation tool, extract it, and put the extracted frames under the folder :

Yolo-Annotation-Tool-New-/Images

Make sure to edit the file “classes.txt”, with the classes you want to recognize. Classes are the “types” of objects you want to recognize, e.g. window, human, etc.

Make sure to put one class per line.

To start annotating, open a terminal and run:

cd Yolo-Annotation-Tool-New-/
python main.py

A GUI should open showing your frames, you can start clicking away on the frames.

For every frame you draw boxes, a file is being created under the “Labels” folder. After you finish with every frame, make sure to put frames and labels on the same folder (you can copy them both to a new folder).

OK, now edit “process.py” with the path to your folder (labels+images). This will create train.txt and test.txt for your files.

How to train YOLO V3 TINY

Lets download Darknet, extract it and copy inside “/darknet”:

  • your folder (labels+images)
  • train.txt
  • test.txt
  • obj.names
  • obj.data

Now we have to edit the .cfg of a specific YOLO version, I will be demonstrating with YOLO v3 tiny, please take a look at the comments:

[net]
# Testing
#batch=1 ##### <<<< uncomment these two lines to train,
#subdivisions=1 ##### <<<< uncomment these two lines to train
# Training
batch=32 ##### <<<< comment these two lines to test,
subdivisions=8 ##### <<<< comment these two lines to test
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1
[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=1
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
###########[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[convolutional]
size=1
stride=1
pad=1
#filters=255
filters=21 ############# 3x (num_of_classes+5) <<<<<< edit here
activation=linear
[yolo]
mask = 3,4,5
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
#classes=80
classes=2
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 8
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[convolutional]
size=1
stride=1
pad=1
#filters=255
filters=21 ############# 3x (num_of_classes+5) <<<<< edit here
activation=linear
[yolo]
mask = 0,1,2
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
#classes=80
classes=2
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

After you modify the .cfg you can train:

./darknet detector train obj.data yolov3_tiny_1class.cfg darknet19_448.conv.23

Aditional comments: you have to download “darknet19_448.conv.23", and also edit Makefile with your GPU number.

How to test the weights, asap

After training the desired number of batches (4000 is a good number to take a look at your weights), we can test the weights on an image (remember to comment the .cfg to test):

./darknet detector test obj.data yolov3_tiny_1class.cfg yolov3_tiny_1class_4000.weights testingimage.jpg

or test with a video:

./darknet detector demo obj.data yolov3_tiny_1class.cfg yolov3_tiny_1class_4000.weights testingvideo.avi -i 0 -out_filename output.avi

How to use the weights with darknet_ros

Donwload darknet_ros here.

catkin build it to your ros project.

copy your .cfg file to:

darknet_ros/darknet_ros/yolo_network_config/cfg/

copy your weight to:

darknet_ros/darknet_ros/yolo_network_config/weights/

edit darknet_ros/darknet_ros/config/ros.yaml with your camera topic:

topic: /bebop/image_raw

create a myfile.yaml file under darknet_ros/darknet_ros/config/:

yolo_model:config_file:
name: yolov3_tiny_1class.cfg
weight_file:
name: yolov3_tiny_1class_4000.weights
threshold:
value: 0.7
detection_classes:
names:
- window

Create a myObjectRecognition.launch Launch file under darknet_ros/darknet_ros/launch:

<?xml version="1.0" encoding="utf-8"?><launch>
<!-- Console launch prefix -->
<arg name="launch_prefix" default=""/>
<!-- Config and weights folder. -->
<arg name="yolo_weights_path" default="$(find darknet_ros)/yolo_network_config/weights"/>
<arg name="yolo_config_path" default="$(find darknet_ros)/yolo_network_config/cfg"/>
<!-- ROS and network parameter files -->
<arg name="ros_param_file" default="$(find darknet_ros)/config/ros.yaml"/>
<arg name="network_param_file" default="$(find darknet_ros)/config/myfile.yaml"/>
<!-- Load parameters -->
<rosparam command="load" ns="darknet_ros" file="$(arg ros_param_file)"/>
<rosparam command="load" ns="darknet_ros" file="$(arg network_param_file)"/>
<!-- Start darknet and ros wrapper -->
<node pkg="darknet_ros" type="darknet_ros" name="darknet_ros" output="screen" launch-prefix="$(arg launch_prefix)">
<param name="weights_path" value="$(arg yolo_weights_path)" />
<param name="config_path" value="$(arg yolo_config_path)" />
</node>
<!--<node name="republish" type="republish" pkg="image_transport" output="screen" args="compressed in:=/front_camera/image_raw raw out:=/camera/image_raw" /> -->
</launch>

catkin build again your darknet_ros package and run:

roslaunch darknet_ros myObjectRecognition.launch

That’s it! Nice Job!

--

--

Anna Carolina Soares Medeiros

Data Scientist @ Vsoft | Python | Computer Vision | Machine Learning | GCP | Azure