Objects detection with YOLO technique using Deeplearning4j

Detecting objects in an image can be accomplished in a variety of ways, but among them YOLO (You Only Look Once) is by far the most easy and efficient one Hancom Office Hangul 2014.

Since YOLO is based on deep learning and deep learning has two faces (training and testing/execution) you may be wondering which side of the coin we will focus on here tree planted person.
Well, the process of training a neural network may be a complex task that requires time, powerful hardware (possibly GPU+Cuda), expertise in the specific field and a trial and error scientific approach pooches.

Here instead we will see how to use a pre-trained YOLO network included into Deeplearning4j, the powerful open source Java library for deep learning that joined the Eclipse ecosystem in 2017.

Start creating a new Maven simple project with the following dependencies

Then create a new Java class with the following imports

now define a String array that contains a label for each one of the 20 different objects detectable by the pre-trained YOLO network that we will use

then create the main method, where the TinyYOLO network included into deeplearning4j-zoo dependency is created and initialized

printing the model.summary() is there just to show some details of the several layers of the network itself.

dt is the detection threshold, a double value between 0 and 1 that represents the probability level of the detections.

The other three variables will be used to

  1. load the image file that we will provide as input
  2. scale each pixel value into a 0..1 range
  3. and extract the results from the network

 

Here comes the real stuff: the object detection phase with a given image file

In the above lines we are:

  • loading and scaling an image file (remember to replace the path with a real path on your system)
  • feeding the pre-trained YOLO neural network
  • getting the detection results
  • measuring and printing how many objects were detected and how fast was the detection phase

Ok, but can we be satisfied without seeing what the network has really detected with our own eyes? Of course not!

So let’s add and define a method that draws bounding boxes around the detected objects and print the label on each.

It’s time to run it!
You choose an image; we’ll take this one, taken from https://commons.wikimedia.org/wiki/File:Lex_Av_E_92_St_06.jpg
(licensed under the Creative Commons Attribution-Share Alike 4.0 International license)

and as result we get this

in a fraction of a second:

Other techniques may then be applied in order to remove the overlapping detection and to improve the overall process, but we wanted to keep the code as simple and straightforward as possible.

Now try with your own images and remember that you may have to tune the dt (detection threshold) value for best results.

 

 

By |2020-07-01T17:07:52+00:00giugno 30th, 2020|

Leave A Comment