Object Detection with OpenCV and EfficientDet
We will first download the files needed for running Object Detection with EfficientDet models. We will download the following files and save it in the model_data directory.
# Download weights
!wget -q https://www.dropbox.com/s/9mqp99fd2tpuqn6/efficientdet-d0.pb?dl=1 -O model_data/efficientdet-d0.pb
#Download names or label file
!wget -q https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-paper.txt -O model_data/coco-labels-paper.txt
#Download the configuration file
!wget -q https://raw.githubusercontent.com/opencv/opencv_extra/master/testdata/dnn/efficientdet-d0.pbtxt -O model_data/efficientdet-d0.pbtxt
import numpy as np
import cv2 as cv2
print(cv.__file__)
print(cv.__version__)
import time
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)
#Routine to fix colors in image
def fixColor(image):
return(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
/usr/local/anaconda/envs/tensorflow2/lib/python3.6/site-packages/cv2/cv2.cpython-36m-x86_64-linux-gnu.so 4.5.1
labelsFile="model_data/coco-labels-paper.txt"
LABELS = open(labelsFile).read().strip().split("\n")
print ("No. of supported classes", len(LABELS))
No. of supported classes 91
Create a array COLORS populated with different colors for different classes
np.random.seed(42)
COLORS = np.random.randint(0, 255, size=(len(LABELS), 3), dtype="uint8")
We read the image.
img=cv2.imread("images/soccer.jpg")
img=cv2.resize(img, (608, 608))
plt.imshow(fixColor(img))
(H, W) = img.shape[:2]
print (H, W)
608 608
The final blob image has to be a 512x512 to be processed. We also normalise all pixel values by dividing by 255. OpenCV by default uses BGR format. We swap it to RGB using the swapRB=True flag.
inp = cv2.dnn.blobFromImage(img, 1 / 255.0, (512, 512), swapRB=True, crop=False)
The model is read by using the OpenCVs DNN module. It takes the config file and the weights to load the model
net = cv.dnn.readNet('model_data/efficientdet-d0.pb', 'model_data/efficientdet-d0.pbtxt')
Here the blob is set as input the net and we do a forward pass. We also measure the time taken for the pass
net.setInput(inp)
t0 = time.time()
boxes = net.forward()
t = time.time()
print('time=', t-t0)
print(boxes.shape)
time= 0.3522500991821289 (1, 1, 100, 7)
There will be 100 detections. The detection will include the classID, confidence and parameters of the box.
We iterate through all the predictions. If the confidence is greater than the threshold the we calculate the coordinates of the box. We use OpenCV to create rectangle and write the class and confidence in the box. The color is unique for each class.
for i in range(0, boxes.shape[2]):
classID = int(boxes[0, 0, i, 1]) #ClassID
confidence = boxes[0, 0, i, 2] #Confidence
#print ("Confidence", confidence)
if confidence > 0.3: #Set a threshold beyond which you want detections
#Box coordinates are normalised. Multiply with actual rows and columns to get
#actual dimensions
min_x=int(boxes[0, 0, i, 3]*W)
min_y=int(boxes[0, 0, i,4]*H)
max_x=int(boxes[0, 0, i,5]*W)
max_y=int(boxes[0, 0, i,6]*H)
#Integer the colors
color = COLORS[classID]
color = [int(c) for c in color]
#Make rectangle for the box
cv2.rectangle(img, (min_x, min_y), (max_x, max_y), color, 6)
#Text to display includes the Label and confidence
text = "{}: {:.2f}".format(LABELS[classID], confidence)
cv2.putText(img, text, (min_x, min_y - 10),
cv2.FONT_HERSHEY_SIMPLEX, 1, color, 3)
plt.imshow(fixColor(img))
<matplotlib.image.AxesImage at 0x7f4f30107780>