Pan-Tilt HAT repo


#1

Dearest hearties,

Am very much enjoying the Pan Tilt HAT. Have managed to put together a very juddery potentiometer controlled script at present (much like the one from latest bilge tank).

Do you have any plans to add any further servo related examples to the repo? I am struggling to find a sensible way to combine the old adafruit face tracker example with the pantilthat library, given how (relatively) complex the old facetracker seems to make the servo control.

Thaaaaaanks.


#2

I’m fully planning to port our PanTiltFacetracker example over to PanTilt HAT. In fact I should probably prioritize that :D Or maybe do it now… hang in there :D


#3

Thankoooo. Prob post an official joystick example too?


#4

Done!

Will see if I can find a sensible way to include this in the repo.


#6

With the pan tilt hat the camera image is upside down.

I had a look at the code and changed cv2.flip(frame, 1) to cv2.flip(frame, -1)
It wasn’t working and the flip command was after the detection code so i moved it to the below.
Works really well after that,its brilliant.


while True:
# Capture frame-by-frame

ret, frame = video_capture.read()
frame = cv2.flip(frame, -1)

#7

Ooh, I was looking into what to do about this. I was short on time so I just flipped the camera around to get it working :D Thank you!


#8

p.s. Was the little shim (that Sandy described in bilge tank 065 at around 11:40) supposed to be in the kit?


#9

I found with with the two plates compressing the camera it clips it perfect, totally solid, but no shim, wasn’t aware of one.

To be honest I was totally confused by them and just went off the shape of the pan-tilt mechanism where the curves are on the top and also that is the best point for ribbon cable entry, it wasn’t intuitive and I did have doubts.
The slight compression of the 4 screws holds it together and it just clips in perfect.

Pimoroni I have given you an honest top review for this bit of kit, but honestly think you are missing a golden opportunity not selling these ready assembled.
I will have to ask some of the community but my recent play with a Chromebit has me thinking the Pi3 might be just a tad faster.
You have a tracking cam, pico computer (chromebit beater) with 4 usb, ethernet, bluetooth, with gpio that is fun and educational for £99!
This is actually for my daughter and I know she is going to love it, the face-tracking is essential as it instantly makes your own personal desktop Wall-E.
Its all in my review but this thing has so much character that still I find something inherently hilarious about it and I know that is why she will love it.
My only fear now is that I will have to be on constant look out for covert surveillance operations :)

I have been blown away by the OpenCV stuff and some of the science there is well out of my league, but to have a simple library where I can employ it, is just fantastic.
http://opencv.org/

Before the daughter gets her pico Wall-E I did a cursory google on motion detection as I am thinking if the face-dectection fails after a number of frames it will default to motion detection.

I have done my usual fall asleep after a few Saturday Bevies and hopefully I will have something to post that maybe you might want to include and anyone else might want to collaborate.
I stopped Dev work 13 years ago so here goes :)

But guys flog this assembled ready working its just a shame you just didn’t have this ready marketed for Xmas, but wow I am sure it has mileage.
Amazing educational tool with huge scope, functionality, usability and packs the big importance of fun!


#10

Also got lost with autostart but was adding it to the global one whilst a pi user one exists!
So it never ran here is a ,much beter place to tack your line of whereever you have placed the facetracker script

sudo nano ~/.config/lxsession/LXDE-pi/autostart

@lxpanel --profile LXDE-pi
@pcmanfm --desktop --profile LXDE-pi
@xscreensaver -no-splash
@point-rpi
@/usr/bin/python /home/pi/Pimoroni/pantilthat/examples/PanTiltFacetracker-master/facetracker.py

Just add that last line with your script location and currently lost in info overload @ https://github.com/opencv/opencv/tree/master/samples/python

http://simplecv.org/


#11

Ooof OpenCV can seriously twist your melon!

Try this as most of it I hacked from another example and haven’t a clue why the api is slightly different.
I am going to do the same with the standard Api and see if the efficiencies are the same!?
Haven’t bothered with the panning as the main problem is the latency caused by Haar Cascades but I think I have halved the processor hit and latency.
Also going the other way of scaling down for face detection means the final image is not upscaled.

import numpy as np
import cv2.cv as cv

cascade = cv.Load('/usr/share/opencv/haarcascades/haarcascade_frontalface_default.xml')
#cascade = cv.Load('/usr/share/opencv/lbpcascades/lbpcascade_frontalface.xml')

min_size = (15, 15)
image_scale = 5
haar_scale = 1.2
min_neighbors = 2
haar_flags = cv.CV_HAAR_DO_CANNY_PRUNING

cap = cv.CreateCameraCapture(0)
cv.NamedWindow("Tracker", 1)
 
if cap:
    frame_copy = None
    
while(True):
    # Capture frame-by-frame
    frame = cv.QueryFrame(cap)
    if not frame:
        cv.WaitKey(0)
        break
    if not frame_copy:
        frame_copy = cv.CreateImage((frame.width,frame.height),
                                            cv.IPL_DEPTH_8U, frame.nChannels)
    if frame.origin == cv.IPL_ORIGIN_TL:
        cv.Flip(frame, frame, -1)
   
    # Our operations on the frame come here
    gray = cv.CreateImage((frame.width,frame.height), 8, 1)
    small_img = cv.CreateImage((cv.Round(frame.width / image_scale),
                   cv.Round (frame.height / image_scale)), 8, 1)
 
    # convert color input image to grayscale
    cv.CvtColor(frame, gray, cv.CV_BGR2GRAY)
 
    # scale input image for faster processing
    cv.Resize(gray, small_img, cv.CV_INTER_LINEAR)
 
    cv.EqualizeHist(small_img, small_img)

    midFace = None
 
    if(cascade):
        t = cv.GetTickCount()
        # HaarDetectObjects takes 0.02s
        faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0),
                                     haar_scale, min_neighbors, haar_flags, min_size)
        t = cv.GetTickCount() - t
        if faces:
            for ((x, y, w, h), n) in faces:
                # the input to cv.HaarDetectObjects was resized, so scale the
                # bounding box of each face and convert it to two CvPoints
                pt1 = (int(x * image_scale), int(y * image_scale))
                pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
                cv.Rectangle(frame, pt1, pt2, cv.RGB(100, 220, 255), 1, 8, 0)
                # get the xy corner co-ords, calc the midFace location
                x1 = pt1[0]
                x2 = pt2[0]
                y1 = pt1[1]
                y2 = pt2[1]
                midFaceX = x1+((x2-x1)/2)
                midFaceY = y1+((y2-y1)/2)
                midFace = (midFaceX, midFaceY)
                print midFace
                break
                
    # Display the resulting frame
    cv.ShowImage('Tracker',frame)
    if cv.WaitKey(1) & 0xFF == ord('q'):
        break

# When everything done, release the capture
cv.DestroyWindow("Tracker")

#12

Lols found some info about flags and I think these will work even though had to gleam them from the .net library for OpenCv.
http://www.emgu.com/wiki/files/2.4.10/document/index.html

Members

Member name Value Description
DEFAULT 0 The default type where no optimization is done.
DO_CANNY_PRUNING 1 If it is set, the function uses Canny edge detector to reject some image regions that contain too few or too much edges and thus can not contain the searched object. The particular threshold values are tuned for face detection and in this case the pruning speeds up the processing
SCALE_IMAGE 2 For each scale factor used the function will downscale the image rather than “zoom” the feature coordinates in the classifier cascade. Currently, the option can only be used alone, i.e. the flag can not be set together with the others
FIND_BIGGEST_OBJECT 4 If it is set, the function finds the largest object (if any) in the image. That is, the output sequence will contain one (or zero) element(s)
DO_ROUGH_SEARCH 8 It should be used only when CV_HAAR_FIND_BIGGEST_OBJECT is set and min_neighbors > 0. If the flag is set, the function does not look for candidates of a smaller size as soon as it has found the object (with enough neighbor candidates) at the current scale. Typically, when min_neighbors is fixed, the mode yields less accurate (a bit larger) object rectangle than the regular single-object mode (flags=CV_HAAR_FIND_BIGGEST_OBJECT), but it is much faster, up to an order of magnitude. A greater value of min_neighbors may be specified to improve the accuracy

Its sort of a two way street you either use SCALE_IMAGE on its own or use the others.
Now I am off to play.


#13

Also movement detection and great example from http://www.pyimagesearch.com/

Only one lib to install so sudo pip install imutils
http://t.dripemail2.com/c/eyJhY2NvdW50X2lkIjoiNDc2ODQyOSIsImRlbGl2ZXJ5X2lkIjoiNDA5MjUyMTE4IiwidXJsIjoiaHR0cDovL3B5aW1nLmNvLzJoMmhvP19fcz1zazZheHpueXJhcTRtN3h5ZndjdyJ9

# USAGE
# python motion_detector.py
# python motion_detector.py --video videos/example_01.mp4

# import the necessary packages
import argparse
import datetime
import imutils
import time
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", help="path to the video file")
ap.add_argument("-a", "--min-area", type=int, default=500, help="minimum area size")
args = vars(ap.parse_args())

# if the video argument is None, then we are reading from webcam
if args.get("video", None) is None:
	camera = cv2.VideoCapture(0)
	time.sleep(0.25)

# otherwise, we are reading from a video file
else:
	camera = cv2.VideoCapture(args["video"])

# initialize the first frame in the video stream
firstFrame = None

# loop over the frames of the video
while True:
	# grab the current frame and initialize the occupied/unoccupied
	# text
	(grabbed, frame) = camera.read()
	text = "Unoccupied"

	# if the frame could not be grabbed, then we have reached the end
	# of the video
	if not grabbed:
		break

	# resize the frame, convert it to grayscale, and blur it
	frame = imutils.resize(frame, width=500)
	gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	gray = cv2.GaussianBlur(gray, (21, 21), 0)

	# if the first frame is None, initialize it
	if firstFrame is None:
		firstFrame = gray
		continue

	# compute the absolute difference between the current frame and
	# first frame
	frameDelta = cv2.absdiff(firstFrame, gray)
	thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]

	# dilate the thresholded image to fill in holes, then find contours
	# on thresholded image
	thresh = cv2.dilate(thresh, None, iterations=2)
	(cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
		cv2.CHAIN_APPROX_SIMPLE)

	# loop over the contours
	for c in cnts:
		# if the contour is too small, ignore it
		if cv2.contourArea(c) < args["min_area"]:
			continue

		# compute the bounding box for the contour, draw it on the frame,
		# and update the text
		(x, y, w, h) = cv2.boundingRect(c)
		cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
		text = "Occupied"

	# draw the text and timestamp on the frame
	cv2.putText(frame, "Room Status: {}".format(text), (10, 20),
		cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
	cv2.putText(frame, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),
		(10, frame.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)

	# show the frame and record if the user presses a key
	cv2.imshow("Security Feed", frame)
	cv2.imshow("Thresh", thresh)
	cv2.imshow("Frame Delta", frameDelta)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key is pressed, break from the lop
	if key == ord("q"):
		break

# cleanup the camera and close any open windows
camera.release()
cv2.destroyAllWindows()

#14

I did a quick and dirty modification of your code and tucked it into our PanTiltFacetracker GitHub: https://github.com/pimoroni/PanTiltFacetracker

Needs tidying up, but it works nicely. Need to set up a battle between the two methods ;D


#15

Phil, did you ever get your fake face to work (that you were drawing on the bilge tank)? I have an idea…


#16

That’s the first fake face I’ve drawn that didn’t work! I even tested the camera with a freddo frog!


#17

This was my first python programming experience so apols [😊]

I am just hacking the original as… is the same and its all Ninja this side.

faces = faceCascade.detectMultiScale(
    gray,
    scaleFactor=1.2,
    minNeighbors=3,
    minSize=(20, 20),
    flags=cv2.cv.CV_HAAR_DO_CANNY_PRUNING | cv2.cv.CV_HAAR_FIND_BIGGEST_OBJECT | cv2.cv.CV_HAAR_DO_ROUGH_SEARCH
)

I modded the original slightly and noticed that your code is a bit of a fudge in terms on pan-tilt

I had a go myself at detection -> move to absolute position an get round to it.

#!/usr/bin/env python
# USAGE
# python motion_detector.py
# python motion_detector.py --video videos/example_01.mp4

# import the necessary packages
import argparse
import datetime
import imutils
import time
import cv2

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", help="path to the video file")
ap.add_argument("-a", "--min-area", type=int, default=500, help="minimum area size")
args = vars(ap.parse_args())

# if the video argument is None, then we are reading from webcam
if args.get("video", None) is None:
camera = cv2.VideoCapture(0)
time.sleep(0.25)

# otherwise, we are reading from a video file
else:
camera = cv2.VideoCapture(args["video"])

# initialize the first frame in the video stream
firstFrame = None

# loop over the frames of the video
while True:
# grab the current frame and initialize the occupied/unoccupied
# text
(grabbed, frame) = camera.read()
text = "Unoccupied"

# if the frame could not be grabbed, then we have reached the end
# of the video
if not grabbed:
break

# resize the frame, convert it to grayscale, and blur it
frame = imutils.resize(frame, width=500)
frame = cv2.flip(frame, -1)

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (21, 21), 0)

# if the first frame is None, initialize it
if firstFrame is None:
firstFrame = gray
continue

# compute the absolute difference between the current frame and
# first frame
frameDelta = cv2.absdiff(firstFrame, gray)
thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]

# dilate the thresholded image to fill in holes, then find contours
# on thresholded image
thresh = cv2.dilate(thresh, None, iterations=2)
(cnts, _) = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)

# loop over the contours
for c in cnts:
# if the contour is too small, ignore it
if cv2.contourArea(c) < args["min_area"]:
continue

# compute the bounding box for the contour, draw it on the frame,
# and update the text
(x, y, w, h) = cv2.boundingRect(c)
cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
text = "Occupied"

# draw the text and timestamp on the frame
cv2.putText(frame, "Room Status: {}".format(text), (10, 20),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
cv2.putText(frame, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),
(10, frame.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)

# show the frame and record if the user presses a key
cv2.imshow("Security Feed", frame)
cv2.imshow("Thresh", thresh)
cv2.imshow("Frame Delta", frameDelta)
key = cv2.waitKey(1) & 0xFF

# if the `q` key is pressed, break from the lop
if key == ord("q"):
break

# cleanup the camera and close any open windows
camera.release()
cv2.destroyAllWindows()

Then the alternative C api rather than the C++ one

[code]
#!/usr/bin/env python
#You might want to remark the tickcount calls and print statements after bug checking and analysis

import numpy as np
import cv2.cv as cv

cascade = cv.Load('/usr/share/opencv/haarcascades/haarcascade_frontalface_default.xml')


min_size = (20, 20)
image_scale = 4
haar_scale = 1.2
min_neighbors = 2
haar_flags = cv.CV_HAAR_DO_CANNY_PRUNING | cv.CV_HAAR_FIND_BIGGEST_OBJECT | cv.CV_HAAR_DO_ROUGH_SEARCH
t = 0

cap = cv.CreateCameraCapture(0)
cv.NamedWindow("Tracker", 1)

if cap:
    frame_copy = None

while(True):
    # Capture frame-by-frame
    t = cv.GetTickCount()
    frame = cv.QueryFrame(cap)
    if not frame:
        cv.WaitKey(0)
        break
    if not frame_copy:
        frame_copy = cv.CreateImage((frame.width,frame.height),
                                            cv.IPL_DEPTH_8U, frame.nChannels)
    if frame.origin == cv.IPL_ORIGIN_TL:
        cv.Flip(frame, frame, -1)
    t = cv.GetTickCount() - t
    print(t), 'Tickcount get image'
    # Our operations on the frame come here
    t = cv.GetTickCount()
    gray = cv.CreateImage((frame.width,frame.height), 8, 1)
    small_img = cv.CreateImage((cv.Round(frame.width / image_scale),
                   cv.Round (frame.height / image_scale)), 8, 1)

    # convert color input image to grayscale
    cv.CvtColor(frame, gray, cv.CV_BGR2GRAY)

    # scale input image for faster processing
    cv.Resize(gray, small_img, cv.CV_INTER_LINEAR)

    cv.EqualizeHist(small_img, small_img)

    midFace = None
    t = cv.GetTickCount() - t
    print(t), 'Tickcount create detect image'
    if(cascade):
        t = cv.GetTickCount()
        # HaarDetectObjects takes 0.02s
        faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0),
                                     haar_scale, min_neighbors, haar_flags, min_size)
        t = cv.GetTickCount() - t

        t = cv.GetTickCount()
        print(t), 'Tickcount Detect objects'

        if faces:
            for ((x, y, w, h), n) in faces:
                # the input to cv.HaarDetectObjects was resized, so scale the
                # bounding box of each face and convert it to two CvPoints
                pt1 = (int(x * image_scale), int(y * image_scale))
                pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
                cv.Rectangle(frame, pt1, pt2, cv.RGB(100, 220, 255), 1, 8, 0)
                # get the xy corner co-ords, calc the midFace location
                x1 = pt1[0]
                x2 = pt2[0]
                y1 = pt1[1]
                y2 = pt2[1]
                midFaceX = x1+((x2-x1)/2)
                midFaceY = y1+((y2-y1)/2)
                midFace = (midFaceX, midFaceY)
                print midFace, 'Face center'
            t = cv.GetTickCount() - t
            print(t), 'Tickcount faces'

    # Display the resulting frame
    t = cv.GetTickCount()
    cv.ShowImage('Tracker',frame)
    t = cv.GetTickCount() - t
    print(t), 'Tickcount Show image'



    if cv.WaitKey(1) & 0xFF == ord('q'):
        break

# When everything done, release the capture
cv.DestroyWindow("Tracker")
[/code]


#19

Back again from my adventures with Pi Cam and OpenCV.

Not mine but I was searching for a detection script that used threading and in fact this goes one better and uses multi processing.

https://github.com/vmlaker/sherlock either git clone or copy and extract the zip.
It creates a python virtualenv so will not affect anything and I guess you could just do what it does to your main env.
Part of it grabs a lib from Mercurial so >>sudo pip install Mercurial
Otherwise the hg clone bit of the script will fail.
In the root directory of the download >>make
Then for each example just drop the .py and >>make object2… and so on

Its that one you should see as its opencv doing exactly what we where doing but blisteringly fast.
object1 is face detection without parallelism and gets .7 FPS
object2 is the same face detection with parallelism and gets 30!
Its not doing any scaling either!!! With zero latency.

PS as to the code wow not worthy! Aint got a clue really but as a Pimoroni noob I am pretty sure I can hack that with the pantilt stuff, prob you guys could also.