Original post author: Ryan
I've come to the conclusion that the performance of my vehicle detection neural network is being severely limited by the dataset of car images I used to initially train it. Those car images were almost always taken from the side, front, or back of the cars. So whenever the neural network looks at a car or truck from a diagonal angle, it struggles to classify it.
I got that batch of training images off the internet. I pieced together a couple different pre-made datasets to total about 6,500 images. If I want better performance, I will need A LOT more data. I need to collect it myself.
I wrote a program to detect movement in a videofeed (from youtube, for example). If the movement meets certain criteria, then the camera saves a picture of the localized movement area and re-sizes it to 64 pixels by 64 pixels (the size I am using for the neural network input). Using a video feed from a traffic camera, I can easily collect 1,500 images per hour (or a lot more if I increase the screen-shot frequency). The images will often contain cars, but they also sometimes contain other movement (such as shadows, moving trees, clouds, etc.) That's good, because the neural network will benefit from learning what non-car images look like.
Now I am capable of collecting (comparatively) massive amounts of image data. I can set this thing to run overnight and wake up with tens of thousands of images. The problem is, the images aren’t labeled. The neural network can't learn from them unless it knows which images contain cars and which do not.
So...I wrote a program that can be used to quickly flip through images and manually label them as car or no-car. The program is designed from the ground-up to be quick. It can work with the mouse, but it can also work with just key presses. I am working on getting it to output the labels directly into a sqlite database.
With these new tools, I will be able to expand my training dataset.
Comments