In a somewhat recent post, I mentioned that I was making a program to quickly flip through large collections of unlabeled images and assign labels to them. I finished developing that program about a month ago, and I've used it quite a bit since then. I named the program tkteach, and I've made it free for anybody to download, use, and modify on github (a website for sharing and improving code for programs). As I mentioned, the program is designed to be fast, easy, and reliable. It outputs the labels to a database, and it saves as you go - so you shouldn't have to worry about losing progress.
Over the past month or so, my time for working on machine learning has been limited. I've spent my limited time labeling as many images as possible using tkteach. I've labeled about 5,800 images so far - and I still have about 14,000 unlabeled images left. It's not necessary to label ALL of those images - but the more the better. Labeling the images is a little boring, so I usually do it while watching a twitch stream or listening to music.
The next step (which I'll likely start this upcoming weekend) will be re-training the neural network using all of these newly labeled images. I anticipate a very big jump in accuracy since this new training data is much more representative of the actual images that the network will be seeing.