2020 Project: Building a depth sensing vision system from scratch

I'm starting out 2020 with a plan to be a little more rigorous about how I approach my side projects. That includes publicly writing about the details of what I'm working on to both share what I've learned and also to hold myself somewhat accountable.

With that being said, some of my general goals for 2020 include:

  • Getting more experience with computer vision applications to balance my machine learning knowledge, which up to this point has mostly focused on natural language applications
  • Gaining better understanding of the process involved in building custom electronics and hardware
  • Re-visiting a failed side project from 2018 (more on that below)

My 2020 project is to build a cheap camera-based depth sensing system that can create detailed 3D point clouds, accurately measure distance, detect motion and do simple object identification and classification.

Project Inspiration

A few years ago I was looking for a side project to keep me challenged and, through equal parts hubris and ignorance, decided that building an “AI-powered Robot” was something I could knock out in a few months on my spare time. I'm a software developer and I have been working on machine learning projects for several years, so I figured once I did the easy part of putting together the hardware (some sensors, some motors, some wheels) the rest would just be writing some awesome software to control it all. Let's just say it didn't go as planned.

After reading up on some electronics basics and several trips to Micro Center, I did end up with an ugly box on wheels with a camera that could control two servo motors and drive around. The project failed (as in, I gave up) when I tried to incorporate sensors to measure the distance to obstacles. Most of the DIY obstacle-avoidance robots I came across used a combination of either ultrasonic or infrared sensors, depending on the use case, and the better ones had arrays of sensors and used arbitrary “averaging” algorithms to account for sensor noise and bad data created due to the angle at which various surfaces of an obstacle were placed in relation to the robot.

I remember sitting in front of my disassembled wannabe-robot, trying yet again to figure out which of the half-dozen wires for the ultrasonic sensor was not hooked up properly, and thinking to myself: “WTF! There has to be a better way."

Requirements

I order to consider this project a be success, I want to end up with a system that:

  • Uses cheap off-the-shelf cameras
  • Can be built for a lot less than comparable stereo cameras, time-of-flight (TOF) cameras or LIDAR systems
  • Creates a simple and abstracted API that is not tied to specific hardware or the microcontroller used to interact with it
  • Is preferably plug and play to avoid circuit board wiring

As far as specific capabilities and features, considering the use case (DIY robot for home/office), I think it would be important that the system can:

  • accurately measure distances to multiple objects and know if any obstacle are in the way (for path finding)
  • detect motion (for tracking, object detection, etc)
  • identify categories of objects a robot is likely to encounter in a home/office environment, such as people, faces, animals, furniture, etc.

This list of specs and features is going to evolve once I get started on this project, but it should be good enough to act as a guiding principle and avoid significant compromises or feature creep.

Next Steps

Research

I already have some ideas for how to implement some of the features, but for now I need to take a step back and do a deep dive into the computer vision literature to learn the best approaches for each of the specific capabilities.

Hardware & Design

While camera/hardware part of this projects is biased toward using cheap off-the-shelf components, I still plan on coming up something that ressambles a polished product. It should be something that, in terms of effort and complexity, falls somewhere between designing my own CMOS sensor and duct taping two webcams to a piece of wood.

I will provide projects updates and share what I learn as part of the research phase on this blog.

comments powered by Disqus