Real-Time Intelligent Audio and Object Detection for Low-Cost Robotics Systems

A University of Chicago M.Sc. Analytics Capstone project in collaboration with R3 Robotics and Ilykei Software Corp.

Python software developed for the Dexter GoPiGo3 Educational Robot Car to complete a scavenger hunt using audio and visual cues. The GoPiGo uses it's camera and sensors, an external microphone, and the inference from deep neural network algorithms to complete the scavenger hunt by:

  1. Autonomously locating and navigating to defined checkpoints with a simple shape and solid color, such as a traffic cone,
  2. Avoiding obstacles in its path,
  3. Identifying the object hidden behind the checkpoint,
  4. Identify and respond to audio cues during operation, and
  5. Providing a report about what the robot saw and heard.

Various model architectures were explored to accomplish the recognition tasks. The final architectures were efficient, mobile-optimized deep neural networks: a MobileNet-SSD was trained to recognize the checkpoints and identify the hidden objects behind the checkpoints, and an EfficientNet-Lite CNN was customized to perform audio recognition, in conjunction with real-time audio signal processing for noise reduction. A control structure processed the inputs and outputs of the models and sensors, and drove the robot to navigate and complete the scavenger hunt.

Our team's approach focused on model efficiency and speed of inference while still maintaining high classification accuracy. The models were integer quantized and converted to TFLite to allow for EdgeTPU inference.

Won Best in Show at the Spring 2020 M.Sc. Analytics Capstone Showcase.

Teammates: Raghav Atal, Dan Dobrzynski, Scott Shepard, & Kevin Stutenberg.

Advisor: Dr. Yuri Balasanov

Avatar
Audrey Salerno
Data Scientist

My research interests include deep learning for computer vision and audio signal processing.

Related