Full-motion video use case for semiautonomous driving
Companies in the security, consumer, retail, telecom, security, and autonomous driving industries are most likely to benefit from automating the extraction, analysis, and understanding of key information from frame-by-frame video images. For example, AI in semi-autonomous driving cars can track potentially dangerous objects to alert the driver to danger. Split-second detection is critical to safety in this instance, so it’s essential to optimize both hard and software for speed.
The state of the art object detection can identify and categorize all instances of an object in each frame of a full motion video. The most common technique to perform this process is a Convolutional Neural Network (CNN). A CNN breaks an image down into smaller pieces (called features) and parses each pixel of the image for a feature match. The math to calculate these feature matches is called a convolution, for which the algorithm is named. A CNN also makes use of pooling, or compressing information into small parts while retaining key information. Because any given CNN may be searching for multiple image matches, the resulting algorithm is comprised of many different layers.
As the techniques of CNN have advanced, many options have emerged. One breakthrough uses the Region-based method (R-CNN). This approach gives each feature of the CNN more detailed data than the classic CNN approach. The algorithm scores each part of the image and to reduce computation costs, focuses in on those areas (called regions of interest [RoIs]) most likely to contain vital information. The algorithm then uses these RoIs to classify and localize predictions. The localization is highlighted on each image with a bounding box.
By: Charis Loveland