In:
IPSJ Transactions on Computer Vision and Applications, Springer Science and Business Media LLC, Vol. 10, No. 1 ( 2018-12)
Abstract:
Motion information can be important for detecting objects, but it has been used less for pedestrian detection, particularly with deep-learning-based methods. We propose a method that uses deep motion features as well as deep still-image features, following the success of two-stream convolutional networks, each of which are trained separately for spatial and temporal streams. To extract motion clues for detection differentiated from other background motions, the temporal stream takes as input the difference in frames that are weakly stabilized by optical flow. To make the networks applicable to bounding-box-level detection, the mid-level features are concatenated and combined with a sliding-window detector. We also introduce transfer learning from multiple sources in the two-stream networks, which can transfer still image and motion features from ImageNet and an action recognition dataset respectively, to overcome the insufficiency of training data for convolutional neural networks in pedestrian datasets. We conducted an evaluation on two popular large-scale pedestrian benchmarks, namely the Caltech Pedestrian Detection Benchmark and Daimler Mono Pedestrian Detection Benchmark. We observed 10% improvement compared to the same method but without motion features.
Type of Medium:
Online Resource
ISSN:
1882-6695
DOI:
10.1186/s41074-018-0048-5
Language:
English
Publisher:
Springer Science and Business Media LLC
Publication Date:
2018
detail.hit.zdb_id:
2769752-6
Permalink