Depth estimation

Depth estimation with event cameras is possible by applying the same approach of disparity calculation on a calibrated stereo camera rig. The straightforward approach is to accumulate frames from events on both cameras and use the same disparity estimation algorithm. This approach might have some limitations, since accumulating events might result in suboptimal results due to low texture available in an accumulated frame.

The dv-processing library provides the dv::camera::StereoGeometry and a few disparity estimation algorithms that, in combination, can be used to build a depth estimation pipeline.

Semi-dense stereo block matching

Dense block matching here refers to the most straightforward approach: accumulating full frames and running a conventional disparity estimation on top to estimate depth. Since the accumulated frames only contain limited texture due to pixels reacting to brightness changes - this approach is referred to as semi-dense. The SemiDenseStereoMatcher class wraps the disparity estimation part, where estimated disparity can be used to calculate depth with dv::camera::StereoGeometry.

Following sample code show the use of SemiDenseStereoMatcher with dv::camera::StereoGeometry to run a real-time depth estimation pipeline on a calibration stereo camera.

  1#include <dv-processing/camera/calibration_set.hpp>
  2#include <dv-processing/core/stereo_event_stream_slicer.hpp>
  3#include <dv-processing/depth/semi_dense_stereo_matcher.hpp>
  4#include <dv-processing/io/stereo_capture.hpp>
  5#include <dv-processing/noise/background_activity_noise_filter.hpp>
  6
  7#include <opencv2/highgui.hpp>
  8
  9int main() {
 10    using namespace std::chrono_literals;
 11
 12    // Path to a stereo calibration file, replace with a file path on your local file system
 13    const std::string calibrationFilePath = "path/to/calibration.json";
 14
 15    // Load the calibration file
 16    auto calibration = dv::camera::CalibrationSet::LoadFromFile(calibrationFilePath);
 17
 18    // It is expected that calibration file will have "C0" as the leftEventBuffer camera
 19    auto leftCamera = calibration.getCameraCalibration("C0").value();
 20
 21    // The second camera is assumed to be rightEventBuffer-side camera
 22    auto rightCamera = calibration.getCameraCalibration("C1").value();
 23
 24    // Open the stereo camera with camera names from calibration
 25    dv::io::StereoCapture capture(leftCamera.name, rightCamera.name);
 26
 27    // Make sure both cameras support event stream output, throw an error otherwise
 28    if (!capture.left.isEventStreamAvailable() || !capture.right.isEventStreamAvailable()) {
 29        throw dv::exceptions::RuntimeError("Input camera does not provide an event stream.");
 30    }
 31
 32    // Initialize a stereo block matcher with a stereo geometry from calibration and the preconfigured SGBM instance
 33    dv::SemiDenseStereoMatcher blockMatcher(std::make_unique<dv::camera::StereoGeometry>(leftCamera, rightCamera));
 34
 35    // Initialization of a stereo event sliver
 36    dv::StereoEventStreamSlicer slicer;
 37
 38    // Initialize a window to show previews of the output
 39    cv::namedWindow("Preview", cv::WINDOW_NORMAL);
 40
 41    // Local event buffers to implement overlapping window of events for accumulation
 42    dv::EventStore leftEventBuffer, rightEventBuffer;
 43
 44    // Use one third of the resolution as count of events per accumulated frame
 45    const size_t eventCount = static_cast<size_t>(leftCamera.resolution.area()) / 3;
 46
 47    // Register a callback to be done at 30Hz
 48    slicer.doEveryTimeInterval(33ms, [&blockMatcher, &leftEventBuffer, &rightEventBuffer, eventCount](
 49                                         const auto &leftEvents, const auto &rightEvents) {
 50        // Push input events into the local buffers
 51        leftEventBuffer.add(leftEvents);
 52        rightEventBuffer.add(rightEvents);
 53
 54        // If the number of events is above the count, just keep the latest events
 55        if (leftEventBuffer.size() > eventCount) {
 56            leftEventBuffer = leftEventBuffer.sliceBack(eventCount);
 57        }
 58        if (rightEventBuffer.size() > eventCount) {
 59            rightEventBuffer = rightEventBuffer.sliceBack(eventCount);
 60        }
 61
 62        // Pass these events into block matcher and estimate disparity, the matcher will accumulate frames
 63        // internally. The disparity output is 16-bit integer, that has sub-pixel precision.
 64        const auto disparity = blockMatcher.computeDisparity(leftEventBuffer, rightEventBuffer);
 65
 66        // Convert disparity into 8-bit integers with scaling and normalize the output for a nice preview.
 67        // This loses the actual numeric value of the disparity, but it's a nice way to visualize the disparity.
 68        cv::Mat disparityU8, disparityColored;
 69        disparity.convertTo(disparityU8, CV_8UC1, 1.0 / 16.0);
 70        cv::normalize(disparityU8, disparityU8, 0, 255, cv::NORM_MINMAX);
 71
 72        // Convert the accumulated frames into colored images for preview.
 73        std::vector<cv::Mat> images(3);
 74        cv::cvtColor(blockMatcher.getLeftFrame().image, images[0], cv::COLOR_GRAY2BGR);
 75        cv::cvtColor(blockMatcher.getRightFrame().image, images[1], cv::COLOR_GRAY2BGR);
 76
 77        // Apply color-mapping to the disparity image, this will encode depth with color: red - close; blue - far.
 78        cv::applyColorMap(disparityU8, images[2], cv::COLORMAP_JET);
 79
 80        // Concatenate images and show them in a window
 81        cv::Mat preview;
 82        cv::hconcat(images, preview);
 83        cv::imshow("Preview", preview);
 84    });
 85
 86    // Buffer input events in these variables to synchronize inputs
 87    std::optional<dv::EventStore> leftEvents  = std::nullopt;
 88    std::optional<dv::EventStore> rightEvents = std::nullopt;
 89
 90    // Run the processing loop while both cameras are connected
 91    while (capture.left.isRunning() && capture.right.isRunning()) {
 92        // Read events from respective left / right cameras
 93        if (!leftEvents.has_value()) {
 94            leftEvents = capture.left.getNextEventBatch();
 95        }
 96        if (!rightEvents.has_value()) {
 97            rightEvents = capture.right.getNextEventBatch();
 98        }
 99
100        // Feed the data into the slicer and reset the buffer
101        if (leftEvents && rightEvents) {
102            slicer.accept(*leftEvents, *rightEvents);
103            leftEvents  = std::nullopt;
104            rightEvents = std::nullopt;
105        }
106
107        // Wait for a small amount of time to avoid CPU overhaul
108        cv::waitKey(1);
109    }
110
111    return 0;
112}
_images/semi-dense.png

Expected result of semi-dense disparity estimation. The output provides two accumulated frames and color-coded disparity map.

Note

Disparity map yields results only in areas with visible texture, areas without texture contain speckle noise.

Sparse disparity estimation

The semi-dense appraoch is a straightforward to stereo disparity estimation. Another approach is to perform disparity estimation on sparse selected regions within accumulated image. Sparse estimation approach allows the implementation to select regions with enough texture to be selected for the disparity, reducing computational complexity and improving quality. The sparse approach takes point coordinates of where the disparity needs to be estimated, performs sparse accumulation only in the regions where disparity matching actually needs to happen and runs correlation based template matching of left image patches on the right camera image. Each template is matched against the other image on a horizontal line using normalized correlation coefficient (Pearson correlation) and the best scoring match is considered to be the correct match and according disparity is assigned to that point.

The following sample code shows the use of sparse disparity block matcher with a live calibrated stereo camera:

  1#include <dv-processing/camera/calibration_set.hpp>
  2#include <dv-processing/cluster/mean_shift/event_store_adaptor.hpp>
  3#include <dv-processing/core/stereo_event_stream_slicer.hpp>
  4#include <dv-processing/data/utilities.hpp>
  5#include <dv-processing/depth/sparse_event_block_matcher.hpp>
  6#include <dv-processing/io/stereo_capture.hpp>
  7#include <dv-processing/visualization/colors.hpp>
  8
  9#include <opencv2/highgui.hpp>
 10
 11int main() {
 12    using namespace std::chrono_literals;
 13
 14    // Path to a stereo calibration file, replace with a file path on your local file system
 15    const std::string calibrationFilePath = "path/to/calibration.json";
 16
 17    // Load the calibration file
 18    auto calibration = dv::camera::CalibrationSet::LoadFromFile(calibrationFilePath);
 19
 20    // It is expected that calibration file will have "C0" as the leftEventBuffer camera
 21    auto leftCamera = calibration.getCameraCalibration("C0").value();
 22
 23    // The second camera is assumed to be rightEventBuffer-side camera
 24    auto rightCamera = calibration.getCameraCalibration("C1").value();
 25
 26    // Open the stereo camera with camera names from calibration
 27    dv::io::StereoCapture capture(leftCamera.name, rightCamera.name);
 28
 29    // Make sure both cameras support event stream output, throw an error otherwise
 30    if (!capture.left.isEventStreamAvailable() || !capture.right.isEventStreamAvailable()) {
 31        throw dv::exceptions::RuntimeError("Input camera does not provide an event stream.");
 32    }
 33
 34    // Matching window size for the block matcher
 35    const cv::Size window(24, 24);
 36    // Minimum disparity value to measure
 37    const int minDisparity = 0;
 38    // Maximum disparity value
 39    const int maxDisparity = 40;
 40    // Minimum z-score value that a valid match can have
 41    const float minScore = 0.0f;
 42
 43    // Initialize the block matcher with rectification
 44    auto matcher = dv::SparseEventBlockMatcher(std::make_unique<dv::camera::StereoGeometry>(leftCamera, rightCamera),
 45        window, maxDisparity, minDisparity, minScore);
 46
 47    // Initialization of a stereo event sliver
 48    dv::StereoEventStreamSlicer slicer;
 49
 50    // Initialize a window to show previews of the output
 51    cv::namedWindow("Preview", cv::WINDOW_NORMAL);
 52
 53    // Local event buffers to implement overlapping window of events for accumulation
 54    dv::EventStore leftEventBuffer, rightEventBuffer;
 55
 56    // Use one third of the resolution as count of events per accumulated frame
 57    const size_t eventCount = static_cast<size_t>(leftCamera.resolution.area()) / 3;
 58
 59    // Register a callback to be done at 50Hz
 60    slicer.doEveryTimeInterval(20ms, [&matcher, &leftEventBuffer, &rightEventBuffer, eventCount, &window](
 61                                         const auto &leftEvents, const auto &rightEvents) {
 62        // Push input events into the local buffers
 63        leftEventBuffer.add(leftEvents);
 64        rightEventBuffer.add(rightEvents);
 65
 66        // If the number of events is above the count, just keep the latest events
 67        if (leftEventBuffer.size() > eventCount) {
 68            leftEventBuffer = leftEventBuffer.sliceBack(eventCount);
 69        }
 70        if (rightEventBuffer.size() > eventCount) {
 71            rightEventBuffer = rightEventBuffer.sliceBack(eventCount);
 72        }
 73
 74        // Number of clusters to extract
 75        constexpr int numClusters = 100;
 76
 77        // Initialize the mean-shift clustering algorithm
 78        dv::cluster::mean_shift::MeanShiftEventStoreAdaptor meanShift(leftEventBuffer, 10.f, 1.0f, 20, numClusters);
 79
 80        // Find cluster centers which are going to be used for disparity estimation
 81        auto centers = meanShift.findClusterCentres<dv::cluster::mean_shift::kernel::Epanechnikov>();
 82
 83        // Run disparity estimation, the output will contain a disparity estimate for each of the given points.
 84        const std::vector<dv::SparseEventBlockMatcher::PixelDisparity> estimates
 85            = matcher.computeDisparitySparse(leftEventBuffer, rightEventBuffer, dv::data::convertToCvPoints(centers));
 86
 87        // Convert the accumulated frames into colored images for preview.
 88        std::vector<cv::Mat> images(2);
 89        cv::cvtColor(matcher.getLeftFrame().image, images[0], cv::COLOR_GRAY2BGR);
 90        cv::cvtColor(matcher.getRightFrame().image, images[1], cv::COLOR_GRAY2BGR);
 91
 92        // Visualize the matched blocks
 93        int32_t index = 0;
 94        for (const auto &point : estimates) {
 95            // If point estimation is invalid, do not show a preview of it
 96            if (!point.valid) {
 97                continue;
 98            }
 99
100            // The rest of the code performs drawing of the match according to the disparity value on the
101            // preview images.
102            const cv::Scalar color = dv::visualization::colors::someNeonColor(index++);
103            // Draw some nice colored markers and rectangles.
104            cv::drawMarker(images[1], *point.matchedPosition, color, cv::MARKER_CROSS, 7);
105            cv::rectangle(images[1],
106                cv::Rect(point.matchedPosition->x - (window.width / 2), point.matchedPosition->y - (window.height / 2),
107                    window.width, window.height),
108                color);
109            cv::rectangle(images[0],
110                cv::Rect(point.templatePosition->x - (window.width / 2),
111                    point.templatePosition->y - (window.height / 2), window.width, window.height),
112                color);
113        }
114
115        // Concatenate images and show them in a window
116        cv::Mat preview;
117        cv::hconcat(images, preview);
118        cv::imshow("Preview", preview);
119    });
120
121    // Buffer input events in these variables to synchronize inputs
122    std::optional<dv::EventStore> leftEvents  = std::nullopt;
123    std::optional<dv::EventStore> rightEvents = std::nullopt;
124
125    // Run the processing loop while both cameras are connected
126    while (capture.left.isRunning() && capture.right.isRunning()) {
127        // Read events from respective left / right cameras
128        if (!leftEvents.has_value()) {
129            leftEvents = capture.left.getNextEventBatch();
130        }
131        if (!rightEvents.has_value()) {
132            rightEvents = capture.right.getNextEventBatch();
133        }
134
135        // Feed the data into the slicer and reset the buffer
136        if (leftEvents && rightEvents) {
137            slicer.accept(*leftEvents, *rightEvents);
138            leftEvents  = std::nullopt;
139            rightEvents = std::nullopt;
140        }
141
142        // Wait for a small amount of time to avoid CPU overhaul
143        cv::waitKey(1);
144    }
145
146    return 0;
147}
_images/sparse-disparity.png

Expected result of sparse disparity estimation. The colored rectangles represent sparse blocks that are matched on the right side image. Block colors are matched on both images. Note that frame are sparse as well - the accumulation happens only in relevant areas around points of interest. The points of interest are selected on high density event areas as per mean-shift cluster extraction.