Starting building
As building the whole SDK and dependencies is quite a long process, let’s start building it right away before looking at its content.
$ cd realsense_sdk_beta3
$ sudo ./install_script.sh
This process will take a very long time, let’s use that opportunity to open a second ssh connection to the board, and ensure that you will get your Intel RealSense camera recognized:
Note: this is the only change that needs to be done outside the classic snap, as we need system services like udev to know about it.
You can now unplug and plug back your Intel RealSense camera. It will be detected by the system.
RealSense SDK Content
While the build is proceeding, let’s look at the files extracted from the bz2 archive:
~/realsense_sdk_beta3$ ls
README.md
beignet-1.2.1-source.tar.gz
build
install_script.sh
librealsense_persontracking_20161220.tar.gz
librealsense_slam_20161221.tar.bz2
librealsense_v1.11.2.tar.gz
opencv_3.1.0.tar.gz
realsense_object_recognition20161208.tar.bz2
realsense_samples-v0.6.2.tar.gz
realsense_sdk_07_12_2016WW50.tar.gz
release_notes
zr300_firmware_update_prq_2.zip
In addition to the README.md, install script, release note, ZR300 firmware and build directory, you see various projects embedded. Let’s detail them in build dependency order:
beignet1.2.1-source.tar.gz
Beignet is an open source implementation of the OpenCL specification - a generic compute oriented API. This code base contains the code to run OpenCL programs on Intel GPUs which basically defines and implements the OpenCL host functions required to initialize the device, create the command queues, the kernels and the programs and run them on the GPU. The code base also contains the compiler part of the stack which is included in backend/.
opencv_3.1.0.tar.gz
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision. It has C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. OpenCV was designed for computational efficiency and with a strong focus on real-time applications. Written in optimized C/C++, the library can take advantage of multi-core processing. Enabled with OpenCL, it can take advantage of the hardware acceleration of the underlying heterogeneous compute platform
librealsense_v1.11.2.tar.gz
This project is a cross-platform library (Linux, Windows, Mac) for capturing data from the Intel® RealSense™ F200, SR300, R200, LR200 and the ZR300 cameras. This effort was initiated to better support researchers, creative coders, and app developers in domains such as robotics, virtual reality, and the internet of things. Several often-requested features of RealSense™ devices are implemented in this project, including multi-camera capture.
The following features are available in this release:
- Native streams: depth, color, infrared and fisheye.
- Synthetic streams: rectified images, depth aligned to color and vice versa, etc.
- Intrinsic/extrinsic calibrationinformation.
- Majority of hardware-specific functionality for individual camera generations (UVC XU controls).
- Multi-camera captureacross heterogeneous camera architectures (e.g. mix R200 and F200 in same application)
- Motion-tracking sensors acquisition (ZR300 only)
This project contains a certain number of samples, where you can get raw data from the camera. Those are built from the general build script. We’ll detail some of them in the next step.
librealsense_persontracking_20161220.tar.gz
Intel® RealSense™ Person Library provides the ability to sense and recognize people, understand what they do, and interact with them.
The following features are available in this release:
- Person Detection: Detects people in a scene. Provides their location through a bounding rectangle.
- Person Tracking: Follows people within a space. Provides the person’s Center of Mass (COM) and handles occlusions and cases in which the person exits the field of view (FOV).
- Person Recognition: Identifies a person by ID. Supports databases of up to 30 registered users.
- Body Tracking: Tracks body part movements. Provides six points of interest on the body (head, chest, shoulders, and extremities of the hands). Body tracking is supported up to a distance of 2.8m from the camera.
- Body Gestures: Understands pre-defined body movements. Supports a pointing gesture (including the pointing vector) and a wave gesture.
- Face Tracking: Tracks face movements. Provides the face direction in values of yaw, pitch, and roll.
realsense_object_recognition20161208.tar.bz2
Intel® RealSense™ Object Library enables machines to understand what they are viewing and provides meaning to computer vision made possible by Intel RealSense cameras. This ability facilitates more dynamic human machine interaction. Intel® RealSense™ Object Library uses a CNN-based architecture that utilizes depth to efficiently and accurately classify and localize objects. This middleware includes features for recognizing, localizing, and tracking objects from a pre-defined library.
The following features are available in this release:
- Object Recognition (OR): Identifies a single object in the scene based on pre-trained classifiers in a pre-defined ROI.
- Object Localization (OD): Identifies and locates multiple objects in a scene.
- 3D Object Position: Provides the x,y,z world coordinates of the center of the object. The more exact the bounding box, the more exact this value will be. This is only available in instances where depth is available and will not be provided as a direct output of localization.
- Object Tracking: Enables the camera to keep track of objects that have been previously localized.
librealsense_slam_20161221.tar.bz2
Intel® RealSense™ SLAM Library provides Simultaneous Localization and Mapping capabilities.
The following features are available in this release:
- Real-time 6 degrees of freedom (6DoF) camera tracking.
- Learning an area by associating its appearance with 6DoF coordinates, so enabling re-localization.
- Real-time construction of a 2D occupancy map of a captured environment.
SLAM is tailored for, although not limited to, robots: Autonomous robot navigation requires the ability to build a high precision map of the surrounding environment and enable the robot to locate itself in the map for path planning and routing purposes.
SLAM uses the visual-inertial sensor of the RealSense camera to estimate odometry and concurrently builds a map. The constructed map can be used by applications to detect obstacles, label locations, and plan paths.
realsense_sdk_07_12_2016WW50.tar.gz
The Intel® RealSense™ SDK for Linux provides libraries, tools, and samples to develop applications using Intel® RealSense™ cameras, over the Intel librealsense API.
The SDK provides functionality of record and playback of camera streams for test and validation.
The SDK includes libraries which support the camera stream projection of streams into a common world-space viewpoint, and libraries which enable the use of multiple middleware modules simultaneously for common multi-modal scenarios.
It features:
- Record and Play:
- Record: The record module provides a utility to create a file, which can be used by the playback module to create a video source. The record module provides the same camera API as defined by the SDK (librealsense) and the record API to configure recording parameters such as output file and state (pause and resume). The record module loads librealsense to access the camera device and execute the set requests and reads, while writing the configuration and changes to the file.
- Playback: The playback module provides a utility to create a video source from a file. The playback module provides the same camera API as defined by the SDK (librealsense), and the playback API to configure recording parameters such as input file, playback mode, seek, and playback state (pause and resume). The playback module supports files that were recorded using the Linux SDK recorder and the Windows RSSDK recorder (up to version 2016 R2).
- Frame data container: The SDK provides an image container for raw image access and basic image processing services, such as format conversion, mirror, rotation, and more. It caches the processing output to optimize multiple requests of the same operation. The image container includes image metadata, which may be used by any pipeline component to attach additional data or computer vision (CV) module processing output to be used by other pipeline components. The SDK uses a correlated samples container to provide access to camera images and motion sensor samples from the relevant streams, which are time-synchronized. The correlated samples container includes all relevant raw buffers, metadata, and information required to access the attached images.
- Spatial correlation and projection: The Spatial Correlation and Projection library provides utilities for spatial mapping:
- Map between color or depth image pixel coordinates and real world coordinates
- Correlate depth and color images and align them in space
- Pipeline: The pipeline is a class, which abstracts the details of how the cognitive data is produced by the computer vision modules. The application can focus on consuming the computer vision output, leaving the camera configuration and streaming details for the pipeline to handle.
- Samples:
- Projection: The sample demonstrates how to use the different spatial correlation and projection functions, from live camera and recorded file
- Record and Playback: The sample demonstrates how to record and play back a file while the application is streaming, with and without an active CV module, with minimal changes to the application, compared to live streaming.
- Video module, asynchronized: The sample demonstrates an application usage of a Computer Vision module, which implements asynchronous sample processing.
- Video module, synchronized: The sample demonstrates an application usage of a Computer Vision module, which implements synchronous samples processing.
- Fatal error recovery: The sample demonstrates how the application can recover from a fatal error in one of the SDK components (CV module or core module), without having to terminate.
- Tools:
- Capture tool: Provides a GUI to view camera streams, create a new file from a live camera, and play a file in the supported formats. The tool provides options to render the camera or file images.
- Projection tool: Provides simple visualization of the projection functions output, to allow human eye detection of major offsets in the projection computation.
- System Info tool: Presents system data such as Linux name, Linux kernel version, CPU information, and more.
- Utilities:
- Log: The SDK provides a logging library, which can be used by the SDK components and the application to log meaningful events.
- Time Sync utility: Provides methods to synchronize multiple streams of images and motion samples based on the samples time-stamp or sample number.
- SDK Data Path utility: The SDK provides a utility to locate SDK files in the system. The utility is used by Computer Vision modules, which need to locate data files in the system that are constant for all applications (not application- or algorithm-instance specific).
- FPS counter: Measures the actual FPS in a specific period. You can use this utility to check the actual FPS in all software stack layers in your applications, to analyze FPS latency in those layers.
realsense_samples-v0.6.2.tar.gz
We are going to detail the available samples in the next section!