YOLO and its descendants draw rectangles. SpiraVision 3D draws masks. The difference matters when two avocados are touching, when a bell pepper's true centroid is offset from its bounding box, or when orientation determines whether the gripper closes on fruit or on stem. Our proprietary CNN — licensed from Spira Vision Systems — segments every produce class at the pixel, fuses depth at the centroid, and fits a PCA principal axis for orientation. Built on commodity hardware, streaming 6-DoF pick poses today.
Per-pixel masks, centroid + axis fusion, and 6-DoF pose in a single forward pass — outputs that bounding-box detectors cannot produce.
Active IR stereo at 90 fps, color-agnostic depth, global shutter for moving conveyors.
Dilated CNN delivers per-pixel multiclass masks in a single forward pass — not bounding boxes.
Mask-averaged depth fusion gives 3D centroid in robot frame, calibrated by Kabsch SVD.
PCA principal axis plus 6-DoF pose streamed via TCP, ROS 2, or JSON to the robot controller.
Intel RealSense D435 projects a Class-1 IR speckle pattern at 850 nm and computes per-pixel depth from active stereo at up to 90 fps. The IR pattern is color-agnostic — separating touching fruit by physical geometry rather than RGB contrast.
A pixel-to-pixel dilated CNN runs in a GPU-accelerated Docker container, outputting multiclass masks in a single forward pass. The accompanying VCC tool lets non-experts add new SKUs.
Sensate's calibration pipeline registers the depth frame into the robot's coordinate system via Kabsch SVD, fits PCA principal axes, and streams 6-DoF pick poses over TCP, ROS 2, or JSON.