Like its predecessors, ROBART III was a laboratory surrogate, never intended for real-world operation: 1) it was not waterproof; 2) its mobility was constrained to planar surfaces, so it could not ascend or descend stairs; 3) it was not defensively armored; 4) it was not rugged; and, 5) it could not right itself in the event it flipped over. Instead, ROBART III was a concept-development platform built in my garage, optimally configured to support a development role in a laboratory environment.
The 12-volt ABEC drive motors were identical to those used on ROBART II, but equipped with higher traction snow tires. System power was supplied by an 80-amphour 12-volt gel-cell battery that provided for several hours of continuous operation between charges. Numerous hardware upgrades were made over ROBART III’s 15-year lifetime in support of more sophisticated navigation, collision avoidance, and mapping schemes, to include a MicroStrain gyro-stabilized magnetic compass, a KVH fiber-optic rate gyro, and a 2D Sick LMS-200 scanning lidar. Full-duplex data communication with the PC-based control station was accomplished via a 9600-baud Telesystems spread-spectrum RF link.
As seen in Figure a above, early versions of ROBART III had a temporary black/yellow pigtail hanging from the mobility base for manual battery charging. While the front bumper design seen in Figure b was primarily added in 2002 to support the Sick lidar, it also accommodated automatic recharging and tactile sensing as seen below. For compatibility with existing battery chargers for ROBART I and ROBART II, the 1-inch aluminum contact strip on the front tactile-bumper segment (image top center) served as the negative recharging contact. Similarly, the descending vertical-spring contact (seen horizontal at image center) enabled the positive connection.
In the mid-2000s, three 16-bit computers were added to the onboard architecture to support more advanced autonomy: 1) the Torso Computer was responsible for processing sonar range data, speech output, and integrated motion of the surrogate weapon and head; 2) the Vision Computer in the head was responsible for processing live video from the omnidirectional and pan-and-tilt cameras; and, 3) the Drive Computer in the mobility base controlled the drive motors in response to data from the Torso Computer, Sick lidar, KVH fiber-optic gyro, and the Micro Strain compass. Multiple 8-bit microcontrollers were still employed for low-level sensor processing and actuator control.
Intended for concept-development and demonstration, the non-lethal-response weapon shown below was a pneumatically powered dart gun capable of firing a variety of 3/16-inch diameter projectiles. Simulated tranquilizer darts (20-gauge spring-steel wires terminated with 3/16-inch nylon balls) illustrated a potential response application involving remote firing of incapacitating rounds by military or law enforcement personnel. A rotating-barrel arrangement allowed for multiple firings (six) with minimal mechanical complexity. The spinning barrel also imparted a rather sobering psychological message during system initialization.
The darts (or alternatively steel balls) were fired by a release of compressed air stored in a small pressurized accumulator at the rear of the gun assembly. This accumulator was monitored by a Micro Switch 242PC150G electronic pressure transducer and maintained at a constant pressure of 120 psi by a solenoid valve. To minimize air loss, another solenoid valve linking the gun accumulator to the active barrel was opened for precisely the amount of time required to expel the projectile. All six darts could be fired in approximately 1.5 seconds under repeatable launch conditions to ensure accurate performance. A visible-red laser gunsight was provided to facilitate manual as well as automatic targeting.
The surrogate weapon was designed for laboratory use only, supporting vision-based weapon control without undue risk to personnel. A fiber-optic sensor on the gun determined load status for each barrel (Figure a below). A local Ready/Standby switch enabled the air compressor and secondary accumulator charging, and a local Arm/Safe switch physically interrupted power to the trigger solenoid (Figure b). There were parallel software disables for both these same functions on the remote OCU. Two separate control lines were employed for the trigger solenoid, one active-high and the other active-low, ANDed together to minimize inadvertent activation during initialization or in the event of a computer reset. Redundant emergency overrides were also provided: 1) two local E-Stop buttons on the mobility base; 2) an RF-kill pendent; and, 3) a remote E-Stop button at the control station.
Assisted weapon control was an extension of the reflexive-teleoperation concept developed on ROBART II. The issue of concern was the difficulty encountered when remotely driving a mobile robot equipped with surveillance and/or targeting cameras, plus an articulated weapon system. Experience gained through extended use of conventional UGVs revealed considerable shortcomings from a man/machine interface point of view. Simply put, if a remote operator has to master simultaneous manipulation of three different joysticks (i.e., drive and steering, camera pan and tilt, plus weapon control), the chances of hitting a moving target are minimal.
Our initial approach in addressing this problem involved making two of the three controllable elements (i.e., drive, camera, and weapon) slaves to the third, so the human operator only had to deal with one entity. For example, the head-mounted surveillance camera could be slaved to the weapon so the camera looked wherever the operator pointed the gun. If either the weapon pan-axis controller or the camera pan-axis controller approached their respective limits of travel, the mobility base automatically rotated in place or turned in the proper direction to restore the necessary range of payload motion. Alternatively, the weapon could be slaved to the surveillance camera, and so forth.
Taking things a step further, final closed-loop control of weapon pan-and-tilt could be provided by the video target-acquisition system. Initial 360-degree motion detection for this behavior was supported by a ring of passive infrared sensors around the neck, an AM Sensors microwave motion detector behind the faceplate, and an omnidirectional camera mounted on the head. Fused outputs from these sensors were used to cue a high-resolution pan-tilt-zoom (PTZ) camera in azimuth and elevation, which provided video of potential targets for further assessment and classification. Final closed-loop weapon control was provided by gun-camera imagery.
The license-plate-detection method of Dlagnekov was investigated to evaluate the Adaboost machine-learning/training algorithm for generic object detection, using ordinary soda cans to quantify performance with respect to scale, background clutter, and changes in specularity. To build a training set, the cans were randomly emplaced in a cluttered lab area at a variety of distances and orientations (Figure a below). Footage from ROBART III’s head-mounted camera was captured under three different lighting configurations as three separate MPEG videos, at a pixel resolution of 320×240 with a frame-rate of 3 frames-per-second.
Each video was divided into two equal segments, with a portion from each segment pair randomly selected, resized to a resolution of 720×480, and extracted as a sequence of bitmap images, which were set aside to build the training set. Soda-can images were manually labeled in every 8th frame of the bitmap training images, then sorted into four separate groups based on their image dimensions (Figure a below). The remaining portions were merged together and used as test footage for our detection algorithm. Despite a few anomalies, the detection windows seen in Figure b below allowed ROBART III to effectively aim the pneumatically-powered weapon surrogate, albeit under ideal laboratory conditions.
The robot’s PTZ camera protocol was later integrated with a two-stage search-and-engage algorithm that performed a wide-area scan for a pre-taught class of objects, then zoomed in to look for specific “vulnerabilities” assigned to that particular target. For the scenario depicted below, the sought object was the dark wooden box atop a mobile pedestal shown earlier in Figure b above, for which the designated “vulnerability” was a soda can inside this box. The surrogate weapon was automatically trained accordingly, using the gun-camera video and bore-sighted targeting laser, then fired under operator supervision.
Detection of the laser spot was disambiguated by grabbing two successive frames of video with the laser sight both on (Figure a below) and off (Figure b), then subtracting the two images to yield a binary difference image. This process continued as the Weapon Controller sent real-time error-based pan-and-tilt velocity commands to reposition the laser spot on target (note the spring-steel darts terminated by plastic balls embedded in the target).
In the summer of 2006, a two-rail surrogate-missile payload that fired rubber-tipped 8.5-inch plastic missiles was added to the left shoulder pod (Figure a below). Sold as part of a toy rocket launcher, these missiles could attain an altitude of 400 feet, powered by a child stomping on a rubber air bag connected to the vertical launch tube. The addition of this payload required a larger main accumulator, and a higher-throughput mechanical regulator to maintain the secondary accumulator at a constant pressure of 120 psi (Figure b below). When fired at a 45-degree angle, these surrogate missiles could easily travel about half a block downrange.
Rather than relying upon a large catch-all object-detection method, we implemented numerous specialized detection strategy that could be selectively applied once the robot had established suitable context for their deployment. While the traditional approach often views computer vision as a more isolated task, our vision development took full advantage of the robot’s navigational and perception capabilities to simplify the procedure. The below figure shows the results of a doorway-detection algorithm that analyzed the SLAM data collected by the Sick lidar during building exploration, with icons inserted into the map representation to flag open doorways that subsequently cued the vision system to look for room signs.
Rather than continuously search the full field of regard for signs from which to extract information, the robot instead optimally positioned itself to investigate the much smaller wall surface to either side of any detected doorway, using the previously discussed boosted classifier to look for a rectangular shape. If a suitable shape were detected, the robot would next verify there were characters within this shape, then interpret the sign using optical character recognition as illustrated below. Under this heuristic task-decomposition approach, conditions that significantly enhanced the performance of visual detection were optimally provided by the robot, thereby enabling more intelligent autonomous behavior.
A natural-language interface allowed ROBART III to receive fairly unstructured verbal direction, no different from the procedures used to instruct a human to perform the same task. For example, suppose the robot had penetrated an underground bunker and was streaming back video that showed an open doorway in the center of the far wall. A human monitoring this video might converse with the robot as follows: “Enter the doorway in front of you.” The robot would then look for predefined scene attributes that suggested a door frame or opening, highlighting its choice with a graphic overlay, whereupon the human could confirm or redirect as needed.
To eliminate voice-recognition problems during the initial stages of development, ROBART III was assigned a working e-mail account, thus enabling human-robot interaction via simple text strings, with the added ability to enclose return attachments. Figure a above shows an outgoing e-mail request from Estrellina Pacis to ROBART III requesting an image dump from the head-mounted PTZ video camera. The automatic response seen in Figure b returned a JPG frame grab of the current camera view, in this particular case depicting the robot’s approach to its linear docking station in an adjacent building (Figure a below). The desired video source and camera pose could be specified by e-mail or automatically determined by the behavior under execution, as in this particular case involving a recharging event (Figure b below).
In 2005, ROBART III was given the ability to deploy and recover its own sacrificial drone scout. The envisioned application for this upgrade was autonomous structure exploration, so we decided to use a slave UAV versus a UGV to facilitate mobility in damaged or cluttered environments. Initial attempts involved a toy R/C helicopter (Figure a below), which proved too unstable, so we switched to an R/C blimp (Figure b), which although stable, was too susceptible to air currents to be practical. This video-based remote-control software was developed in a matter of days by electrical engineer Greg Kogut using a single-chip RS-232 to R/C converter.
As one of the more sophisticated mobile-robot research platforms of its time, ROBART III was featured numerous times on the Learning, History, and Discovery Channels, and in January 2006 was ranked number 16 in Wired Magazine’s survey of the 50 best robots ever, beating out R2D2. The continuously evolving prototype made many contributions to the field of both supervised and fully autonomous robots, to include perception, localization, navigation, and response.