Autonomous underwater vehicles (AUVs) are being used in a number of different aquatic applications, such as environmental monitoring, cable and wreckage inspection, and search-and-rescue missions. These missions are usually carried out in teams that may involve multiple AUVs and/or professional scuba divers. During a mission, the scuba divers mainly use sign language to communicate among them. However, to communicate with other AUVs or to understand their conversation, the scuba divers either need to use extra transmitter/receiver devices or seek assistance from surface vessels. Therefore, a common language comprehensible to both AUVs and humans would significantly enhance the underwater multi-human-robot collaborative missions.
As a PhD candidate and UMII MnDRIVE PhD Graduate Assistant, Sadman Sakib Enan (Computer Science & Engineering) worked on a project called “Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration” that proposed such a language for the AUVs to communicate underwater using robot motions that represent human understandable gestures- nodding of the robot’s body means a Yes, for instance. It is shown that the AUVs can interpret the meaning of different gestural messages with superior accuracy and reasonable inference times. It is also demonstrated that humans can understand this gestural language when two AUVs are communicating between them. The gestural messages were first designed in various simulated underwater environments using computer-aided design renderings of a six-legged AUV named Aqua. Later, they were implemented on board the actual Aqua robot using Robot Operating System. Then, a robot-to-robot communication language dataset (RRComm) was created by implementing a total of 15 gestural messages in different underwater environments (both simulated and actual). To allow AUVs to visually understand a gesture from another AUV, a deep network (RRCommNet) was proposed which exploits the self-attention mechanism to learn to recognize each message by extracting maximally discriminative spatio-temporal features. This RRCommNet network was trained using the RRComm dataset. Experimental evaluations on both simulated and actual data (closed-water robot trials) demonstrate that the proposed RRCommNet architecture is able to decipher the gesture-based messages with an average accuracy of 88-94% on simulated data and 73-83% on real data. Finally, to check if humans actually understand the gestural message, a message transcription study was carried out. In the study, a total of 34 human participants were asked to transcribe the conversation shown to them of two underwater robots conversing using the proposed gestural language. It was concluded that the participants are able to correctly transcribe the conversation with an average accuracy of 88% and confidence of 7.9 (out of 10).
Dr. Enan was awarded his PhD in April 2024. He did this research under the supervision of Associate Professor Junaed Sattar (Computer Science & Engineering; MSI PI).
Publication:
- S. S. Enan, M. Fulton, and J. Sattar, “Robotic Detection of a Human-Comprehensible Gestural Language for Underwater Multi-Human-Robot Collaboration,” 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 2022, pp. 3085-3092.
Award:
- The above paper was nominated for the IROS Best Paper Award on Cognitive Robotics.
A video related to this research can be found on YouTube.
The UMII MnDRIVE Graduate Assistantship program supported U of M PhD candidates pursuing research at the intersection of informatics and any of the five MnDRIVE areas:
- Robotics
- Global Food
- Environment
- Brain Conditions
- Cancer Clinical Trials
This project was part of the Robotics MnDRIVE area.
The UMII program was converted to the Data Science Initiative-MnDRIVE Graduate Assistantship program in the fall of 2023. Research supported by the program is at the intersection of data science and the five MnDRIVE areas. The most recent Assistantships were announced in January 2024. The application period for the next awards will be announced in Fall 2024.
Image description: An underwater gestural communication framework where the speaker robot is communicating back with the listener robot by making a nodding motion to mean YES, which a human observer is also able to understand.