Last January, I was on the jury of Joerg Rett’s PhD. thesis defense. The place: the University of Coimbra in Portugal. Joerg’s presentation was very professional, interesting and ceremonious. The room where the presentation took place was really magnificent: the "Sala Grande dos Actos or Sala dos Capelos". I’m sure that the ceremony was an unforgettable experience both for Joerg and the Jury. Here I re-write my evaluation of the thesis document, that I find very interesting.
![]() |
| From left to right: Joerg Rett, Professor Pierre Bessière and myself (Juan-Manuel Ahuactzin). |
REPORT ON THE THESIS OF MR. JÖRG RETT: ROBOT-HUMAN INTERFACE USING LABAN MOVEMENT ANALYSIS INSIDE A BAYESIAN FRAMEWORK. BY JUAN-MANUEL AHUACTZIN. SEPTEMBER 2008.
Bayesian gesture recognition is a key issue on information technology. Humans and machines will increase their communication when our gesture languages will be recognized in real time and independently of the actor. The Bayesian paradigm and the Laban movement analysis are two fields that have proven to be very promising for the development of gesture recognition systems. The combination of the two fields is a response to the lacking ability to analyze, represent and recognize human movements. This is demonstrated by Mr. Joerg Rett’s thesis that goes from the analysis of the human movements to a real implementation: a gesture recognition system controlling a mobile robot by a body language.
Mr. Rett’s thesis starts by presenting an introduction to the different fields involved on his thesis: notation for describing human movements, human-robot interaction, social robots, computational human movement analysis and an examination about the use of Laban Movement Analysis for social robots. The introduction ends by a discussion on the contributions of the thesis.
The main body of the thesis is composed of four chapters: i) human movement analysis, ii) human movement tracking and description, iii) Bayesian models and iv) recognition system.
In Chapter 2 Mr. Rett introduces the Laban movement analysis (LMA): a method for observing, describing, notating and interpreting human movements. LMA includes two kinds of components: kinematics (Body and Space) and non-kinematics (Effort and Shape). On this work, Space and Effort are selected for describing body human gestures. Space concerns the location of the body parts, and Effort refers to the dynamic of the movements and their energy. A tagged database was created with different gestures examples capturing the Space and Effort properties. The database is an important contribution for the scientific community. Indeed, to the best of my knowledge it is the first gesture database including LMA.
Mr. Rett chose to describe Space and Effort components using low-level features: measurable entities obtained from sensing the body parts either in 2D and 3D. Chapter 3 contains a technical review of the sensing and tracking methods and the geometric problems involved on the task. Four low-level features are introduced: direction, velocity, acceleration and curvature. The pertinence of the low-level features for describing Effort and Space is evaluated. The frequencies “signatures” of the features given a particular kind of movement are analyzed for a single actor as well as for multiple actors. The problem of segmentation is also discussed. An original three-phase model segmentation has been adopted in order to improve the robustness of the gesture recognition task. This chapter is a pertinent demonstration of how it is difficult to represent a particular gesture without taking into account the uncertainty. Pure symbolic language cannot describe what is happening in the real world. The execution of a movement is not strict; low-level features can have important variations when executing the same gesture while the human observer or actor could or could not observe this variation.
On Chapter 4, Mr. Rett presents a global model for gesture recognition based on three levels of abstraction: concept space, Laban space and physical space. These levels associate the concepts presented in the previous chapters: (i) the movement and frame number, (ii) the components of the LMA and (iii) the low-level features. A sensor fusion like approach is used as the global model. Each of the LMA components is analogous to a virtual and independent sensor. The Space, Effort and Shape sub-models are developed in detail. Additional temporal model is introduced in order to deal with the speed and the length of a gesture description. The identification of the parametrical forms on the sub-models is also presented. This concerns the learning of the distributions of the low-level features knowing the gesture (movement) and the effort qualities knowing the gesture. The developed models are pertinent and easy to understand. In addition to the contribution of the gesture model, Mr. Rett contributes with a proposal for measuring the quality of anticipation. The interpretation is possible thanks to three entropy-based measures: speed, confidence and stability. From my point of view, the three entropy-based measures are an important contribution that is validated by experimentation.
Chapter 5 presents the results obtained by the gesture recognition model. This includes the analysis, the experimentation and a final application where a real robot follows human gesture instructions in a natural environment. This chapter is clear, convincing and complete in what we could expect of a strict validation. The robustness of the approach is experimented under different situations showing the fundamental importance of both the LMA and the Bayesian approach.
The last chapter of the thesis presents a resume of the content of the document as well as the future work.
From my point of view, Mr. Rett thesis is a complete work: it includes a deep reflection about the problem, an exceptional evaluation of the use of the LMA for solving this task, an original integrated model, and a pertinent experimentation. Key subjects of Artificial Intelligence are dealt all around the thesis, for example: symbolic languages, automatic analysis, probabilistic modeling, learning, classification, and robotics.
Gesture recognition will be adopted, in very short time, by numerous applications and devices. A diversity of gesture languages will be developed to be the main mean of communication between humans and computers. Mr. Rett’s thesis makes important contributions on this direction.
In summary, I think that Mr. Rett’s thesis makes important contributions on the subject of gesture recognition using Laban movement analysis. Let me unequivocally state my strong recommendation for proceeding to his final defense.
Juan Manuel Ahuactzin Research Director Probayes SAS
