Face Detection of Thermal Images in Various Standing Body-Pose using Facial Geometry

Automatic face detection in frontal view for thermal images is a primary task in a health system e.g. febrile identification or security system e.g. intruder recognition. In a daily state, the scanned person does not always stay in frontal face view. This paper develops an algorithm to identify a frontal face in various standing body-pose. The algorithm used an image processing method where first it segmented face based on human skin’s temperature. Some exposed non-face body parts could also get included in the segmentation result, hence discriminant features of a face were applied. The shape features were based on the characteristic of a frontal face, which are: (1) Size of a face, (2) facial Golden Ratio, and (3) Shape of a face is oval. The algorithm was tested on various standing body-pose that rotate 360° towards 2 meters and 4 meters camera-to-object distance. The accuracy of the algorithm on face detection in a manageable environment is 95.8%. It detected face whether the person was wearing glasses or not.


INTRODUCTION
Detection of a face in the thermal image is used in various applications, such as the health system. The emerging cases of virus outbreak such as SARS, MERS, and the latest COVID-19, increases the use of thermal imaging to measure human body temperature. The thermal camera is handy since it can perform temperature measurements from distance. The image also promotes a mass measurement that can be implemented in the public area. Thermal imaging also has been used to detect humans in a very low light condition [1][2]. In a security system, the thermal camera is used for intruder recognition in a complete dark illumination. This advantage also enables the thermal camera to be used in a Driver Assistance System e.g. for pedestrian detection [3][4][5][6]. In the field of Driver Assistance System, pedestrians are the most vulnerable traffic participants because they often suffer injury in traffic accidents [7].
However, detecting a pedestrian face is challenging. It is caused by some backgrounds that might have similar temperatures to human body temperature. Many researchers proposed various methods to overcome this. Research by [8][9][10] detected pedestrian faces using the Template Matching algorithm. The method defines templates of human faces then runs similarity calculation for each face candidates. Candidates' similarity value that meets a defined threshold value was assigned as a face.
Other research by [11] used a high-temperature threshold to detect faces. A face is defined as an uncovered body part that has a high temperature compares to other covered body parts. It appears brighter in thermal images. In the thermal image, the presence of a human or a hot-life object can be recognized easily even in the absence of illumination.
To increase the face detection accuracy, the human body could be detected beforehand. Defining a body frame could narrow the searching area. In the body frame, a face is simply a body part that is located on top. Although further discussion should be investigated since a person does not always in a straight pose (e.g. lay down). This condition makes the face does not lie in the highest part of the body. The human temperature itself is high and distinctive from the environment. In thermal images, a human usually appears to be brighter than the surroundings [12]. Unfortunately, other objects might have similar temperatures as well, e.g. a light bulb, computer, running car engines, etc. These conditions could bring challenges to the human body or face detection.
In face detection, a thresholding method using human body temperature was also suggested by [13]. This research sets 151 as a threshold from the grayscale intensity of 0-255. However, uncovered body parts such as feet and hands could have similar temperatures with faces. Even some thermal images show chest, back, and thighs to have this similar temperature. Thus, discriminant features of a face must be addressed to distinct faces from other body parts. Research by [14] defined the face as an elliptical object. Although the method was applied in visual RGB images, it still can be used as a shape feature of faces. Therefore in this paper, a study to detect frontal faces using facial geometry as discriminant shape features was presented. Compare to researches by [11][12][13], the developed algorithm added features after the Thresholding process to define frontal faces. The developed algorithm implements the shape of a face which is oval as suggested by [14] in thermal images. It also used the Golden ratio of faces as the second shape feature. Geometrics as shape features, i.e. size, ratio, and shape, was simple to be implemented. It is also computationally cheap compare to the learning algorithm used by [8][9][10]. Furthermore, the algorithm was tested in various standing body-pose. Testing in real-unmanageable condition and environment were also carried out.

METHODS
Faces have similar temperatures from other body parts' skin, hence segment it solely using temperature value would be inaccurate. Fortunately, in a normal situation, people wear clothes to cover their bodies. This leaves only the head, neck, hand, and lower leg to be exposed. This study developed an algorithm that added shape features of faces as discriminant features to the segmented thermal images. Thermal images were acquired using a thermal camera where the scanned temperature range was fixed to 25°C -38°C. The temperature range was chosen since the ambient temperature is 25°C whilst the surface temperature of human skin varies from 31°C -35°C [15]. The grayscale mode is chosen rather than other modes such as the Ironbow and Arctic for the image coloring. This is due to the image processing algorithm that is performed in a greyscale value of 0 -255. Figure 1 shows an example of an acquired image. The distance between the camera and the person is fixed from 2 to 4 meters. During the data acquisition, the person was asked to wear and not to wear glasses since infrared radiation is blocked by glasses. This is to test whether the algorithm still can detect a frontal face when the person wears face accessories.

Figure 1 Acquired thermal image
The algorithm consists of three stages. First, it segments the area that falls within the range of the human skin's temperature. Second, it applied morphological filters that help to eliminate false-positive and false-negative of binarization. Third, it applied facial geometry as the shape features to detect a face among the candidates. The steps in the developed algorithm are shown in Figure 2.

Segmentation using human skin's temperature-range as a threshold
Thresholding is the most commonly used segmentation method [16]. It classifies each pixel into a foreground and background class using a certain threshold value [17]. The segmentation yield a binary image with values of 0 and 1. The threshold can be selected manually or statistically based on the image's histogram. In this study, the human skin lower temperature of 31°C [15] is chosen as a manual threshold value. The range is equal to 118 in grayscale intensity. Pixels that fall above this threshold were given values of 1, and otherwise 0 as shown in equation (1). Figure 3(a) shows the Segmented Image from Acquired Thermal Image in Figure 1. The and are the pixels' coordinates.

Morphological Filtering
Since hand, feet, or other exposed non-face body parts might fall within the temperature threshold (intensity ≥118), a filtering method was applied for the Segmented Image . There are two types of false detection, which are false positive, and false negative. The false-positive error can be seen in Figure 3(b) where hand, thigh, and feet are detected as a face. The falsenegative error usually appears when the face is covered e.g. by the glass as seen in Figure 3(a).
A morphological filter is suitable to eliminate those false detections. It is widely used for binary images [18]. It utilizes a Structure Element that is operated to each sub-image in the same pixel-size using binary operators AND or OR. The structure element is filled with values of 1. The basics of the Morphological Filter are The Erosion and Dilation filters. The Erosion used the AND operation between the sub-image and the Structure Element. The Dilation used the OR operation. between the sub-image and the Structure Element. In this study, the size of the Structure Element was chosen to be small enough since a face size in 4 meters objectcamera distance is small. The structure element is 5×5 size (square).
The first filter applied was the Opening. The operation which consists of the Erosion followed by the Dilation aims to eliminate the false-positive error. The result of Opening Filter for Segmented Image in Figure 3(a) is shown in Figure 3(b). The next filer applied was the Closing. The operation which consists of the Dilation followed by the Erosion aims to eliminate the false-negative error. The result of the Closing Filter applied in Figure 3(b) is shown in Figure 3(c). As can be seen in the final result, the morphological filters can fill holes or gaps. It also can eliminate small areas of the false-positive error.

3 Applying Facial Geometry as shape features of a face
Three prior-knowledge of face's geometry was used in the algorithm to detect the face amongst the candidates, which are: (1) size of head, (2) face ratio, and (3) face is oval. Those three could distinct a frontal face from other non-face body parts. Each knowledge was translated into the below criteria:

3.1 Size of head
The head's size of a normal adult is homogeneous. Studie in head's length and width comparison among the globe show that heads have a similar size [19][20]. In 4 meters of objectcamera distance, heads appear to have an area of 450 pixels (from the total number of 57,600 pixels). This area size was then used as the first selection for frontal face's candidates. A connected-components' area that was smaller than the size, such as arm, was eliminated.

3.2 Face ratio
A normal face would follow the Golden Ratio which is 1.6 [21]. It is the ratio between faces' height and width. Nevertheless, several studies showed some normal face was shorter or longer than the ratio [22][23]. This study used a threshold of ratio around 1.6 which was 1.4 -2. This range was manually selected. The higher threshold, which is 2, was selected since the uncovered neck appears to be connected with the face. The arm and leg usually appear to have a longer ratio than the threshold. The hand appears to have a shorter ratio. Thus non-face candidates were eliminated based on the Golden Ratio. Figure 4 shows an example of a face's width (W) and height (H) ratio. The segmented face is shown in a red rectangle whilst the ratio is written in the image.

3.3 Face shape
Research by [14] used a head shape, which is oval, to detect faces in RGB images. Studies in head anthropometry show variations of face shape amongst race. Research by [20] found that some adult face had shorter face length. This high amount of oval shape variation may produce a larger template that should be provided to detect faces. A simple oval shape detection was presented in this study. It used the area of oval rather than the shape itself. It was defined to follow the well-known formula of an oval which is shown in equation (2). Candidates whose area was smaller than this size were eliminated. The result of applying those three facial Geometries is shown in Figure 5 where only a frontal face is detected in the image. The algorithm was tested on a managed and un-managed environment. In the managed environment, only one person appeared in the frame. There are also no other sources of heat in the frame except for the person itself. The image acquisition was set as follow: 1) The body-camera distance is 2 meters and 4 meters. The human adult body does not fully appear at a distance of 2 meters from the camera but fully appears at a distance of 4 meters. The distance of 3 meters was not tested since the body does not fully appear as well.
2) The body pose rotates at 360°. The body rotates from frontal-pose, gradually spinning clockwise to the rear, then spinning again toward the front.
3) The face wears glasses and no-glasses. In practice, human faces can wear accessories such as glasses that block infrared radiation from the face. The result of the automatic detection of frontal faces in various standing-body poses is shown in Figure 6. The detected frontal face is shown using a black rectangle. The detected frontal faces are shown as black rectangles. The first row in Figure 6 is for 4 meters distance without glasses, the second row is for 4 meters distance with glasses, the third row is for 2 meters distance rotating pose 360° without glasses, and the fourth row is for 4 meters distance rotating pose 360° with glasses. The total image was 24. From the total 24 images, it can be seen that the algorithm only fails to detect the frontal faces in 1 image (first row, fourth image). It yields a detection accuracy of 95.8%.
It is shown that the algorithm was able to detect the faces only when it appears fully and in frontal view. It was also successfully detected faces that wear glasses. The algorithm was designed to only detect a full face. Thus in 2 meters distance where faces did not fully appear, the frontal face was successfully not detected. This requirement is needed for the implementation in the security area where intruder recognition needs a full frontal face.
Using the range of body temperature to segment the face has proven to be accurate and simple to be implemented. The other unexposed body parts such as the chest, back, stomach, and thigh are generally covered by clothing hence they appear to have a lower temperature than the range. When the faces are facing back, hairs are covering the head hence it appears to have a lower temperature as well.
The disadvantage of such thresholding which only relied on body temperature-range and not the presence of facial component e.g. eye, mouth, nose, is that the algorithm would fail when the person is bald. It would detect the back of a head as a frontal face which has a similar temperature with the range. Choosing the size of a head as a shape feature limits the implementation with the furthest object-to-camera distance of 4 meters. At a distance of more than 4 meters, the head appears to have a smaller area than 450 pixels.
The algorithm was also tested in real outdoor and indoor conditions. The image acquisition was in an unmanageable environment. The challenge in the outdoor condition is that objects in the surrounding can fall within the same range as the human body temperature. The result is shown in Figure 7. As can be seen in Figure 7 (a, d, e, g, h, j, l), the algorithm was able to detect a face or faces in various backgrounds, indoor or outdoor. In Figure 7 (b, f), the algorithm only able to detects the lower half of a face due to the presence of eye-glasses. In a further distance as in Figure 8, the morphological filter Closing can fill the hole that is caused by eye-glasses. A small structure element was sufficient to perform the filling. But in a closer distance such as in Figure 7 (b, f), the hole is bigger thus the small structure element could not perform the task. The result of segmentation accompanied by the filtering of Figure 7 (b, f) are shown in Figure 8. Although a bigger Structure Element could fill the bigger hole, it was not used in the algorithm since it would fill holes between other body parts such as the neck and chest. It remains as the limitation of the algorithm where the person to be scanned is suggested to take out their eye-glasses, especially in near object-to-distance camera.

Figure 8 Morphological filter result of Persons with Eye-glasses
A covered body part is still possible to appear at the same temperature as the face. An example is shown in Figure 7 (k) where the back was falsely segmented as face candidates. The result of segmentation continued with morphological filters is shown in Figure 9. The chest could experience this situation as well. Nevertheless, the facial geometry could successfully eliminate back and chest from face candidates since its ratio and shape did not follow it. Figure 9 Morphological filter result of a person's back that is segmented as a face candidate 4. CONCLUSIONS This paper presents an algorithm to detect frontal faces in thermal images in various standing body-poses. The image processing steps are Thresholding and Morphology Filtering. It suggested a simple detection algorithm that does not require a learning process. The discriminative features of a face were solely based on facial characteristics which are size, ratio, and shape. The facial geometry of normal human are identical and thus can be used as global features to detect frontal faces. The next research would cover a wider variety of gender, ages, or races where their facial geometries slightly differ.