Automatic Detection of Cow’s Oestrus in Audio Surveillance System
Article information
Abstract
Early detection of anomalies is an important issue in the management of group-housed livestock. In particular, failure to detect oestrus in a timely and accurate way can become a limiting factor in achieving efficient reproductive performance. Although a rich variety of methods has been introduced for the detection of oestrus, a more accurate and practical method is still required. In this paper, we propose an efficient data mining solution for the detection of oestrus, using the sound data of Korean native cows (Bos taurus coreanea). In this method, we extracted the mel frequency cepstrum coefficients from sound data with a feature dimension reduction, and use the support vector data description as an early anomaly detector. Our experimental results show that this method can be used to detect oestrus both economically (even a cheap microphone) and accurately (over 94% accuracy), either as a standalone solution or to complement known methods.
INTRODUCTION
Early detection of anomalies is an important issue in the management of group-housed livestock. In particular, the damage caused by the recent outbreak of livestock diseases in Korea such as foot-and-mouth disease was serious. In order to minimize the damage incurred from such diseases, it is necessary to develop the technology for collecting and analyzing livestock data. Although some progress in monitoring livestock has been made recently in Korea (Hwang and Yoe, 2010; Hwang et al., 2010), an automated analysis of anomalies using a sound sensor has not yet been reported. This is also true in other countries, although some research studies on applying information technology (IT) to livestock management have been reported in the last decades (Frost et al., 1997; Cox, 2003; Berckmans, 2004; Wathes et al., 2008; Davies, 2009; Gutiérrez et al., 2009; Hancock et al., 2009; Ruiz-Garcia et al., 2009).
In this paper, we focus on detecting oestrus in cows. Early detection of oestrus is an important issue in the herd management of cows (Lehrer et al., 1992; Jiménez et al., 2011; Firk et al., 2002; Saint-Dizier and Chastant-Maillard, 2012). When oestrus is undetected or detected late, the profitability of farmers can be significantly affected. Recently, two interesting papers using cows’ sound data were published. Ikeda and Ishii (Ikeda and Ishii, 2008) claimed that the vocalization of a cow contains information not only about the animal’s extraordinary condition, such as pain, oestrus, separation from her calf, and hunger or thirst, but also about the animal’s individuality. Consequently, vocalization can be used as a signal for the detection of these conditions by objective, non-contact, and remote sensing techniques. Using linear discriminant analysis, the authors used the resonant frequencies of the vocalized signal of cows to recognize two states: the hungry state before feeding, and the weaning state when the cow is separated from her calf. Jahns (Jahns, 2008) tried to classify the meaning of the cow’s specific calls into seven emotional states, including oestrus, using hidden Markov models that statistically model acoustic patterns for human speech recognition.
In this paper, we propose an efficient data mining solution for the detection of oestrus using the sound data of Korean native cows (Bos taurus coreanea). Primarily, the most widely used feature of sound analysis, the Mel Frequency Cepstrum Coefficient (MFCC) (Mahdi and Azizollah, 2009), was extracted and feature dimension reduction was performed in the data preprocessing phase. The Support Vector Data Description (SVDD) (Cristianini and Shawe-Taylor, 2000), which is often called an anomaly or novelty detector, was naturally used in the detection of the oestrus sound. To the best of our knowledge, this is the first report of the detection of oestrus using sound data and SVDD learning for early anomaly detection. Moreover, the application of a data mining method is appropriate considering the continuous and large incoming data stream that is a characteristic of audio surveillance systems for a cattle barn. This new method was verified using the real sound data of Korean native cows.
MATERIALS AND METHODS
Sample sound collection
In our experiments, 24 to 70 months old multiparous Korean native cows (Bos taurus coreanea) were housed in a commercial loose barn with open sides, located in Jinju, South Korea. Oestrus occurs around the time of ovulation. Since the precise time of ovulation cannot be measured in practice, in this study, we based our evaluation on observable quantities known to be related to the time of ovulation through visual observations such as mucous vaginal discharge, cajoling, restlessness, sniffing of the vagina of other cow, resting with chin on other cow and most especially the standing heat characteristic which is considered as the most accurate signs of oestrus (Lyimo et al., 2000). Cows that made no effort to escape at more than 3 s when mounted by others were regarded as oestrus. Oestrus cows were later on confirmed when the selected cow was diagnosed pregnant through ultra sound diagnostic test 30 d after artificial insemination. The sounds emitted by fifteen oestrus animals were recorded using a digital camcorder (HDR-XR160, Sony, Japan). The recorded oestrus sounds were extracted using a PC with a standard Realtek AC97 soundcard at 16 bits and 44.1 kHz sampling rates using Cool Edit (Adobe, San Jose, CA, USA). They were then used as reference data for the detection of oestrus cow calls. The normal sounds of thirty-two head of cattle that were not in oestrus were also recorded and extracted using the same methods as those for oestrus sounds.
Proposed audio-based cow oestrus detection system
The proposed real-time Korean native cow oestrus detection system is composed of four modules: The feature extraction and Korean native cow oestrus detection module that comprises two online process modules, and the attribute subset selection and SVDD training module of two offline process modules (Figure 1). In the feature extraction module, the MFCC algorithm is used for the simulation of feature extraction from the audio signal via sound sensors or a CCTV camera. The attribute subset selection module selects the optimal feature attribute subset, which improves the detection speed of the entire detection system. The SVDD training module performs training offline based on the attribute selection of the MFCC. In the Korean native cow oestrus detection module, the module that completes the training in the SVDD training module identifies the incoming audio signal. Each module will be described in detail.
Mel frequency cepstrum coefficients
In this paper, the MFCC algorithm is used for the simulation of the feature extraction module. The Mel frequency scale is the most widely used feature in sound analysis, with its simple calculation, good distinction ability, anti-noise capability, and other advantages (Peipei et al., 2011).
A block diagram of the structure of an MFCC processor is shown in Figure 2. In the first step, the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M<N). Typical values for N and M are N = 256 and M = 100. The next step in the processing is to window each individual frame to minimize the signal discontinuities at the beginning and end of each frame. Typically, the Hamming window is used. In the next processing step, the Fast Fourier Transform is used to convert each frame of N samples from the time domain into the frequency domain. The scale of frequency is then converted from the linear to mel scale. Then, the logarithm is taken from the results. In the final step, the log mel spectrum is converted back to the time domain. The result is called the MFCC (Figure 2).
Attribute subset selection
The efficient selection of attribute subsets for pattern recognition is an important issue and is addressed in many studies. Attribute selection is the problem of selecting a subset of attributes from a feature set in order to provide a compact, precise, and fast recognizer, with minimal performance degradation, by removing the attributes that are useless, redundant, or least-used (Hall, 1998). Reducing the dimensionality of the data reduces the size of the hypothesis space and allows algorithms to operate faster and more effectively.
In this paper, we used Correlation-based Feature Selection (CFS), which has been verified as the best among the attribute subset selection methods (Hall, 1998). CFS uses the features’ predictive performances and inter-correlations to guide its search for a good feature subset. It can drastically reduce the dimensionality of data sets while maintaining or improving the performance of learning algorithms. At the heart of the CFS algorithm is a heuristic for evaluating the worth or merit of a subset of features. This heuristic takes into account the usefulness of individual features for predicting the class label, along with the level of inter-correlation among them. CFS first calculates a matrix of feature-class and feature-feature correlations from the training data, and then searches the feature subset space using a best first search algorithm. The version of CFS used in this paper includes a heuristic to include locally predictive features and avoid the reintroduction of redundancies (Hall, 1998).
Support vector data description
Recently, the support vector learning method has reached maturity as a viable tool in the area of intelligent systems (Cristianini and Shawe-Taylor, 2000). Among the important application areas for support vector learning is the one-class classification problems. In the problems of one-class classification, only the training data for the normal class are generally given, and after the training phase is finished, we are required to decide whether each test vector belongs to the normal or abnormal class. One-class classification problems are often called outlier detection problems or novelty detection problems. One of the most well-known support vector learning methods for the one-class problems is the SVDD.
The SVDD method, which approximates the existence area of objects belonging to a normal class, is derived as follows: Consider a ball B with center α∈Rd and radius R, and the training data set D consisting of objects xi∈Rd, i = 1, …, N. It should be noted that, since the training data are usually prone to noise, some part of the training data D could consist of abnormal objects. The main idea of the SVDD is to find a ball that can achieve the two conflicting goals simultaneously: the ball should be as small as possible, and, more importantly, it should contain as many training data as possible. Obviously, somewhat satisfactory balls that meet these multiple objectives may be obtained by solving the optimization problem:
Here, the slack variable ξi represents the penalty associated with the deviation of the i-th training pattern outside the ball. The objective function of the above optimization problem consists of the two conflicting terms: the square of radius R2 and the total penalty
From the saddle point condition, the optimal solution of (1) should satisfy
With the substitution of the above into L, the Lagrange function can be expressed in terms of the dual variables:
Thus, the dual problem can be written as
It should be noted that Eq. (3) is equivalent to the quadratic programming (QP) problem:
In addition, it should be noted that from the Kuhn-Tucker complementarity condition, it should hold true that
From the above, we can easily show that ultimately only the data points on the boundary or outside the ball can have positive alpha values. These data points are called the support vectors. Once the αi are obtained by solving the problem (4), the optimal center is given by Eq. (2). In addition, the optimal value of R2 is acquired by applying condition (5) to the support vectors. After the training phase has been completed, we decide whether a given test point x∈Rd belongs to the normal class utilizing the criterion:
RESULTS AND DISCUSSION
For the proposed system, we performed an identification test by comparing the oestrus sound and the normal sound. In our experiment, 100 oestrus sound data and 180 normal sound data were used. Figure 3 shows two spectrum samples: oestrus sound and normal sound. The vocalization during oestrus was similar to a roaring sound and represented harsh energy distribution. For the MFCC features, the frame length applied was 2 s. We used 24 triangular filters and 12 cepstral coefficients, excluding the 13th coefficient. The MFCC vectors were appended with delta and double delta coefficients in order to yield 360-dimensional features. For the selection of the optimal attribute subset, the CFS of Weka 3.6 (http://www.cs.waikato.ac.nz/ml) was used. The trade-off constant C in the Gaussian Kernel function was set to C = 0.05. The value of the parameter σ in the Gaussian Kernel function was chosen as 4.7.
We used three important formulas (Han et al., 2012) in the evaluation of the performance of the proposed system: the oestrus detection rate (ODR), false positive rate (FPR), and false negative rate (FNR). They are given as follows:
In the above equations, I is individual oestrus sound data, while N is normal sound data. T is oestrus sound data that are classified as such by the system. P indicates normal sound data that are misclassified as oestrus sound data. F indicates oestrus sound data that are misclassified as normal sound data.
A summary of detection results for the entire group of cows studied are shown in Table 1. Our experimental results show that using 360 feature vectors, the average detection accuracy of the proposed system approached 96.86%, with FPR and FNR on average reaching 7.46% and 3.14%, respectively, when 80% of the data, randomly chosen, were used for training. In our experiment, we tested the system five times using all the data (Table 1). For CFS, the dimension of the selected optimal attribute subsets was reduced to 62. The detection results when only 62 features were used are also shown in Table 1, which shows that the overall accuracy for all the 360 features is slightly greater than that for the CFS selected features. Even when only 62 attributes were used, it was confirmed that the accuracy of the detection, FPR, and FNR was satisfactory.
Moreover, in order to evaluate the effect on the resource consumption of reducing the feature vectors, we measured the memory requirement and execution time of each approach. The measurement was conducted using a PC (2.9 GHz Intel core i5, 4 GB memory) and Matlab R2010a software. Since the offline SVM training phase is performed only once, we measured the execution time of the online SVM recognition phase using the clock function supported by Matlab. Table 2 shows a comparison of the memory requirement and execution time of each approach. From Table 2, it can be seen that the CFS approach reduced both the memory requirement (by a factor of 5.7) and execution time (by a factor of 5.1) in comparison with the typical approach in which all the features were used. It should be noted that the resource requirement was determined by combining the sampling rate and the number of cows monitored. Using any combination, the CFS approach requires fewer resources than the typical approach while providing an acceptable level of accuracy.
According to the literature reviews, a rich variety of methods has already been introduced for the detection of oestrus in cattle. For example, automatic detection of standing heat can be a possible solution for automatic oestrus detection (Xu et al., 1998; Nebel et al., 2000; At-Taras and Spahr, 2001; Saumande, 2002; Alawneh et al., 2006). However, this method is difficult to maintain in group-housed animals, and furthermore its detection accuracy rate is relatively low. Another observable change that occurs in the cows in oestrus is the increase in their behavioral activity. Although monitoring the behavioral activity of a cow using a pedometer (Koelsch et al., 1994; Maatje et al., 1997; Roelofs et al., 2005; Brehme et al., 2008) or activity meter attached to her neck or leg (Hockey et al., 2010; Jonsson et al., 2011) is a possible solution, the accuracy of this method may vary according to the devices and the algorithm used (Nebel et al., 2000; At-Taras and Spahr, 2001; Firk et al., 2002; Roelofs et al., 2005; Hockey et al., 2010). Cows in oestrus display the following behaviors more intensively: restlessness, mounting, allowing mounting without standing, sniffing the vulva of another cow, resting the chin on the back of another cow, licking, rubbing, and aggressiveness (Saint-Dizier and Chastant-Maillard, 2012). Thus, video recording and automated image analysis can be an alternate solution for the detection of this phenomenon (Firk et al., 2002; Saint-Dizier and Chastant-Maillard, 2012). However, this current alternative solution requires expensive infrared cameras and it provides a result whose accuracy is relatively low. Monitoring cows’ body temperature is also another possible solution (Firk et al., 2002; Fisher et al., 2008) but this may result in a number of false positives. Finally, an automated milk-related measurement can be used to detect oestrus in dairy cows (Mitchell et al., 1996; Van Asseldonk et al., 1998; De Mol et al., 1999; Firk et al., 2003; Friggens et al., 2008). However, a fully automatic system is expensive and its use is exclusive to dairy cows. All these known methods have some limiting factor, and we therefore still require another accurate and economic solution to this detection problem.
In this paper, we used the MFCC which has proven to be good features within human speech recognition, since it models the human perception of sound, and is therefore also widely used feature of sound analysis. However, animal sound perception may be different than human sound perception, and other feature vector representation method may be more suitable in the detection of the cow’s oestrus sound. This has yet to be investigated.
The Support Vector Machine (SVM) which has grown up as a viable tool in the area of intelligent systems may also used in the detection of the oestrus sound. However, the SVDD is more suitable, as it is a natural anomaly or novelty detector. To the best of our knowledge, this is the first report of the detection of oestrus using sound data and SVM (including SVDD) learning for early anomaly detection. Moreover, the application of a data mining method is appropriate considering the continuous and large incoming data stream that is a characteristic of audio surveillance systems for a cattle barn.
CONCLUSIONS
Automatic detection of oestrus in cows is an important issue as it can save a significant part of the breeder’s work hours. From the experiments, we found that automatic detection of oestrus using sound data can be an efficient and economical solution. A combination of MFCC with feature dimension reduction and SVDD can automatically detect oestrus at an accuracy level of over 94%. As the sound data acquired from even a cheap microphone can detect oestrus accurately and economically, our method can be used either as a standalone solution or to complement other known methods to obtain a more accurate solution. Moreover, this study might be a confirmation that sound understanding of cow calls is an amendable method to understand the animal’s present conditions. For future work, we will consider the multi-modality of the video and audio data. Further testing and refinement of our proposed system, as needed, in commercial production settings are also warranted. That is, a complete real-time system, capable of incorporating the automatic recognition of cow’s oestrus call, is a part of ongoing our research.
Acknowledgements
This research was supported by Advanced Production Technology Development Program, Ministry for Food, Agriculture, Forestry and Fisheries, Korea.