Poster Session W1

Wednesday W1 Poster Session 9:45 – 11:15 am

Image Processing, Video Processing, Hyperspectral and Multispectral Image Processing

 

W1-1 Face Recognition in Vehicles with Near Infrared Frame Differencing

Jinwoo Kang, David Anderson, Monson Hayes – Georgia Institute of Technology, USA

This paper describes a system of practical technologies to implement an illumination robust, consumer grade biometric system based on face recognition to be used in the automotive market. Most current face recognition systems are compromised in accuracy by ambient illumination changes. Especially outdoor applications including driver identification pose the most challenging environment for face recognition. The point of this research is to investigate practical face recognition used for identity management in order to minimize algorithmic complexity while making the system robust to ambient illumination changes. First, we present a frame differencing method with an active near-infrared illumination control that produces images independent of the ambient illumination. Second, end-to-end face recognition system is presented including foreground/background segmentation, face detection, pose clustering and face recognition modules. And it is shown that the frame differencing method makes the modules more robust to the ambient illumination variation. Vehicular application videos were taken in extremely challenging outdoor illumination and shadowing conditions and used to test each module. Finally, Extensive test results of vehicular scenario are provided to evaluate the end-to-end systems.

 

W1-2 Signal Processing Techniques for Enhancing Multispectral Images of Ancient Documents

Trace Griffiths – Hewlett Packard, USA; Gene Ware, Todd Moon – Utah State University, USA

Digital multispectral imaging (MSI) has been widely adopted to aid in the study of ancient artefacts including paintings and documents. MSI is able to capture views of the subject at multiple narrowband wavelengths ranging from the ultraviolet through the infrared. Stacking the imagery data in three dimensions creates a large image data cube which can be processed using statistical signal and image processing techniques. This paper is a brief review of how signal processing can aid in reducing three general problem areas that may be present in MSI data sets of ancient documents, namely: image fusion, ink identification, and bleed-through removal.

 

W1-3 Eigen-gap of Structure Transition Matrix: A New Criterion for Image Quality Assessment

Mohsen Joneidi, Mostafa Rahmani – University of Central Florida, USA; Hossein Bakhshi

Golestani – Sharif University of Technology, Iran; Mohammad Ghanbari – University of Essex,

United Kingdom

A new approach to Image Quality Assessment (IQA) is presented. The idea is based on the fact that two images are similar if their structural relationship within their blocks is preserved. To this end, a transition matrix is defined which exploits structural transitions between corresponding blocks of two images. The matrix contains valuable information about differences of two images, which should be transformed to a quality index. Eigen-value analysis over the transition matrix leads to a new distance measure called Eigen-gap. According to simulation results, the Eigen-gap is not only highly correlated to subjective scores but also, its performance is as good as the SSIM, a trustworthy index.

 

W1-4 Image Loss Concealment Using Edge-Guided Interpolation and Multi-scale Transformation

Bahareh Langari, John Stonham, Alireza Mousavi – Brunel University, United Kingdom

A novel global edge interpolation, based on new edge-guided interpolation methods for image gap restoration, is presented. The gap restoration is achieved by incorporating the edge-based directional interpolation within a multi-scale DCT/DWT pyramid transform. Two categories of image edges are proposed and utilised in the image gap reconstruction process. First, the local edges, or textures inferred from estimation of the gradients of the neighbouring pixels in various directions, are measured. Second, the global edges, or boundaries between image objects or segments, are estimated by using the Canny edge detector. Evaluations over a range of images, in regular and random loss pattern, at loss rates of up to 40%, reveal that the proposed method results in improved quality of the image and increase in PSNR by 1 to 5 dB compared to a range of best published works.

 

W1-5 A Practical Strategy for Spectral Library Partitioning and Least-Squares Identification

Shawn Higbee – Lawrence Livermore National Laboratory, USA

This paper proposes a method of partitioning large data libraries into smaller sub-partitions, in such a way that a least-squares-based identification process will be numerically better behaved. An example from a well-known remote sensing spectral library is used to illustrate various seed strategies for the partitioning as well as various assignment strategies. In the example shown seed strategy is relatively unimportant for a library of this size, but there is a substantial improvement in least-squares performance with SVD-based partitioning for both point and interval estimates. Several context-dependent variants of this strategy are also proposed.

 

W1-6 Temperature Emissivity Separation: Estimation with a Parameter Affecting Both the Mean and Variance of the Observation

Todd Moon, David Neal, Jake Gunther – Utah State University, USA; Gustavious P Williams – Brigham Young University, USA

We consider a model for temperature-emissivity separation (TES) in hyperspectral image processing. The emissivity is modulated by both the black body function and the atmospheric downwelling. The interaction has made it difficult to extract both temperature and emissivity, since offsets in one can be compensated by the other. Working here with only a single wavelength component, we propose here a model in which the downwelling is considered as a random variable (or vector). The emissivity thus contributes to both the variance and mean of the observations. This leads to a maximum likelihood estimator for the emissivity. We compute an expression for the bias of this estimator, and show how it can be used to produce an unbiased estimator. An estimator for the temperature is also given. These two estimators can be used iteratively, providing separation of the temperature and emissivity components.

 

W1-7 Evaluating the performance of Max Current AC-DCT based colored Digital Image Fusion for Visual Sensor Network’s

Arun Begill, Shruti Puniani, Kamaljot Singh – DAV University, India; Navjot Kaur – DAVIET, India

This paper presents an efficient digital image fusion strategy that is created for Visual Sensor Networks(VSN’s) to work in resource restricted, hazardous environments like battlefields. We aimed to use multiple partially unfocused colored images to develop a single multi-focus image using Discrete Cosine Transformation(DCT) depending on maximum value Alternating Current(AC) coefficients. This technique is beneficial in computation restricted environments of reduced computational powered devices to achieve image quality of higher degree. Our experiments shown that, the evaluated technique had produced better quality images as compared to other available methods of fusion in DCT domain.

 

W1-8 Body markers detection based on 3D video processing oriented to children gait analysis

Mario Chacon – Chihuahua Institute of Technology, Mexico; Carlos Avalos – Intel Guadlajara, Mexico; Omar Arias – Bosch, Mexico

This paper describes preliminary results of an auxiliary system designed to obtain a standard of gait kinematic of children in the age of 6 to 12 years of a specific population. It is expected that the use of the system may help children from vulnerable social groups with disabilities due to accidents or illness. The system is based on the Microsoft Kinect 3D sensor. Corporal segments and markers are determined by extracting the body silhouette using a background subtraction technique and morphologic operations on the depth plane. Results obtained with the proposed system proved that the system is able to estimate the main corporal markers needed in gait analysis. The estimations showed good correlation compared with a manual ground truth. The maximum relative angle average deviation found was 1.63° indicating acceptable mark tracking.

 

Neural Networks, Machine Learning

 

W1-9 A neural bio-inspired scheme for head pose recognition

Mario Chacon, Huber Orozco, Juan Ramirez – Chihuahua Institute of Technology, Mexico

A bio-inspired model for head pose recognition is described in this paper. The bio-inspired model recognizes the head by using gray scale information as well as the silhouette of the person. A set of descriptors is generated from this analysis by a hierarchical model based on the visual cortex. Then the descriptors are classified by a multilayer perceptron artificial neural network to identify the position of the head. The model is able to recognize five distinct poses with a precision of 98% in images with a resolution of 640×480.

 

W1-10 Mapping Arabic Acoustic Parameters to Their Articulatory Features Using Neural Networks

Yousef A. Alotaibi – King Saud University, Saudi Arabia; Yasser M. Seddiq – King Abdulaziz City for Science and Technology (KACST) and King Saud University, Saudi Arabia

A mapping system based on an artificial neural network was designed, trained, and tested to map Arabic acoustic parameters to their corresponding articulatory features. The main objective of the study was to find the correlation between these two different types of features. To train and test the system, an in-house database was created for all 29 Arabic alphabets as carrier words for our intended Arabic phonemes. Fifty Arabic native speakers were asked to utter all alphabets 10 times. Hence, the database consisted of 10 repetitions of each alphabet produced by each speaker, resulting in 14,500 tokens. The system was tested to extract Arabic articulatory features using another disjoint speech data subset. The overall accuracy of the system was 64.06% for all articulatory feature elements and all Arabic phonemes.

 

W1-11 A Novel Method for Blind Segmentation of Thai Continuous Speech

Siripong Potisuk – The Citadel, USA

This paper describes an acoustical investigation on Thai speech segmentation using a combination of average level crossing rate (ALCR) and root-mean-square (RMS) energy. Simple and easy to compute, ALCR information alone was successfully used in an automatic speech segmentation system for English. However, ALCR has never been applied to Thai. As a result, the objective of the study is to apply ALCR information to ascertain its usefulness in detecting significant temporal changes in continuous Thai Speech. An experiment was conducted on a small speech corpus containing 21 sentences. Preliminary results suggest that ALCR and RMS energy can be used to detect the phonetic boundary between obstruent initial consonant and preceding/following vowel. In addition, it can also be used to detect boundary between final consonant of the preceding syllable and initial consonant of the following syllable except for the case involving two successive non-identical nasals. The overall accuracy is around 81% for data from four speakers.

 

W1-12 Deep Emotion Recognition using Prosodic and Spectral Feature Extraction and Classification based on Cross Validation and Bootstrap

Ayush Sharma, David Anderson – Georgia Institute of Technology, USA

Despite the existence of a robust model to identify basic emotions, the ability to classify a large group of emotions with reliability is yet to be developed. Hence, objective of this paper is to develop an efficient technique to identify emotions with an accuracy comparable to humans. The array of emotions addressed in this paper go far beyond what are present on the circumplex diagram. Due to the nature of correlation and ambiguity present in emotions, both prosodic and spectral features of speech are considered during the feature extraction. Feature selection algorithms are applied to work on a subset of relevant features. Owing to the low dimensionality of the feature space, several cross validation methods are employed in combination with different classifiers and their performances are compared. In addition to cross validation, the bootstrap error estimate is also calculated and a combination of both is used as an overall estimate of the classification accuracy of the model.

 

W1-13 Traffic Flow Forecasting Research Based on Bayesian Normalized Elman Neural Network

Wenchi Ma – Harbin Institute of Technology, P.R. China

To traffic flow accurately, scientifically and reasonably is the key technology of ITS. In this thesis, a single, separate section, for example, is used to forecast the traffic flow in a long time. The advantage of artificial neural network is its ability of learning or training in other words. By learning, the network can give appropriate output when accepting input. Thus, artificial neural network is a good model for predicting transportation flow. This paper proposes the Bayesian normalized Elman neural network as the prediction model which has the reliability and stability of Elman neural network and is able to overcome the influence of the hidden layer nodes on the prediction accuracy, which improves the generalization ability of the network. Then depending on long-time traffic forecasting results of different algorithms, statistics accuracy error and comparative analysis are finished to draw a conclusion that combined with Bayesian normalized method based on Elman neural network is more suitable for long time traffic forecast.