Improving Data Reliability for Process Monitoring with Fuzzy Outlier Detection

Abstract

To implement on-line process monitoring techniques that utilize principal component analysis (PCA) or partial least squares (PLS) models, it is important to use reliable data that represents normal process operation when constructing the models. In this paper, a novel flexible fuzzy treatment method is developed for the detection of outliers in process data. This method utilizes a combination of fuzzy C-means clustering algorithm to separate the data into clusters and a fuzzy inference engine to assign a degree of outlier to data points. The current iteration of the fuzzy inference engine performs an evaluation based on distance as a measure of standard deviation and fuzzy membership values to determine the degree of outlier. This method can be considered a hybrid method that incorporates statistical parameters into a fuzzy strategy. Decisions on how to handle the data can then be made based on the degree of outlier. This degree of outlier can be conveniently translated into a relative weight assigned to an outlier entering downstream data processes. The fuzzy treatment method was applied to benchmark penicillin production process data containing artificial outliers data points. The proposed method was able to detect the outliers in the process data. Though data points in the transient region were prone to high degree of outlier. Additionally it is possible to modify the fuzzy inference engine to utilize different criteria, such as data density or spread to evaluate outlierness. The result is presented along with a discussion on the advantages of this method as a flexible treatment of process data. The methodology will be applied to future investigation on PCA based process monitoring.

Publication
Computer Aided Chemical Engineering