Knowledge mining sensory evaluation data: genetic programming, statistical techniques, and swarm optimization

Knowledge mining sensory evaluation data is a challenging process due to extreme sparsity of the data, and a large variation in responses from different members (called assessors) of the panel. The main goals of knowledge mining in sensory sciences are understanding the dependency of the perceived liking score on the concentration levels of flavors’ ingredients, identifying ingredients that drive liking, segmenting the panel into groups with similar liking preferences and optimizing flavors to maximize liking per group. Our approach employs (1) Genetic programming (symbolic regression) and ensemble methods to generate multiple diverse explanations of assessor liking preferences with confidence information; (2) statistical techniques to extrapolate using the produced ensembles to unobserved regions of the flavor space, and segment the assessors into groups which either have the same propensity to like flavors, or are driven by the same ingredients; and (3) two-objective swarm optimization to identify flavors which are well and consistently liked by a selected segment of assessors.

[1]  Terence Soule,et al.  Genetic Programming Theory and Practice V (Genetic and Evolutionary Computation) , 2008 .

[2]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[3]  John Shawe-Taylor,et al.  A Framework for Probability Density Estimation , 2007, AISTATS.

[4]  Xin Yao,et al.  Boosting Kernel Models for Regression , 2006, Sixth International Conference on Data Mining (ICDM'06).

[5]  Hod Lipson,et al.  Coevolution of Fitness Predictors , 2008, IEEE Transactions on Evolutionary Computation.

[6]  Ioannis Kontoyiannis,et al.  Estimating the entropy of discrete distributions , 2001, Proceedings. 2001 IEEE International Symposium on Information Theory (IEEE Cat. No.01CH37252).

[7]  Jacques-André Landry,et al.  Discriminant feature selection by genetic programming : towards a domain independent multi-class object detection system , 2004 .

[8]  Charles L. Mitchell Networking Technologies and the Rate of Technological Change , 2005 .

[9]  Maarten Keijzer,et al.  Improving Symbolic Regression with Interval Arithmetic and Linear Scaling , 2003, EuroGP.

[10]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[11]  Arthur K. Kordon,et al.  Variable Selection in Industrial Datasets Using Pareto Genetic Programming , 2006 .

[12]  Howard R. Moskowitz,et al.  VARIABILITY IN HEDONICS: INDICATIONS OF WORLD‐WIDE SENSORY AND COGNITIVE PREFERENCE SEGMENTATION , 2000 .

[13]  Frank D. Francone,et al.  Discrimination of Unexploded Ordnance from Clutter Using Linear Genetic Programming , 2006 .

[14]  Xin Yao,et al.  Learning and Evolution by Minimization of Mutual Information , 2002, PPSN.

[15]  Vladimir Vapnik,et al.  Multivariate Density Estimation: a Support Vector Machine Approach , 1999 .

[16]  Oleksandr Makeyev,et al.  Neural network with ensembles , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[17]  Michael F. Korns Large-Scale, Time-Constrained Symbolic Regression , 2007 .

[18]  E. Vladislavleva Model-based problem solving through symbolic regression via pareto genetic programming , 2008 .

[19]  H. Iba Bagging, Boosting, and bloating in Genetic Programming , 1999 .

[20]  Kalyan Veeramachaneni,et al.  Knowledge mining with genetic programming methods for variable selection in flavor design , 2010, GECCO '10.

[21]  Xin Yao,et al.  Evolutionary ensembles with negative correlation learning , 2000, IEEE Trans. Evol. Comput..

[22]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[23]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[24]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[25]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[26]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[27]  Riccardo Poli,et al.  Genetic Programming for Feature Detection and Image Segmentation , 1996, Evolutionary Computing, AISB Workshop.

[28]  Debashis Ghosh,et al.  Feature selection and molecular classification of cancer using genetic programming. , 2007, Neoplasia.

[29]  Giandomenico Spezzano,et al.  GP ensembles for large-scale data classification , 2006, IEEE Transactions on Evolutionary Computation.

[30]  Mark Johnston,et al.  Feature Construction and Dimension Reduction Using Genetic Programming , 2007, Australian Conference on Artificial Intelligence.

[31]  Thomas G. Dietterich,et al.  In Advances in Neural Information Processing Systems 12 , 1991, NIPS 1991.

[32]  Maarten Keijzer,et al.  Scaled Symbolic Regression , 2004, Genetic Programming and Evolvable Machines.

[33]  Cyril Fonlupt,et al.  Applying Boosting Techniques to Genetic Programming , 2001, Artificial Evolution.

[34]  H. Sebastian Seung,et al.  Information, Prediction, and Query by Committee , 1992, NIPS.

[35]  Kalyan Veeramachaneni,et al.  Learning a Lot from Only a Little: Genetic Programming for Panel Segmentation on Sparse Sensory Evaluation Data , 2010, EuroGP.

[36]  Una-May O'Reilly,et al.  Genetic Programming Theory and Practice II , 2005 .

[37]  Kalyan Veeramachaneni,et al.  Evolutionary optimization of flavors , 2010, GECCO '10.

[38]  Mark Kotanchek,et al.  Pareto-Front Exploitation in Symbolic Regression , 2005 .

[39]  Xiaodong Li,et al.  A Non-dominated Sorting Particle Swarm Optimizer for Multiobjective Optimization , 2003, GECCO.