Analysis and classification of proton NMR spectra of lipoprotein fractions from healthy volunteers and patients with cancer or CHD.

Human blood plasma samples from 52 subjects were collected and the very low density lipoprotein (VLDL), intermediate density lipoprotein (IDL), low density lipoprotein (LDL) and high density lipoprotein were isolated by serial ultra centrifugation. 600 MHz 1H NMR spectra of the lipoprotein fractions were acquired. The methyl and methylene regions in the spectra of VLDL, LDL and HDL were utilised in further analyses via Kohonen neural networks (KNN) and generative topographic mapping (GTM), two related examples of (unsupervised learning) self-organising feature mapping techniques. Systematic variations in lipoprotein profiles can be substantially visualised through the use of KNN and GTM. The relationship between the sample positions in the Kohonen plot was visualised by surface plots of the corresponding VLDL and HDL cholesterol and VLDL triglyceride contents. The GTM maps of the VLDL and HDL fractions were used to investigate the individual properties of selected samples. A large number of the cancer patients were found clustered in the VLDL GTM map, and GTM map positions of samples in relation to CHD, diabetes and renal failure could be found. Although the study group here considered is heterogeneous in respect to age, sex, type of disease and medications within each defined class, classification of VLDL and HDL data with probabilistic neural network (PNN) was quite successful with respect to the groupings: cancer, CHD, volunteers and other (comprising patients with other diseases). Statistics based on 15 independent sets of PNN calculations gave true positive fractions usually higher than 0.83 and false positive fractions lower than 0.088. Attempts to use the corresponding LDL data and four classes were uniformly poor although some classifications (e.g., volunteer versus CHD) were easily performed.