Baseline filtering in NMR

A recent article [1] by Yaroshchyk and Eberhardt deals with some problems of baseline filtering in NMR spectral data. As a starting point, I recommend this article [2] together with further discussion [3] [4] and new findings [5] [6].


  1. P. Yaroshchyk, and J.E. Eberhardt, "Automatic correction of continuum background in LIBS using a model-free algorithm", Spectrochimica Acta Part B: Atomic Spectroscopy, 2014.
  2. . Komsta, "Comparison of Several Methods of Chromatographic Baseline Removal with a New Approach Based on Quantile Regression", Chromatographia, vol. 73, pp. 721-731, 2011.
  3. Z. Zhang, and Y. Liang, "Comments on the Baseline Removal Method Based on Quantile Regression and Comparison of Several Methods", Chromatographia, vol. 75, pp. 313-314, 2012.
  4. . Komsta, "Response to Letter to the Editor Regarding: Comparison of Several Methods of Chromatographic Baseline Removal with a New Approach Based on Quantile Regression", Chromatographia, vol. 75, pp. 315-316, 2012.
  5. . Górski, F. Ciepiela, and M. Jakubowska, "Automatic baseline correction in voltammetry", Electrochimica Acta, vol. 136, pp. 195-203, 2014.
  6. K.H. Liland, E. Rukke, E.F. Olsen, and T. Isaksson, "Customized baseline correction", Chemometrics and Intelligent Laboratory Systems, vol. 109, pp. 51-56, 2011.

Random Forests with missing data

Random Forests are not very popular technique in chemometrics, but there are reports of its use in QSRR [1], NIR multivariate calibration [2] and metabolomics [3]. The problem of missing data in this technique (together with variable selection) is a topic of new article in CSDA [4] by Hapfelmeier and Ulm. Enjoy reading!


  1. T. Hancock, R. Put, D. Coomans, Y. Vander Heyden, and Y. Everingham, "A performance comparison of modern statistical techniques for molecular descriptor selection and retention prediction in chromatographic QSRR studies", Chemometrics and Intelligent Laboratory Systems, vol. 76, pp. 185-196, 2005.
  2. D. Donald, D. Coomans, Y. Everingham, D. Cozzolino, M. Gishen, and T. Hancock, "Adaptive wavelet modelling of a nested 3 factor experimental design in NIR chemometrics", Chemometrics and Intelligent Laboratory Systems, vol. 82, pp. 122-129, 2006.
  3. M. Eliasson, S. Rannar, and J. Trygg, "From Data Processing to Multivariate Validation - Essential Steps in Extracting Interpretable Information from Metabolomics Data", CPB, vol. 12, pp. 996-1004, 2011.
  4. A. Hapfelmeier, and K. Ulm, "Variable selection by Random Forests using data with missing values", Computational Statistics & Data Analysis, vol. 80, pp. 129-139, 2014.

K-CM neural network

The group of prof. Todeschini presented new method called K-CM [1]. It combines an neural network approach with sample fuzzing profiling and k-NN. Indeed, interesting idea.


  1. M. Buscema, V. Consonni, D. Ballabio, A. Mauri, G. Massini, M. Breda, and R. Todeschini, "K-CM: a new artificial neural network. Application to supervised pattern recognition", Chemometrics and Intelligent Laboratory Systems, 2014.

New approaches in MCR-ALS

People involved in MCR-ALS methodology should take a look to two interesting recent papers. First describes performance of the method in quadrilinear constraints with noise [1], the second one proposes algorithm for incomplete datasets [2]. Both papers are authored by inventors and main developers of MCR-ALS methodology, so enjoy reading!


  1. A. Malik, and R. Tauler, "Performance and validation of MCR-ALS with quadrilinear constraint in the analysis of noisy datasets", Chemometrics and Intelligent Laboratory Systems, vol. 135, pp. 223-234, 2014.
  2. M.D. Luca, G. Ragno, G. Ioele, and R. Tauler, "Multivariate curve resolution of incomplete fused multiset data from chromatographic and spectrophotometric analyses for drug photostability studies", Analytica Chimica Acta, 2014.

Independence in high dimensional data

In new issue of Statistics & Probability Letters, Mao proposes a new test for independence in high dimensional data [1]. This approach can be useful in some chemometrics applications (even if it is only another approach besides existing ones, for example this referenced article [2]).


  1. G. Mao, "A new test of independence for high-dimensional data", Statistics & Probability Letters, vol. 93, pp. 14-18, 2014.
  2. J.R. Schott, "Testing for complete independence in high dimensions", Biometrika, vol. 92, pp. 951-956, 2005.

LMM and goodness of fit

Linear Mixed Models (LMM) are not so often used in chemometrics. However, I would like to point the attention to a very interesting article proposing goodness of fit tests for such models [1]. It really widens understanding of them and stimulates to further study.


  1. M. Tang, E.V. Slud, and R.M. Pfeiffer, "Goodness of fit tests for linear mixed models", Journal of Multivariate Analysis, vol. 130, pp. 176-193, 2014.

SIMCA extension

A. Pomerantsev and O. Rodionova propose in newest Journal of Chemometrics a new method for type II error calculation in SIMCA [1]. If someone is new to SIMCA, should start the study from this chapter [2].


  1. A.L. Pomerantsev, and O.Y. Rodionova, "On the type II error in SIMCA method", Journal of Chemometrics, vol. 28, pp. 518-522, 2014.
  2. S. WOLD, and M. SJÖSTRÖM, "SIMCA: A Method for Analyzing Chemical Data in Terms of Similarity and Analogy", ACS Symposium Series, pp. 243-282, 1977.

Orthogonality problem in chromatography

There is a continuous need of searching for orthogonal chromatographic systems. Camenzuli and Schoenmakers propose a new measure of orthogonality for this purpose [1]. Besides cited (and very worthy of reading) references, it is a very good idea to read this article [2] as widening appendix.


  1. M. Camenzuli, and P.J. Schoenmakers, "A new measure of orthogonality for multi-dimensional chromatography", Analytica Chimica Acta, 2014.
  2. Z. Zeng, H.M. Hugel, and P.J. Marriott, "Chemometrics in comprehensive multidimensional separations", Analytical and Bioanalytical Chemistry, vol. 401, pp. 2373-2386, 2011.

Importance, influence and variable selection

Variable selection and feature selection ... neverending story! Today we have two interesting articles in special issue of Journal of Chemometrics: a discussion about variable importance [1] and influence of variable importance in OPLS [2]. As usual, I recommend to track references mentioned in this article to improve overall knowledge on this topic.


  1. O.M. Kvalheim, R. Arneberg, O. Bleie, T. Rajalahti, A.K. Smilde, and J.A. Westerhuis, "Variable importance in latent variable regression models", Journal of Chemometrics, pp. n/a-n/a, 2014.
  2. B. Galindo-Prieto, L. Eriksson, and J. Trygg, "Variable influence on projection (VIP) for orthogonal projections to latent structures (OPLS)", Journal of Chemometrics, pp. n/a-n/a, 2014.