Papers by Paulina De La Mata
Region of interest selection for GC×GC–MS data using a pseudo fisher ratio moving window with connected components segmentation
Journal of Chromatography Open

Journal of Chromatography A, Oct 1, 2022
There are many challenges associated with analysing gas chromatography -mass spectrometry (GC-MS)... more There are many challenges associated with analysing gas chromatography -mass spectrometry (GC-MS) data. Many of these challenges stem from the fact that electron ionisation (EI) can make it difficult to recover molecular information due to the high degree of fragmentation with concomitant loss of molecular ion signal. With GC-MS data there are often many common fragment ions shared among closely-eluting peaks, necessitating sophisticated methods for analysis. Some of these methods are fully automated, but make some assumptions about the data which can introduce artifacts during the analysis. Chemometric methods such as Multivariate Curve Resolution (MCR), or Parallel Factor Analysis (PARAFAC/PARAFAC2) are particularly attractive, since they are flexible and make relatively few assumptions about the data -ideally resulting in fewer artifacts. These methods do require expert user intervention to determine the most relevant regions of interest and an appropriate number of components, k, for each region. Automated region of interest selection is needed to permit automated batch processing of chromatographic data with advanced signal deconvolution. Here, we propose a new method for automated, untargeted region of interest selection that accounts for the multivariate information present in GC-MS data to select regions of interest based on the ratio of the squared first, and second singular values from the Singular Value Decomposition (SVD) of a window that moves across the chromatogram. Assuming that the first singular value accounts largely for signal, and that the second singular value accounts largely for noise, it is possible to interpret the relationship between these two values as a probabilistic distribution of Fisher Ratios. The sensitivity of the algorithm was tested by investigating the concentration at which the algorithm can no longer pick out chromatographic regions known to contain signal. The algorithm achieved detection of features in a GC-MS chromatogram at concentrations below 10 pg on-column. The resultant probabilities can be interpreted as regions that contain features of interest.

arXiv (Cornell University), May 6, 2022
Reliable analysis of comprehensive two-dimensional gas chromatography -time-offlight mass spectro... more Reliable analysis of comprehensive two-dimensional gas chromatography -time-offlight mass spectrometry (GC×GC-TOFMS ) data is considered to be a major bottleneck for its widespread application. For multiple samples, GC×GC-TOFMS data for specific chromatographic regions manifests as a 4 th order tensor of I mass spectral acquisitions, J mass channels, K modulations, and L samples. Chromatographic drift is common along both the first-dimension (modulations), and along the seconddimension (mass spectral acquisitions), while drift along the mass channel and sample dimensions is for all practical purposes nonexistent. A number of solutions to handling GC×GC-TOFMS data have been proposed: these involve reshaping the data to make it amenable to either 2 nd order decomposition techniques based on Multivariate Curve Resolution (MCR), or 3 rd order decomposition techniques such as Parallel Factor Analysis 2 (PARAFAC2). PARAFAC2 has been utilised to model chromatographic drift along one mode, which has enabled its use for robust decomposition of multiple GC-MS experiments. Although extensible, it is not straightforward to implement a PARAFAC2 model that accounts for drift along multiple modes. In this submission, we demonstrate a new approach and a general theory for modelling data with drift along multiple modes, for applications in multidimensional chromatography with multivariate detection

Frontiers in analytical science, May 19, 2022
Discriminant-type analyses arise from the need to classify samples based on their measured charac... more Discriminant-type analyses arise from the need to classify samples based on their measured characteristics (variables), usually with respect to some observable property. In the case of samples that are difficult to obtain, or using advanced instrumentation, it is very common to encounter situations with many more measured characteristics than samples. The method of Partial Least Squares Regression (PLS-R), and its variant for discriminant-type analyses (PLS-DA) are among the most ubiquitous of these tools. PLS utilises a rank-deficient method to solve the inverse least-squares problem in a way that maximises the co-variance between the known properties of the samples (commonly referred to as the Y-Block), and their measured characteristics (the X-block). A relatively small subset of highly co-variate variables are weighted more strongly than those that are poorly co-variate, in such a way that an illposed matrix inverse problem is circumvented. Feature selection is another common way of reducing the dimensionality of the data to a relatively small, robust subset of variables for use in subsequent modelling. The utility of these features can be inferred and tested any number of ways, this are the subject of this review.
Use of reconstituted kefir consortia to determine the impact of microbial composition on kefir metabolite profiles
Food Research International
Comparing Gc×gc-Tofms-Based Metabolomic Profiling and Wood Anatomy for Forensic Identification of Five Meliaceae (Mahogany) Species
Wood and Fiber Science

Metabolomics
Fecal samples are highly complex and heterogeneous, containing materials at various stages of dig... more Fecal samples are highly complex and heterogeneous, containing materials at various stages of digestion. The heterogeneity and complexity of feces make stool metabolomics inherently challenging. The level of homogenization in uences the outcome of the study, affecting the metabolite pro les and reproducibility; however, there is no consensus on how fecal samples should be prepared to overcome the topographical discrepancy and obtain data representative of the stool as a whole. Various combinations of homogenization conditions were compared to investigate the effects of bead size, addition of solvents and the differences between wet-frozen and lyophilized feces. The homogenization parameters were systematically altered to evaluate the solvent usage, bead size, and whether lyophilization is required in homogenization. The metabolic coverage and reproducibility were compared among the different conditions. The current work revealed that a combination of mechanical and chemical lysis obtained by bead-beating with a mixture of big and small sizes of beads in an organic solvent is an effective way to homogenize fecal samples with adequate reproducibility and metabolic coverage. Lyophilization is required when bead-beating is not available. A comprehensive and systematical evaluation of various fecal matter homogenization conditions provides a profound understanding for the effects of different homogenization methods. Our ndings would be bene cial to assist with standardization of fecal sample homogenization protocol.

Metabolites
The essential oil (EO) from the leaves of Zanthoxylum caribaeum (syn. Chiloperone) (Rutaceae) was... more The essential oil (EO) from the leaves of Zanthoxylum caribaeum (syn. Chiloperone) (Rutaceae) was studied previously for its acaricidal, antimicrobial, antioxidant, and insecticidal properties. In prior studies, the most abundant compound class found in leaf oils from Brazil, Costa Rica, and Paraguay was terpenoids. Herein, essential oil from the leaves of Zanthoxylum caribaeum (prickly yellow, bois chandelle blanc (FWI), peñas Blancas (Costa Rica), and tembetary hu (Paraguay)) growing in Guadeloupe was analyzed with comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC × GC-TOFMS), and thirty molecules were identified. A comparison with previously published leaf EO compositions of the same species growing in Brazil, Costa Rica, and Paraguay revealed a number of molecules in common such as β-myrcene, limonene, β-caryophyllene, α-humulene, and spathulenol. Some molecules identified in Zanthoxylum caribaeum from Guadeloupe showed some antimet...

Frontiers in Analytical Science
Discriminant-type analyses arise from the need to classify samples based on their measured charac... more Discriminant-type analyses arise from the need to classify samples based on their measured characteristics (variables), usually with respect to some observable property. In the case of samples that are difficult to obtain, or using advanced instrumentation, it is very common to encounter situations with many more measured characteristics than samples. The method of Partial Least Squares Regression (PLS-R), and its variant for discriminant-type analyses (PLS-DA) are among the most ubiquitous of these tools. PLS utilises a rank-deficient method to solve the inverse least-squares problem in a way that maximises the co-variance between the known properties of the samples (commonly referred to as the Y-Block), and their measured characteristics (the X-block). A relatively small subset of highly co-variate variables are weighted more strongly than those that are poorly co-variate, in such a way that an ill-posed matrix inverse problem is circumvented. Feature selection is another common w...

Introduction Fecal samples are highly complex and heterogeneous, containing materials at various ... more Introduction Fecal samples are highly complex and heterogeneous, containing materials at various stages of digestion. The heterogeneity and complexity of feces make stool metabolomics inherently challenging. The level of homogenization influences the outcome of the study, affecting the metabolite profiles and reproducibility; however, there is no consensus on how fecal samples should be prepared to overcome the topographical discrepancy and obtain data representative of the stool as a whole. Objective Various combinations of homogenization conditions were compared to investigate the effects of bead size, addition of solvents and the differences between wet-frozen and lyophilized feces. Methods The homogenization parameters were systematically altered to evaluate the solvent usage, bead size, and whether lyophilization is required in homogenization. The metabolic coverage and reproducibility were compared among the different conditions. Results The current work revealed that a combin...
Improved sample storage, preparation and extraction of blueberry aroma volatile organic compounds for gas chromatography
Journal of Chromatography Open
Exploration of Extraction and Separation Techniques for Routine Trace Analysis of Organic Compounds in Water: Dispersive Liquid-Liquid Microextraction vs Liquid-Liquid Extraction
Journal of Chromatography Open
Metabolomic analysis of secondary metabolites from Caribbean crab gills using comprehensive two-dimensional gas chromatography - time-of-flight mass spectrometry—New inputs for a better understanding of symbiotic associations in crustaceans
Journal of Chromatography Open

Dietary benzoic acid and supplemental enzymes alter fiber-fermenting taxa and metabolites in the cecum of weaned pigs
Journal of Animal Science
Inclusion of enzymes and organic acids in pig diets is an important strategy supporting decreased... more Inclusion of enzymes and organic acids in pig diets is an important strategy supporting decreased antibiotic usage in pork production. However, limited knowledge exists about how these additives impact intestinal microbes and their metabolites. To examine the effects of benzoic acid and enzymes on gut microbiota and metabolome, 160 pigs were assigned to one of four diets 7 days after weaning: a control diet or the addition of 0.5% benzoic acid, 0.045% dietary enzymes (phytase, β-glucanase, xylanase, and α-amylase), or both and fed ad libitum for 21 to 22 d. Individual growth performance and group diarrhea incidence data were collected throughout the experimental period. A decrease of 20% in pen-level diarrhea incidence from days 8 to 14 in pigs-fed both benzoic acid and enzymes compared to the control diet (P = 0.047). Cecal digesta samples were collected at the end of the experimental period from 40 piglets (n = 10 per group) and evaluated for differences using 16S rRNA sequencing ...

Untargeted Region of Interest Selection for GC-MS Data using a Pseudo F-Ratio Moving Window ($\psi$FRMV)
There are many challenges associated with analysing gas chromatography - mass spectrometry (GC-MS... more There are many challenges associated with analysing gas chromatography - mass spectrometry (GC-MS) data. Many of these challenges stem from the fact that electron ionisation can make it difficult to recover molecular information due to the high degree of fragmentation with concomitant loss of molecular ion signal. With GC-MS data there are often many common fragment ions shared among closely-eluting peaks, necessitating sophisticated methods for analysis. Some of these methods are fully automated, but make some assumptions about the data which can introduce artifacts during the analysis. Chemometric methods such as Multivariate Curve Resolution, or Parallel Factor Analysis are particularly attractive, since they are flexible and make relatively few assumptions about the data - ideally resulting in fewer artifacts. These methods do require expert user intervention to determine the most relevant regions of interest and an appropriate number of components, $k$, for each region. Automated region of interest selection is needed to permit automated batch processing of chromatographic data with advanced signal deconvolution. Here, we propose a new method for automated, untargeted region of interest selection that accounts for the multivariate information present in GC-MS data to select regions of interest based on the ratio of the squared first, and second singular values from the Singular Value Decomposition of a window that moves across the chromatogram. Assuming that the first singular value accounts largely for signal, and that the second singular value accounts largely for noise, it is possible to interpret the relationship between these two values as a probabilistic distribution of Fisher Ratios. The sensitivity of the algorithm was tested by investigating the concentration at which the algorithm can no longer pick out chromatographic regions known to contain signal.

Body odour consists of different compounds that interact in various ways with textile materials d... more Body odour consists of different compounds that interact in various ways with textile materials due to differences in their chemical properties. Clothing made from hydrophilic fibres (e.g., cotton) can be more easily laundered, and odorants more effectively removed than those made from hydrophobic fibres (e.g., polyester). Therefore, the purpose of this research was to examine the interactions between textile materials and odorous compounds when washed several times with different detergents. Test fabrics were of both interlock knit structure and either 100% cotton (234 g/m 2 ) or 100% polyester (224 g/m 2 ) fibre content. Test compounds were 4-ethyl octanoic acid (octanoic acid) and 2-nonenal (nonenal). Fabric samples were spiked with 10 µL of a solution of octanoic acid (0.1g), nonenal (0.1g) and dichloromethane (solvent) and left to sit for 24 hours. Inoculated samples were washed with either Tide® Free and Gentle detergent or Tide® Febreze Sports detergent. Residual odorants were measured using gas chromatography with flame ionization detector. Headspace analysis of volatiles was conducted using solid-phase micro-extraction (SPME); direct extraction of compounds remaining in fabrics was done using dichloromethane as the solvent. Considering the peak area of odorants, findings show that cotton generally retained and desorbed lesser odorous compounds than polyester did. Interestingly, the non-polar nonenal was difficult to remove from the non-polar hydrophobic polyester by washing, which resulted in higher quantities of nonenal compared with octanoic acid in the headspace. Also, while multiple washes eventually became more efficient in removing odorants from cotton, polyester did not clean as well. iii Preface This thesis is an original work by Mohammed Mukhtar Abdul-Bari. No part of this thesis has been previously published.

Evaluation of fresh, frozen, and lyophilized fecal samples by SPME and derivatization methods using GC×GC-TOFMS
Metabolomics
INTRODUCTION Feces is a highly complex matrix containing thousands of metabolites. It also contai... more INTRODUCTION Feces is a highly complex matrix containing thousands of metabolites. It also contains live bacteria and enzymes, and does not have a static chemistry. Consequently, proper control of pre-analytical parameters is critical to minimize unwanted variations in the samples. However, no consensus currently exists on how fecal samples should be stored/processed prior to analysis. OBJECTIVE The effects of sample handling conditions on fecal metabolite profiles and abundances were examined using comprehensive two-dimensional gas chromatography coupled to time-of-flight mass spectrometry (GC×GC-TOFMS). METHODS Solid-phase microextraction (SPME) and derivatization via trimethylsilylation (TMS) were employed as complementary techniques to evaluate fresh, frozen, and lyophilized fecal samples with expanded coverage of the fecal metabolome. The total number of detected peaks and the signal intensities were compared among the different handling conditions. RESULTS Our analysis revealed that the metabolic profiles of fecal samples depend greatly on sample handling and processing conditions, which had a more pronounced effect on results obtained by SPME than by TMS derivatization. Overall, lyophilization resulted in a greater amount of total and class-specific metabolites, which may be attributed to cell lysis and/or membrane disintegration. CONCLUSIONS A comprehensive comparison of the sample handling conditions provides a deeper understanding of the physicochemical changes that occur within the samples during freezing and lyophilization. Based on our results, snap-freezing at -80 °C would be preferred over lyophilization for handling samples in the field of fecal metabolomics as this imparts the least change from the fresh condition.

An efficient and accurate numerical determination of the cluster resolution metric in two dimensions
Journal of Chemometrics, 2021
Cluster resolution (CR) is a useful metric for guiding automated feature selection of classificat... more Cluster resolution (CR) is a useful metric for guiding automated feature selection of classification models. CR is a measure of class separation in a linear subspace for variable subsets via the determination of maximal, non‐intersecting confidence ellipses. Feature selection by cluster resolution (FS‐CR) is most commonly used to extract panels of useful, discriminating features from sparsely populated chromatographic peak tables, optimizing models from raw signals, or when working with datasets with many more variables than samples. The absence of a numerical method for calculating CR necessitates a great deal of dynamic programming and algorithmic complexity. In this work, we present a numerical determination of the CR metric, which reduces computation time by about 65 times when compared with the dynamic programming approach and simplifies the operating principles of FS‐CR algorithm.
Investigation of the accelerated thermal aging behavior of polyetherimide and lifetime prediction at elevated temperature
Journal of Applied Polymer Science, 2021
Talanta, 2011
The present work studies the effectiveness of the use of triacylglycerols (TAGs) for the quantifi... more The present work studies the effectiveness of the use of triacylglycerols (TAGs) for the quantification of olive oil in blends with vegetable oils. The determinations were obtained using high-performance liquid chromatography (HPLC) coupled to a Charged Aerosol Detector (CAD), in combination with Partial Least Squares (PLS) regression and using interval PLS (iPLS) for variable selection. Results revealed that PLS models can predict olive oil concentrations with reasonable errors. Variable selection through iPLS did not improve predictions significantly, but revealed the chemical information important in the chromatogram to quantify olive oil in vegetable oil blends.
Uploads
Papers by Paulina De La Mata