Dear editors,
We are writing in response to the second letter by IROA technologies requesting the retraction of our publication “Isotope Ratio Outlier Analysis (IROA) for HPLC-TOFMSBased Metabolomics of Human Urine” in Metabolites (2022, 12(8):741). Paragraphs taken from the retraction request letters are marked in brown.
The readers (IROA) further insist: “The changes to the protocol the authors made were extreme.” The comparison with the constructed CRISPR-cas9 example is irrelevant. The changes are not extreme, and we described all experimental details clearly in our manuscript for the reader to evaluate the data in the light of the experimental conditions.
Nevertheless, changes to the IROA protocol are well documented in the literature. Interestingly, in a protocol (!) published by Mendez et al with both IROA technologies founders (chief executive officer, chief scientific officer) as coauthors we find the following protocol employing the IROA TruQuant IQQ kit to analyze liver tissue extracts (Mendez R et al. 2019):
“…
8. Dry samples in a vacuum centrifuge for ~1 h or until fully dry. Flush the microcentrifuge tubes containing the sample with nitrogen gas (see Note 9).
9. Reconstitute sample immediately prior to use by adding 160 μL 2% methanol. Vortex (see Note 10).
10. Prepare the IROA®-IS by adding 1.25 mL HPLC-grade water to the vial.
11. To each sample, add 40 μL of IROA®-IS.
12. Prepare IROA®-LTRS by adding 50 μL HPLC-grade water to the vial.
13. Inject 5 μL of IROA®-IS supplemented samples or IROA®-LTRS solution into the Q-Exactive injection port and run the LC gradient according to Fig. 2 (also see Note 11).
…” (Mendez R et al. 2019).
First: No calibration experiment is performed or even mentioned!
Second: The volumes from the protocol were not used (marked blue: IS diluted in 1.25 mL water instead of 1.2 mL; LTRS diluted in 50 µL water instead of 40 µL). Most importantly, dry samples are first dissolved in 160 µL 2% methanol and then further diluted with 40 µL IS (marked yellow). This results in an IS in the final sample that is only at 19.2% strength and a LTRS that is at 80% strength compared to original protocol!! These are extreme changes.
Third: Injection of IS blank samples are not performed or at least not mentioned!
“The critical element here is that all three be the same concentration to minimize any chromatographic shifts that are concentration dependent. The fact that they are all very dilute and were injected as larger injections simply served to broaden what would already be broad peaks, may minimize the concentration effects, but as we have seen it minimized the ability to see the peaks. Had the authors done the calibration experiment to test their conditions they would have seen and corrected these errors up front. The IS blank needs to be at the same concentration as the LTRS is to create RT linkages between the LTRS, and the IS containing samples, of which the IS blank is the most critical.”
Referring to their own published protocol for the IROA TruQuant IQQ kit, this argument is rejected!
Nevertheless, to further stress this point, we analyzed other publications. The protocol cited above has been used by Kesh, Mendez (first author of the protocol) et al, in a more recent publication (Kesh K et al. 2022) to analyze serum samples. No calibration is mentioned and only the Mendez protocol is cited. Assuming that according to protocol (Mendez R et al. 2019) both positive and negative mode data were acquired, only 217 metabolites were detected and apparently ratio data (according to supplementary table 1) were used. Pallikkuth, Mendez et al cite the Mendez protocol (Mendez R et al. 2019) to analyze Rhesus Macaques plasma and feces (Pallikkuth S et al. 2021). Again, no calibration is mentioned, and experimental details are sparse.
Myer et al. used yet a different protocol to analyze aqueous humor (Myer C et al. 2020a; Myer C et al. 2020b). Again, no calibration is mentioned, the samples are precipitated, dried, reconstituted in water, and 10 µL sample is combined with 20 μL of IROA-IS (according to IROA protocol reconstituted in 1.2 mL water). This results in an IS at 66.6% strength, exactly the same as we used! The LTRS was reconstituted in 50 µL water (not according to IROA protocol) resulting in an LTRS at 80%strength, which is different from the IS strength in the samples. The number of metabolites detected using both positive and negative mode is also not that high with 260 (Myer C et al. 2020a) or 206 (Myer C et al. 2020b). Myer is also a coauthor of the protocol “Isotopic Ratio Outlier Analysis (IROA) of Aqueous Humor for Metabolites” published in the same “Metabolomics, Methods and protocols” book publication mentioned above (Piqueras MDC et al. 2019). Again, no calibration is reported, the reconstituted samples are spiked with the IS (not dissolved with the IS), and details are sparse.
All the publications outlined above do not report a calibration experiment, do not use the IROA protocol, the IS and the LTRS are at different strength and the number of reported metabolites is low. Why did IROA technologies not demand a retraction of all these publications, including their own protocol? In total, we find in the mentioned book publication 3 different protocols - all for the IROA TruQuant IQQ kit (the kit we used): the original protocol from IROA technologies (Beecher C and de Jong FA.
2019), the Mendez protocol with IROA technologies coauthors (Mendez R et al. 2019) and the Piqueras protocol (Piqueras MDC et al. 2019), apparently later used by Myer et al. (Myer C et al. 2020b)(Myer C et al. 2020a).
The readers (IROA) claim that our chromatography is “bad” because we use an injection volume of 5 µL:
Fact check: We refer to a common rule of thumb that can be found in text books on HPLC/UPLC:
“Normally, the maximum injection volume should not exceed 10%, preferably 5%, of the column void volume.” (Kromidas 2017). The column void volume is estimated as follows:
𝑑𝑐2
𝑉𝑚 = × 𝜋 × 𝐿𝑐 × 𝜀 4
With a column diameter (dc) of 2.1 mm, column length (Lc) of 150 mm and = 0.65 – 0.7, a void volume of 338 – 364 µL is estimated. An injection volume of 5 µL amounts to 1.48% of an estimated void volume of 338 µL!
If we refer to Waters, the manufacturer of the column used, an even less stringent recommendation can be found based on the empty column volume: “The recommended injection volume for an ACQUITY UPLC column is 1-3% of the total empty column volume.”
(https://support.waters.com/KB_Chem/Columns/WKB66208_What_is_the_suggested _injection_volume_for_an_ACQUITY_UPLC_column, accessed 2022/12/13)
The empty column volume of the column used is 519 µL and an injection volume of 5 µL amounts to 0.96%.
Having established that the injection volume we used is ok, it must be pointed out that the influence of injection volume on peak shape depends on the retention factor k, e.g., it is less pronounced for analytes with higher retention time compared to early eluting solutes. The generalized statements found throughout the second letter that peaks are broad due to an injection volume of 5 µL are just wrong.
Most importantly, two main forms of column overload can occur: volume and mass overload. We aimed to reduce the latter as much as possible. The peak of trigonelline used by IROA technologies to point out a “bad chromatography” (“An examination of Figure 6 in the paper clearly shows that the author’s peaks are unfortunately broad (Fig 6 D =1.8 minutes wide)”) is overloaded, as shown by the peak shape.
Unfortunately, here the readers (IROA) inflated the value (underlined statement). The overloaded peak is about 0.8 min broad (at max 1 min), not 1.8 min. The whole chromatogram window shown in figure 6d is only about 1.5 min wide!
When the non-linear region of the distribution isotherm is reached due to mass overload the peak shape will be distorted and this is the case for trigonelline. To avoid mass overload, urine with a creatinine concentration of 6.5 mM should not be injected straight onto the column but samples should be diluted as we did. There are many metabolites in urine, e.g. conjugates such as hippuric acid, sulfate conjugates, glucuronides, etc. that will easily overload the column, but for which we will not find a corresponding IS in the yeast based IS.
The company further claims the mass spectrometer is poorly tuned because the 40.000 resolution from the specification is not reached. They conveniently overlook that the specification is given for m/z 1022, but we give the resolution for a low m/z:
“A mass spectral resolution of R = 21,000 (for m/z 90.977) was obtained.” While the resolution over m/z curve is fairly flat for higher masses with TOF instruments, it is well known that it drops down in the low m/z range (Beck S et al. 2015), where digitizer speed becomes a limiting factor. This is also nicely illustrated in a white paper by Thermo Scientific (M. Bromirski, white paper 65146).
The company claims that Cluster Cinder 4 (CF4) was released in June 2019. Why is the build of Cluster finder 3.1.12 version dated to August 2019 (see screen shot Figure 1), when the new release was already out?
Why was CF4 not used in a publication (Taylor NJ et al. 2020) with the IROA founder and chief scientific officer as coauthor published in October 2020, submitted to the journal in April 2020? On a side note, if we examine that publication closer, we find that a Thermo Scientific Q Exactive orbitrap (full scan mode) was used at 35,000 resolution (m/z 200). However, for instrumental details an older publication (Ulmer CZ et al. 2015) is cited where the instrument was operated at 70,000 resolution (m/z 200) – quite a discrepancy? Should that paper be retracted because a poorly tuned instrument was used? It might also be noted that according to (Ulmer CZ et al. 2015) an injection volume of 5 µL was used (Ace Excel C18-pfp column ; 100 x 2.1mm, 2μm), not to mention other severe flaws in the paper?
Figure 1: Screenshot – About ClusterFinder version 3.1.12.
Second letter: “We should also point out that Figure 6 demonstrates some of the instrumental issues the authors faced. In this figure they suggest that at a resolution of 21,000 they were unable to clearly see a 12 ppm mass difference despite the fact that half height these should separate. Clearly, they had instrumentation and chromatographic issues.”
The figure shows that the signal of protonated trigonelline interferes with the 13C Asp signal.
13C Asparate: ([M+H]+: 138.058 m/z, [13]C4H8N1O4)
Trigonelline: ([M+H]+: 138.055 m/z, C7H8N1O2)
This is a 3 mDa difference! This is in the range of the specified mass accuracy of the MaxisImpact (2 ppm for 1221.9906 m/z, external calibration) and below common m/z extraction windows for data analysis (e.g., 5 mDa). Using the 10 % valley definition for the resolution according to IUPAC:
”Let two peaks of equal height in a mass spectrum at masses m and m−Δm be separated by a valley which at its lowest point is just 10 per cent of the height of either peak. For similar peaks at a mass exceeding m, let the height of the valley at its lowest point be more (by any amount) than ten per cent of either peak height. Then the resolution (10 per cent valley definition) is m/Δm.” (The International Union of Pure and Applied Chemistry 2023), we would need a resolution (138.058/0.003) much higher than 46.000 to resolve them properly, taking into consideration that they are not of equal height!
The readers (IROA) further claims that using the ratios of monoisotopic peaks of endogenous compound to IS is wrong and that the M and M+1 (endogenous) or M and M-1 (IS) must be summed. While this can be done, we insist that this is not necessary and, in some cases, even introduces massive errors.
Example 1: Glycine; the M+1 of the endogenous compound is the M-1 of the internal standard – Is it added to both? Probably not.
Example 2: Cysteine; the M+2 of the endogenous compound amounts up to 5.18%
(protonated, m/z 140.018: 4.477% and m/z 140.026 :0.698% of base peak, CompassIsotopePattern tool, Bruker) of the base peak but would be added as M-1 peak to the IS area. This would introduce tremendous variability in the IS summed area depending on the abundance of the endogenous cysteine.
Example 3: Even other metabolites with 3 carbon atoms have a low percentage of the M+2 isotopologue (e.g., serine [protonated, 0.7% of base peak,
CompassIsotopePattern tool, Bruker]) that will contribute to the M-1 isotopologue of the IS, especially if the endogenous concentration is much higher compared to the IS concentration.
We can only assume that C3 metabolites are treated differently, but these exemptions are not discussed by IROA technologies.
The letters repeatedly point out that we do not seem to be able to understand the different isotopic patterns in IS as a function (binomial distribution) of the C number in the metabolite. To be thorough, letter 1 from IROA technologies states “In the case of natural abundance, the error is real but minimal. In the case of both 5% and 95% labeling the isotopic peak envelopes (the collection of Isotopologues, “peaks”, and the number of isotopomers that make up each of them) are very closely approximated by a binomial distribution and is completely dependent on the number of carbons in a molecule. Thus, for a molecule containing 6 carbons the M+1 and M-
1 peaks alone represent 37% of the height of the base peak, …” Wrong. It is 32%.
(They even state that in their own contribution; (Beecher C and de Jong FA. 2019):
Figure 1)
Letter 2 states: “… the authors ratio includes only monoisotopic peaks. As discussed previously, this is flawed because the monoisotopic peak in the IS, at 5% 12C will lose increasingly large percentages as the number of carbons in any molecule changes, for instance, at 5 carbons the 13C M-1 isotopologues other than the monoisotopic will have 23% of the IS contributed isotopologues, while at 10 carbons the 13C M-1 isotopologue will represent almost 40% of the IS contributed isotopologues.” Wrong: for a C10-molecule it is 32% of the isotopologues distribution or 53% of the monoisotopic peak set as base peak.
Letter 2 further states: “This becomes quite critical when comparing a natural abundance peak (which will have only a 1.1% loss in height), with an IROA monoisotopic peak height (which will have lost a significant but variable formuladefined amount of its height).”
The “loss in height” of the natural abundance peak also depends on the formula.
“This ratio of monoisotopic, as used by the author is not quantitatively accurate. For the more accurate quantitative comparison the natural abundance isotopologues must be summed and compared to the sum of the internal standards isotopologues, i.e. each contribution must be considered as a whole and not as some variable fragment.”
It is not a quantitative analysis! IUPAC defines quantitative analysis as “Analyses in which the amount or concentration of an analyte may be determined (estimated) and expressed as a numerical value in appropriate units.” (IUPAC 2019). The procedure neither delivers a concentration nor an amount because the concentration of the labeled analytes in the yeast IS are simply not known! Hence, the term “quantitative” is wrongly used.
Summing up the isotopologue envelope is only mandatory if it would allow us to compare metabolites to each other, but that is not possible. Based on the ratios calculated from the isotopologue area sum we cannot compare metabolite A to metabolite B in sample X and deduce that metabolite A has a higher concentration because its ratio is higher. The concentration of the IS is not known, and the concentrations are different from one IS compound to another. A high ratio can be caused by a high abundance of the endogenous compound or a low concentration of the IS. Moreover, ionization efficiency can differ tremendously between metabolites, hence a higher peak area of metabolite A compared to metabolite B can be caused by a higher concentration or a better ionization efficiency of metabolite A over B. Hence, we can only compare the abundance of the same metabolite among samples. Since the isotope pattern of the respective IS does not change, monoisotopic peak ratio can be used.
Remember, we compare the ranks of the same metabolite across different samples, meaning the isotope pattern is the same. We elaborate here with this simple example.
Two urine samples: S1, S2, with IROA IS added to them.
Consider a metabolite X exhibiting two isotopes above the accepted intensity threshold, for both
C12 and C13 isotopic envelopes. The isotopes intensities are the following:
S1:
C12 M : 100 000
C12 M+1 : 10 000
C13 M :50 000 C13 M-1 : 25 000
S2:
C12 M : 200 000
C12 M+1 : 20 000
C13 M : 40 000
C13 M-1 : 20 000
Consider the ratios Rmono (a monoisotopic ratio of the M0 isotopes) and Rtot for a total ratio (summing each envelope first)
In S1:
RmonoS1=100 000/50 000=2
RtotS1=(100 000+10 000)/(50 000+25 000)= 110 000/75 000= 1.467
Surely, we get different ratios here, but as shown below this is irrelevant when comparing samples using the same method.
In S2:
RmonoS2=200 000/40 000=5
RtotS2=(20 0000+20 000)/(40 000+20 000)= 220 000/60 000= 3.667
RmonoS1/RmonoS2=2/5=0.4
RtotS1/RtotS2=1.467/3.667=0.4
Moreover, we showed in our previous communication that both ratios correlate well, which was ignored in the second letter.
Second letter: The readers (IROA) further wondered why we showed the percentage of features with a ratio between 0.8 and 1.2 (using CF4) in our initial response letter for the different data sets. Well, in the first letter by IROA we find the following statement: “The C12 and C13 peaks for most compounds are optimally measured when they are closer to a 1:1 ratio. A calibration experiment would have determined the creatinine equivalence value for the IS that would have given the greatest number of peaks found and assured their accuracy.” This clearly implies that a ratio of 1:1 is preferred, and we merely showed that the preferred ratio is reached for less than 10% of the features. At no time did we imply that we have limited our analysis to this range of ratios!
A major concern seems to be the low number of reported peaks. Generally, a metabolite can be represented by different peaks/features. This inflates the reported number of features, as seen in (Carey J et al. 2019).
As pointed out earlier, the numbers detected in other matrices (Kesh et al. Meyer et al.) are also not high. Urine is not a yeast extract- of course, there are many more metabolites present, often in high concentration that do not have a corresponding IS. This alone prohibits to choose the sample amount only based on the calibration experiment. No one would inject a urine sample with 6.5 mM creatinine directly on the column, but samples are almost always diluted.
Nevertheless, we again like to point out that we clearly stated in the publication: “In general, using a yeast extract for urine analysis is not optimal. We, however, did not address the other aspects of using the kit here, such as the range of covered metabolites, which would be readily affected by the matrix type. We, instead, focus on a few identified metabolites and investigate the use of a complex IS and its ability to improve the data correlation to a reference dataset.”
Again, IROA technologies insist that we use the so-called ion suppression corrected data. There is no real proof of this concept. The IS area from the IS blank is used, which is considered as the least suppressed value, but there is no proof. The IS itself is biological matrix that might suffer from ion suppression. (Ion suppression corrected areas: x*C12/IROAIS with X being the least suppressed value of this analyte). Overall, it is just a metabolite dependent factor to multiply the metabolite ratio in each sample.
Regarding the selection of the IS area: “In the current software the default is selected as the average across all of the IS-Blank samples”. In the software version we were provided with, we did not find that option. But if it is the default setting, the result table should deliver an average value for the IS in the sheet “Suppr. Corrected C13 data”, but in fact the value comes from an individual IS sample.
Interestingly, the output from the data analysis with CF 4 delivers the raw peak areas for C12 and C13, the ratios, the suppression corrected data for C12 and C13, and the MSTUS normalized data.
The pdf document “IROA Technology Primer and ClusterFinder™ V3 Software User Manual” delivered with the CF 4 software package (we can only assume it is the correct manual) advises: “When you, the user, exports your data for statistical analysis the ClusterFinder data export will provide you with the raw data (as seen), the ratio, the suppression corrected data, and the normalized data. Based on your needs and experimental design you should choose the appropriate dataset.” Hence, the user is free to use the ratios as well, as also seen in Kesh et al. in a publication from 2022 (Kesh K et al. 2022).
The readers (IROA) further attack the use of PQN normalization: “PQN and RBE make appropriateness assumptions that most users barely understand concerning the quality and nature of the underlying data. How appropriate are techniques which have been developed for genetic data and are now used for metabolomic data?”
Just as a reminder, PQN was developed for metabolomics data as the title of the publication tells us: “Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics” (Dieterle F et al. 2006) and the paper has already been cited over a 1.000 times including many publications working with mass spectrometry data.
As an additional point, we also like to point out that for the metabolites shown in the paper, we get already a good correlation between the raw peak area and the quantitative data, and there is not much room for further improvement using the ratio data. This is not surprising, there are thousands of untargeted LC-MS studies out there working with the metabolite peak areas without an IS for each feature, including the work by well-known companies such as Metabolon. Should all these studies be retracted because IROA postulates that one cannot work with raw peak areas?
Nevertheless, we did mention that the kit improved the data. We also said that under other conditions it might outperform the post-hoc batch correction.
Publication: “Regarding the batch effects, the application of the kit helped to reduce them. Nevertheless, the IROA–ratio list might show even better results when comparing several batches measured over a prolonged time or even measured on different instruments. Additionally, since RBE works when there are distinct batches, IROA ratios would be more beneficial if samples are measured in a single batch over a long period of time and a gradual decline in instrument performance occurs throughout the batch. This, however, is beyond the scope of this study.”
In summary, the arguments made for a retraction do not hold, not the least in the light of other publications including a protocol with coauthors from IROA technologies itself. As mentioned in our previous letter, we are happy to publish an addendum to show the data analyzed with CF4 and the calibration experiment where we exactly followed the protocol suggested by IROA.
However, if the editors decide that the work must be retracted, we will accept that decision. But, in the interest of transparency, the complete correspondence must be available to the readers so that they can form their own opinion. The unnecessarily derogatory and insulting language in the retraction requests together with contradictions and inconsistencies that we pointed out must be disclosed to the readers.
The authors
Beck S; Michalski A; Raether O; Lubeck M; Kaspar S; Goedecke N, Baessmann C. et al. (2015): The
Impact II, a Very High-Resolution Quadrupole Time-of-Flight Instrument (QTOF) for Deep Shotgun Proteomics. In Molecular & Cellular Proteomics 14 (7), pp. 2014–2029. DOI:
10.1074/mcp.M114.047407.
Beecher C; de Jong FA. (2019): Isotopic Ratio Outlier Analysis (IROA) for Quantitative Analysis: Springer New York (Methods in Molecular Biology).
Carey J; Nguyen T; Korchak J; Beecher C; de Jong F; Lane AL. (2019): An Isotopic Ratio Outlier Analysis Approach for Global Metabolomics of Biosynthetically Talented Actinomycetes. In Metabolites 9 (9), p. 181. DOI: 10.3390/metabo9090181.
Dieterle F; Ross A; Schlotterbeck G; Senn H. (2006): Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in 1 H NMR Metabonomics. In Anal. Chem. 78 (13), pp. 4281–4290. DOI: 10.1021/ac051632c.
IUPAC (2019): quantitative analysis. In Victor Gold (Ed.): The IUPAC Compendium of Chemical
Terminology. Research Triangle Park, NC: International Union of Pure and Applied Chemistry (IUPAC).
Kesh K; Mendez R; Mateo-Victoriano B; Garrido VT; Durden B; Gupta VK et al. (2022): Obesity enriches for tumor protective microbial metabolites and treatment refractory cells to confer therapy resistance in PDAC. In Gut Microbes 14 (1), p. 245. DOI: 10.1080/19490976.2022.2096328.
Kromidas, S. (2017): The HPLC-MS handbook for practitioners. Weinheim, Germany: Wiley-VCH.
Mendez R; Del Carmen Piqueras M; Raskind A; de Jong FA; Beecher C; Bhattacharya SK; Banerjee S. (2019): Quantitative Metabolomics Using Isotope Residue Outlier Analysis (IROA®) with Internal Standards: Springer New York (Methods in Molecular Biology).
Myer C; Abdelrahman L; Banerjee S; Khattri RB; Merritt ME; Junk AK et al. (2020a): Aqueous humor metabolite profile of pseudoexfoliation glaucoma is distinctive. In Mol. Omics 16 (5), pp. 425–435. DOI: 10.1039/C9MO00192A.
Myer C; Perez J; Abdelrahman L; Mendez R; Khattri RB; Junk AK; Bhattacharya SK. (2020b):
Differentiation of soluble aqueous humor metabolites in primary open angle glaucoma and controls. In Experimental Eye Research 194 (Suppl. 3), p. 108024. DOI: 10.1016/j.exer.2020.108024.
Pallikkuth S; Mendez R; Russell K; Sirupangi T; Kvistad D; Pahwa R et al. (2021): Age Associated Microbiome and Microbial Metabolites Modulation and Its Association With Systemic Inflammation in a Rhesus Macaque Model. In Front. Immunol. 12, p. 25. DOI: 10.3389/fimmu.2021.748397.
Piqueras MDC; Myer C; Junk A; Bhattacharya SK. (2019): Isotopic Ratio Outlier Analysis (IROA) of Aqueous Humor for Metabolites: Springer New York (Methods in Molecular Biology).
Taylor NJ; Gaynanova I; Eschrich SA; Welsh EA; Garrett TJ; Beecher C et al. (2020): Metabolomics of primary cutaneous melanoma and matched adjacent extratumoral microenvironment. In PLoS ONE 15 (10), e0240849. DOI: 10.1371/journal.pone.0240849.
The International Union of Pure and Applied Chemistry (2023): IUPAC - resolution (R05318). Available online at https://goldbook.iupac.org/terms/view/R05318, updated on 1/13/2023, checked on 1/13/2023.
Ulmer CZ; Yost RA; Chen J; Mathews CE; Garrett TJ. (2015): Liquid Chromatography-Mass
Spectrometry Metabolic and Lipidomic Sample Preparation Workflow for Suspension-Cultured Mammalian Cells using Jurkat T lymphocyte Cells. In J Proteomics Bioinform 08 (06). DOI:10.4172/jpb.1000360.