Alexis St-Gelais, chimiste – Popularization
Today, we want to give you a glimpse of the process involved when we are able to identify compounds that we earlier reported as unknowns in our gas chromatographic profiles, applying it to the case of cannabis terpenes.
How can a Compound Noted as Unknown in GC Profiles Finally be Identified?
We have previously discussed on these pages the concept of unknown compounds, often seen in essential oil analyses. Some unknown molecules are also part of our terpenes screening for cannabis, for very similar reasons.
Of course, for a chemist, the most satisfying part of unknowns is when one can finally be identified!
There are two ways to do so. The first is to isolate and characterize the molecule, as we did in that paper from 2018 . However, this can be a difficult undertaking, for various reasons (sample availability, challenges in separating the compound from the others, too small amounts, impurities, etc.), and it is time-consuming.
Another possibility is that, even though no matches can be found in available mass spectral databases, the compound has nevertheless been identified earlier in scientific literature. Not all literature is included in databases. And since no one is omniscient of all published science and that searching literature is time-consuming, sometimes a bit of luck must be involved before one starts connecting all the dots.
Identification: Case of the Unknown CIAU VII or Selina-4,7(11)-diene
The case we wanted to present today is of that latter type. In both our main and full terpenes analyses for cannabis, we made a habit of reporting an often quantitatively important unknown compound, which eluted in close association to selina-4(15),7(11)-diene and selina-3,7(11)-diene. We knew it, in-house, as Unknown CIAU VII [m/z 189, 204 (92), 161 (65), 133 (51), 105 (51), 91 (51), 119 (45)]. The last part of that name, contained within brackets, is a list of the main ions read on the mass spectra of unknowns, followed in parentheses by their relative abundance against the most prominent ion, which is always the first one. This information can be found in our reports, because they can be useful to any analytical chemist attempting to locate the same substance. The particle CIAU VII is an internal code we use for the sake of simplicity and is derived from the name of the botanical source of the essential oil where the unknown was first observed in our hands, followed by a Latin incremental number. CIAU here stands for Citrus aurantifolia, or lime essential oil, where is was the 7th (VII) unknown registered at PhytoChemia for the species.
This molecule can be observed in a number of other essential oils, but rarely in large amounts except in cannabis. Even in lime oils, it rarely exceeds 0.25%, which is relatively minor. Since we deal with thousands of unknowns in our databases, it was not given particular attention for a long time.
This summer, though, we decided to painstakingly dissect the results of a guaiacwood essential oil to enrich our databases with the large number of new compounds reported in an excellent and detailed research paper by Tissandié, Viciana, Brevard, Meierhenrich & Filippi from 2018 for that essential oil, with a self-telling title: Towards a complete characterization of guaiacwood oil . When crossing the results from that paper with our own observations, we stumbled upon selina-4,7(11)-diene, a compound which was not listed in reference databases to which we had access. Tissandié’s paper referred to another work from de Lima et al in 2012 on Croton hirtus essential oil from Brazil , where the molecule is indeed listed at a retention index of 1534 on their DB-5 column. This last study, however, did not provide much more information, in particular the mass spectrum for this compound.
Searching deeper in literature for reference data for this compound, we found a paper of Feger, Brandauer & Ziegler from 1999 on lime essential oil . This study reports the isolation and characterization of selina-4,7(11)-diene by nuclear magnetic resonance, which allowed them to establish its structure unequivocally. They indicate that this compound, on a non-polar gas chromatographic column, is eluted close to selina-4(15),7(11)-diene*. The latter interestingly is a recurring and prominent sesquiterpene of cannabis, which is eluted very near to unknown CIAU VII, also recorded in lime oil in the first place. At this point, we began to suspect that CIAU VII might be selina-4,7(11)-diene.
However, that article still did not provide a mass spectrum to finally locate that compound. The authors instead referred to a much older piece of literature from Sun & Erickson in 1978, who isolated this molecule for the first time in nature in algae Laurencia nidifica . Happily, these authors did take the time to report mass spectral data for their newly identified compound – an aspect that is sadly too often overlooked in studies reporting the isolation of sesquiterpene compounds, and whose absence has jeopardized more than one attempt at identifying unknowns in our hands. This allows for comparison between the reported mass spectral signals and our CIAU VII (Figure 1).
This match is somewhat ambiguous at first glance. All major peaks are found in both spectra, but the ratios are not in full compliance. One can notice that small ions are more abundant in the literature spectrum. This phenomenon is, however, not that uncommon, especially when dealing with older data (and 1978, for GC-MS, is fairly old). Often, smaller ions appear more intense in older publications, possibly owing to evolution in instrumentation and/or instrument parameters being used. This must also be weighed in by the fact that CIAU VII is found in the same concentration range as that reported for selina-4,7(11)-diene, both in distilled lime oil . Furthermore, the retention index reported by de Lima et al  at 1534 is the same as that which we recorded for CIAU VII on the same chromatographic column. Finally, CIAU VII was indeed observed as traces in the guaiacwood oil we were studying, which was in accordance with the reporting of selina-4,7(11)-diene by Tissandié et al .
Singlehandedly, none of these observations would be solid enough to support the hypothesis that CIAU VII is selina-4,7(11)-diene, but taken together, we feel that they give a credible basis to at least propose that identification followed by a question mark in future reports.
Another argument can be obtained from observation of cannabis terpenes themselves. Over the course of hundreds of analyses, clear correlations of concentrations arise among some of the terpenes. This is to be expected, since plants produce such metabolites following biosynthetic pathways: if a given pathway is enabled for a cannabis strain, typically meaning that a given enzyme is available to the plant to process a given compound in a derivative, all compounds that are compatible with that enzyme can undergo the same transformation. This generally leads to compound groups, where the members share a similar origin and structural resemblance. Running some statistics on our cannabis terpenes data, we noticed that such a group formed around several similar terpenes (Figure 2).
This group includes several compounds whose name comprises the “selin-“ prefix, and belong to the eudesmane-type sesquiterpenes. Whenever these compounds are more abundant in a strain, the unknown CIAU VII also was more prominent. Comparison of the structure of selina-4,7(11)-diene with the other members of this group clearly show the structural similarity. Therefore, the proposed identification of CIAU VII would also be plausible, from a biosynthetic perspective, since cannabis can be expected to generate this compound alongside the others.
What is changing in the analysis reports
Therefore, we will from now on be reporting, “selina-4,7(11)-diene?” in place of “Unknown [m/z 189, 204 (92), 161 (65), 133 (51), 105 (51), 91 (51), 119 (45)]” in our cannabis terpenes screens, as well as in future essential oil analyses. If any of our customers is tracking compound-based statistics for their strains, they might want to arrange for the translation to be made in their databases, too. This illustrates the kind of inference process at play when trying to cross literature data with empirical observations to identify some unknown compounds. We hope that you enjoyed the demonstration!
*Feger, Brandauer & Ziegler list the compound as selina-4(14),7(11)-diene, but as irritating as these nomenclatural twists go, this is a synonym of selina-4(15),7(11)-diene!
 St-Gelais, A.; Roger, B.; Alsarraf, J.; Legault, J.; Massé, D.; Pichette, A. Aromas from Quebec. VI. Morella pensylvanica from the Magdalen Islands: A (-)-α-Bisabolol-Rich Oil Featuring a New Bisabolane Ether. J. Essent. Oil Res., 2018, 30 (5), 319–329. https://doi.org/10.1080/10412905.2018.1470039.
 Tissandié, L.; Viciana, S.; Brevard, H.; Meierhenrich, U. J.; Filippi, J. J. Towards a Complete Characterisation of Guaiacwood Oil. Phytochemistry, 2018, 149, 64–81. https://doi.org/10.1016/j.phytochem.2018.02.007.
 de Lima, S. G.; Medeiros, L. B. P.; Cunha, C. N. L. C.; da Silva, D.; de Andrade, N. C.; Neto, J. M. M.; Lopes, J. A. D.; Steffen, R. A.; Araújo, B. Q.; Reis, F. de A. M. Chemical Composition of Essential Oils of Croton hirtus L’Her from Piauí (Brazil). J. Essent. Oil Res., 2012, 24 (4), 371–376. https://doi.org/10.1080/10412905.2012.692908.
 Feger, W.; Brandauer, H.; Ziegler, M. Analytical Investigation of the Sesquiterpene Hydrocarbons of Distilled Lime Oil (Citrus aurantifolia Swingle). J. Essent. Oil Res., 1999, 11 (5), 556–562. https://doi.org/10.1080/10412905.1999.9701213.
 Sun, H. H.; Erickson, K. L. Sesquiterpenoids from the Hawaiian Marine Alga Laurencia nidifica. 7. (+)-Selina-4,7(11)-Diene. J. Org. Chem., 1978, 43 (8), 1613–1614. https://doi.org/10.1021/jo00402a039.