Stephen Bustina,⁎, Jim Huggettb,c
a Faculty of Medical Science, Anglia Ruskin University, Chelmsford CM1 1SQ, United Kingdom
b Molecular and Cell Biology Team, LGC, Queens Road, Teddington, Middlesex TW11 0LY, United Kingdom
c School of Biosciences & Medicine, Faculty of Health & Medical Science, University of Surrey, Guildford, GU2 7XH, United Kingdom
ARTICLE INFO
Handled by: Justin O’Grady Keywords: Real-time PCR Assay design MIQE Oligonucleotide
ABSTRACT
Primers are arguably the single most critical components of any PCR assay, as their properties control the exquisite specificity and sensitivity that make this method uniquely powerful. Consequently, poor design combined with failure to optimise reaction conditions is likely to result in reduced technical precision and false positive or negative detection of amplification targets. Despite the framework provided by the MIQE guidelines and the accessibility of wide-ranging support from peer-reviewed publications, books and online sources as well as commercial companies, the design of many published assays continues to be less than optimal: primers often lack intended specificity, can form dimers, compete with template secondary structures at the primer binding sites or hybridise only within a narrow temperature range. We present an overview of the main steps in the primer design workflow, with data that illustrate some of the unexpected variability that often occurs when theory is translated into practice. We also strongly urge researchers to report as much information about their assays as possible in their publications.
1. Introduction
The peer-reviewed literature contains references to tens, if not hundreds of thousands of oligonucleotide primer sequences for use with the polymerase chain reaction (PCR) and hundreds more are available from primer databases or can be bought from commercial suppliers. Oligonucleotide synthesis is available at bargain prices, enzymes are becoming faster, more reliable and cheaper, there are task-specific master mixes (e.g. for multiplexing) and thermal cyclers are becoming more affordable and user-friendly. This makes it possible to generate huge amounts of data with comparatively little effort. Since these data find their way into over 15,000 qPCR-related publication every year, it is essential to try and ensure that publications report real results, rather than technical bias [1]. With so many ready-made assays available, one might wonder why anyone would want to go to the trouble of designing yet another assay. Especially as the perception is that designing one’s own assay is a lot more complex and inconvenient than simply buying it from a commercial supplier, who in any case will have validated every one of their assays. That perception is wrong for two reasons: first, commercial primers or assay condition may not have been experimentally validated or optimised. Second, it cannot be presumed that a primer set will generate the same results under different experimental conditions since assay performance can vary depending on what extraction methods was used to purify the templates [2], what reagents were used for the PCR reaction [3–5] and what thermal cycler was used to run the assays [6,7]. Hence researchers can be sure of an assay’s performance only by
performing their own validation and optimisation experiments. Doing this before working with precious samples will save time, expense and help avoid failed runs or inconsistent experimental data. Given the significance of empirical validation, it is important that any publication include that essential information [8–11]. Several reports have been published recently that together scored thousands of peer-reviewed papers in a wide collection of journals ranging from low to high impact factors [12–17]. All concluded that the amount of critical information provided with papers reporting qPCR data is inadequate for the purpose of evaluating the validity of conclusions arising from those data, with many not reporting primer sequences, validation data or including wrong information. The main concerns with regards to designing assays usually relate to researchers being unfamiliar with the primer design process or unsure
about the key parameters most likely to generate optimal primers, lacking the appropriate design tools and apprehension that the design process will take too long. However, assay design is usually quite straightforward, suitable tools are freely available online and it takes less time to design a robust, sensitive and specific assay than to troubleshoot a poorly designed one.We provide a concise overview of the main primer-related issues that confront anyone wanting to design a qPCR assay, consider the main criteria that have an impact on assay performance, dissect the individual steps of the assay design workflow and analyse the performance of some real-life assays.
2. The importance of primers
Appropriately validated primers are crucial in determining the specificity, sensitivity and robustness of a PCR reaction [18]. Whilst it nearly always possible to get a result with a PCR assay, this is not the same as getting a correct result, be that a present/absent call for the detection of a pathogen or mutation using an endpoint assay or an accurate quantification of RNA copy numbers using a real-time method. In reality, PCR is not as robust as many people believe and there is a need to consider the science underlying DNA folding and match versus mismatch hybridisation. Having said that, it is not always obvious why some primer combinations work, or indeed do not work well. The critical variable for primer performance is its annealing temperature (Ta), rather than its melting temperature (Tm), as the Ta defines the temperature at which the maximum amount of primer is bound to its target. The optimal primer Ta must be established experimentally as primer design programs generally calculate Tms and, in any case, many use wrong prediction parameters [19]. Furthermore, since optimal annealing temperatures vary with different buffers, results obtained with one master mix cannot necessarily be extrapolated to a second one. Even at the optimal Ta, non-specific amplification can occur, especially with “proofreading” enzymes, caused not just by primer dimers but by physical closeness of primer pairs at mismatched sites. Furthermore, reliance on BLAST searches alone does not guarantee primer specificity, since whilst the BLAST algorithm returns fast results it may miss thermodynamically important hybridisation events as it does not correctly score the gaps that generate duplex bulges [19]. Furthermore, the effects of mismatches on duplex stability are sequence
context dependent and are not correctly called by sequence independent approximations.
3. Principal considerations for assay design
A good assay will not create primer dimers, be close to 100% efficient and exquisitely specific. Such an assay will also be robust, which means that if conditions are not quite optimal, for example if a sample contains traces of an inhibitor, or if the thermal cycler has uneven thermal profiles across its block, then the assay may still perform reliably and generate usable data. In contrast, a poor assay will be much more susceptible to variable conditions, and is virtually guaranteed to result in wasted time and considerable frustration on the part of the researcher. As a rule of thumb, if primers perform well over a broad temperature gradient, the assay tends to be robust, whereas if amplification is restricted to a narrow temperature optimum, it is not. When designing assays in-house, the design process comprises a comprehensive workflow that demands careful consideration not just of the primers themselves but also of amplicon uniqueness, structure and location, with the aim of bringing about an optimal primer/amplicon combination for accurate quantification of nucleic acids. Attention to such detail makes it more likely that the assays will yield data that are sufficiently reliable and sensitive to generate consistent as well as biologically/clinically relevant results. Importantly, even when the primers have been designed by a colleague, tracked down from a peerreviewed publication, acquired from a primer database or purchased from a commercial source, reliable qPCR demands a retrospective evaluation of most of the in silico criteria and assiduous validation of all of the wet lab parameters. Achieving these objectives is not difficult, when following the workflow shown in Fig. 1 which involves four major steps: (i) target identification, (ii) definition of assay properties, (iii) characterisation of primers and (iv) assay optimisation. The first two steps are carried out by in silico analyses, the latter two by experimental investigation.
4. Target identification
It is self-evident that an assay is useful only if the correct target has been identified and used for assay design. Hence the more that is known about the DNA or RNA of interest, the better. Accordingly, the first step involves accumulating as much information as possible from sequence databases. This can appear to be quite daunting, but will soon become second nature with a systematic, step-by-step approach. One problem with searching for sequences is that there are often numerous identical, closely-related or, more surprisingly, significantly different sequences listed under the same common name. An NCBI search for “Aspergillus terreus 18S” brings up 161 sequences of varying lengths, descriptions and accession numbers. A search for “Aspergillus terreus 28S” returns
142 sequences, some the same as the previous search, but again all with different accession numbers and lengths. There are two lessons here: (i) it is essential to have absolute clarity about the amplification target and (ii) it is crucial always to refer to the accession or individual transcript number of any sequence used for assay design, as this minimises the risk of confusion and makes life much simpler for reviewers and readers. Furthermore, many databases are not curated, so a given sequence name is solely based on what the individual who uploaded it thought it was. Consequently, if the original sequence was incorrect, for example due to a nonspecific PCR, but this was not known when it was uploaded, then a circular problem will arise that further propagates the error.
Database mining assumes familiarity with its nomenclature: e.g. NCBI sequences prefixed with NC_, NG_ are curated genomic sequences, NM_ is a curated mRNA sequence, whereas NT_ and NW_ are automated genomic and XM_ automated RNA sequences. Hence the information with regards to some sequences (NM_) is likely to be more reliable than that for others (XM_) and judicious choice of sequence information is advised. For example, if the aim is to amplify a cellular mRNA, it is important to ascertain whether there are transcript or splice variants or additional closely related paralogues and whether the assay should target all of those or be able to distinguish between them. One important consideration is to ensure that the assay does not inadvertently amplify pseudogenes. Designing PCR primers for miRNAs is somewhat more challenging, since a typical miRNA is only 22 bases long, which is
about the same size as a conventional PCR primer. Genotyping assays, on the other hand, place an obvious restriction on the position of the amplicon as it must include the site of the polymorphism or mutation. This highlights an important point in that while designing an optimal assay is desirable the designer is ultimately at the mercy of the sequence in question and the ‘best’ assay may not be ideal. This may not preclude
its use as long as the limitations of such a choice (such as possible reduced efficiency, sensitivity or precision) are understood and, crucially, reported in any publication. Absolute certainty as to what is being targeted is of particular importance when utilising PCR as a diagnostic or forensic test. Whilst assays used in clinical applications are heavily regulated (although mistakes are still made), research diagnostic assays are not, so factors such as specificity must be considered when reaching conclusions. Human papillomavirus (HPV) is the primary aetiological factor that transforms cervical epithelia into cervical cancer and causes most anal and oropharyngeal as well as some vaginal, vulvar, and penile cancers. According to epidemiological case-control studies, 15 high-risk HPV types have been acknowledged, while three types have been designated as probable high-risk and 12 types have been classified as low-risk [21]. Depending on the purpose of an associated study it could be essential to distinguish between these subtypes and so appropriately designed assays will be paramount [22]. Similarly, designs targeting bacterial and fungal pathogens require careful consideration prior to carrying out any diagnostic experiments [23]. The recent interest in using RT-PCR to target RNA for the tissue profiling of human forensic samples.
Fig. 1. Workflow for PCR primer design. The following web sites are pertinent: PrimerBlast: https://www.ncbi.nlm.nih.gov/tools/primer-blast;DINAmelt: http://unafold.rna.albany.edu/?q=DINAMelt; Mfold: http://unafold.rna.albany.edu/?q=mfold; BLAST: https://blast.ncbi.nlm.nih.gov/Blast.cgi; Ensembl: http://www.ensembl.org/index.html;.
Fig. 2. Temperature gradient analysis of three assays targeting fungal rRNA genes. Effect of different commercial master mixes on gradient profile. Amplification was carried out in 10 μL using BioRad’s iTaq Universal SYBR Green Supermix (172-5121) with 300 nM final primer concentration on a BioRad CFX instrument, with a three minute 95 °C denaturation step followed by 40 cycles of 5 s at 95 °C and 30 s at 62 °C. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Primers were Asp-F: CTTGGATTTGCTGAAGACTAAC and Asp-R: CTAACTTTCGTTCCCTGATTAATG, amplicon size 76 bp. This assay is robust, with the amplification plots virtually indistinguishable at the eight temperatures tested. B: Melt curve. C. Primers were FS1-F: GAGGATGCTTTTGGTGAG and FS1-R: GAGCTTTACAGAGGATCG, amplicon size 99 bp. This assay is somewhat less robust, recording visible differences in Cq. D: Melt curve. E. Primers were FS2-F: CCCGAGTTGTAATTTGTAG and FS2-R: GAAGGAGCTTTACAGAGG, amplicon size 121 bp. This assay is poor, with significantly higher Cqs at temperatures away from the optimum. F: melt curve. G. Table of the Cqs recorded for the three assays, with the assays recording ΔCqs of 0.53, 1.63 and 8.93, respectively, between optimal (highlighted in red) and least optimal temperatures. Amplification conditions were as described in the legend to Fig. 5
Fig. 3. Effect of different commercial master mixes on gradient profile. Amplification reactions were carried out as described in the legend to Fig. 2, except that seven different master mixes were used and annealing was carried out using a temperature gradient from 60 °C–65 °C.
A. Primers CA-F: GTTTGGTGTTGAGCAATAC and CA-R: CTACCTGATTTGAGGTCAAA were used to amplify fungal genomic DNA and PCR amplicons were detected with a hydrolysis probe CA-Pr: FAM-ACAATGGCTTAGGTCTAAC-BHQ. All master mixes record similarly robust gradient profiles, although the Cqs from one master mix are lower than those from the rest. The optimal annealing temperatures are highlighted in red and differ between the master mixes.
B. A. Primers BAR-1F: CATGCTCCAAAATGCCCTA and BAR-1R: CTTGGTAGCACACCCAAA were used to amplify bacterial genomic DNA and PCR amplicons were detected with SYBR
Green chemistry. The quality of the gradient profile depends on the master mix used, although the specificity is not affected, as determined by the melt curves. The optimal annealing
temperatures are highlighted in red and differ between the master mixes.
makes it necessary to ensure that the new assays being developed are species-, RNA- and tissue-specific. It is particularly worth noting that when the intention is to achieve RNA-specificity by placing the primer binding sites in separate exons, intron size is important. At least one fast Taq polymerases in use today can extend at a rate of 155 nucleotides/ second [25], and with annealing and polymerisation times of tens of seconds, as is still common, this can result in polymerisation through the intron and the detection of amplified DNA, eliminating any chance
of reliably distinguishing tissue-specific RNA expression from DNA contamination. Surprisingly, many assays in clinical use are not as specific and/or sensitive as one might expect, making one wonder whether or how they
were ever optimised. A publication reporting the evaluation of the performance of eight candidate working standards developed for diagnostic qPCR-based assays of clinically relevant viral targets by Central Veterinary Laboratories in the UK showed a high level of variability in intra- and inter-assay detection of their targets, with the
highest levels of variation amounting to almost 3 logs of virus [26]. This is rather disturbing, seeing that these results come from laboratories carrying out routine RT-qPCR analyses on patient samples. Other reports describe exon-specific primers that also may bind to introns [27], where the probe binds to sequences upstream from the forward primer[28] or where some of the published primer sequences are incorrect[29].
5. Assay properties
5.1. Primers
The discussion in the previous section highlights the importance of considering assay design as an integration of amplicon and primer characteristics. Primers are the pivotal component of a qPCR assay: their raison d’être is to prime the specific amplification of a single target in a background of more-or-less related potential alternatives. Most
assays are designed to run under constant PCR conditions, consisting of a brief 95 °C melt followed by a single annealing and elongation step of 60 °C as this lends itself to automation and the running of different assays using the same parameters. However, as most polymerases used for PCR work best at 72 °C, our advice is to consider designing assays where primers hybridise at higher temperatures (63 ± 2 °C). This also helps reduce the polymerisation time and so speeds up the time required to complete a PCR run [30]. Since oligonucleotides are cheap, it is worth designing several primers for each assay and choosing the combination giving the lowest Cq and the least amount (ideally none) of primer dimer. This could be a problem if the aim of the assay is ultimate sensitivity. The Ta characteristics of PCR primers can vary widely. Some assays are not sufficiently robust and fall apart quickly if they are not performed at the optimal Ta of the primers. The temperature profiles of three assays amplifying different targets are demonstrated in Fig. 2,
Fig. 4. Master mix-dependent effects of primer concentration. Amplification was carried using five commercial master mixes and conditions, except for primer concentration, were as described in the legend to Fig. 2.
A. Primers CA-F and CA-R were used at either 300 nM or 600 nM final concentration to amplify fungal genomic DNA and PCR amplicons were detected with SYBR Green chemistry. Doubling the primer concentration has a small deleterious effect with most master mixes, with the maximum effect an increase in Cq of 1.3.
B. Primers BAR-1F and BAR-1R were used at either 300 nM or 600 nM final concentration to amplify bacterial genomic DNA and PCR amplicons were detected with SYBR Green chemistry. Doubling the primer concentration has an enhancing effect with all master mixes, with the maximum effect a decrease in Cq of 2.3
where there is a clear difference in the results obtained for the three primer sets. The assay in panel A is optimal and characterised by a broad optimal Ta range with similar Cqs (maximum ΔCq = 0.53) over a 59 °C–64 °C range. The assay in panel B is not quite as robust, with a maximum ΔCq of 1.63, whereas the assay in panel C has a narrow optimum range between 59 °C and 60 °C, resulting in significantly lower Cqs away from that optimum (ΔCq = 8.93). This is quite separate from the specificity of the amplification reaction, where the melt curves reveal a single peak indicative of the assay remaining specific at the temperatures tested (panels D, E and F). Unfortunately, it is not possible to come up with a fail-safe prediction of which primers or combinations of primer will generate the most temperature-tolerant, efficient, or specific assays. Hence any in silico design must be followed by experimental validation and optimisation. This takes time, effort and costs money − but is an integral part of performing a PCR assay and the more thorough the experimental validation, the more likely any subsequent results will be accurate and not the product of measurement error that has no true biological meaning. A major consideration is the fact that different assays perform differently with different master mixes, so there is no one size fits all rule. This is shown in Fig. 3, where two primer sets show significantly different properties. Primer set A records a robust temperature gradient profile with seven master mixes, although one of the master mixes records much higher Cqs than the rest. In contrast, the results obtained primer set B depend on what master mix was used: three master mixes have a profile that is similarly robust to primer set A, the other four record significantly less robust profiles, with one especially poor. Primer concentrations for SYBR I Green assays tend to be lower (100–400 nM) than for probe-based assays (300–900 nM), but there are always exceptions that prove that rule. The effects of varying primer concentrations can differ dramatically between different primer pairs. Sometimes the effects can be dramatic and extend over several magnitudes, with primer combinations from 50 to 600 nM resulting in a Cq range of more than 20 cycles [31]. Generally, however, the effects are less pronounced as long as primer concentration does not dip below 100 nM, and of concern mainly if sensitivity is the major consideration. This is demonstrated in Fig. 4, where two primer sets at two concentrations were used with five different master mixes. The results demonstrate that optimal primer annealing and concentration conditions are affected by the master mix, with different suppliers using different Mg2+ concentrations and adding undisclosed stabilisers to their buffers. Unfortunately, the results are not consistent between primers and master mixes. Master mixes B-E record higher Cqs with higher primer concentrations for one of the primer sets, whereas master mix A records lower Cqs (Fig. 4A). Yet master mixes B-E record lower Cqs with higher primer concentrations for the second set of primers (Fig. 4B). Hence annealing conditions may need to be re-evaluated when switching from one master mix to another. qPCR assays generally use symmetric primers. However, this results in the reactions typically slowing down and entering the plateau phase in a stochastic manner, because reannealing of the template strands gradually outcompetes primer and probe binding to the template
Fig. 5. Comparison of Clostridium difficile assays targeting the toxin gene tcdB. A. Location of the assays. Amplicon A (blue) is 149 bp, amplicon B (green) is 121 bp and amplicon C (black) is 76 bp. The primers for assay A are tcdB-FA: GTCCATCCTGTTTCCCAAGCAA, tcdB-RA: AGCCACACTTATCTATATATGACGTATTGGA, those for assay B are tcdB-FB: CAACTGAACAAGAAATGGCTAGCTT and tcdB-RB: CTCCTTGTCAACTACTATATTTTGAG, those assay C are tcdB-FC: GCGGCAGCTTATCAAGATTT and tcdB-RC: TTCTTAAATCAGCTTCTATCAAATGG.
B. Mfold analysis indicates no secondary structure issues at the primer binding sites.
C. Amplification plots and melt curves for the PCR amplicons A and B. D. Amplification plots and melt curves for the PCR amplicons B and C. The blue, green and black data were obtained for amplicons A, B or C, respectively. Amplification conditions were as described in the legend to Fig. 2.
strands and sequesters the polymerase. This is a particular problem when the aim is to detect specific DNA targets down to alleles of singlecopy genes in single cells. Asymmetric PCR potentially circumvents the problem of amplicon strand reannealing by using unequal primer concentrations. However, asymmetric amplification can be much less efficient and requires extensive optimisation to identify the proper primer ratios, the amounts of starting material, and the number of amplification cycles that can generate reasonable amounts of product for individual template/target combinations. This issue is addressed by LATE-PCR, which uses unequal primer concentrations but takes into account the effect of the actual primer concentrations on primer annealing [32]. It corrects for the fact that the optimal Ta of the limiting primer is often several degrees below the Ta of excess primer and allows
the asymmetric PCR to proceed as efficiently as symmetric PCR [33]. Furthermore, single stranded amplicons are generated with predictable kinetics for many cycles beyond the exponential phase. This permits uncoupling of primer annealing from product detection via a fluorescent probe. As a result, the Ta of the probe no longer needs to be higher than the Tm of either primer. This permits the use of low-Tm probes, which are inherently more allele-discriminating, generate lower background, and can be used at saturating concentrations without interfering with the efficiency of amplification [34].
5.2. Amplicons
Reliable amplification and precise quantification requires complete doubling of target at each PCR cycle. Hence PCR amplicons detected by probes are generally short, as the suboptimal elongation temperature, generally 60 °C, does not always result in completed products that can be used as templates in further amplifications [35]. Nevertheless, the choice of amplicon size depends on usage and for some applications longer amplicons work better [36,37]. With SYBR Green, one might expect longer PCR amplicons to result in lower Cqs, since they can bind more SYBR Green molecules. However, this expectation is not always born out in practice and shorter amplicons can record lower Cqs than longer ones [35]. This is illustrated by a comparison of the performance of three assays A, B and C, which target the same gene in the human intestinal pathogen Clostridium difficile. The assays amplify sequences 1532–1680, 3589–3710 and 1423–1498 respectively of the bacterial toxin B (tcdB) gene (NC_009089) (Fig. 5A). An Mfold analysis [38] suggests that the three PCR amplicons, comprising 149, 121 and 76 bp, are free from significant secondary structure, with all primer binding sites accessible for primer annealing (Fig. 5B). This is probably helped
by the high AT content of the bacterial DNA. Assay A records a Cq of 27.25, whereas assay B records a Cq of 26.14, i.e. the shorter amplicon is detected 1.1 cycles earlier than the longer amplicon (Fig. 2C). The melt curves are similar, showing a single peak for either assay, with amplicon B having a slightly lower peak at 75 °C, compared with peak for amplicon A at 76 °C. In contrast, a comparison between amplicons B and C generates the expected result: B is significantly longer (121 bp) than amplicon C (76 bp) and records a Cq of 26.14, whereas assay C records a Cq of
Fig. 6. Effects of different master mixes on amplification. Amplification reactions were carried out as described in the legend to Fig. 2.
A. A qPCR assay targeting fungal DNA was used with two sets of forward and reverse primers, which differ mainly at their 3′-ends. The PCR amplicon has no secondary structure issues at the primer binding sites.
B. When used with master mix A, the maximum ΔCq between the primer combinations was 4.64, the equivalent of a 25-fold difference in sensitivity. The assays using CA-R recorded the lowest Cqs, whereas those using CA-RB recorded higher Cqs. When used with master mix B, the maximum ΔCq between the primer combinations was 2.98, the equivalent of an 8-fold difference in sensitivity. The assay using CA-F/CA-R recorded the lowest Cqs, with the other three broadly equivalent.
C. Melt curves for master mixes A (green) and B (blue) show that the specificity of the assays is the same for all primer combinations
27.87, i.e. the longer PCR amplicon is detected 1.7 cycles earlier than the short one. In general, too short an amplicon, (< 80 bp) in SYBR Green assays may result in difficulties when differentiating amplicon and primer dimer(s) and can result in later Cq readings. Hence it is a good idea to keep SYBR Green amplicons a little longer (80–150) than those targeted by probe-based assays (60–90). In addition, SYBR Green has a greater affinity for AT rich than GC–rich DNA, hence G-C rich PCR amplicons may record higher Cqs than A-T rich ones. Melt curve analysis is also not always what it seems as, since whereas SYBR Green intercalates at low dye:base pair ratios, at higher
ratios its conformation changes and it interacts with the minor groove [39]. It is this interaction that results in a significant increase in fluorescence. An important characteristic of SYBR Green binding to DNA is that in PCR the dye:base pair ratio is not constant, since it changes with cycle number as more double stranded DNA is produced. Consequently, melt curve analysis can be influenced by the number of cycles and the amount of DNA present after amplification. Amplicons should always be checked for secondary structure, as areas of extensive secondary structure can be an important cause of reduced amplification efficiency. Possible structures must be checked using the actual reaction conditions, especially the Ta, salt and Mg2+ ion concentrations and the simplest way to check secondary structures is by using the Mfold website. The hybridisation kinetics of the annealing reaction will favour intramolecular binding, obstruct primer binding, reduce priming efficiency and hence reduce the amplification efficiency [40]. Hence the binding regions for primers should be completely accessible; if the predictions suggest any secondary structures at those sites, the amplicon should, where possible, be moved. If the price of avoiding secondary structures at the primer binding sites is a longer amplicon, then that may be a price worth paying, especially with the introduction of the latest fast reagents. Of course, the secondary structures are only predictive, and it may also be necessary to consider the sequences directly upstream or downstream from the amplicon, as they could interfere with the initial stages of the PCR assay [41]. The GC content of amplicons should be as close to 50% as possible and the inclusion of Guanine (G) repeats should be avoided, since they may prevent complete strand dissociation and so also reduce amplification efficiency. However, it is important to realise that these are general rules and in practice, an assay that amplifies a longer product can perform better than a shorter one, many bacterial PCR amplicons will be AT rich and sometimes secondary structures just cannot be avoided. Consequently, although there are guidelines for everything, in practice things often work out different than theory asserts and in the end, all that matters is how a design works in the laboratory. If all else fails, as long as the assay is specific, most other conditions can be tweaked to achieve a satisfactory efficiency. Most importantly, if despite all best efforts an assay is only 85% efficient, then it is acceptable to report the results.
Fig. 7. Linearity of a well-designed qPCR assay. Amplification reactions were carried out as described in the legend to Fig. 2. Ten-fold serial dilutions of target DNA were subjected to amplification with the Asp-F and R primers. The variability recorded by the four replicates increases with decreasing target copy number, until the nominal single copy target fails to record a Cq in 4/5 replicates together with supporting evidence to show standard curves, report precision and limits of quantification and of detection. Thus, the limitations of such an approach can be determined and validity of associated conclusions evaluated
6. Assay optimisation
The precision and reliability of a qPCR assay depends on rigorous optimisation of every aspect of the PCR reaction. In some cases assays may not require extensive optimisation, but for qPCR assays that are performed for challenging measurement such as the precise quantification of small differences in nucleic acid quantities (be they DNA or RNA), reliable detection of pathogens with good sensitivity or specific discrimination of polymorphisms or mutations this is not the case. Thorough optimisation of the PCR protocol, reagents, instrumentation and analysis methodologies are a critical prerequisite for obtaining valid, reproducible results with maximum specificity and sensitivity. Incidentally, sensitivity is not dependent on a given Cq value; high sensitivity is achieved if an assay can reliably amplify and detect low copy number targets. While high detection sensitivity may not be necessary for a given experiment, it is worth remembering that a well optimised qPCR is able to amplify single molecules of DNA and this ability usually goes hand in hand with high efficiency and quantitative precision. The results shown in Fig. 6 demonstrate the effects of using an optimised primer/master mix combination. The assay is targeting fungal DNA and forward and reverse primers bind to regions of the PCR amplicon that are free from secondary structure. There are two forward and two reverse primers and when used in all possible combinations it is evident that with master mix A the CA-F/CA-RB grouping is the worst grouping, as the Cqs translate into assays that are between 8- and 25-
fold less sensitive than those using the CA-R primer. In contrast, only the CA-F/CA-R combination works optimally with master mix B, with the other three recording similar, higher Cqs. The melt curves obtained using the two master mixes are slightly different, which is not unexpected as they contain different ingredients. Quantification by qPCR assumes a linear relationship between the logarithm of the initial template quantity and the Cq value obtained during amplification. This permits calculations of an assay’s amplification efficiency and delineating its limits of detection and quantification [42]. The hallmarks of an optimised qPCR assay are:
• High amplification efficiency (95–105%)
• Linear standard curve (R2 > 0.980)
• High precision between experimental experiments
• Consistency across replicate experiments
• No primer dimers
• Wide dynamic range
Amplification efficiency is best determined by generating a standard curve using serial dilutions of a template and determining the slope from the linear regression of a plot of Cq (y-axis) vs log [quantity] [43]. If perfect doubling occurs with each amplification cycle, the spacing of the fluorescence curves will be determined by the equation 2n = dilution factor, where n is the number of cycles between curves. For example, with a 10-fold serial dilution of DNA, 2n = 10. Therefore, n = 3.32 and the Cq values should increase by approximately 3.32 cycles for every ten-fold dilution. An acceptable evaluation of PCR
Fig. 8. Comparability of two optimised qPCR assays targeting the same gene HIF-1α (NM_181054.2). Amplification reactions were carried out as described in the legend to Fig. 2, except that annealing was carried out using a temperature gradient from 55 °C–65 °C.
A. Exon/intron structure of the gene, with assay 1 (HIF-AF: CCGAGGAAGAACTATGAA and HIF-AR:TGGTTACTGTTGGTATCA) amplifying sequences in exons 5 and 6 and assay 2 (HIFBF: AAGAACTTTTAGGCCGCTCA and HIF-BR:TGTCCTGTGGTGACTTGTCC) amplifying sequences in exons 7 and 8.
B. There are no secondary structures issues at the primer binding sites.
C. Both assays are robust, with the 55°–65° gradient recording similar qs of 24.68 ± 0.07 and 24.32 ± 0.12, respectively. Melt curve analysis shows a single peak.
D. Standard curves are comparable, showing linearity at least over five orders of magnitude
efficiency requires a minimum of three replicates and four, ideally five,orders of magnitude of templateconcentration. This is because even if the assay is 100% efficient, variability in the dilutions will result in a
range of efficiencies when testing a dilution series of a single log. A slope of −3.32 reflects an efficiency of around 100%. A PCR reaction with lower efficiency will have lower sensitivity [5,44]. The R2 value is the square of the correlation between the response values and the predicted response values and measures how successful a fit is in explaining the variation of the data. R2 can take on any value between 0 and 1, with a value closer to 1 indicating that a greater proportion of variance is accounted for by the model. For example, an R2 value of 0.998 means that the fit explains 99.8% of the total variation in the data about the average. An R2 value > 0.980 provides good confidence in correlating Cq and target copy number. The amplification plots and standard curve shown in Fig. 7 illustrate a typical optimised qPCR assay. It shows good linearity over a wide dynamic range from 1 × 108 copies all the way down to a nominal single copy. A wide dynamic range of a well-designed assay is one of the key features of qPCR assays and ensures that target copy numbers varying by vast amounts can be accurately quantified.
If assays are well-designed and properly optimised, it should be possible to get comparable results from separate assays that use the same master mix to detect the same target gene or genome, but amplify different regions on that target. This implies that if two laboratories use optimised assays to amplify, for example, the same mRNA, their results should be equivalent. The example in Fig. 8 demonstrates this nicely. Fig. 8A shows the location of assays amplifying exons 5/6 (A) or exons 7/8 (B) of the hypoxia inducible factor 1, alpha subunit transcript variant 2 gene (NM_181054). An analysis of the secondary structures indicates that there are none at the primer annealing sites (Fig. 8B). The amplification plots in Fig. 8C obtained over a temperature gradient from 55 °C to 65 °C on a BioRad CFX confirm that both assays are comparable and robust in performance resulting in comparable Cqs, with a single peak detected in the melt curve analysis. Standard curves are also similar, with comparable amplification efficiencies (Fig. 8D). However, it must be remembered that while general fold-change differences, especially large ones, may be easier to reproduce, it is far more challenging to obtain comparable absolute quantities. This is because upstream steps prior to the amplification step, such as nucleic extraction and reverse transcription, can contribute considerable uncertainty [45] and employing comparable calibration standards is far from trivial.
7. Conclusions
Knowledgeable and consistent assay design is at the heart of any research project designed to quantify nucleic acids. It must be carried out with care, but can be simplified by following a straightforward workflow, as is detailed above. Reliable qPCR demands good primers. This usually means absolute specificity, absence of hairpin structures or cross-dimerisation potential and temperature tolerance. Good assay design must consider amplicon structure and ensure that primer target sites are free from secondary structure. There are numerous opinions and guidelines available; an internet search for the terms “qPCR assay design” lists 695,000 pages. However, many of these are based on myths or may have been appropriate for legacy PCR but require subtle (or not so subtle) modifications for use with qPCR. Each “new” assay must be properly validated, with in silico validation acting as an initial filter to screen designs and dismiss those that do not fulfil clearly defined quality criteria. Empirical optimisation and validation are an essential, yet frequently neglected, part of any qPCR experiment. This applies to both newly designed assays as well as assays obtained from elsewhere. Finally, results of optimisation and validation processes
should be reported when qPCR data are published.
References
[1] S.A. Bustin, The reproducibility of biomedical research: sleepers awake, Biomol.
Detect. Quantif. 2 (2014) 35–42.
[2] K. Cankar, D. Stebih, T. Dreo, J. Zel, K. Gruden, Critical points of DNA quantification by real-time PCR-effects of DNA extraction method and sample matrix on
quantification of genetically modified organisms, BMC Biotechnol. 6 (2006) 37.
[3] C.C. Raggi, P. Verderio, M. Pazzagli, E. Marubini, L. Simi, P. Pinzani, et al., An
Italian program of external quality control for quantitative assays based on realtime PCR with Taq-Man probes, Clin. Chem. Lab. Med. 43 (2005) 542–548.
[4] G.S. Buzard, D. Baker, M.J. Wolcott, D.A. Norwood, L.A. Dauphin, Multi-platform
comparison of ten commercial master mixes for probe-based real-time polymerase
chain reaction detection of bioterrorism threat agents for surge preparedness,
Forensic Sci. Int. 223 (2012) 292–297.
[5] S. Alemayehu, K.C. Feghali, J. Cowden, J. Komisar, C.F. Ockenhouse, E. Kamau,
Comparative evaluation of published real-time PCR assays for the detection of
malaria following MIQE guidelines, Malar. J. 12 (2013) 277.
[6] S. Lu, A.P. Smith, D. Moore, N.M. Lee, Different real-time PCR systems yield different gene expression values, Mol. Cell. Probes 24 (2010) 315–320.
[7] E. Picard-Meyer, C. Peytavin de Garam, J.L. Schereffer, C. Marchal, E. Robardet,
F. Cliquet, Cross-platform evaluation of commercial real-time SYBR green RT-PCR
kits for sensitive and rapid detection of European bat Lyssavirus type 1, BioMed Res.
Int. 2015 (2015) 839518.
[8] S.A. Bustin, V. Benes, J.A. Garson, J. Hellemans, J. Huggett, M. Kubista, et al., The
MIQE guidelines: minimum information for publication of quantitative real-time
PCR experiments, Clin. Chem. 55 (2009) 611–622.
[9] S.A. Bustin, J.F. Beaulieu, J. Huggett, R. Jaggi, F.S. Kibenge, P.A. Olsvik, et al.,
MIQE precis: practical implementation of minimum standard guidelines for fluorescence-based quantitative real-time PCR experiments, BMC Mol. Biol. 11
(2010) 74.
[10] S.A. Bustin, V. Benes, J.A. Garson, J. Hellemans, J. Huggett, M. Kubista, et al.,
Primer sequence disclosure: a clarification of the MIQE guidelines, Clin. Chem. 57
(2011) 919–921.
[11] J.F. Huggett, C.A. Foy, V. Benes, K. Emslie, J.A. Garson, R. Haynes, et al., The
digital MIQE guidelines: minimum information for publication of quantitative digital PCR experiments, Clin. Chem. 59 (2013) 892–902.
[12] S. Bustin, Transparency of reporting in molecular diagnostics, Int. J. Mol. Sci. 14
(2013) 15878–15884.
[13] J.R. Dijkstra, L.C. van Kempen, I.D. Nagtegaal, S.A. Bustin, Critical appraisal of
quantitative PCR results in colorectal cancer research: can we rely on published
qPCR results, Mol. Oncol. 8 (2014) 813–818.
[14] A.M. Abdel Nour, E. Azhar, G. Damanhouri, S.A. Bustin, Five years MIQE guidelines: the case of the Arabian countries, PLoS One 9 (2014) e88266.
[15] J. Huggett, S.A. Bustin, Standardisation and reporting for nucleic acid quantification, Accredit. Qual. Assur. 16 (2011) 399–405.
[16] S.A. Bustin, T. Nolan, Improving the reliability of peer-reviewed publications: we
are all in it together, Biomol. Detect. Quantif. 7 (2016) A1–5.
[17] S.A. Bustin, T. Nolan, Talking the talk, but not walking the walk: RT-qPCR as a
paradigm for the lack of reproducibility in molecular research, Eur. J. Clin. Invest.
47 (2017) 756–774.
[18] J.M. Robertson, J. Walsh-Weller, An introduction to PCR primer design and optimization of amplification reactions, Methods Mol. Biol. 98 (1998) 121–154.
[19] J. SantaLucia, Physical principles and visual-OMP software for optimal PCR design,
Methods Mol. Biol. 402 (2007) 3–34.
[20] J.J. SantaLucia, D. Hicks, The thermodynamics of DNA structural motifs, Annu.
Rev. Biophys. Biomol. Struct. 33 (2004) 415–440.
[21] N. Muñoz, F.X. Bosch, S. de Sanjosé, R. Herrero, X. Castellsagué, K.V. Shah, et al.,
Epidemiologic classification of human papillomavirus types associated with cervical
cancer, N. Engl. J. Med. 348 (2003) 518–527.
[22] H. Ikenberg, Laboratory diagnosis of human papillomavirus infection, Curr. Probl.
Dermatol. 45 (2014) 166–174.
[23] S.A. Bustin, H.H. Kessler, Amplification and detection methods, in: H.H. Kessler
(Ed.), Molecular Diagnostics of Infectious Diseases, 3rd edition, De Gruyter, Berlin,
2014, pp. 63–84.
[24] Z. Li, P. Bai, D. Peng, B. Long, L. Zhang, W. Liang, Influences of different RT-qPCR
methods on forensic body fluid identification by microRNA, Forensic Sci. Int.:
Genet. Suppl. Series 5 (2015) e295–e297.
[25] J.L. Montgomery, N. Rejali, C.T. Wittwer, Stopped-flow DNA polymerase assay by
continuous monitoring of dNTP incorporation by fluorescence, Anal. Biochem. 441
(2013) 133–139.
[26] J.F. Fryer, S.A. Baylis, A.L. Gottlieb, M. Ferguson, G.A. Vincini, V.M. Bevan, et al.,
Development of working reference materials for clinical virology, J. Clin. Virol. 43
(2008) 367–371.
[27] T.K. Lee, S.R. Murthy, N.X. Cawley, S. Dhanvantari, S.M. Hewitt, H. Lou, et al., An
N-terminal truncated carboxypeptidase E splice isoform induces tumor growth and
is a biomarker for predicting future metastasis in human cancers, J. Clin. Invest. 121
(2011) 880–892.
[28] R. Torelli, M. Sanguinetti, A. Moody, L. Pagano, M. Caira, E. De Carolis, et al.,
Diagnosis of invasive aspergillosis by a commercial real-time PCR assay for
Aspergillus DNA in bronchoalveolar lavage fluid samples from high-risk patients
compared to a galactomannan enzyme immunoassay, J. Clin. Microbiol. 49 (2011)
4273–4278.
[29] K.B. Brajao de Oliveira, R. Losi Guembarovski, A.M.F. Losi Guembarovski, A.C. da
Silva do Amaral Herrera, W.J. Sobrinho, C. Batista Ariza, M.A. Ehara Watanabe,
CXCL12, CXCR4 and IFNγ genes expression: implications for proinflammatory microenvironment of breast cancer, Clin. Exp. Med. 13 (2013) 211–219.
[30] S.A. Bustin, How to speed up the polymerase chain reaction, Biomol. Detect.
Quantif. 12 (2017) 10–14.
[31] T. Nolan, R.E. Hands, S.A. Bustin, Quantification of mRNA using real-time RT-PCR,
Nat. Protoc. 1 (2006) 1559–1582.
[32] J.A. Sanchez, K.E. Pierce, J.E. Rice, L.J. Wangh, Linear-after-the-exponential
(LATE)-PCR: an advanced method of asymmetric PCR and its uses in quantitative
real-time analysis, Proc. Natl. Acad. Sci. U. S. A. 101 (2004) 1933–1938.
[33] K.E. Pierce, J.A. Sanchez, J.E. Rice, L.J. Wangh, Linear-After-The-Exponential
(LATE)-PCR: primer design criteria for high yields of specific single-stranded DNA
and improved real-time detection, Proc. Natl. Acad. Sci. U. S. A. 102 (2005)
8609–8614.
[34] K.E. Pierce, L.J. Wangh, LATE-PCR and allied technologies: real-time detection
strategies for rapid, reliable diagnosis from single cells, Methods Mol. Biol. 688
(2011) 47–66.
[35] F. Debode, A. Marien, E. Janssen, C. Bragard, G. Berben, The influence of amplicon
length on real-time PCR results, Biotechnol. Agron. Soc. Environ. 21 (2017) 3–11.
[36] P.J. Contreras, H. Urrutia, K. Sossa, A. Nocker, Effect of PCR amplicon length on
suppressing signals from membrane-compromised cells by propidium monoazide
treatment, J. Microbiol. Methods 87 (2011) 89–95.
[37] B.K. Martin, S. Raurich, M. Garriga, T. Aymerich, Effect of amplicon length in
propidium monoazide quantitative PCR for the enumeration of viable cells of salmonella in cooked ham, Food Anal Methods. 6 (2013) 683–690.
[38] M. Zuker, Mfold web server for nucleic acid folding and hybridization prediction,
Nucleic Acids Res. 31 (2003) 3406–3415.
[39] H. Zipper, H. Brunner, J. Bernhagen, F. Vitzthum, Investigations on DNA intercalation and surface binding by SYBR Green I, its structure determination and
methodological implications, Nucleic Acids Res. 32 (2004) e103.
[40] Y. Gao, L.K. Wolf, R.M. Georgiadis, Secondary structure effects on DNA hybridization kinetics: a solution versus surface comparison, Nucleic Acids Res. 34
(2006) 3370–3377.
[41] J. Wilhelm, M. Hahn, A. Pingoud, Influence of DNA target melting behavior on realtime PCR quantification, Clin. Chem. 46 (2000) 1738–1743.
[42] A. Forootan, R. Sjöback, J. Björkman, B. Sjögreen, L. Linz, M. Kubista, Methods to
determine limit of detection and limit of quantification in quantitative real-time
PCR (qPCR), Biomol. Detect. Quantif. 12 (2017) 1–6.
[43] S.A. Bustin, Why the need for qPCR publication guidelines?–The case for MIQE,
Methods 50 (2010) 217–226.
[44] C. Hilscher, W. Vahrson, D.P. Dittmer, Faster quantitative real-time PCR protocols
may lose sensitivity and show increased variability, Nucleic Acids Res. 33 (2005)
e182.
[45] R. Sanders, D.J. Mason, C.A. Foy, J.F. Huggett, Considerations for accurate gene
expression measurement by reverse transcription quantitative PCR when analysing
clinical samples, Anal. Bioanal. Chem. 406 (2014) 6471–6483.