Back to top

Validation of Random Library for Aptamer Selection

By Gregory Penner (NeoVentures Biotechnology) and Natasha Paul, PhD (TriLink Biotechnologies, LLC)

As an oligo manufacturer specializing in randomers, TriLink is committed to providing all aptamer identification groups with the best starting libraries possible by employing its unique A/C/G/T blending and purification expertise. NeoVentures is committed to providing their clients with the best aptamers possible for all targets.


One of the key assumptions of aptamer selection is that the composition of the starting random library is in fact random. Any bias from randomness can cause misrepresentation of sequence space, which can in turn limit the search for aptamers that bind the desired target with high specificity. NeoVentures and TriLink came together to confirm the randomness of the TriLink library manufacturing process.


TriLink prepared a DNA library template containing a 40mer random region, flanked by 20mer fixed sequence regions. The 40mer random region was prepared to contain an equal ratio of all four nucleotides (A/C/G/T). Upon preparation, NeoVentures and TriLink characterized the randomness of the DNA library. In particular, two measures of the deviation from randomness were assessed:

  1. Deviation from expected nucleotide frequency.
  2. Deviation in regard to nucleotide coupling.

First, the DNA library was analyzed using TriLink's in-house enzymatic digestion assay to assess the overall nucleotide composition (Figure 1A). This digestion assay is coupled with an HPLC analysis step to determine the relative number of A/C/G/T nucleotides across the entire sequence. Next, NeoVentures sequenced 61 unselected random sequences from this library and analyzed the sequences using their proprietary consensus motif software to assess randomness (Figure 1B). In contrast to the enzymatic digestion method, this approach provided a deeper look at the random region itself. The same sequencing and digestion methods were applied to a DNA library obtained from another commercial source ('Other').

Figure 1A. Distribution of nucleotide frequency by enzymatic digestion.
Figure 1B. Distribution of nucleotide frequency by sequencing.

When the DNA library was analyzed using an enzymatic digestion assay (Figure 1A), tight correlation was seen between the expected and found frequencies for the total nucleotides within the sequence. When compared to the expected nucleotide frequency (Figure 1B), the TriLink library did not significantly deviate from randomness, as determined by sequencing of 2,431 nucleotides within the random region. In comparison, both digestion and sequencing analyses of the 'Other' library, revealed significant deviations from the expected random nucleotide frequency, with over-representation of T and under-representation of A.

Next, NeoVentures determined the expected distribution of each trinucleotide motif based on the observed frequency of the nucleotides within each library. This expected frequency was subtracted from the observed frequency and divided by the theoretical standard deviation for all possible three nucleotide motifs to generate Z values. The Z values were converted into probability estimates (p) and the distribution of these values was graphed in Figure 2.

Figure 2. Deviation in regard to nucleotide coupling.

In the TriLink library 60 out of the 64 possible trinucleotide motifs were observed at a frequency that indicated no deviation from the mean. The remaining four motifs were distributed at a frequency greater than p = 0.01, or less than p = 0.99. This is within acceptable limits for sampling error.

The analysis of the 'Other' library showed 15 motifs at probability levels beyond those observed for the TriLink library. This indicates that the 'Other' library also exhibits bias in motif frequency even when nucleotide distribution throughout the entire library is considered. The TriLink library did not exhibit statistically significant bias in the manner in which nucleotides are coupled.

The TriLink library passed both tests for random distribution.

To learn more about this analysis please contact Natasha Paul at [email protected]. To learn more about TriLink's aptamer offering visit Custom libraries may be priced and ordered through To learn more about next generation sequencing and custom aptamer selection or to place an order for a custom aptamer selection project contact NeoVentures at [email protected].