Molecular biologists have long dreamed of being able to manipulate DNA in living cells with ease and precision, and by now that dream is nearly a reality. TALE-based designer proteins, introduced just a few years ago, are arguably the most user-friendly and precise DNA-directed tools that have yet been invented. Designer TALEs (transcription-activator-like effectors) are based on natural TALE proteins that are produced by some plant-infecting bacteria. These natural TALEs help bacteria subvert their plant hosts by binding to specific sites on plant DNA and boosting the activity of certain genes—thereby enhancing the growth and survival of the invading bacteria. Scientists have found that they can easily engineer the DNA-grabbing segment of TALE proteins to bind precisely to a DNA sequence of interest. Typically they join that DNA-binding segment to another protein segment that can perform some desired function at the site of interest—for example, an enzyme fragment that cuts through DNA. Collectively the Barbas laboratory and others in this field have already engineered thousands of these powerful TALE-based DNA-editing proteins.However, TALE-based DNA-editing has been seen to have a significant limitation. Virtually all the natural TALE proteins that have been discovered so far target sequences of DNA whose transcription begins with the nucleoside thymidine—the letter “T” in the four-letter DNA code. Structural studies have hinted that natural TALE proteins can’t bind well to DNA without that initial T. Molecular biologists thus have widely assumed that the same “T restriction” rule applies to any artificial TALE protein they might engineer.“Yet no one has investigated thoroughly whether that initial thymidine is truly required for the variety of TALE designer proteins and enzymes that now exist,” said Brian M. Lamb, a research associate in the Barbas Laboratory.
Recent structural data indicate that there is an interaction between the N-terminal domain (NTD) and a 5′ T of the target sequence. A survey of the recent TALE nuclease (TALEN) literature yielded conflicting data regarding the importance of the first base of the target sequence, the N0 residue.
Additionally, there have been no studies regarding the impact of the N0 base on the activities of TALE recombinases (TALE-Rs). Recently, Barbas Laboratory quantified the impact of the N0 base in the binding regions of TALE-Rs, TALE-TFs, TALE DNA-binding domains expressed as fusions with maltose binding protein (MBP-TALEs) and TALENs. Each of these TALE platforms have distinct N- and C-terminal architectures, but all demonstrated highest activity when the N0 residue was a thymidine. To simplify the rules for constructing effective TALEs in these platforms, and allow precision genome engineering applications at any arbitrary DNA sequence, they devised a structure-guided activity selection using our recently developed TALE-R system. Novel NTD sequences were identified that provided highly active and selective TALE-R activity on TALE binding sites with 5′ G, and additional domain sequences were selected that permitted general targeting of any 5′ N0 residue. These domains were imported into TALE-TF, MBP-TALE and TALEN architectures and consistently exhibited greater activity than did the wild-type NTD on target sequences with non-T 5′ residues. These novel NTDs are compatible with the golden gate TALEN assembly protocol and now make possible the efficient construction of TALE transcription factors, recombinases, nucleases and DNA-binding proteins that recognize any DNA sequence allowing for precise and unconstrained positioning of TALE-based proteins on DNA without regard to the 5′ T rule that limits most natural TALE proteins.