Difference between revisions of "CH391L/S14/CAD systems"

From SynBioCyc
Jump to: navigation, search
(Future Directions)
(References)
Line 117: Line 117:
 
#Cai2010 pmid=20167639
 
#Cai2010 pmid=20167639
 
//GenoCAD for iGEM: a grammatical approach to the design of standard-compliant constructs.
 
//GenoCAD for iGEM: a grammatical approach to the design of standard-compliant constructs.
 +
#Medema2012 pmid=22266781
 +
//Computational tools for the synthetic design of biochemical pathways.
 
</biblio>
 
</biblio>

Revision as of 12:57, 26 January 2014

Contents

Introduction

Computer-Aided Design (CAD) tools are software packages which are created to help in designing and engineering new systems. In traditional engineering fields, these programs have long been used to aid in optimizing production processes, modeling chemical reactions, and creating new products. Graphical User Interfaces (GUIs) act as the human-readable visualization of computer languages which are designed to assemble components into useful products or devices. Many of these programs include capabilities for simulating the outcome of a given assembled device as well as automating the assembly with a specific goal in mind. The field of synthetic biology is advancing to the point where high throughput automated design of synthetic biological devices will be necessary to realize the potential of the discipline.

CAD USER PROCESS.jpg

In general, the use of a CAD program in synthetic biology will involve the following steps:

step 1: user draws a biological system


step 2: user performs some analysis


step 3: go back to step 1 if analysis is not satisfactory


The analysis step can potentially include: mathematical analysis of non-linear systems stochastic simulations, structural analysis, and other methods from systems biology prediction of evolutionary trajectories for directed evolution analysis and optimization of the DNA sequence database look-up to find suitable components.

Synthetic Biology CAD Tools

Vector Editor representation of an annotated plasmid sequence
Screen grab from TinkerCell software. A genetic NOR gate is pictured, with the accompanying basic model summary, plot and parameter input forms

Synthetic Biology CAD tools are programs which help to create novel biological constructs. At the most basic, these programs are essentially enhanced DNA editors which provide a user interface to facilitate easier manipulation of the basic “parts” which comprise biological devices. Some of the more advanced programs have a variety of functions including visualization, asserting validity of constructs, and simulations of metabolic networks. In general, CAD programs for synthetic biology should comply with SBOL (Synthetic Biology Open Language) to facilitate use with the Parts Registry and sharing of parts with other researchers.


Basic Design and Alignment Tools

In the majority of CAD programs for biology, the basic program is a GUI for editing and annotating DNA sequences. The interface often provides a way to edit the sequence for parts and devices, in addition to annotating various regions of the DNA. Most programs have, at the very least, a sequence/part editor which will output the information according to various standards for exchanging biological parts, i.e. SBOL. Many also contain visualization features which show the parts assembled into a vector or plasmid in a compact way, as in VectorEditor[1] or Ape. Others also include improved design features such as codon optimization (i.e. Gene Designer 2.0[2]). More advanced transcription/translation optimizer software is also available commercially (GeneOptimizer), and includes considerations such as mRNA secondary structure and GC content in choosing the most productive device design.

Many of the basic DNA editors also allow for the design of primers for traditional cloning. In light of more recent advances in large-scale cloning techniques, some newer programs such as Gibthon provide automated design of primers for Gibson cloning and other new cloning strategies.

BLAST (Basic Local Alignment Search Tool) is a web-based utility which aligns genetic sequences to a reference sequence. This tool is a basic requirement for almost all synthetic biology research, as it is used to verify that the sequencing results of a given part or device match the expected composition for the design. Moreover, BLAST is the tool of choice to detect particular similar/homologous or identical sequences (including non-continuous sequences) within a user defined set of genome sequences from publicly available nucleotide and protein data banks.

Other more advanced alignment programs (such as Chromas or Geneious) will align multiple sequences directly from the trace files which show signal intensity output from sequencing software. The program Geneious is particularly useful in generating a complete and organized view of the genome of choice. From annotating genome sequences, to keeping track of everything related (primer design, genetic modifications, sequence analysis, etc.) to a particular genetic engineering project, these sort of multi-purpose stand-alone software tools are becoming very popular among the synthetic biology community.


Assembly Tools

A complex part (genetic toggle switch) comprised of simple parts (promoters, repressors, reporter) which can assembled and validated in Eugene

Several of the more advanced CAD programs provide features which aid in the assembly of simple biological parts into more complex features and devices. In some cases, the framework provides a way to compile various simple parts into more complex features with error checking to validate the composition of a component. For example, the complex device at right (genetic toggle switch[3]), which is composed of several simple parts (i.e. promoter), can be error-checked using the Eugene Language[4], which strictly defines synthetic biology devices, part types, parts and properties, to validate a functional composition. More advanced algorithms automate the assembly of components by checking the entire set of permutations containing a given group of parts for valid constructs, returning only those designs which are likely to be functional for the desired task. There are also downloadable tools such as Genome Compiler or Gene Composer and web-based tools such as DNAWorks[5] or GeneDesign which are designed to facilitate the assembly of much larger devices from simple and complex parts.

Full Featured Tools

There are a several larger packages which take all of these tools into consideration. From importing large sets of parts in spreadsheet format (i.e. Clotho) to simulating the metabolite levels from a network containing synthetic devices (i.e. Tinker Cell[6]), these integrated packages aim to provide the entire toolbox of CAD capabilities to synthetic biologists. In addition to these full featured packages, some programs are designed solely for the purpose of modeling metabolic networks (i.e. SynBioSS[7]).

j5 is a web-based tool that has multiple design features. It features automated assembly of scar-free devices from multiple biological parts. j5 can perform a variety of assembly protocols, including Gibson, Golden Gate, and circular polymerase extension cloning (CPEC). j5 also showcases engineering-related features such as cost optimization, enforcing design specification rules, and automated construction of combinatorial libraries.[8]

SnapGene Viewer is a software that allows to create, browse, edit and share richly annotated DNA sequence files up to 1 Gb in length. Sequence data may be directly entered, or imported from record from GenBank, or opening an annotated sequence stored in one of many common file formats. It has built-in automatic annotation of common features, such as identification of open reading frame (ORI) with a single mouse click.

GenoCAD has designed framework that can automatically manage the constraints associated with the different standards, this will help the community better leverage ongoing standardization efforts. It uses context-free grammar (CFG) [9] to model the structure of genetic constructs making it possible for users to quickly assemble from a rich library of genetic parts, constructs compliant with any of six BioBrick assembly standards [10]. GenoCAD's design strategy of synthetic genetic constructs in the form of grammatical models allows two different ways in which it can be used: a user can design a synthetic construct by successively selecting design rules to transform the structure of the design; or a user can upload a DNA sequence designed outside GenoCAD to validate its consistency with the grammatical model.

Database Tools

Several software programs are designed for maintaining records of BioBricks or other synthetic constructs. These programs are primarily focused on providing accessibility to collections of parts which are available. One example is the Joint BioEnergy Institute's JBEI GD-ICE program, which is a web-based tool for creating and maintaining a "Inventory of Composable Elements" for a lab group. The tool is primarily designed for creating private databases within a smaller group of researchers, but JBEI also maintains a public database of parts. Clotho also has built-in capability for maintaining a local database of biological parts within a lab group or institution.

There are other important organizations such as Addgene that serve not only as a global, non-profit plasmid repository, but also features a free online cloning vector analysis tool. Their focus is on assembling a high-quality library of published plasmids for use in research and discovery, for both preservation and distribution [11]. Here, plasmids are linked with research articles, allowing easy access to data related to materials requested. The BioBricks Foundation is presently partnering Addgene to distribute plasmids that have been contributed under the BioBrick™ Public Agreement.

Standardizing Representation of Synthetic Biology Designs

TinkerCell representation of parts in a lactose-inducible GFP part

The Synthetic Biology Open Language is an open-source standard for representing designs consisting of both DNA sequence information and higher level annotation of parts with defined roles and behaviors [12]. The core specification of this system has been developed as an RFC [13]. Several different synthetic biology CAD software programs use this format. Representation at this higher level of parts can be visualized and simulated in some of these systems (e.g., TinkerCell).

The Eugene Language[4] is an open-source human-readable language designed to facilitate automatic creation of new devices from a collection of parts. Eugene includes a standardized format for specifying devices and parts as well as constraints on how they can be assembled into higher level devices (i.e. genetic toggle switch). Eugene also features functions for automatic generation of functional assemblies into complex devices. Eugene does not support visualization of constructs.

iGEM Software Tools Development

The iGEM competition for development of software tools is designed to promote creation of publicly available CAD programs for synthetic biology. Similar to the Registry for Standard Biological Parts, the software tools entered into the competition must adhere to certain standards of interoperability and data format in order to facilitate reuse and ease of collaboration among researchers. There are several categories developers can pursue, including specific modular CAD frameworks (i.e. Clotho) as well as sharing data and interfacing with the Parts Registry. iGEM hosts a repository of these open source software packages from past competitions, which is freely available.

One exciting tool is the MoClo Planner, a multi-touch interface for supporting the design of complex and useful biological constructs. It draws information from the MIT Registry of Biological Parts, PubMed, and the iGEM archive. Its design implements Golden Gate Modular Cloning (MoClo) [14], a novel laboratory method that allows the efficient creation of multi-gene constructs from a library of biological parts. Using this method, biological parts are permuted and joined together in a tiered fashion to create new synthetic biology constructs (BioBricks). The MoClo method includes: browsing a library over 2200 biological parts; selecting biological parts based on their function, genetic sequence, and other biological characteristics; computing possible permutations of parts in predefined arrangements; and designing primers and fusion recognition sites.

Future Directions

Although there is a vast collection of useful synthetic biology CAD programs, there is a pressing need for improved standardization and modularity. This includes finding consensus for defining individual components or parts, and the implementation of restrictions intended to simplify the process of building synthetic networks while making these more robust and interchangeable. An existing standard is the standard assembly [15], which has made DNA assembly simpler. In the future, it is anticipated that standards will also exist for describing the dynamics of a part; for example, standard promoter parts might contain a "strength" value, describing its efficiency in recruiting RNA polymerase under some standard environmental condition [16]. Standardization is also important in naming such future values as well as parts to always maintain a computer-readable format such as the Resource Definition Language [17] [18].

The current state of understanding for how DNA parts come together to make a functional biological device is lacking. Advances are coming swiftly with the advent of high-throughput technologies, but Computer Aided Design programs have yet to catch up. Specifically, it is not fully understood how a part changes its function when placed in different devices, so it has proven difficult to create a fully functional, complete language for combining parts efficiently while maintaining their expected functionality. Whereas we are currently capable of modeling metabolic networks to study the effects of a single step in the pathway of synthesis of a relevant material (i.e. biofuel), one can envision a time in the future where the software tools will advance to the point of being able to create de novo networks for the synthesis of completely new products (i.e. non-protein/nucleic acid polymers) within the context of a cell. In the coming years, synthetic biology CAD programs will be able to facilitate the rapid advancement of completely new engineered biological devices [19].

References

Error fetching PMID 21390321:
Error fetching PMID 19874625:
Error fetching PMID 16756672:
Error fetching PMID 20639523:
Error fetching PMID 22718978:
Error fetching PMID 21559524:
Error fetching PMID 10659857:
Error fetching PMID 12000848:
Error fetching PMID 16481661:
Error fetching PMID 19298678:
Error fetching PMID 18410688:
Error fetching PMID 21364738:
Error fetching PMID 17804435:
Error fetching PMID 20167639:
Error fetching PMID 22266781:
  1. Error fetching PMID 22718978: [VectorEditor2012]
    Design, implementation and practice of JBEI-ICE: an open source biological part registry platform and tools.
  2. Error fetching PMID 16756672: [GeneDesigner2006]
    Gene Designer:a synthetic biology tool for constructing artificial DNA segments
  3. Error fetching PMID 10659857: [Togglepaper2000]
    Construction of a genetic toggle switch in Escherichia coli
  4. Error fetching PMID 21559524: [Eugene2011]
    Eugene--a domain specific language for specifying and constraining synthetic biological parts, devices, and systems
  5. Error fetching PMID 19874625: [TinkerCell2009]
    TinkerCell: modular CAD tool for synthetic biology
  6. Error fetching PMID 20639523: [SynBioSS2010]
    SynBioSS designer: a web-based tool for the automated generation of kinetic models for synthetic biological constructs
  7. doi:10.1021/sb2000116 [j52011]
    j5 DNA Assembly Design Automation Software
  8. Error fetching PMID 17804435: [CFG]
    A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts.
  9. Error fetching PMID 20167639: [Cai2010]
    GenoCAD for iGEM: a grammatical approach to the design of standard-compliant constructs.
  10. doi:10.1038/505272a Nature 505, 272 (16 January 2014) [Addgene2014]
    Repositories share key research tools
  11. Error fetching PMID 21390321: [Galdzicki2011]
    Standard biological parts knowledgebase
  12. http://dspace.mit.edu/handle/1721.1/66172 [SBOLRFC]
    Synthetic Biology Open Language (SBOL) Version 1.0.0
  13. Error fetching PMID 21364738: [MoClo]
    A modular cloning system for standardized assembly of multigene constructs.
  14. Error fetching PMID 18410688: [Shetty2008]
    Engineering BioBrick vectors from BioBrick parts.
  15. Error fetching PMID 19298678: [Kelly2009]
    Measuring the activity of BioBrick promoters using an in vivo reference standard.
  16. http://hdl.handle.net/1721.1/45537 [Galdzicki2009]
    Provisional BioBrick Language(PoBoL)
  17. http://openwetware.org/wiki/The_BioBricks_Foundation:Standards/Technical/Exchange [standards]
    Synthetic Biology Open Language (SBOL)
  18. Error fetching PMID 22266781: [Medema2012]
    Computational tools for the synthetic design of biochemical pathways.
  19. Error fetching PMID 12000848: [DNAWorks2002]
    DNAWorks: an automated method for designing oligonucleotides for PCR-based gene synthesis.
  20. Error fetching PMID 16481661: [Genedesign2006]
    GeneDesign: rapid, automated design of multikilobase synthetic genes.
All Medline abstracts: PubMed | HubMed