Now showing items 1-14 of 14

    • A Deep Learning Approach to Antibiotic Discovery 

      Stokes, Jonathan; Yang, Kevin; Swanson, Kyle; Jin, Wengong; Cubillos, Andres Fernando; e.a. (Elsevier BV, 2020-02)
      Due to the rapid emergence of antibiotic-resistant bacteria, there is a growing need to discover new antibiotics. To address this challenge, we trained a deep neural network capable of predicting molecules with antibacterial ...
    • Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape 

      Lewis, Sophia; van Hoff, John Peter; Karun, Vivek; Hashimoto, Tatsunori Benjamin; Sherwood, Richard I.; e.a. (Nature Publishing Group, 2014-01)
      We describe protein interaction quantitation (PIQ), a computational method for modeling the magnitude and shape of genome-wide DNase I hypersensitivity profiles to identify transcription factor (TF) binding sites. Through ...
    • A graph-convolutional neural network model for the prediction of chemical reactivity 

      Coley, Connor W.; Jin, Wengong; Rogers, Luke; Jamison, Timothy F.; Jaakkola, Tommi S.; e.a. (Royal Society of Chemistry (RSC), 2019-01)
      © 2019 The Royal Society of Chemistry. We present a supervised learning approach to predict the products of organic reactions given their reactants, reagents, and solvent(s). The prediction task is factored into two stages ...
    • Hierarchical Dirichlet Process-Based Models For Discovery of Cross-species Mammalian Gene Expression 

      Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-07-06)
      An important research problem in computational biology is theidentification of expression programs, sets of co-activatedgenes orchestrating physiological processes, and thecharacterization of the functional breadth of these ...
    • Molding CNNs for text: Non-linear, non-consecutive convolutions 

      Lei, Tao; Barzilay, Regina; Jaakkola, Tommi S. (Association for Computational Linguistics, 2015-09)
      The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead ...
    • On the Dirichlet Prior and Bayesian Regularization 

      Steck, Harald; Jaakkola, Tommi S. (2002-09-01)
      A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data ...
    • Prediction of Organic Reaction Outcomes Using Machine Learning 

      Coley, Connor W.; Barzilay, Regina; Jaakkola, Tommi S.; Green, William H.; Jensen, Klavs F.; e.a. (American Chemical Society (ACS), 2017-04)
      Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions ...
    • (Semi-)Predictive Discretization During Model Selection 

      Steck, Harald; Jaakkola, Tommi S. (2003-02-25)
      In this paper, we present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which ...
    • Semi-supervised question retrieval with gated convolutions 

      Lei, Tao; Joshi, Hrishikesh S.; Barzilay, Regina; Jaakkola, Tommi S. (North American Chapter of the Association for Computational Linguistics, 2016-06)
      Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions. In this paper, we develop a methodology for finding ...
    • A synergistic DNA logic predicts genome-wide chromatin accessibility 

      Sherwood, Richard I.; Emons, Bart J.M.; Hashimoto, Tatsunori Benjamin; Kang, Daniel D.; Rajagopal, Nisha; e.a. (Cold Spring Harbor Laboratory Press, 206-08)
      Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence ...
    • Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data 

      Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-06-25)
      Table 1 (Supplemental): Summary of recurrent expression programs (EPs) discovered by GeneProgram. The columns are: (1) the EP identifier (an arbitrarily assigned number), (2) the number of genes in the EP, (3) the number ...
    • Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2 

      Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-06-25)
      Table 2 (Supplemental): Complete data for all 100 recurrent expression programs (EPs) discovered by GeneProgram. Each EP has two identifying rows, a list of meta-genes, and a list of significantly enriched GO categories. ...
    • Ten pairs to tag - Multilingual POS tagging via coarse mapping between embeddings 

      Gaddy, David M.; Zhang, Yuan; Barzilay, Regina; Jaakkola, Tommi S. (Association for Computational Linguistics, 2016-06)
      In the absence of annotations in the target language, multilingual models typically draw on extensive parallel resources. In this paper, we demonstrate that accurate multilingual partof-speech (POS) tagging can be done ...
    • Tight bounds for the expected risk of linear classifiers and PAC-bayes finite-sample guarantees 

      Honorio Carrillo, Jean; Jaakkola, Tommi S. (Journal of Machine Learning Research, 2014-04)
      We analyze the expected risk of linear classifiers for a fixed weight vector in the “minimax” setting. That is, we analyze the worst-case risk among all data distributions with a given mean and covariance. We provide a ...