Browsing by Author "Jaakkola, Tommi S."

A Deep Learning Approach to Antibiotic Discovery

Stokes, Jonathan; Yang, Kevin; Swanson, Kyle; Jin, Wengong; Cubillos, Andres Fernando; e.a. (Elsevier BV, 2020-02)

Due to the rapid emergence of antibiotic-resistant bacteria, there is a growing need to discover new antibiotics. To address this challenge, we trained a deep neural network capable of predicting molecules with antibacterial ...

Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape

Lewis, Sophia; van Hoff, John Peter; Karun, Vivek; Hashimoto, Tatsunori Benjamin; Sherwood, Richard I.; e.a. (Nature Publishing Group, 2014-01)

We describe protein interaction quantitation (PIQ), a computational method for modeling the magnitude and shape of genome-wide DNase I hypersensitivity profiles to identify transcription factor (TF) binding sites. Through ...

A graph-convolutional neural network model for the prediction of chemical reactivity

Coley, Connor W.; Jin, Wengong; Rogers, Luke; Jamison, Timothy F.; Jaakkola, Tommi S.; e.a. (Royal Society of Chemistry (RSC), 2019-01)

© 2019 The Royal Society of Chemistry. We present a supervised learning approach to predict the products of organic reactions given their reactants, reagents, and solvent(s). The prediction task is factored into two stages ...

Hierarchical Dirichlet Process-Based Models For Discovery of Cross-species Mammalian Gene Expression

Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-07-06)

An important research problem in computational biology is theidentification of expression programs, sets of co-activatedgenes orchestrating physiological processes, and thecharacterization of the functional breadth of these ...

Molding CNNs for text: Non-linear, non-consecutive convolutions

Lei, Tao; Barzilay, Regina; Jaakkola, Tommi S. (Association for Computational Linguistics, 2015-09)

The success of deep learning often derives from well-chosen operational building blocks. In this work, we revise the temporal convolution operation in CNNs to better adapt it to text processing. Instead ...

On the Dirichlet Prior and Bayesian Regularization

Steck, Harald; Jaakkola, Tommi S. (2002-09-01)

A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data ...

Prediction of Organic Reaction Outcomes Using Machine Learning

Coley, Connor W.; Barzilay, Regina; Jaakkola, Tommi S.; Green, William H.; Jensen, Klavs F.; e.a. (American Chemical Society (ACS), 2017-04)

Computer assistance in synthesis design has existed for over 40 years, yet retrosynthesis planning software has struggled to achieve widespread adoption. One critical challenge in developing high-quality pathway suggestions ...

(Semi-)Predictive Discretization During Model Selection

Steck, Harald; Jaakkola, Tommi S. (2003-02-25)

In this paper, we present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which ...

Semi-supervised question retrieval with gated convolutions

Lei, Tao; Joshi, Hrishikesh S.; Barzilay, Regina; Jaakkola, Tommi S. (North American Chapter of the Association for Computational Linguistics, 2016-06)

Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions. In this paper, we develop a methodology for finding ...

A synergistic DNA logic predicts genome-wide chromatin accessibility

Sherwood, Richard I.; Emons, Bart J.M.; Hashimoto, Tatsunori Benjamin; Kang, Daniel D.; Rajagopal, Nisha; e.a. (Cold Spring Harbor Laboratory Press, 206-08)

Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence ...

Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data

Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-06-25)

Table 1 (Supplemental): Summary of recurrent expression programs (EPs) discovered by GeneProgram. The columns are: (1) the EP identifier (an arbitrarily assigned number), (2) the number of genes in the EP, (3) the number ...

Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2

Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. (2007-06-25)

Table 2 (Supplemental): Complete data for all 100 recurrent expression programs (EPs) discovered by GeneProgram. Each EP has two identifying rows, a list of meta-genes, and a list of significantly enriched GO categories. ...

Ten pairs to tag - Multilingual POS tagging via coarse mapping between embeddings

Gaddy, David M.; Zhang, Yuan; Barzilay, Regina; Jaakkola, Tommi S. (Association for Computational Linguistics, 2016-06)

In the absence of annotations in the target language, multilingual models typically draw on extensive parallel resources. In this paper, we demonstrate that accurate multilingual partof-speech (POS) tagging can be done ...

Tight bounds for the expected risk of linear classifiers and PAC-bayes finite-sample guarantees

Honorio Carrillo, Jean; Jaakkola, Tommi S. (Journal of Machine Learning Research, 2014-04)

We analyze the expected risk of linear classifiers for a fixed weight vector in the “minimax” setting. That is, we analyze the worst-case risk among all data distributions with a given mean and covariance. We provide a ...