Show simple item record

dc.contributor.advisorMadden, Samuel R.
dc.contributor.authorM. F. da Trindade, Joana
dc.date.accessioned2024-03-21T19:08:53Z
dc.date.available2024-03-21T19:08:53Z
dc.date.issued2024-02
dc.date.submitted2024-02-21T17:19:05.446Z
dc.identifier.urihttps://hdl.handle.net/1721.1/153832
dc.description.abstractGraphs are a natural way to model real-world entities and relationships between them, ranging from social networks and biological datasets to cloud computing infrastructure data lineage graphs. Queries over these large graphs often involve expensive subgraph traversals and complex analytical computations. Furthermore, real-world graphs often encode relationships that evolve through time, which are modeled as a temporal graph, i.e., one in which edges are associated to time attributes (such as start time and duration). These real-world graphs are often substantially more structured than a generic vertex-and-edge model would suggest, but this insight has remained mostly unexplored by existing graph engines for graph query optimization purposes. In addition, most temporal graph processing systems remain inefficient as they rely on distributed processing even for graphs that fit well within a commodity server’s available storage. To this end, we present two distinct systems tailored for optimizing large-scale real-world graph processing. The first system, Kaskade, targets the challenges of efficient query evaluation by leveraging structural properties of graphs and queries to infer materialized graph views, speeding up query evaluation by rewriting queries in terms of views it deems beneficial based on input graph and query characteristics. The second system, Kairos, introduces selective indexing, a technique that chooses a subset of target vertices to index based on characteristics of the underlying temporal graphs and input queries. This system further employs a highly-specialized parallel data structure aimed at in-memory storage and fast retrieval of temporal edges. Finally, Kairos is built upon Ligra, the de facto benchmark system in shared-memory parallel graph processing, offering similar advantages and a familiar API to application programmers. Both systems offer speedups of up to 50-60x when compared with alternative baselines, and introduce novel classes of query optimization techniques aimed at efficient real-world graph analytics.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleSystems and Techniques for Efficient Real-World Graph Analytics
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record