Systems and Techniques for Efficient Real-World Graph Analytics

M. F. da Trindade, Joana

dc.contributor.advisor	Madden, Samuel R.
dc.contributor.author	M. F. da Trindade, Joana
dc.date.accessioned	2024-03-21T19:08:53Z
dc.date.available	2024-03-21T19:08:53Z
dc.date.issued	2024-02
dc.date.submitted	2024-02-21T17:19:05.446Z
dc.identifier.uri	https://hdl.handle.net/1721.1/153832
dc.description.abstract	Graphs are a natural way to model real-world entities and relationships between them, ranging from social networks and biological datasets to cloud computing infrastructure data lineage graphs. Queries over these large graphs often involve expensive subgraph traversals and complex analytical computations. Furthermore, real-world graphs often encode relationships that evolve through time, which are modeled as a temporal graph, i.e., one in which edges are associated to time attributes (such as start time and duration). These real-world graphs are often substantially more structured than a generic vertex-and-edge model would suggest, but this insight has remained mostly unexplored by existing graph engines for graph query optimization purposes. In addition, most temporal graph processing systems remain inefficient as they rely on distributed processing even for graphs that fit well within a commodity server’s available storage. To this end, we present two distinct systems tailored for optimizing large-scale real-world graph processing. The first system, Kaskade, targets the challenges of efficient query evaluation by leveraging structural properties of graphs and queries to infer materialized graph views, speeding up query evaluation by rewriting queries in terms of views it deems beneficial based on input graph and query characteristics. The second system, Kairos, introduces selective indexing, a technique that chooses a subset of target vertices to index based on characteristics of the underlying temporal graphs and input queries. This system further employs a highly-specialized parallel data structure aimed at in-memory storage and fast retrieval of temporal edges. Finally, Kairos is built upon Ligra, the de facto benchmark system in shared-memory parallel graph processing, offering similar advantages and a familiar API to application programmers. Both systems offer speedups of up to 50-60x when compared with alternative baselines, and introduce novel classes of query optimization techniques aimed at efficient real-world graph analytics.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Systems and Techniques for Efficient Real-World Graph Analytics
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: matosfonsecadatrindade-jfon-ph ...
Size:: 1.533Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record