We study theta-joins in general and join predicates with conjunctions and disjunctions of inequalities in particular, focusing on where the answers are returned incrementally in an order dictated by a given ranking function. Our approach achieves strong time and space complexity properties: with denoting the number of tuples in the database, we guarantee for acyclic full join queries with inequality conditions that for value of , the top-ranked answers are returned in time. This is within a polylogarithmic factor of , i.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
January 2022
Node-link visualizations are a familiar and powerful tool for displaying the relationships in a network. The readability of these visualizations highly depends on the spatial layout used for the nodes. In this paper, we focus on computing layered layouts, in which nodes are aligned on a set of parallel axes to better expose hierarchical or sequential relationships.
View Article and Find Full Text PDFProceedings VLDB Endowment
May 2020
We study ranked enumeration of join-query results according to very general orders defined by selective dioids. Our main contribution is a framework for ranked enumeration over a class of dynamic programming problems that generalizes seemingly different problems that had been studied in isolation. To this end, we extend classic algorithms that find the -shortest paths in a weighted graph.
View Article and Find Full Text PDFhave been studied intensively in the database community and they are an important means to reduce query cost when only the "best" or "most interesting" results are needed instead of the full output. While some optimality results exist, e.g.
View Article and Find Full Text PDFProc ACM SIGMOD Int Conf Manag Data
June 2020
We consider running-time optimization for band-joins in a distributed system, e.g., the cloud.
View Article and Find Full Text PDFDistrib Parallel Databases
September 2019
We consider data analytics workloads on distributed architectures, in particular clusters of commodity machines. To find a job partitioning that minimizes running time, a cost model, which we more accurately refer to as makespan model, is needed. In attempting to find the simplest possible, but sufficiently accurate, such model, we explore piecewise linear functions of input, output, and computational complexity.
View Article and Find Full Text PDFWe study distributed equi-join computation in the presence of join-attribute skew, which causes load imbalance. Skew can be addressed by more fine-grained partitioning, at the cost of input duplication. For random load assignment, e.
View Article and Find Full Text PDFProc Int World Wide Web Conf
April 2018
Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top- matches according to a ranking function over edge and node weights. For users, it is difficult to select value .
View Article and Find Full Text PDFProc 5th Int Workshop Explor Search Databases Web (2018)
June 2018
We recently proposed the notion of , together with the KARPET algorithm, for tree-pattern search in labeled graphs. Any- extends top- by not requiring a pre-specified value of . Instead, an any- algorithm returns as many of the top-ranked results as possible, for a given time budget.
View Article and Find Full Text PDFBig Data Anal Knowl Discov (2017)
August 2017
We consider data analytics workloads on distributed architectures, in particular clusters of commodity machines. To find a job partitioning that minimizes running time, a cost model, which we more accurately refer to as makespan model, is needed. In attempting to find the simplest possible, but sufficiently accurate, such model, we explore piecewise linear functions of input, output, and computational complexity.
View Article and Find Full Text PDFThe distributions of animal populations change and evolve through time. Migratory species exploit different habitats at different times of the year. Biotic and abiotic features that determine where a species lives vary due to natural and anthropogenic factors.
View Article and Find Full Text PDF