Publications by authors named "Yangqiu Song"

Article Synopsis
  • Images can provide important information to help computers understand things better.
  • Current methods for getting this information are limited because they use specific formats or types of relationships.
  • OpenVik is a new tool that finds and generates useful information from images without strict formats, making it more flexible and improving how computers can use visual information.
View Article and Find Full Text PDF

Visualization recommendation or automatic visualization generation can significantly lower the barriers for general users to rapidly create effective data visualizations, especially for those users without a background in data visualizations. However, existing rule-based approaches require tedious manual specifications of visualization rules by visualization experts. Other machine learning-based approaches often work like black-box and are difficult to understand why a specific visualization is recommended, limiting the wider adoption of these approaches.

View Article and Find Full Text PDF

As a fundamental task, document similarity measure has broad impact to document-based classification, clustering and ranking. Traditional approaches represent documents as bag-of-words and compute document similarities using measures like cosine, Jaccard, and dice. However, entity phrases rather than single words in documents can be critical for evaluating document relatedness.

View Article and Find Full Text PDF

One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect supervision. World knowledge is general-purpose knowledge, which is not designed for any specific domain.

View Article and Find Full Text PDF

ImageHive communicates information about an image collection by generating a summary image that preserves the relationships between images and avoids occluding their salient parts. It uses a constrained graph-layout algorithm first, to preserve image similarities and keep important parts visible, and then a constrained Voronoi tessellation algorithm to locally refine the layout and tile the image plane.

View Article and Find Full Text PDF

Understanding how topics evolve in text data is an important and challenging task. Although much work has been devoted to topic analysis, the study of topic evolution has largely been limited to individual topics. In this paper, we introduce TextFlow, a seamless integration of visualization and topic mining techniques, for analyzing various evolution patterns that emerge from multiple topics.

View Article and Find Full Text PDF

Spectral clustering algorithms have been shown to be more effective in finding clusters than some traditional algorithms, such as k-means. However, spectral clustering suffers from a scalability problem in both memory use and computational time when the size of a data set is large. To perform clustering on large data sets, we investigate two representative ways of approximating the dense similarity matrix.

View Article and Find Full Text PDF