Pattern Recognition, Indexing and Social Media Analysis
This research project proposes to study, design, implement, and test 3D shape matching algorithms based on local descriptors, with a special emphasis in efficiency issues. The project is divided in three main topics: applying the metric space approach for 3D object retrieval, developing a hierarchical matching scheme using local features, and developing similarity search algorithms over 3D objects modeled as point clouds. We will test and validate our proposed algorithm by extensive experimental evaluations using the standard 3D collections sets (e.g., the collections from the SHREC Contests), and also from our own created data collections specifically aimed for testing efficiency and effectiveness. As a result of this research project, we expect to develop efficient algorithms for 3D shape matching, that are able to process and query data collections of tens of thousands of 3D models.
Similarity search in multimedia database systems is becoming increasingly important, due to a rapidly growing amount of available multimedia data like images, audio files, video clips, 3D objects, time series, and text documents. As we see progress in the fields of acquisition, storage, and dissemination of various multimedia formats, the application of effective and efficient database management systems becomes indispensable in order to handle these formats. Application domains for multimedia databases include molecular biology, medicine, geographical information systems, Computer Aided Design/Computer Aided Manufacturing (CAD/CAM), and virtual reality.
Many of these practical applications have in common that the objects of the database are modeled in a metric space, i.e., it is possible to define a positive real-valued function among the objects, called metric, that satisfies the properties of strict positiveness, symmetry, and the triangle inequality. The main motivations to model multimedia databases as metric spaces are: the modeling is usually fast and easily parameterizable; there are many metric functions that can be efficiently computed; they are easily indexable by metric access methods.
A recent proposal to improve the effectiveness (i.e., the quality of the retrieved answer) of similarity search resorts to the use of combinations of metrics. Instead of using a single metric to compare two objects, the search system uses a linear combination of metrics (also known as multi-metric) to compute the (dis)similarity between two objects. This novel framework for searching on multimedia databases has shown to provide considerable improvements in the effectiveness of the search. However, there are still many research topics within this framework that must be addressed to reach its full potential.
This research project proposes to study several improvements to the multi-metric framework. Our goal is to produce several novel algorithms and methods aimed to improve the effectiveness as well as the efficiency of similarity search in multimedia databases based on the multi-metric approach. The main ideas that we plan to pursue are: new methods for computing the combinations of metrics, several new index structures for supporting multi-metric spaces, a novel approach that uses sets of indices for improving the efficiency of the search in multi-metric spaces, and new algorithms for similarity search with improved space cost. Additionally, we plan to study other novel trends for searching in multimedia databases.
To achieve our goals, we plan to develop and implement several novel algorithms and data structures, and then to perform exhaustive experimental evaluations of the proposed methods compared with the state-of-art techniques, both in terms of their efficiency (CPU or I/O time) as well as their effectiveness (using tools that come from the information retrieval community). With these extensive evaluations, we expect to assess the real gains provided by the new techniques for searching in multimedia databases.
We divide our proposed ideas in two main areas: those oriented to improve the effectiveness of the search, and those oriented to improve the efficiency of the search.
Improving the effectiveness of multimedia databases is related to:
Whereas, improving the efficiency of multimedia databases requires depth analysis in:
As a result of this research project, we expect to make further advances in the theoretical foundations of multimedia databases, producing several scientific publications to present our results and producing implementations of our developed techniques. The methods that we plan to research are general and can be used with any particular multimedia database. That is, we are not restricting ourselves to propose solutions for some specific multimedia data type, but our results could be applied to any kind of multimedia database.
Unsupervised multimedia content mining aims at discovering in an unsupervised manner repeating motifs within multimedia data such as video or speech, an emerging field which as received limited attention so far in spite of numerous potential applications. MAXIMUM aims at proposing, studying and evaluating efficient and scalable approaches to unsupervised motif discovery in multimedia sequences, both from a fundamental point of view and an application point of view. We will develop fundamental technology at the frontier of multimedia content analysis, multimedia indexing, database management and bioinformatics technology to propose new scalable approaches to motif discovery and temporal sequence indexing in a multimedia framework. In particular, we will investigate indexing techniques for high dimensional temporal sequences and symbolic representations of multimedia data, two key techniques for efficient motif discovery. Based on these techniques and on previous work, we will study efficient algorithmic architectures for motif discovery at scale in multimedia data, eventually accounting for variability in motifs. Achievements will be instantiated and demonstrated in a number of tasks, from efficient retrieval of multimedia sequences to variability tolerant unsupervised discovery of repeating motifs. The projects will bring together Brazilian, Chilean and French labs, all with a strong background in multimedia content analysis, indexing and mining, with the dual goal of reinforcing existing collaborations and of fostering new ones so as to gain leadership in an emerging research topic.
May 2014: Kick-off meeting
The kick-off meeting of the Project MAXIMUM was held by video conference on May 12th. During this videoconference, we discussed about our current research topics related with this project, and decided to make a Project’s Workshop in October in Brazil.
September 2014: Visit of Prof. Eduardo Valle to DCC
Prof. Eduardo Alves do Valle Jr. (UNICAMP) made a research visit to the Department of Computer Science (DCC), University of Chile. This mission lasted 10 days (Sept. 3rd to Sept. 12th). During his stay, Prof. Valle gave the talk “Advances on Locality-Sensitive Hashing for Large-Scale Indexing on General Metric Spaces” at DCC.
October 2014: Project's Workshop and Visit to Belo Horizonte
Benjamin Bustos and Juan Manuel Barrios travelled to Belo Horizonte, Brazil, to participate at the Project’s Workshop, held at the Federal University of Minas Gerais (UFMG). This mission lasted 10 days (Oct. 6th to Oct. 15th). During this mission, we had talks with Arnaldo de A. Araújo (UFMG), Jefersson Alex dos Santos (UFMG), Eduardo Valle (UNICAMP), Silvio Jamil Guimaraes (PUC Minas), Guillaume Gravier (IRISA), Simon Malinowski (IRISA), and with several graduate students from UFMG. Benjamin Bustos gave the seminar “Anomaly Detection in Streaming Time Series Based on Bounding Boxes” at the Workshop. Also, Juan Manuel Barrios gave the seminar “Search in Catalogs and TV Monitoring” at the Workshop.
Additionally, Benjamin Bustos was part of the Ph.D. Thesis Committee of Carlos A. F. Pimentel Filho (Ph.D. student at UFMG). Dr. Pimentel succesfully defended his doctoral thesis on October 8th.
January 2015: Doctoral Proposal Exam of Heider Sanchez
On January 30th took place the Doctoral Proposal Exam of Heider Sanchez, Ph.D. student at University of Chile (advisor: Benjamin Bustos). The external member of the Exam Committee was Guillaume Gravier (IRISA), who kindly accepted the invitation from our Ph.D. Graduate Committee. The exam took place via videoconference.
May 2015: Visit to IRISA, Rennes, France
Benjamin Bustos and Juan Manuel Barrios travelled to Rennes, France, for visiting the research group lead by Guillaume Gravier at IRISA. This mission lasted 6 days (May 17th—22th). During this mission, we had research meetings with Guillaume Gravier (IRISA), Simon Malinowski (IRISA), and Laurent Amsaleg (IRISA). Also during the visit, Benjamin Bustos gave the seminar “Anomaly Detection in Streaming Time Series Based on Bounding Boxes”, and Juan Manuel Barrios gave the seminar “Search in Catalogs and Video Analysis”.
As result of this mission, we started collaboration with Simon Malinowski on Time Series Mining, the main research topic of our Ph.D. student Heider Sanchez.
December 2015: Visit to Santiago, Chile, and Project's Workshop
During the week Dec. 7th—11th 2015, we received the visit of Arnaldo de Albuquerque Araujo (UFMG, Brazil), Silvio Jamil Ferzoli Guimaraes (PUC Minas, Brazil) and Simon Malinowski (IRISA, France). During their visit, we held research meetings at DCC, University of Chile and at ORAND. During their visit to ORAND, Juan Manuel Barrios introduced our international guests to the research projects the company is involved. At DCC, we organized the Project’s Workshop for year 2015 (Dec. 9th—10th).
List of talks given at the Workshop:
Title: Similarity search in time series using local features
Abstract: Time series is a useful representation form for temporal data. Analysis over this kind of data (e.g., classification and clustering) may require performing a similarity search over a collection of time series. In this talk, we present an efficient approach for approximate searches of time series, which considers the extraction of local feature vectors for generating a concise representation of the time series called Feature Signature. It then uses a distance measure to compute the similarity value between signatures. Here we use multidimensional distances as the Signature Quadratic Form Distance (SQFD), a distance measure proposed for image retrieval. We test our approach for a classification task in a standard reference collection, and show that it up to three orders of magnitude faster than the best existent configuration of the DTW in collections with long time series.
Title: Query-centered topic discovery from tags in multimedia search results
Abstract: The widespread adoption of on-line digital content sharing platforms, such as YouTube, Flickr and Pinterest, have made today's Web into the largest multimedia repository ever known. In this scenario, users have to deal with an enormous amount of data, making them rely heavily on multimedia search engines to try to find relevant content. Specialized search engines must manage the existing information overload while dealing with the semantic gap. Moreover, the subjectivity of human interpretation of content requires, in addition, for search engines to provide enough diversity in their results for all needs. In order to deal with the diversity of information needs in multimedia content, several search engines show results grouped under common subtopics. In most cases these groups have been precomputed for popular queries, or according to visual similarity, surrounding text and/or tags. Nevertheless, in some cases these precomputed categories do not correspond to the most important topics found in a set of results, or can be biased towards trendy subtopics. Therefore, creating an on-line, simple, efficient and semantically meaningful method for obtaining relevant topics for any query is very useful for any retrieval task. We address this challenge by proposing a non-parameterized algorithm for automatic query-centered topic discovery, based on community detection over a tag-graph representation of a query's search results. Our method determines semantically relevant topics for a query, while at the same time removing irrelevant tags given by the bias that exists in how each user tags their content. We show that densely connected sets of tags are good semantic topic descriptors and, in the case of multimedia content, also good visual topic descriptors.
Title:Image descriptor extraction using an Android mobile device for a video search system
Abstract: This work relates to the problem of identifying an unlabeled video using a mobile device. The goal is to build an application able to record a video and to compute its visual descriptors in the mobile device. Then, these data are sent to a server that performs the search with a database of previously computed video descriptors. Three types of global descriptors were programmed and compared in terms of their efficacy.
Title: Time series classification using SIFT-like descriptors
Abstract: Time series classification is an application of particular interest with the increase of data to monitor. Classical techniques for time series classification rely on point-to-point distances. Recently, Bag-of-Words approaches have been used in this context. Words are quantized versions of simple features extracted from sliding windows. The SIFT framework has proved efficient for image classification. In this talk, I will talk about the design of a time series classification scheme that builds on the SIFT framework adapted to time series to feed a Bag-of-Words. Experimental results show competitive performance with respect to classical techniques.
Title: Social Signal Burst Analysis for Online Emerging Event Detection
Abstract: Online social network users generate an extremely large volume of data daily on the Web. In particular, the microblogging platform Twitter is characterized by short-text message exchanges at high rates among a network of millions of active users. In this type of scenario, the detection of relevant emerging events in text streams becomes essential for identifying relevant topic trends and breaking news. In this area, offline approaches for event detection constitute a well-established research field. However, the research of online approaches for emerging event detection in big streaming data has just recently started to unfold. It is within this research that scalability, efficiency and rapidness become key issues. To address this problem we propose a generic online algorithm to detect relevant positive frequency changes, or bursts, in sets of social-based signals. This algorithm is designed to report, in close to real-time, bursts in any set of discrete signals over time. By monitoring frequency changes using sliding windows, our method is able to monitor a large number of simultaneous signals. The worst case complexity of the algorithm is O(n log n), where n is the number of messages per window. Our hypothesis is that bursts in certain types of social signals are related to emerging events, and that aggregation of bursts from different signals can allow fast identification of important events. In order to validate our approach we study event detection based on arrival rate of keyword signals and of sentiment signals in streaming text. We compare the performance of our method for the task of keyword burst detection to state-of-the-art online approaches on two datasets, with 91% of precision in the best case and 36% in the average case against LDA (commonly used as an offline ground truth), showing a significant improvement over other approaches at much less cost. In addition, we analyze sentiment signals and observe that bursts in this domain are strongly correlated to keyword bursts and to overall topic popularity in the network. These results indicate that our algorithm detects quickly moments in which emerging events start to happen. For the future we are working on using our method to monitor signals representing natural disasters, such as earthquakes.
Title: IMGpedia - enriching the semantic web with content-based multimedia analysis
Abstract: With the Semantic Web standards, today we can make complex queries to the information on the Linked Data. However, the web contains much more information than the plain text and its meta-data. It also includes videos, images, music and other media. The multimedia data is often not considered at the moment of solving a query, although it could be very relevant for the users of the web. In this talk I will present IMGpedia, a new ontology that mixes the RDF triples of DBpedia with content-based descriptors of the images of the Wikipedia's articles. With this new knowledge-base, a unified query system can be made for requesting both the data and the multimedia descriptors, providing a holistic experience on querying the web and aiming to reduce the semantic gap.
Title: Spatio and Temporal Characterization of Chilean News Events in Social Media
Abstract: Online Social Networks play a leading role in news consumption. As a consequence, most newspapers and other media use these platforms to promote their content. However, the geographic bias in the media, in addition to the demographic bias in Online Social Networks can lead to inaccurate and incomplete view of the news in a country. Being aware of these two kinds of bias in news published in Online Social Networks is useful to understand the context in which events develop. We selected Chile as a case study to observe these problems. Chile is a country with a high degree of participation in Online Social Networks and suffers from both issues: media covers mostly news from Santiago, its capital, and most of Online Social Networks users are located in this city. We built a dataset of Chilean news headlines extracted from Twitter.
We conducted a characterization of news and messages which comment them. We focus on the geographical and temporal features of news. In this paper we present the results of this analysis in addition to the description of the dataset. Our findings show that as expected, news and Twitter users are mainly concentrated in Chile’s capital. In addition, users in Chile focus on local news paying little attention to international events. We observed that a considerable number of users discussing Chilean news are located outside of Chile. We conclude that users in Chile are subject to bias in news media coverage of information, which privileges news from the largest cities.
Title: Galean: A Visual Tool for Geotemporal Analysis of News Events Extracted from Online Social Networks
Abstract: Messages shared on Online Social Networks provide insights into how news events impact people, countries and other entities. As the volume of content posted in these platforms increases at high-rates, the task of understanding relevant aspects of this information becomes extremely complex. Following this motivation, we present Galean, a Visual Analytics tool created to assist users in the search, exploration and analysis of up-to-date and historical real-world news events posted on Twitter, a popular social media service. In this talk we will present its architecture, two case studies and the results we got after conducting an evaluation with users.
Title: Search in catalogs and Video Linking
Abstract: In this talk I'll present the research we do in ORAND in the computer vision area, focusing on two specific problems: finding products in retail stores and the automatic linking of TV videos. I'll also introduce a service we are currently working on for indexing and searching in image databases.
Title: Human action classification using video segmentation as a preprocessing step
Abstract: Human action classification in videos is the process of automatically naming human actions by using the video content. The common strategy to cope with this kind of classification is based on feature extraction which is computed directly from the video followed by the action classification. However, the number of features to be used in this classification may be very high due to the presence of noise and redundant information. Thus, to reduce the number of features without decreasing of accuracy, we have been considering a hierarchical video segmentation as preprocessing step. Hierarchical video segmentation provides region-oriented scale-space, i.e., a set of video segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. Hierarchical methods have the interesting property of preserving spatial and neighboring information among segmented regions. Here, we transform the hierarchical video segmentation into a graph partitioning problem in which each part will correspond to one region of the video.
Title: Video similarity search by using compact representations
Abstract: The amount of applications using unstructured data, like videos, has been increased, and the researches concerning multimedia retrieval have attracted great attention. The need to efficiently index and retrieve this kind of data is of great concern, due to the fact that common searching approaches based on the use of keywords are not adequate for large video databases. Similarity search is a content based approach and it has been successfully used in retrieval systems. Accordingly, a major challenge is to provide an accurate and compact video representation that can achieve good performance with a fast answer in this type of searching. In this work, we proposed a compact video representation by using Min-Hash and the k-nearest GIST descriptors. Furthermore, we also present the first use of BossaNova Video Descriptor (BNVD) to video similarity search. Both compact video representations have shown more than 88% of mean average precision on similarity video search. The experimental results indicate high efficiency of our proposed representations in video retrieval task.
As result from the visit of our international collaborators, we made plans to work on two papers about Time Series Mining with Simon Malinowski, and to work on one paper with Silvio Jamil Ferzoli Guimaraes about sketch-based image retrieval. Also, we discussed the possibility for Heider Sanchez to continuing his research activities as postdoc at the group of Silvio Jamil Ferzoli Guimaraes in PUC Minas, after he finishes his Ph.D. (planned for mid-2016).