When X Meets Y and Beyond

University of Waterloo’s Dr. Tamer Özsu Turns to Graphs to Derive Meaning from Big Data

Tamer Özsu has a passion for pens. Fountain pens in particular. They scatter his desk by the dozens, and are displayed in a running slideshow on his office computer. He has a collection of more than 700 pens at home, and admits that there’s nothing he enjoys more than heading to the basement on a quiet evening to capture his thoughts with pen to fine paper.

But Dr. Özsu’s ‘day job’ as a Professor of Computer Science and Associate Dean of Research in the Faculty of Mathematics at the University of Waterloo demands a higher abstraction of thinking that is about as far from pen and paper as one can get.

Dr. Özsu’s research is focused on the field of data management, following two threads: large scale data distribution, and the management of non-traditional data.

Within the realm of Big Data, Dr. Özsu's research interest of late is centered in the area of graph data. Graph representations are quite widely used today to capture the structure of large data sets, and graphs have long been an important data type for data researchers. With the growth of social networks, such as Facebook, Twitter, and web data, particularly the semantic web, the prominence of search engines such as Google and Bing, and the emergence of online information ‘storehouses’ such as Wikipedia, interest in managing very large graphs has accelerated in step with the pace of market change, says Dr. Özsu. He and his UW counterparts are now finding the research results in this area to be much in demand from large corporate entities including Google and SAP, as well as many start ups who are looking to leverage graph representation to understand myriad of business challenges, from social listening, to monitoring Internet traffic, to mapping of genetic structures.

“By taking information and presenting it as a graph, all kinds of information is subsequently revealed,” explains Özsu. “For instance, how do you find friends of friends of friends to you on Facebook today? This is actually a graph query, which follows the edges and where each node represents a person, and each connection represents a relationship – for instance a search of Tokyo might reveal photos of friends taken in Tokyo shared with you. Google’s page rank is also a graphical representation –showing which pages connect to each other. And then, once we have that information, which may represent multiple millions of pages, the question becomes how do you best store that information and query upon it? And in the case of data generated by social networks, how does that process evolve as the data evolves and changes?”

While Dr. Özsu’s work focuses specifically on data representation, storage and querying, other counterparts within the UW Mathematics Faculty cover off many of the associated areas of expertise when it some to Big Data – including data analysis, data cleaning, efficient data processing, data optimization and machine learning. “We have developed a significant core competency in the field of Big Data here at U Waterloo that is the world’s envy,” says Dr. Özsu. Although he does admit that the research today is somewhat siloed, there is opportunity in the future to build an integrated program that creates within the University Waterloo’s Math Faculty a Centre of Excellence in Big Data.

“We are entering the second golden age of data management,”says Tamer Özsu. “The advancements we are seeing today haven’t been seen since the 1970s and 80s, which was marked by the introduction of the relational database technology. The volume, variety and velocity of data today are unprecedented and are presenting fresh new opportunities for research, for commercial enterprise and for the start up community. The datasets are very interesting and complicated and the problems are real. It’s been a good period.”