报告简介:
Iterative algorithms are pervasive in many applications such as search engine algorithms, machine learning, and recommendation systems. These applications typically involve a dataset of massive scale. Fast iterative computation of the massive data set is essential for these applications. This is particular important for on-line query such as keyword based search query. In this talk, we present an overview of MapReduce framework, and propose two models, iMapReduce and pMapReduce, that enable fast iterative computation.
By providing the support of iterative computation and prioritized execution, we can ensure faster convergence of the iterative process.
Both iMapReduce and pMapReduce preserve the MapReduce distributed computing framework and is particularly efficient for online queries such as top-k queries. We implement iMapReduce and pMapReduce based on Apache Hadoop and evaluate its performance. Our evaluation results show that pMapReduce can reduce the computation time by a factor of 5 to 50 comparing to that achieved with MapReduce. At the end of the talk, I will provide an overview of on-going projects in my research group.
报告人简介:
Lixin Gao is a professor of Electrical and Computer Engineering at the University of Massachusetts at Amherst. She received her Ph.D. degree in computer science from the University of Massachusetts at Amherst in 1996. Her research interests include social networking, and Internet routing, network virtualization and cloud computing. Between May 1999 and January 2000, she was a visiting researcher at AT&T Research Labs and DIMACS. She was an Alfred P.
Sloan Fellow between 2003-2005 and received an NSF CAREER Award in 1999. She won the best paper award from IEEE INFOCOM 2010 and the Test-of-Time award from ACM SIGMETRICS 2011, received the Chancellor's Award for Outstanding Accomplishment in Research and Creative Activity in 2010, and is a fellow of IEEE.