计算机网络和信息集成教育部重点实验室(东南大学)

 
   



2015年学术报告


--- 2015年学术报告
---
Data Mining on Cloud Computing Platforms ─ Challenges and Solutions

时间: 地点:九龙湖校区计算机学院三楼会议室

报告简介:

  Cloud computing has emerged rapidly as a growing paradigm of on-demand access to computing, data and software utilities using a usage-based billing model. Users essentially rent resources and pay for what they use and everything including software, platform, and infrastructure is as a service. Many massive data applications including data mining should be the ideal applications on cloud platforms. However, with the current cloud programming models, complicated data mining algorithms cannot be implemented easily and executed efficiently on the many cloud platforms. In this talk, I will give a review of different massively parallel computing platforms and compare various computing domains and programming models on these platforms, their limitations and potential solutions, especially to data mining applications. In particular, I will point out the shortcomings and limitations of current cloud computing programming models for typical data mining algorithms, and propose possible solutions. Current MapReduce model and its variants have succeeded in data-parallel applications such as database operations and web searching; however, they are still not effective for applications with a lot of data dependency such as data mining and graph applications. We propose several approaches to solving this problem through extension of current programming models, automatic translation from sequential codes to cloud codes, simple API and framework built on current cloud models, detection of data and task parallelism, and their efficient scheduling. Some preliminary theoretical and experimental results will also be reported in this talk.

报告人简介:

   Yi Pan is a Distinguished University Professor of the Department of Computer Science and an Interim Associate Dean at Georgia State University, USA. Dr. Pan received his B.Eng. and M.Eng. degrees in computer engineering from Tsinghua University, China, in 1982 and 1984, respectively, and his Ph.D. degree in computer science from the University of Pittsburgh, USA, in 1991. His profile has been featured as a distinguished alumnus in both Tsinghua Alumni Newsletter and University of Pittsburgh CS Alumni Newsletter. Dr. Pan's research interests include parallel and cloud computing, wireless networks, and bioinformatics. Dr. Pan has published more than 180 journal papers with over 50 papers published in various IEEE journals. In addition, he has published over 150 papers in refereed conferences. He has also co-authored/co-edited 40 books. His work has been cited more than 5000 times. Dr. Pan has served as an editor-in-chief or editorial board member for 15 journals including 7 IEEE Transactions. He is the recipient of many awards including IEEE Transactions Best Paper Award, IBM Faculty Award, JSPS Senior Invitation Fellowship, IEEE BIBE Outstanding Achievement Award, NSF Research Opportunity Award, and AFOSR Summer Faculty Research Fellowship. He has organized many international conferences and delivered over 40 keynote speeches at various international conferences around the world.
   

东南大学计算机网络和信息集成教育部重点实验室 版权所有