Computer Systems Technology Colloquium Series presents:
Big Data Challenges and Solutions
Computer Systems Technology
New York City College of Technology
Thursday, April 16, 2015 12-1pm
Light refreshments will be served!
Big data is set to offer tremendous insight. But with terabytes and petabytes of data pouring in to organizations today, traditional architectures and infrastructures are not up to the challenge. This begs the question: How do you present big data in a way that can be quickly understood and used? These data present tremendous opportunities in data mining, a burgeoning field in computer science that focuses on the development of methods that can extract knowledge from data. In many real world problems, data mining algorithms have access to massive amounts of data. Mining all the available data is prohibitive due to computational (time and memory) constraints. Much of the current research is concerned with scaling up data mining algorithms (i.e. improving on existing data mining algorithms for larger datasets). An alternative approach is to scale down the data. Thus, determining a smallest sufficient training set size that obtains the same accuracy as the entire available dataset remains an important research question. Our research focuses on selecting how many (sampling) instances to present to the data mining algorithm and also how to improve the quality of the data.
Dr. Ashwin Satyanarayana is an Assistant Professor in the Computer Systems Technology department at CityTech. Prior to joining CityTech, Ashwin was a Research Scientist at Microsoft, where he worked on several Big Data problems including Query Reformulation on Microsoft’s search engine Bing. Ashwin’s prior experience also includes a Senior Research Scientist on the area of Location Analytics at Placed Inc. He holds a PhD in Computer Science (Data Mining) from SUNY, with particular emphasis on Data Mining, Machine Learning and Applied Probability with applications in Real World Learning Problems.