Big Data Analytics
In addition to constantly growing volumes of proprietary transaction, product, inventory, customer, competitor, and industry data collected from enterprise systems, organizations are also faced with overwhelming amounts data from the Web, social media, mobile sources, and sensor networks that do not fit into traditional databases in terms of volume, velocity and variety (the three Vs of Big Data). This Big Data flood poses challenges as well as opportunities, if managed and analyzed properly, to derive new actionable knowledge and intelligence in a timely manner. This course will explore existing and emerging methods to manage, integrate, analyze and visualize domain-specific Big Data, to identify and provide domain specific solutions.
This course covers the research issues and practical methods of managing and analyzing Big Data to gain and discover insights, patterns, and knowledge nuggets that can support decision makers.
- 2019-01-31 Introduction, defining Big Data, core trade-offs, theoretical statistics, online and streaming learning.
- 2019-02-07 (Ping Ji) High velocity classical data, anomaly detection. Traffic analysis. High speed, low dimension.
- 2019-02-14 Feature selection, complexity reduction. Machine learning techniques: clustering, classification.
- 2019-02-21 Software overview: Tensorflow, Google BigQuery, Hadoop, Spark.
- 2019-02-28 (Denis Khryashchev) Spatial data, Transportation research.
- 2019-03-07 (Ping Ji) Network data, graph analytics. Facebook graph.
- 2019-03-14 Financial data, time series.
- 2019-03-21 Text data. Twitter firehose. Corpus study.
- 2019-03-28 Media data. Flickr and YouTube datasets. The Netflix Challenge.
- 2019-04-04 (Ping Ji) Transparency, Security and Privacy.
- 2019-04-11 Accountability, Fairness, Responsible data analytics.
- 2019-04-18 High complexity: Topological Data Analysis. Manifold learning.
- 2019-05-02 (Ping Ji) Presentations
After taking this course, you will be able to...
- ...identify research challenges or application utilities within big data analytics.
- ...evaluate and select appropriate tools for performing big data analytics.
- ...communicate analysis results.
- ...present existing and/or innovative methods and algorithms for designing analytics solutions.
This course will be evaluated on
- 40% Written report: either performing an in-depth analysis of a particular data source, or explaining in detail a technique not handled in class.
- 20% 10 minute in-class presentation of report content.
- 20% Take-home lab assignments (primarily through Kaggle competition participation)