Please note that this schedule is tentative and subject to change.
|4:15 - 6:15 PM
Media Theory and History
Advanced Data Analysis Methods
|6:30 - 8:30 PM
Working with Data
Data, Culture and Society
Interactive Data Visualization
Updated soon with course descriptions and registration codes.
DATA 71200 - Advanced Data Analysis Methods #CRN
Wednesday, 4:15 - 6:15 PM, 3 Credits, Rm. TBA, Prof. Johanna Devaney (Johanna.Devaney@brooklyn.cuny.edu)
This course will provide students will skills necessary to apply machine learning techniques to data, and interpret and communicate their results. They will also begin to develop intuitions about when machine learning is an appropriate tool versus other statistical methods. This course will cover both supervised methods (e.g., k-Nearest neighbors, naïve Bayes classifiers, decision trees, and support vector machines) and unsupervised methods (e.g., principal component analysis, non-negative matrix factorization, and k-means clustering). The supervised methods will focus primarily on "classic" machine learning techniques where features are designed rather than learned, although we briefly look at recent deep learning models with neural networks. This is an applied machine learning class, which emphasizes the intuitions and know-how needed to get learning algorithms to work in practice, rather than mathematical derivations.
The course will be taught in Python, primarily using teh scikit-learn library. The courses's main text will be the O'Reilly book "Introduction to Machine Learning with Python" by Sarah Guido and Andreas C. Müller, along with the book's corresponding Jupyter notebooks. We will also be referencing "The Elements of Statistical Learning" by Trevor Hastie, jerome H. Friedman, Robert and Tibshirani for examining some of the topics in more depth (this book is available for free form the first author's website: https://web.stanford.edu/~hastie/Papers/ESLII.pdf [web.stanford.edu]
DATA 73200 - Interactive Data Visualization #CRN
Thursday, 6:30 - 8:30 PM, 3 Credits, Rm. TBA, Prof. Aucher Serr (firstname.lastname@example.org)
Interactive Data Visualization is one of the most important forms of communication today — allowing users to better engage with data, detect patterns, and quickly gain insight into complicated topics. This course will introduce students to the tools, skills, and concepts necessary for making state-of-the-art interactive data visualizations. Using web-based technologies including HTML, CSS, and D3.js, students will learn to create engaging and effective information displays, grounded in the science of visual perception and best practices in visual mapping and accessibility. Throughout the semester, students will work towards creating a portfolio of beautiful and analytically sound data visualizations, while also developing their own iterative design process.
DATA 73500 - Working with Data #CRN
Monday, 6:30 - 8:30 PM, 3 Credits, Rm. TBA, Prof. Timothy Shortell (email@example.com)
This course covers the fundamentals of working with data. Students will be introduced to key disciplines that provide techniques used for working with small, medium and big data today--classical statistics, contemporary data science, machine learning, and data visualization. They will learn about different data types; what constitutes a valid dataset that can be analyzed quantitatively; how data should be formatted to create a valid dataset. The course will also explore fundamental theoretical questions that arise when we attempt to represent social or cultural phenomena as data. Particular attention will be focussed on working with social network services data, user generated content, and other types of data about societies and individuals that have emerged recently (such as sensor data) and massive media datasets (images, video, text, sound, code, etc.). The course will explore fundamental database technologies and more recent techniques for working with real-time data flows.
The "data revolution" has transformed the way we understand and interact with the world around us. The availability of large datasets, progress in computer hardware and software, and use of the web to share data and acquire it from numerous sources (including social network services, libraries, museums, city governments, non-profits, etc.) has created many new possibilities in many fields including computer science, social science, humanities, business, economics, and medicine. These developments have also led to the emergene of a number of new research fields in the end of the 2000s: social computing, computational social science, digital humanities, cultural analytics, and culturomics. This course introduces students to fundamental concepts and practical techniques and skills needed to work with data.
We'll begin with a broader examination of data and society. Then, we'll take a look at some of the tools used by data analysts and data scientists to produce knowledge in various settings. We'll focus mainly on analysis of text in our coding work; this is the best place to begin to understand the choices we make as researchers and analysts in applied settings.
Students will learn the most fundamental concepts and skills of data analysis, required before they can use more advanced analysis techniques, and also do data visualization. While focusing on fundamentals, the course also introduces students to new ideas for data analysis, new types and sources of data, and recently emerged fields that are taking advantage of these sources, increasing computer power for data processing and new open source comprehensive data analysis programming environments. After taking this course, students will be able to:
* have a general understanding of how to use quantitative data to research topics in many fields;
* understand both the benefits and limitations of using quantitative methods in research;
* learn concepts and practical techniques for downloading data from various sources, cleaning data, managing and structruing datasets using tools such as Google Sheets and the R data analysis programming language.
* students will acquire working knowledge of a language such as R or Python, including importing and exporting data in different formats, data management, selecting parts of a dataset using various conditions, combining data sets, and creating basic graphs.
DATA 74000 - Data, Culture, and Society #CRN
Wednesday, 6:30 - 8:30 PM, 3 Credits, Rm. TBA, Prof. Katherine Behar (Katherine.Behar@baruch.cuny.edu)
Big data and computational methods for its analysis are changing scientific and humanities research, financial markets, political campaigning, higher education, and countless other areas, and also affect our everyday lives. Our daily existence is increasingly structured by software systems that process massive amounts of data and generate results such as music and book recommendations, search engine inputs, car routes, airline prices, etc.
in this course, we explore the social, political, and cultural impact of reliance of our society on massive (and often real-time) data analysis. We will discuss the concepts behind data collection, organization, analysis, and publication. We will also discuss possibilities, limitations, and implications of using big data-centric methods in social science and humanities research, and the already developed work in computational social science, digital humanities and cultural analytics fields. Students will become familiar with the history and basic concepts of the fundamental paradigms developed by modern societies to analyze patterns in data--statistics, visualization, data mining, and machine learning.
Finally, we also want to ask general questions about society and culture in a data-centric society. The arrival of social media and the gradual move of knowledge and media distribution and cultural communication to digital networks in the early 21st century has created a new digital landscape which challenges our existing methods for the study of and assumptions about culture. What new theoretical concepts do we need to deal with the new scale of born-digital culture? What data analysis and visualization techniques developed by industry and sciences are most useful for cultural analysis? How can we use big cultural data to question what we know about culture and generate new questions?
DATA 74200 - Media Theory and History #CRN
Tuesday, 4:15 - 6:15 PM, 3 Credits, Rm, TBA, Prof. Lev Manovich (firstname.lastname@example.org)
DATA 78000 - Geospatial Humanities #CRN
Monday, 4:15 - 6:15 PM, 3 Credits, Rm. TBA, Prof. Jonathan Peters (email@example.com)
Cross-listed with DHUM 73700