DATA 71200 - Advanced Data Analysis Methods #61133
Prof. Johanna Devaney (Johanna.Devaney@brooklyn.cuny.edu)
This course will provide students will skills necessary to apply machine learning techniques to data, and interpret and communicate their results. They will also begin to develop intuitions about when machine learning is an appropriate tool versus other statistical methods. This course will cover both supervised methods (e.g., k-Nearest neighbors, naïve Bayes classifiers, decision trees, and support vector machines) and unsupervised methods (e.g., principal component analysis, non-negative matrix factorization, and k-means clustering). The supervised methods will focus primarily on "classic" machine learning techniques where features are designed rather than learned, although we briefly look at recent deep learning models with neural networks. This is an applied machine learning class, which emphasizes the intuitions and know-how needed to get learning algorithms to work in practice, rather than mathematical derivations.
The course will be taught in Python, primarily using teh scikit-learn library. The courses's main text will be the O'Reilly book "Introduction to Machine Learning with Python" by Sarah Guido and Andreas C. Müller, along with the book's corresponding Jupyter notebooks. We will also be referencing "The Elements of Statistical Learning" by Trevor Hastie, jerome H. Friedman, Robert and Tibshirani for examining some of the topics in more depth (this book is available for free form the first author's website: https://web.stanford.edu/~hastie/Papers/ESLII.pdf [web.stanford.edu]
DATA 73200 - Interactive Data Visualization #61138 (Note: two sections offered--same day and time)
Prof. Aucher Serr (firstname.lastname@example.org)
DATA 73200 - Interactive Data Visualization #64845 (Note: two sections offered--same day and time)
Prof. Ellie Frymire (email@example.com)
Interactive Data Visualization is one of the most important forms of communication today — allowing users to better engage with data, detect patterns, and quickly gain insight into complicated topics. This course will introduce students to the tools, skills, and concepts necessary for making state-of-the-art interactive data visualizations. Using web-based technologies including HTML, CSS, and D3.js, students will learn to create engaging and effective information displays, grounded in the science of visual perception and best practices in visual mapping and accessibility. Throughout the semester, students will work towards creating a portfolio of beautiful and analytically sound data visualizations, while also developing their own iterative design process.
DATA 73500 - Working with Data #66137 (Note: there are two sections on Monday)
Prof. Timothy Shortell (firstname.lastname@example.org)
DATA 73500 - Working with Data #61122 (Note: there are two sections on Monday)
Prof. Timothy Shortell (email@example.com)
This course covers the fundamentals of working with data. Students will be introduced to key disciplines that provide techniques used for working with small, medium and big data today--classical statistics, contemporary data science, machine learning, and data visualization. They will learn about different data types; what constitutes a valid dataset that can be analyzed quantitatively; how data should be formatted to create a valid dataset. The course will also explore fundamental theoretical questions that arise when we attempt to represent social or cultural phenomena as data. Particular attention will be focussed on working with social network services data, user generated content, and other types of data about societies and individuals that have emerged recently (such as sensor data) and massive media datasets (images, video, text, sound, code, etc.). The course will explore fundamental database technologies and more recent techniques for working with real-time data flows.
The "data revolution" has transformed the way we understand and interact with the world around us. The availability of large datasets, progress in computer hardware and software, and use of the web to share data and acquire it from numerous sources (including social network services, libraries, museums, city governments, non-profits, etc.) has created many new possibilities in many fields including computer science, social science, humanities, business, economics, and medicine. These developments have also led to the emergene of a number of new research fields in the end of the 2000s: social computing, computational social science, digital humanities, cultural analytics, and culturomics. This course introduces students to fundamental concepts and practical techniques and skills needed to work with data.
We'll begin with a broader examination of data and society. Then, we'll take a look at some of the tools used by data analysts and data scientists to produce knowledge in various settings. We'll focus mainly on analysis of text in our coding work; this is the best place to begin to understand the choices we make as researchers and analysts in applied settings.
Students will learn the most fundamental concepts and skills of data analysis, required before they can use more advanced analysis techniques, and also do data visualization. While focusing on fundamentals, the course also introduces students to new ideas for data analysis, new types and sources of data, and recently emerged fields that are taking advantage of these sources, increasing computer power for data processing and new open source comprehensive data analysis programming environments. After taking this course, students will be able to:
* have a general understanding of how to use quantitative data to research topics in many fields;
* understand both the benefits and limitations of using quantitative methods in research;
* learn concepts and practical techniques for downloading data from various sources, cleaning data, managing and structruing datasets using tools such as Google Sheets and the R data analysis programming language.
* students will acquire working knowledge of a language such as R or Python, including importing and exporting data in different formats, data management, selecting parts of a dataset using various conditions, combining data sets, and creating basic graphs.
DATA 74000 - Data, Culture, and Society #61128
Prof. Katherine Behar (Katherine.Behar@baruch.cuny.edu)
Big data and computational methods for its analysis are changing scientific and humanities research, financial markets, political campaigning, higher education, and countless other areas, and also affect our everyday lives. Our daily existence is increasingly structured by software systems that process massive amounts of data and generate results such as music and book recommendations, search engine inputs, car routes, airline prices, etc.
in this course, we explore the social, political, and cultural impact of reliance of our society on massive (and often real-time) data analysis. We will discuss the concepts behind data collection, organization, analysis, and publication. We will also discuss possibilities, limitations, and implications of using big data-centric methods in social science and humanities research, and the already developed work in computational social science, digital humanities and cultural analytics fields. Students will become familiar with the history and basic concepts of the fundamental paradigms developed by modern societies to analyze patterns in data--statistics, visualization, data mining, and machine learning.
Finally, we also want to ask general questions about society and culture in a data-centric society. The arrival of social media and the gradual move of knowledge and media distribution and cultural communication to digital networks in the early 21st century has created a new digital landscape which challenges our existing methods for the study of and assumptions about culture. What new theoretical concepts do we need to deal with the new scale of born-digital culture? What data analysis and visualization techniques developed by industry and sciences are most useful for cultural analysis? How can we use big cultural data to question what we know about culture and generate new questions?
DATA 74200 - Media Theory and History #61130
Prof. Lev Manovich (firstname.lastname@example.org)
The topics course is designed to introduce students to many influential ideas and works by key modern and contemporary thinkers about media and technology. Because historically these ideas were developed in relation to particular technologies and media that came into prominence in different periods, we will also explore aspects of media history including photography, film, radio, television, Internet, social media, artificial intelligence, big data and data art. Some of the discussions will use as starting points Manovich's own selected articles and chapters from his books The Language of New Media, Software Takes Command, Instagram and Contemporary Image, and Cultural Analytics (forthcoming). all texts used in the class are freely available online.
DATA 78000 - Geospatial Humanities #61134
Prof. Jonathan Peters (email@example.com)
Cross-listed with DHUM 73700
This course combines an introduction to basic cartographic theory and techniques in humanities contexts with practical experience in the analysis, manipulation, and the graphical representation of spatial information in a fun and engaging way. The course examines the storage, processing, compilation, and symbolization of spatial data; basic spatial analysis and spatial statistics; and the visual design principles involved in conveying spatial information. Emphasis is placed on digital mapping technologies, including online and offline computer based geographic information science tools. Students will develop original maps using various forms of data collection, analysis and historical resources.
The overarching objective of this course is to familiarize students with GIS and spatial analysis tools and techniques used in professional and scholarly fields. By the conclusion of this course, students will be able to:
* gather and manipulate geospatial data;
* interact with geospatial data stored in a database;
* interact with geospatial data stored in hierarchical data formats;
* explore historical geospatial data resources and understand variations in data reporting based upon time period and location;
* collect geospatial data in field using GPS technology and map as needed;
* use cartographic theory to design effective graphical representations of geospatial data;
* use cartographic theory to interpret, analyze, and critique graphical representations of spatial phenomena;
* and create both static and interactive maps containing different representations of geospatial information.
Mastering ArcGIS by Maribeth H. Price – Seventh Edition. ISBN-13: 978-0078095146 $78.25 MSRP
Getting to Know ArcGIS Desktop Second Edition, for ArcGIS 10 Edition by Tim Ormsby, Eileen J. Napoleon, Robert Burke, Carolyn Groessl ISBN-13: 978-1589482609 $25.00 MSRP.
Lost City of the Monkey God by Douglas Preston ISBN 9781455569410 – selected chapters as noted
Topics / Academic Papers as noted