MATH VALUES

View Original

Dabble in Data with CODAP

By R. N. Uma, NC Central University (NCCU)

R.N. Uma

Have you ever wished all your students could dabble in data with ease? Has the task of introducing students to spreadsheets (e.g., Microsoft Excel) and programming languages (e.g., R) for data visualization and analyses seemed daunting?

My collaborators and I were facing this predicament when implementing our NSF-funded Data Science for Social Justice project1 in a freshman seminar course that is taken by all majors in our university. When my collaborator, Rebecca Lowe, found CODAP, a web-based online data analysis tool, we realized that was the answer to our predicament!

What is CODAP?

CODAP2 (https://codap.concord.org), an acronym for Common Online Data Analysis Platform, is an open-source, web-based tool that makes data analysis as easy as a drag-and-drop activity. This makes it accessible to all students from middle school and beyond. To get a quick look at CODAP, follow the tutorial at the end of this article.

Our Experience with CODAP at NCCU

All students at NCCU are required to take a freshman seminar course that includes a social justice-based project. Each year about 30-35 sections of this course are offered. For our Data Science for Social Justice project, we have been working with two sections of this course each year—one section consists of STEM majors and the other section is mostly non-STEM majors. The students in our sections take a data-driven approach to analyzing the social justice issue. We have been working with the same set of instructors each year.

The first year (2019-2020) we implemented this project, we introduced the students to R and provided template code in R in the hopes of making it easier for the students to adapt the provided code to their dataset by merely changing the variable names. That did not go as smoothly as we had hoped. Almost all of our students had no prior programming experience and found R to have a steep learning curve despite the resources and tutorial sessions we had provided.

The following year (2020-2021) we introduced CODAP and had the students use CODAP instead of R. Our project implementation was impacted due to COVID-19 since classes moved to an online or hybrid format and the regular interactions with students were lost. Despite this, the feedback from instructors and students was positive. One instructor said, “I feel like CODAP was a lot more user-friendly when it came to the student. So there wasn't that initial barrier of figuring out how to program to get the information.  It was more just you drag what you're interested in and see what pops up. So that was a lot easier.

Moving forward, we will continue using CODAP for this project. Statistics courses, and for that matter any course, regardless of discipline, that requires students to analyze data, will greatly benefit from using CODAP.

Data Sources

Access to data analysis tools without access to data is pointless. So where do we go for meaningful data? Depending on the topic, there are a variety of sources. Some of the standard sources are DATA.GOV (https://www.data.gov/), the US Census Bureau (https://www.census.gov/), and the open data initiative at each state/city/country (e.g., NYC Open Data: https://opendata.cityofnewyork.us/) to name a few. However, the data we download from these sources and other government and non-profit agencies may not be in a format that is readily explorable by all students.

Through our Data Science for Social Justice project, we are creating a repository of datasets related to social justice issues. Our project (https://sites.google.com/view/dssj/projects) and Kaggle (https://www.kaggle.com/datasets) are a couple of sources for cleaned datasets in a neat format.

Let’s CODAP-ble in data!

CODAP Tutorial

Link from Column 1 Row 1, https://codap.concord.org/

For a more interactive example, read this article written by the creators of CODAP, Tim Erickson (eeps media) and Bill Finzer (Concord Consortium). If instead, you would like to try out CODAP on your dataset, watch my recording of a live-demo tutorial.

R. N. Uma is a Professor of Computer Science in the Department of Mathematics and Physics at NC Central University. Her research interests include data science and applications of scheduling theory. On the education front, her interests are geared towards increasing the participation of underrepresented minorities in STEM, particularly in the Mathematical Sciences.


1 NSF HRD-1912408: Broadening Participation Research Project: Research for Social Justice – Broadening

Participation through Data Science. R. N. Uma (NC Central University), Alade Tokuta (NC Central University), Rebecca Lowe (Cynosure Consulting), Adrienne Smith (Cynosure Consulting).

2 Common Online Data Analysis Platform [Computer software]. (2014). Concord, MA: The Concord Consortium.