Kolaczyk et al.’s review (“Statistics Practicum: Placing 'Practice' at the Center of Data Science Education,” this issue) of Boston University’s M.S. in Statistical Practice Program (MSPP), including a thorough review of their model practicum, demonstrates the high level of execution required in centering practice in the MSSP curriculum. As a former faculty director of the University of San Francisco’s (USF) MS in Data Science program, and current executive director charged with leading data science initiatives at the University of Chicago (UChicago), the article prompted me to reflect on the number of significant challenges that exist (both in common with MSSP and unique in our execution) when centering practice in the curriculum. These challenges, from evolving the educational model and measuring student impact, to project management and sustainable external partnerships, force a reflection on the difficulty of centering practice in a data science curriculum. The framework provided in the article makes it a critical read for anyone considering a more practice centered approach to data science education.
What is clear from the article is the extensive effort demonstrated by the BU team to run an integrated practicum at a high level each year. As noted by the authors, this effort places MSSP in the outlier status alongside North Carolina State University (NCSU) and USF (and arguably a few others like UChicago and our neighbors at Northwestern) who center practice through a high touch practicum experience that provides meaningful opportunity to engage in data science projects with external partners. What this suggests is there is an opportunity to develop sustainable, modular, and effective educational models that allow for easier adoption by data science educational programs of all varieties, in all locations (e.g., urban, ex-urban, and rural), to support centering practice.
As such, I’d like to posit a call to action and a road map on how we can collectively move as an educational community toward this goal. Let me begin by capturing an incomplete set of ideal characteristics of practicum/clinic:
significant, mutually beneficial, long-term partnerships with entities that have ‘real’ data science projects;
sustained, team-based effort (curricular or co-curricular) to address the projects; and
student growth from theory to practice, modern skill acquisition, critical problem-solving skills, and effective communication of solutions to the client.
If programs doing the above well are outliers, how can we develop new approaches that address 1–3 that scale to other master’s and undergraduate programs, community colleges, and even Ph.D. programs?
Kolaczyk et al. make a strong case for the value of center practice. For the students there is immediate impact on ‘readiness’ both toward their research and career trajectories. Practicum acts as a unifying force that amplifies the theory and methodology learned in other data science coursework. The authors do miss an opportunity to highlight the value of the MSSP practice-centric model as an engine for positive impact, as a whole, on underrepresentation in data science. Several characteristics of the practicum (e.g., project based and team-based) outlined in the article happen to address one of the core opportunities of data science educators: attracting and retaining a diverse and vibrant pipeline of data science researchers and practitioners (Espey, 2018). It is not often we, as STEM educators and researchers, can take the driver’s seat in the development of a new field like data science. Centering practice in data science education creates a real possibility the field will grow in a manner that actually reflects the population it purports to engage, with diverse scientists asking novel questions from a wide range of viewpoints. The challenge that remains is how to make centering practice and external partnerships more accessible to the wider data science education community.
There are several challenges that vastly increase the activation energy required to center practice through external partnership, thereby preventing more widespread adoption. Here I’ll focus on what I think are the top two.
Recruiting and retaining meaningful and mutually beneficial external partners around data science.
Adapting old and developing new models of project-based education that sustainably and effectively support student learning in data science.
Both 1 and 2 are challenging but we don’t have to start from scratch.
External data science partners can be a challenge to recruit and retain. Kolaczyk et al. note that an important source of their practicum projects come from internally sourced research projects. Internal projects are among the easiest and most reliable sources of valuable projects for new data science programs. That said, internal projects tend to be understandably more academic and less focused on practice. Kolaczyk et al. make the case convincingly in Section 3.1 that external partner projects combined with reproducible research training are the backbone of centering practice. Therefore, I’ll focus mainly on the effort and challenge of sourcing external projects.
There are sparse examples of successful models that engage at scale with industry partners. USF happens to benefit from being surrounded by a wealth of industries working in data science. NCSU, Northwestern, BU, and UChicago have made significant investments dedicated to the staffing and recruitment of these industry partners.
If a university is not surrounded by the data science industry or has other geographic constraints, I argue a rich and more accessible source of projects is still available. Universities can turn to their local community organizations and government as well as the broader social impact sector for a model of sustainable partnerships. While industry has traditionally had more of an arm’s length distance to higher education, this is less true for local community organizations and local government. Taking advantage of close community relationships and networks helps lower the activation energy needed to start new conversations around partnerships through data science, while also increasing the likelihood of long-term sustainability. One of the most illustrative examples in the article is the project done in support of a public school district in the analysis of public school enrollment. Social impact organizations and the public sector have no shortage of data challenges and opportunities for engaging student talent in team-based, project-based data science work. The work has immediate, tangible, and positive impact on the local community and the larger social impact sector.
Finally, by focusing on clinical work that makes positive change for the local community, data science programs have the opportunity to additionally integrate authentic learning in the process of centering practice. Authentic learning is a form of active learning in which students focus on real-world issues over a long period of time (Singer et al., 2020) that may improve both student retention and performance, among other positive student measures (Beier et al., 2019; Torres et al., 2016). By leading with these best practices, centering practice through clinical work provides a meaningful opportunity to increase inclusivity, recruitment, retention, and outcomes for students.
Programs focusing on social impact through data science have the added advantage of increasing the number of opportunities to attract funding support. Pioneering foundations such as the 11th Hour Project have supported our work to create an educational platform at UChicago that brings data science student teams in partnership with social impact organizations working at the forefront of environmental, human rights, and energy challenges. These partnerships have led to breakthrough solutions that advanced organizational mission and increased data science capacity at these organizations for sustainable change. This team-based, clinical course has provided a critical opportunity to center practice in our students’ data science education. Students also report meaningful growth that aligns with personal and career goals.
More recently, Data.org has awarded UChicago a grant through its inclusive growth and recovery challenge program to link the city of Chicago, Southside community organizations, and UChicago faculty, researchers, and students to directly address the major challenge of broadband access for underserved communities and bridge the digital divide. In both partnerships, we are able to center practice through our clinical model of education with the explicit aim of ensuring we are training the next generation of vibrant, capable, and diverse leaders in data science. We are committed to working in partnership with the greater educational community to move toward the goal of sustainable, modular, and effective practice-centric educational models that allow for easier adoption by data science programs of all varieties, in all locations.
There is clearly more work needed before external partnerships are scalable to the greater higher education community. The typical active data science faculty member, administrator, or program coordinator isn’t particularly experienced in engaging, scoping, and launching data science projects with external partners. As a window into the effort, Kolaczyk et al. note:
These projects typically are lined up by MSSP program faculty one to six months in advance, in partnership with external entities in industry, government, and the nonprofit sector. Scoping of projects generally involves a series of initial meetings to match goals of prospective partners and MSSP, iteration by email on formal project statements, and frequently (sometimes extensive) discussion between university and partner legal representatives to define and agree upon parameters like intellectual property, nondisclosure agreements, data sharing and privacy, and so on.
We need to invest in the development of field tested, generalizable material. Currently, intake forms, vetted project agreements, data usage agreements, assessment, and placement material on how to set up projects and partnerships exist at few institutions (including BU, as noted above). We need to come together as a community to share our current best practices and aim to design a more transferable playbook to help make partnerships easier to establish for institutions of all sorts, including Research Universities, liberal arts colleges, Historically Black Colleges and Universities (HBCU), and community colleges.
As was noted in the article, educational models of clinic and practicum have been around for decades. There is an existing and rich knowledge of best practices on these courses to leverage and modify for the specific needs of data science. Kolaczyk et al. lay out a highly engaged, two-semester model of practicum. Ultimately, a semester or multi-semester of clinical courses may not work for many institutions. In these cases we need to develop more modular efforts that contribute toward placing practice at the center of data science education, which can be adjusted to local constraints and institutional needs. These may include integrating an external project in current coursework. One example could be partnering with a social impact organization on a team-based project through an existing machine learning course. It may make sense to develop and explore summer opportunities modeled on the successful National Science Foundation Research Experience for Undergraduates (REU) programs that build communities of data science students working on real world projects together.
Overall, more investment is needed to meet this call to action. Universities will need to make some additional investments in the resources needed to launch and sustain these programs. Given the current economic environment in higher education, incremental new investment will need to go hand in hand with developing financial models (tuition revenue, local community foundation support, National Science Foundation and government support, individual donors, etc.) that can ensure these models continue in a self-sufficient way.
We have the opportunity now to take up serious and sustained trans-institutional collaboration and dialogue through a wide range of modalities—such as shared grants, collaborative projects, and conferences—to share our current knowledge and work hand in hand with diverse institutions as the data science educational ecosystem evolves. When working at the cusp of a new scientific field, one can only hope—and work—to ensure that our students enrolled at all types of higher educational institutions gain access to critical knowledge through centering practice.
David Uminsky has no financial or non-financial disclosures to share for this article.
Beier, M. E., Kim, M. H., Saterbak, A., Leautaud, V., Bishnoi, S., & Gilberto, J. M. (2019). The effect of authentic project‐based learning on attitudes and career aspirations in STEM. Journal of Research in Science Teaching, 56(1), 3–23. https://doi.org/10.1002/tea.21465
Espey, M. (2018). Diversity, effort, and cooperation in team-based learning. The Journal of Economic Education, 49(1), 8–21. https://doi.org/10.1080/00220485.2017.1397571
Singer, A., Montgomery, G., & Schmoll, S. (2020). How to foster the formation of STEM identity: Studying diversity in an authentic learning environment. International Journal of STEM Education, 7(1), Article 57. https://doi.org/10.1186/s40594-020-00254-z
Torres, W. J., & Saterbak, A., & Beier, M. E. (2016, June). Long-term impact of an elective, first-year engineering design course. In 2016 ASEE Annual Conference & Exposition. https://doi.org//10.18260/p.25575
©2021 David Uminsky. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.