Skip to main content
SearchLoginLogin or Signup

Online Education in Times of COVID: Adapting and Deploying a Data Science Program in Mexico

Published onMay 02, 2022
Online Education in Times of COVID: Adapting and Deploying a Data Science Program in Mexico

Column Editor’s Note: In this Minding the Future column piece, a team from Mexico describe their efforts in moving a data science program for high school and college students online during the COVID-19 pandemic.  The piece highlights both the flexibility required of educators and students at the height of the pandemic, and the efforts to which many went to ensure that educational opportunities continued, and in some cases even expanded.  Statisticians and data scientists had—and continue to have—a key role in explaining the abundance of data collected during the global pandemic, as well as clarifying what the data could (and could not) tell us.

Keywords: local education, outreach, diversity, STEM education


The COVID-19 pandemic has forced us to redesign and adapt how educational programs work. Educators face unique challenges and are moving to use online technologies to address current academic needs. In this “Minding the Future” column, we write about our learned experiences and insights after launching a new data science virtual program instead of our summer educational program, Clubes de Ciencia Mexico (CdeCMx), which typically runs face-to-face. Here we present how our program worked before COVID-19, the decisions we made to offer an alternative educational experience in the context of a global pandemic, and a description of our new program called The CdeCMx Challenge. We present some recommendations for educators and researchers based on a survey applied to participating students. Our goal is to offer an example of adapting after-school hands-on educational programs to an online format. Here we focus the subject of our program on data science and COVID-19 in the context of Mexican high school and college students. We attempt to take advantage of the topic's relevance and increased virtual connectivity in a globalized world while being constrained by resources within the developing world.

The Clubes de Ciencia Mexico Educational Program

Science, technology, engineering, and math (STEM) outreach programs show  positive effects on students’ attitudes toward science (Heinze et al., 1995), and a positive influence on their performance in advanced science courses, as well as on their decision to participate in other science programs and their desire to pursue a career in science (Markowitz, 2004). In particular, at the precollege level, outreach programs are essential tools for increasing awareness about the science and engineering profession (Anderson et al., 2002). Those that blend current STEM topics and real-life applications can trigger important changes in career choices (Kudenko & Gras-Velázquez, 2016).

In 2014, we started our CdeCMx STEM outreach program, which took inspiration from several successful outreach initiatives according to Gandara and Bial (2001) and Schultz and Mueller (2006), and reviewed by Valla and Williams (2012). These characteristics include 1) participation of individuals who guide students over an extended time, often after the program has ended; 2) offering high-quality instruction, more specifically through access to innovative, cutting-edge topics that are outside of standard curriculum; 3) involving additional support and information to guide students through the college/graduate school application process; 4) sensitivity to the cultural backgrounds of students, with instructors that know about their educational environment; and 5) providing peer-to-peer interactions in which participants offer each other academic, as well as social and emotional support (Gandara & Bial, 2001).

After a pilot program led by Harvard University graduate students, CdeCMx became a volunteer-run nonprofit organization ( Our educational model combines Mexican scientists with international early-career scientists who work together to design and teach intensive, week-long workshops (from now on clubes) on diverse topics at the frontier of research and science. CdeCMx engages more than 600 high school and college students each year through free face-to-face, hands-on multidisciplinary workshops. Group activities that have multiple goals complement the program: to build mentoring relationships between students and instructors, to learn about opportunities of research internships, scholarships, and graduate programs (more than 12 students have been admitted to graduate programs in the United States and 10 to summer research programs as of 2020), as well as to learn about the daily life of a STEM professional and career opportunities.

What Is a Club of CdeCMx?

A club is a STEM workshop designed around learning goals, including applying technical knowledge to novel scientific problems and developing critical thinking to solve real-life problems. Instructors incorporate active learning methods (Freeman et al., 2014) in their lessons, emphasizing hands-on activities and learning soft skills such as teamwork and public speaking. A typical face-to-face club has two instructors (a local and a visiting instructor from abroad) and 20 students. Each club meets for 45 hours on average over a week.

For example, the club titled “The Story in the Data: Discovering the Higgs Boson” offered in 2016 in Merida, Yucatan, was inspired by the Kaggle Higgs Boson Machine Learning challenge (Adam-Bourdarios et al., 2015) and was led by a chemistry Harvard graduate student and a particle physics graduate student from the host city’s state university. In this club, students built models that took particle event statistics and attempted to classify if the event corresponded to a Higgs boson or another particle. In this club, students learned to perform exploratory data analysis on demographics data with scientific Python libraries (Hunter, 2007; Reback et al., 2021; Virtanen et al., 2020) and to build machine learning (ML) models (Pedregosa, 2011) on the Institute for Research on Innovation and Science (IRIS) data set. They applied the same techniques on the Higgs boson data set. Students also met a scientist from the European Organization for Nuclear Research (CERN) who participated in the discovery of the Higgs boson. In the end, students presented their work and findings in a student symposium.

In our most recent prepandemic edition (2019), we offered 33 clubes in nine cities throughout Mexico for 639 students. Since 2014, we have reached out to more than 5,000 students and taught more than 300 clubes with unique topics. The Clubes de Ciencia model has been implemented in seven other countries (Ferreira et al., 2019) throughout Ibero-America, reaching more than 15,000 students and more than 1,000 instructors thus far. For more information, visit

Decisions and Challenges When Moving to an Online Education Format

As an international science education program mainly United States and Mexico) our context brought some challenges due to the geographical location of instructors and students, participants’ schedule uncertainties, and limited access to educational and technical resources at students’ homes. When the pandemic arose during March 2020, holding a face-to-face summer program became impossible, so we adapted our model to a virtual format. While moving to an online format, our priorities were to maintain a high standard for the educational content and foster the core values of Clubes de Ciencia, which include promoting networking experiences between participants and increasing diversity accessibility to students of any socioeconomic background.

A guiding design principle behind our new online program was to serve all our students, especially those with limited access to online resources and infrastructure. In Mexico, 70% of the population has internet access, but only 44% of households have a computer (Instituto Nacional de Estadística y Geografía, 2020). Generally, family members share a single computer, and the internet connection is unreliable in several parts of the country. To facilitate interactions between instructors and students, we used a messaging platform. In addition, we asked instructors to record sessions and generate content for asynchronous sessions that could be accessible from a wide range of devices, including smartphones. Since we had never operated in this format before, we wanted to receive real-time feedback; therefore, we opted for tools such as voting and polling features during video conferences and in our messaging platform.

We decided to organize a data science-oriented event since we thought it would be easier to coordinate computer-based rather than lab-based experimental activities while still exposing students to state-of-the-art science. We took inspiration from data science competitions like Kaggle, and we defined a general topic and expected goals for all students to foster a sense of unity and collaboration. Previously, each club that we offered was independent. Still, for this online competition, we thought exposing students to different approaches and tools to solve a similar problem could facilitate the interaction among participants. We realized that most participants did not know each other, so we tried to enable collaborations through scientific challenges to make them interact. In this section, we describe in detail the tools used by our team.

An additional challenge was to identify the best time to run the program. Given the schedule changes in school calendars, we allowed students to engage in activities asynchronously and assumed that students would participate only for a couple of hours per session. We discuss a more detailed breakdown of the difference between the in-person (pre-COVID) and online (COVID) format in Table 1.

Table 1. Comparison of in-person science club and online challenge implementation.


In-person Club
(“Science Club”)

Online Club (“Challenge”)

Rationale for Change

Yearly program

Thirty to 50 different clubes in nine cities.

One broad topic was approached by four tracks, four to six submodules per the theme.

Broader topics would promote more shared experiences for students in an online setting.

Scientific topic

Decided among instructors for each club.

Organizers defined the broad topic and tracks. Instructors within each track proposed submodules.

Predefined tracks allow us to scope out more Kaggle/data science-like projects that are easier to implement online. Establishing similar research questions across teams also helps to foster collaboration among participants from different groups.


One foreign and one local instructor per club.

Four to 15 instructors in different geographical locations per track.

Without having to travel, more instructors (with varying time commitments) could participate in the program. Increasing the instructor roster was not financially prohibitive. We kept the balance of foreign and national instructors per track.

Students-instructor ratio

10:1 to 15:1

8:1 to 12:1

We kept a small student-instructor ratio to guarantee students would get individual attention and ensure accountability. We note that each instructor was assigned a list of students in the online version. Still, students were encouraged to participate in multiple classes/sessions, which would not be possible in the face-to-face version.


One to two volunteers per club, two to three site coordinators.

Fifteen people shared across all themes.

Having participants working around similar topics make it easier to solve common questions and troubleshoot problems.

Type of practices

Driven by the topic chosen by the instructor, which can be experimental, theoretical, or computational.

Mostly computational and team collaboration.

Online activities more easily lend themselves to data-science types of activities. We designed team activities to combat feelings of isolation in an online setting.


5 days of 8 hr in-person instruction each day.

6 days of 2 to 4 hr daily sessions. Plus 4 days for working on final projects.

We looked to balance increasing the number of days while keeping the participants interested. Extending the program's duration allowed for more flexible scheduling since participants might have other duties. Daily sessions were not necessarily consecutive. We allowed the instructors from each track to agree upon their schedule (with input from the organizers).


One in-person presentation per club (5–8 min). Presented in a conference-type session.

GitHub repository with analysis notebooks, summary website, 2–3-min video.

Usually, in the in-person club (20 students), instructors would organize their class to work on a final presentation together. To facilitate coordinating and monitoring students in the online version, we forced students to work in smaller teams.

Project evaluation

Not applicable.

External evaluation by reviewers on a comprehensive rubric.

We set clearly defined goals to allow students to strategize how they will work as a team. Also, we added the competition component as a strategy to keep students engaged.

Student preparation

Online material sent by instructors.

Month-long scientific programming course.

Programming is a more crucial skill for success in data science courses.

Staff preparation

Setting up in-person logistics, travel accommodations, classrooms, and buying lab materials.

Setting up tools (Slack, gather ‘Round, Zoom), writing guideline example documents, preparing common tutorials.

Meeting clear expectations with examples was more important. We developed various guidelines and user tutorials to compensate for any lack of communication lost through emails. We explored multiple software and tools to identify the best communication mediums.



Informal discussion panels, sports games, and food gatherings.

Online discussion panels, online games.

We encouraged networking activities and looked for online adaptations of our in-person discussion panels and recreational activities.

Program Implementation

"The CdeCMx 2020 Challenge," as we called our new virtual program, aimed to expose students to the fundamentals of data science and multidisciplinary research in an online format. We guided students through solving problems and data analysis using the COVID-19 pandemic as a research problem. As in our Clubes de Ciencia Program, we designed this program to include the main components of a successful outreach initiative (Valla et al., 2012). The learning outcomes of our program were to 1) understand statistical modeling and its applications to solve real-world problems 2) use computational skills to analyze and manipulate big data sets 3) acquire introductory knowledge of skills and tools to pursue a career in data science and 4) develop communication skills to interpret and disseminate data-driven content.

Our organizing team was composed of five coordinators and 15 staff members. The coordinators envisioned the mission and goals, while the staff members helped with the implementation. About 70% of our staff had programming expertise to provide mentorship to students, and they committed full time during the program execution. The remainder of the team provided technical assistance (i.e., scheduling video calls and recording talks) to participants and instructors.

We opened a preregistration and asked candidate students to complete a short online Python course to provide them with programming fundamentals (see Figure 1). Given the limited availability of free high-quality online courses in Spanish, we developed one in collaboration with a Mexican community of programmers (Future Lab) and an online educational platform (OmegaUp). During this phase, we offered live Q&A sessions to promote engagement. We accepted into the program students who completed the programming course and assigned them to one of the four tracks (described following). We anticipated that making the programming course a prerequisite would prune applicants to a group of disciplined and engaged students.

Figure 1. Timeline of events for our program. The timeline includes all the events necessary to do the CdeCMx Challenge, from its creation to the final ceremony, including activities done by students, instructors, and organizers.

We defined four tracks to study diverse aspects of the COVID-19 pandemic: health and environment, epidemiological models, therapeutics development, and diagnostics development. We invited 37 instructors from universities in the United States and Mexico (Figure 2A) and divided them into these four tracks according to their background. For example, in the epidemiological track, instructors’ backgrounds were in medicine, public health, and engineering with expertise in artificial intelligence. Most of our instructors apply statistical analysis and computational skills to their fields of study (Figure 2B). Then, the instructors spent 2 months developing the teaching material according to our program learning goals. The organizing team met with the instructors regularly to monitor and provide feedback on the content of the courses.

Figure 2. A snapshot of instructors that participated in our 2020 program. A) Statistics on the country of residence, nationality, and gender. B) Academic area of study from instructors. The main disciplines in the different areas were formal sciences (physics 100%), natural sciences (biology 29%), engineering and technology (bioengineering and biotechnology 55%), and medical and health sciences (medicine and public health 66%).

The CdeCMx Challenge lasted 11 days and was divided into two parts (Figure 1). First, the students participated in instructor-led active learning sessions (Freeman et al., 2014): webinars, demos, hands-on computational activities, and roundtable discussions. The webinars were structured to last from 1 to 2 hours and were divided into 15-minute blocks, where first the instructor would teach about a topic, followed by time for questions and discussion. We used online polling tools (like and breakout rooms (from Zoom or Google Meet) to encourage student interaction. For the practical sessions (with a similar duration as webinars), usually, the instructor would walk students through a live demo (this could be a statistical analysis pipeline or an algorithm implementation). Then students were asked to repeat the exercise with simple modifications. We used Google Colab notebooks, which are Jupyter notebooks that run in the cloud and require no setup to use while providing free access to computing resources, including GPUs. We used Slack as the main source of communication. This platform facilitated focused communication and collaboration through customizable chat rooms (channels). Each track had its Slack channel where participants could continue with class discussions.

Second, in the project development phase, the students were divided into small teams to work on their projects guided by their instructors. We encouraged students to form interdisciplinary teams. We asked instructors to design different projects to account for students' diverse skills and resources (e.g., explanatory blog posts, exploratory data analysis, modeling, and computer simulations). The student teams coordinated with their instructors, and the organizing team provided complimentary support. We had additional Q&A sessions and practical tutorials that focused on building skills for making effective presentations and scientific reports and building web pages (using GitHub pages) for their final projects. The chat rooms allowed the teaching staff to keep track of the students’ activities. For example, we frequently monitored question chats and made informal polls, and then we tailored office hours and Q&A sessions to address common questions.

Each team submitted a final project in the form of two deliverables: a website that detailed the rationale for their work and a 2-minute video presentation where they could share their experience and lessons learned. We evaluated and awarded prizes to the final projects, taking into account the seven categories listed in Figure 1. Each category evaluates one of the following learning outcomes: 1) summarize and outline an easily reproducible science project; 2) demonstrate the capacity of teamwork and collaboration; 3) develop compelling data visualizations; 4) demonstrate technical merit by developing code that integrates a proper data analysis pipeline, developing an algorithm, or adequately using statistical analysis; 5) demonstrate the ability to communicate scientific concepts using data-driven information; 6) compile information together in a novel and insightful way; and 7) develop a well-rounded presentation. The different categories attempted to allow different skills, capabilities, and personalities to shine.

As an example of a final project, in the epidemiological model's track, one team studied how to model the number of cases during the COVID-19 pandemic using Markov chain models and code a Markov chain in Python. They learned how to retrieve data from public databases and analyze it according to their local context. In the therapeutics development track, one of the final projects focused on network biology and the application of protein-protein interaction networks to develop novel drugs against SARS-COV-2. Students downloaded real data sets from different open sources and compared various protein networks to identify potential new drug targets; in one exercise, they attempted to replicate findings from a recent scientific publication (Saha et al., 2020).

Below we show some of the final projects from student teams, highlighting links to their websites and final presentations:

  • Epidemiological Modeling track: “CoviDetectives.” Modeling COVID Cases Using Markov Chains. Website, video presentation.

  • Health and Environment track: “PM2.5 and Covid-19 in San Pedro Garza García.” Studying the relationship between contamination and COVID in a city. Website, video presentation.

  • Health and Environment track: “Variability in Covid-19 Cases in Baja California Not Related to Population Density.” Investigating social factors that can affect the velocity of Covid infections in a state in Mexico. Website, video presentation, GitHub repository.

  • Therapeutics Development track: “Network Biology.” Looking at the discovery of new drugs from a network perspective. Website, video presentation, Google Colab.

  • Diagnostics Development track: “Comparing Diagnostic Methods.” Survey of several covid diagnostic methods. Website.

Insights From Survey Data

This section describes the results obtained from the registration form and the postprogram surveys given to participating students. These forms include multiple choice and open questions (see Appendix A) to learn demographics and expectations and assess students' learning experience. First, we present an overall description of the demographics of participating students (Figure 3) and summarize students' expectations, then we summarize their feedback and experience after participating in the program.

Figure 3. A snapshot of students that participated in our 2020 program. A) Barplot of the area of study of students. B) Barplot of English and programming skills prior to the Clubes de Ciencia Mexico (CdeCMx) Challenge—levels are defined as null (1), basic (2), intermediate (3), and advanced (4). C) Map of the states of Mexico color-coded by the number of participating students in each state.

We received 847 applications, from which 286 applicants completed the prerequisite introductory online programming course, and 280 enrolled and completed the CdeCMx Challenge. The median age of the participants was 20 years, with a 25% percentile at 18 years and 75% at 21 years; we identified that most participants reported being female (58%). A large fraction (40%) of students reported to be concentrated in the biosciences; they came from natural, medical, and health sciences (Figure 3A). We asked the students to rate their programming skills and English level: 70% of students self-identified to have intermediate or advanced English level, and 79% reported to have null or basic programming skills (Figure 3B). We had a broad representation of students across almost all of Mexico's 32 states (Figure 3C). We asked students, "What do you expect to learn at the CdeCMx Challenge?" and classified their answers into seven categories1: 32% expected to develop programming skills; 18% expected to learn about the COVID-19 pandemic; 8% expected to increase their exposure to scientific problems to inform their professional careers (including the possibility of postgraduate degrees and specializations).

We conducted an exit survey to determine students' satisfaction, to identify successful teaching strategies, and to understand potential limitations or disadvantages perceived by students (Figure 4). Only 187 students completed the postprogram survey. We were particularly interested in evaluating our program's difficulty level and whether logistics aspects prevented students from completing or fully participating in the CdeCMx Challenge. We asked students to rate their experience: 86% reported that the CdeCMx Challenge met their learning goals as expected or more; 42% rated the difficulty level of the course to be adequate and 39.8% as advanced; and 85% rated the activities and platforms used to be suitable based on their time availability, computer resources, and internet speed.

Figure 4. Summary of exit survey answers[1] evaluating students’ assessment of the program (187 responses). A) Top advantages and disadvantages of an online model from students' answers. B) Summary of the factors that affect students’ learning experience, some of the others activities that students reported were a factor that affected the learning experience are the job and the housework. C) Prefered teaching methods and strategies highlighted by students, “other tools” refers to: Python, Google Colab, GitHub, and so on. D) Students’ suggestions for teachers to improve their online sessions. The questions were open, and we manually classified them into multiple categories. Figures exclude the individuals that either did not answer these questions or whose answers were difficult to interpret and classify. We note that students could have selected one or more categories for each question.

Since the CdeCMx Challenge was an online activity, we asked "What advantages or disadvantages do you find compared to face-to-face training?" (Figure 4A). Students pointed out as advantages having "no geographical boundaries" (20%), "flexible schedule" (11.7%), "more people can participate" (9.1%), and "work from home" (9.1%). Among the disadvantages that resonated the most, 26.3% mentioned "loss of direct social interaction," 13.2% "inefficient communication," and 9% "technical difficulties and internet connection." We then asked, "What factors do you consider that contributed or affected your learning experience?" (Figure 4B). Positive factors like the mentorship and guidance from instructors and staff members highlight the importance of having a highly motivated and committed teaching staff. In contrast, having limited prior experience with programming tools was self-perceived to be limiting for the overall experience in the program. Students also reported that having a good or bad team during the project phase affected their overall experience.

We asked “What teaching methods instructors used were most helpful?” (Figure 4C). Students highlighted the following having online complementary learning resources (e.g., blog posts, YouTube videos, scientific papers), including practical usage of visual mediums in webinars (e.g., concise presentations, interactive plots), learning Python using Google Colab through interactive sessions, and Q&A sessions. Lastly, we asked for suggestions from instructors to improve their sessions (Figure 4D); most requests were addressed to the sessions that lacked interactions or had poor time management during class, highlighting the importance of including interactive learning strategies.

We asked students to select factors that affected their availability to participate in the Challenge; over 60% chose both school commitments and work, 32% highlighted personal or family situations, and 18% mentioned having limited internet access. We also asked how many hours per day they could invest in the program; 43% selected from 2 and 3 hours, and another 35% selected from 4 to 5 hours per day.

Furthermore, we asked our students to self-assess their confidence in using programming and data science skills in future endeavors before and after the program. On average, students reported a median of one point of increase (on a scale of 0–5) in self-confidence in using programming and data analysis in future endeavors. Though this should be interpreted with caution, it could be influenced by the immediate enthusiasm of students. A follow-up survey after 6 months would be more reliable.

Lessons for Potential Implementers

Implementing an online STEM educational program allowed us to explore a new way to interact with students. In this section, we present some lessons we learned during this process and suggestions for future implementations. While several of these may seem to be common sense, and we acknowledge that the survey data described here represents limited results from one program at a particular point in time, our goal is to further the discussion about online teaching and note trends that may need further investigation.

  • Setting up expectations, planning, and adapting multiple mediums is key for a positive learning experience. Our postprogram survey suggests that students were more excited about their classes when they included activities that utilized multiple mediums, including video recordings, chat rooms (like Slack), and readings ameliorated challenges that may be present during synchronous online activities. Students also highlighted good time management and overall lesson organization as good characteristics of effective teaching lessons.

  • Going online may increase access across gender, geography, time, and economic backgrounds. In 5 years of operation of the Clubes de Ciencia program, we have received more than 8,000 student applications with men representing the majority of them; interestingly, we saw this trend change for the first time. In this online edition, we had a higher percentage of female applicants (64.7%), while this observation warrants further investigation, it may be possible that eliminating the need to physically attend the program outside of the home has lowered the barrier for female students to apply and participate in our programs. In line with this idea, students appreciated that they could participate without commuting and without having to pay traveling expenses. In previous years many students that live outside of the cities where we offer the program had to travel interstate to participate. In this edition, we had a larger representation of students from southern states, which are characterized by having limited educational and economical resources in the country.

  • It is uncertain how much high-quality online education can scale.​ While it is tempting to think that it is easier to expand an online program since it requires fewer economic resources, how many students one can serve without severely diminishing the quality of instruction remains unclear. We believe quality instruction requires a personal touch. Our students highlighted the organizing team, staff, and instructors as the main features that made the program a good educational experience. Our program included one member from our staff team for every 20 students and one instructor for every 20 students. Finding the appropriate ratio of instructor/staff/students will require further investigation.

  • Teaching data science with limited student access to computers or limited programming or limited statistical skills remains a challenge. We found that students often associated data science with programming activities, which might be discouraging if they do not have a prior technical background. We think it is beneficial during teaching to put effort into contextualizing data science with activities that require minimal specialized tools and skills. Combining learning technical skills such as programming and advanced statistical analysis with developing public speaking and collaboration as learning outcomes may lower the barrier for students to get involved in data science programs. In our model, we encouraged instructors to teach how to digest a technical topic critically and communicate it to a variety of audiences. Using the news or analyzing upcoming policy changes can serve as a basis for these activities.

Concluding Remarks

When moving a face-to-face educational program to an online format, there are many unknowns, from technology choices to educational practices. Nonetheless, this challenge also offers a great time to experiment with new educational models and tools. Before the COVID-19 pandemic, our program relied on instructor traveling, face-to-face interactions, and access to labs and computers to deliver educational content. This first online version of our program received positive feedback from the participants (see Figure 5 for testimonials). In surveys and testimonials, participants reported positive STEM socialization and increased self-confidence in data science, which is crucial to increasing STEM representation (Eccles, 2007; Valla et al., 2012).

Figure 5. Testimonials from participants. To provide a snapshot of how participants at all levels were feeling during these activities we provide some quotes. These quotes come from the instructors and students.

Some of the overall advantages of running a virtual program that we identified were 1) more instructors could get involved due to a lack of financial constraints and flexibility in scheduling; 2) more students could participate, while keeping a small student–instructor ratio; 3) we reached an audience that otherwise may not participate, which proves to be especially important for STEM programs that aim to reach underserved minority groups. To meet the educational goals, some of the successful active learning strategies for online workshops were 1) setting common goals across clubes helps create collaborations across teams that otherwise would be difficult to happen in an online setting organically; 2) use of collaborative platforms like Google Colab for teaching programming and chat rooms (Slack) as primary means of communication; 3) setting up clear expectations, writing instructions and tutorials, and providing Q&A was crucial for getting the best from students and instructors.

However, multiple challenges remain: 1) finding ways to reinforce students' accountability, mainly when working in teams as it affects the overall students' experiences; 2) finding tools and activities that do not rely on high internet speed or high-quality computer devices; 3) improving and refining the means of online social interaction; 4) finding the optimal length of an online program, where students can have more classes without losing focus; 5) defining the best student/instructor ratio, and to what extent we can increase the program size without compromising its quality.

In the future, we aim to incorporate this online experience to develop a hybrid educational program since it proved to be an excellent opportunity to increase access to STEM education. This may be particularly important for underserved groups.


We want to thank the staff, students, and instructors that made the edition of CdeCMx 2020 possible. We want to thank Gabriel Fuente (Scripps Institute) and Brendan Deveney (Harvard University) who contributed to the survey design. Lastly, we want to emphasize that analysis of survey data was performed by A.M.A.R. and G.M.B., both volunteers and former students of our program.

Disclosure Statement

Andrea Monserrat Arredondo-Rodriguez, Gabriel Missael-Barco, Claudia Ivette García-Gil, Alicia de Carmen Hernández-Guzman, Rogelio Antonio Hernández-López, Benjamin Manuel Sánchez-Lengeling, and Carla Márquez-Luna have no financial or non-financial disclosures to share for this article.


Adam-Bourdarios, C., Cowan, G., Germain, C., Guyon, I., Kégl, B., & Rousseau, D. (2015). The Higgs boson machine learning challenge. In Proceedings of Machine Learning Research: Vol. 42. Proceedings of the NIPS 2014 Workshop on High-Energy Physics and Machine Learning (19–55).

Anderson, L. S., & Gilbride, K. A. (2002). Pre-university outreach: Encouraging students to consider engineering careers. Global Journal of Engineering Education, 7(1), 87–93.

Eccles, J. S. (2007). Where are all the women? Gender differences in participation in physical sciences and engineering. In S. J. Ceci & W. M. Williams (Eds.), Why aren’t more women in science? Top researchers debate the evidence (pp. 199–210). American Psychological Association.

Ferreira, L. M. R., Carosso, G. A., Duran, N. M., Bohorquez-Massud, S. V., Vaca-Diez, G., Rivera-Betancourt L. I., Rodriguez, Y., Ordonez, D. G., Alatriste-Gonzalez, D. K., Vacaflores, A., Gonzalez Auza, L., Schuetz, C., Alvarado-Arnez, L. E., Alexander-Savino, C. V., Gandarilla, O., & Mostajo-Radji, M. A. (2019). Effective participatory science education in a diverse Latin American population. Palgrave Communications, 5(63), 1–18.

Freeman, S., Eddy, S. L., McDonough, M., Smith, M. K., Okoroafor, N., Jordt, H., & Wenderoth, M. P. (2014). Active learning increases student performance in science, engineering, and mathematics. Proceedings of the National Academy of Sciences, 111(23), 8410–8415.

Gandara, & Bial, D. (2001). Paving the way to postsecondary education : K-12 intervention programs for underrepresented youth. Report of the National Postsecondary Education Cooperative Working Group on Access to Postsecondary Education ; prepared for the National Postsecondary Education Cooperative (NPEC) and its Working Group on Access by Patricia Gándara with Deborah Bial. National Center for Education Statistics, Office of Educational Research and Improvement, U.S. Dept. of Education.

Heinze, K. F., Allen, J. L., & Jacobsen, E. N. (1995). Encouraging tomorrow’s chemists: University outreach program bringing hands-on experiments to local students. Journal of Chemical Education, 72(2), 167.

Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science Engineering, 9(3), 90–95.

Instituto Nacional de Estadística y Geografía. (2020, 14 May). Estadísticas a Propósito del Día Mundial del Internet.

Kudenko I., & Gras-Velázquez, À. (2016). The future of European STEM workforce: What secondary school pupils of Europe think about STEM industry and careers. In N. Papadouris, A. Hadjigeorgiou, & C. Constantinou (Eds.), Insights from research in science teaching and learning (pp. 223–236). Contributions from Science Education Research (vol 2). Springer, Cham.

Markowitz, D. G. (2004). Evaluation of the long-term impact of a university high school summer science program on students’ interest and perceived abilities in science. Journal of Science Education and Technology, 13(3), 395–407.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Reback, J., McKinney, W., jbrockmendel, Van den Bossche, J., Augspurger, T., Cloud, P., gfyoung, Sinhrks, Hoefler, P., Klein, A., Petersen, T., Tratner, J., She, C., Ayd, W., Naveh, S., Darbyshire, J. H. M., Garcia, M., Shadrach, R., Schendel, J., … Battison, P. (2021, January 24). pandas-dev/pandas: Pandas 1.2.1. 2021.

Saha, S., Halder, A. K., Bandyopadhyay, S. S., Chatterjee, P., Nasipuri, M., Bose, D., & Basu, S. (2020). Is fostamatinib a possible drug for COVID-19? – A computational study. OSF Preprints.

Schultz, J. L., & Mueller, D. (2006). Effectiveness of programs to improve postsecondary education enrollment and success of underrepresented youth: A literature review. Wilder Research.

Valla, J. M., & Williams, W. M. (2012). Increasing achievement and higher-education representation of under-represented groups in science, technology, engineering, and mathematics fields: A review of current K-12 intervention programs. Journal of Women and Minorities in Science and Engineering, 18(1), 21–53.

Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P. Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Millman, K. J., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., … & SciPy 1.0 Contributors. (2020) SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17, 261–272.


Appendix A: Students exit survey

  1. Were you able to finish your participation in the CdeCMx Challenge? In other words, did you attend the majority of the sessions and submit your final project?

    a. Yes

    b. No

  2. Generally speaking, was the CdeCMx Challenge what you expected?

Scale 1 (less than expected)  to 5 (More than I expected)

  1.  Why?

  2. Did the CdeCMx Challenge meet your learning expectations?

Scale 1(less than expected)  to 5 (More than I expected)

  1. One of CdeCMx Challenge's main objectives was to show students theoretical and practical ways of using science, technology, and programming to understand and solve problems associated with the COVID-19 pandemic and motivate students to take action and propose solutions to those challenges. Do you think we achieved this objective?

Scale 1(Not at all)  to 5 (Absolutely)

  1. In your opinion, indicate the level of difficulty of the content during the Challenge.

Scale 1(Very basic)  to 5 (Very Advanced)

  1. Do you consider that the activities and platforms used during the Challenge worked well for your time availability, computer access, and internet speed?

Scale 1(Disagree)  to 5 (Absolutely agree)

  1. Do you have any suggestions to improve the experience?

  2. What activities of the CdeCMx Challenge do you consider that were useful to you?
    Check all that apply.

    a. The python course and office hours

    b. Webinars by instructors

    c. Q&A sessions with instructors

    d. Instructor-led hands-on activities

    e. Social hours

    f. Working in teams on the final project

    g. Communication with staff and organizers via Slack

    h. Website construction advice (counseling?)

  3. What activities of the CdeCMx Challenge do you think we can improve and how?

  4. The CdeCMx Challenge was an online activity; what advantages or disadvantages do you find compared to face-to-face training?

  5. What factors do you consider that contributed or affected your learning experience during the Challenge?

  6. How much do you think the online option contributed to your participation in this edition?

  7. How do you rate the overall teaching quality of the CdeCMx Challenge instructors?

Scale 1(Poor)  to 5 (Exceptional)

  1. What were some teaching methods that instructors used that were most helpful to you?

  2. What suggestions do you have for instructors to improve their sessions?

  3. Would you like to be able to continue in contact with one of your instructors after the CdeCMx Challenge?

    a. Yes

    b. No

    c. Maybe

  4. If your answer was yes, by what means?

    a. e-mail

    b. Slack

    c. Video calls

    d. Activities face-to-face

  5. Before you participated in this edition of  Clubes de Ciencia, how much exposure have you had to science activities outside of school?

Scale 0 (Null)  to 5 (A lot)

  1. Before you participated in this edition of Clubes de Ciencia, have you had experience taking science classes in English?

    a. Yes

    b. No

  2. During your participation in this edition of Clubes de Ciencia, did you have any classes in English?

    a. Yes

    b. No

  3. After your participation in Clubes de Ciencia, Do you feel confident taking classes in English? How so?

Scale 0 (Not confident at all)  to 5 (Very Confident)

  1. Since part of our activities was in English, select the option with which you agree the most.

    a. The number of activities in English was adequate

    b. I prefer more activities to be in Spanish

    c. My position is neutral regarding the proportion of classes in English
    and/or Spanish.

    d. I prefer not to answer

  2. Rate your programming knowledge before your participation in this edition of Clubes de Ciencia (it can be any programming language).

Scale 0 (Null)  to 5 (Excellent)

  1. Before you participated in this edition of  Clubes de Ciencia, how did you feel about being able to do a job that involved a bit of programming or data analysis?

    Scale 0 (Not confident at all)  to 5 (Very Confident)

  2. After participating in this edition of Clubes de Ciencia, how do you feel about using programming or data analysis in future projects?

    Scale 0 (Not confident at all)  to 5 (Very Confident)

  3. How many hours a day did you have to invest in activities of the CdeCMx Challenge approximately.

    a. Less than 2 hours

    b. Around 2-3 hours

    c. Around 4-5 hours

    d. More than 6 hours

  4. Select all the factors that affect your time availability for activities like the CdeCMx Challenge

    a. I have to work and study

    b. Other extracurricular projects

    c. School duties

    d. Limited access to a computer/internet

    e. Slow Internet

    f. Personal / family situations

    g. I have no time

    h. Other

  5. What was your experience using Slack as the primary means of communication?

    Scale 1 (Very Poor)  to 5 (Excellent)

  6. Do you have any recommendations that make it easier for you to adapt to using new tools like Slack?

  7. How likely is it that you would recommend a friend or family member to participate in the CdeCMx Challenge?

Scale 1 (Not at all likely)  to 5 (Extremely Likely)

  1. Why?

  2. Would you consider participating again in the CdeCMx Challenge?

    1. Yes

    2. No

    3. Maybe

    4. NA

  3. If the CdeCMx Challenge activities lasted 30 days, instead of 10 days like this time, would you consider participating?

    a. Yes

    b. No

    c. Maybe

    d. NA

    e. Any final comment you want to add

If you could not finish the CdeCMx Challenge.

  1. Select all the factors that you think affected that you could not participate

    a. Lack of time

    b. Lack of access to a computer/internet

    c. Personal / family problems

    d. Health problems

    e. It was not what I expected

    f. It was very difficult for me

  2. Do you have any comments or suggestions for CdeCMx?

Appendix B: Example of a class schedule

Track: Therapeutics. In yellow we highlighted the data science classes.




Aug 05 Day 1 

11:30 AM Opening Ceremony

1 PM  Intros and Ice breaker

Instructor’s Intros, tasks and meet and greet.

Aug 06 Day 2  Thursday

12 PM Overview of viral cycle

2 PM Overview of diagnostics, antiviral therapeutics and vaccines

5 PM Intro to Bioinformatics

Lecture, Q & A

Lecture, Q & A


Lecture, Q & A

 Aug 07 Day 3 Friday

12:00 PM–1:30 CST SARS-COV2-Key targets proteins and therapeutics to target viral proteins.

2:00–3:30 - Biochemistry and biophysical methods to study protein-protein interactions

Lecture , Q&A



Lecture, Q&A

Aug 08 Day 4 Saturday

1–2:30 PM Network Biology, interactome

3–4:30 Protein Structure visualization and antibody discovery

Lecture , Q&A

Lecture , Q&A

Aug 09 Day 5


10–11:30 AM Vaccine Manufacturing

11:30–12:00 PM Preclinical stages of drug development

12:30–2 PM Clinical trials and clinical data modeling

Lecture , Q&A

Lecture , Q&A


Lecture , Q&A

Aug 10 Day 6–Aug 14 Day 10

Office hours, times according to instructor availabilities

Work in your challenge

Final projects examples:

  1. Explain the study design and statistical analyses to assess the safety and efficacy of a new vaccine.

  2. Given data of lung or another tissue of interest and using protein interaction networks propose drugs that could work for this tissue in particular.

©2022 Andrea Monserrat Arredondo-Rodriguez, Gabriel Missael-Barco, Claudia Ivette García-Gil, Alicia del Carmen Hernández-Guzman, Rogelio Antonio Hernández-López, Benjamin Manuel Sánchez-Lengeling, and Carla Márquez-Luna. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

No comments here
Why not start the discussion?