Computing infrastructure for teaching data science

posit::conf(2023)
Teaching Data Science Masterclass

Mine Çetinkaya-Rundel

Recap

Your turn!

Log out of Posit Cloud and log back in.

02:00

This morning…

  • Module 1: Teaching data science with the tidyverse and Quarto
  • Module 2: Teaching data science with Git and GitHub
  • Everything we’ve discussed and recommended as good practice requires a ready-to-go computing environment, as opposed to one where students need to install various software before they can make their first data visualization

Why RStudio in the cloud?

  • Reduce friction at first exposure to R
  • Avoid local installation
  • Install R and RStudio on a server and provide access to students:
    • Centralized RStudio server (*)
    • Dockerized RStudio server (*)
    • Posit Cloud

Hello Posit Cloud!

What is Posit Cloud?

We created Posit Cloud to make it easy for professionals, hobbyists, trainers, teachers, and students to do, share, teach, and learn data science using R.

An RStudio project

A Posit Cloud project

Why Posit Cloud?

  • Does not require IT support
  • Features designed for instructors:
    • Classes can be organized in workspaces
    • Members can be assigned different roles: instructor, TA, student
    • Projects can be public or private
    • Projects can be made into assignments
    • Students can make copies of projects created by instructor
    • Instructor can peek into student projects
    • A default project template can ensure same packages in each new project created in the workspace
    • Git works out of the box

Contexts

  • Semester long courses
    • Intro data science / statistics: little to no background in stats, data science, programming
    • Upper level data science / statistics: Varied computing background and different computer setups
  • Shorter workshops: Likely no opportunity to communicate pre-workshop instructions, varied computing background and learning goals

Workspaces

  • When you create an account on Posit Cloud you get a workspace of your own

  • You can add a new workspace and control its permissions

  • Projects in either workspace can be public or private

Projects

  • A new project in Posit Cloud

  • is basically a new project in the RStudio IDE

Say yes to projects

If you use RStudio, use projects! Trust me, you won’t regret it. Find out more at https://r4ds.had.co.nz/workflow-projects.html.

Projects from Git

  • A new project from Git Repository in Posit Cloud
  • is creating (cloning) a project from a Git repository RStudio

Teaching a short workshop

Setup in 280 characters

Single project, instructor view

Demo:

  • Create a new project in your workspace

  • Install all required packages and add all required files

  • Change permissions

  • Share URL – make a shortlink if you like!

Single project, student view

Demo:

  • Go to the URL shared with you

  • Pick up the project where the instructor left off

Sharing an individual project

Good!

  • Students land directly in a project upon login
  • Works well for workshops where all work will be completed in a single project
  • Also great for sharing code in general, e.g. collaboration, reprexes, etc.

Not so good…

  • Students need to remember to make a copy of the project (which means you need to remember to remind them!)
  • Changes you make after student launches the project won’t propagate to their project

Your turn: Create a shareable project

You may have already done some of the steps below during my demo!

Your role: Instructor

  • Create a new project in your workspace and give it a name.
  • Create a new Quarto document in the project – just the new document template is fine!
  • Change the access level of the project so others can see it as well.
  • Grab the project URL and share it with your neighbor.


Your role: Student

Access your neighbor’s project as if you’re the student and they’re the instructor.

05:00

Teaching a longer course

Creating a workspace, instructor view

Demo:

  • Create a new workspace and add a description

  • Set permissions for various roles

  • Set up a template, and make it the default

  • Set up a new assignment

  • Invite students via a sharing link

Permission levels

role permission course role
admin manage users, view, edit and manage all projects instructor
moderator view, edit and manage all projects TA
contributor create, edit and manage their own projects student
viewer view projects shared with everyone auditor

Sharing a workspace

Good!

  • Various permission levels
  • Base projects with desired packages installed
  • Assignments, which remove the need to remind students to make a copy of the project before starting work
  • Ability to peek into students’ projects

Not so good…

  • Students land in the workspace, may need to provide instructions for the next steps
  • You can update the base project throughout the course, but it will only be applied to projects created going forward

Cloud miscellania

Git integration

Default project template can be used, so New Project from Git Repository also has the right packages installed!

Institution accounts

Dashboards

Student usage / engagement metrics

Parting remarks

Tips

  • Each project is allocated 0.5 GB of RAM by default
    • Test things out before assignments involving large datasets
  • What your students see is not always what you see
    • Create a secondary account and add as a student

Posit Cloud and your course

  • If teaching without Git and GitHub, sufficiently rich functionality for fully supporting your course

  • If teaching with Git and GitHub, there are some rough edges (that I hope will be ironed out soon):

    • You need to set a PAT for each project, which is a setup that is not a realistic GitHub workflow and requires either that you have as many PATs as projects laying around or you have a good password manager where you can store the PAT and copy it and set it in each project. Thanks to the usethis package this can be done with 2 commands without going to the Terminal, which is good if that’s not part of the course learning goals.

    • You need to run git config credential.helper store in the Terminal to make sure PAT is available in that project “forever”, or until the PAT itself expires.

Discussion

What models can you envision for collecting assignments and providing feedback when teaching with Posit Cloud?

03:00

To be or not to be, in the IDE

Work in the IDE

  • So far we’ve demonstrated and discussed computational infrastructure options where students are “in the IDE”, where they can do data science

  • Another approach, with the opportunity for even more immediate hands-on interaction, is running code in the browser, to quickly experience data science