B Learning Objectives

This appendix lists the learning objectives for each chapter, and is intended to help instructors who want to use this curriculum.

B.1 Getting Started

  • Identify the few standard files that should be present in every research software project.
  • Explain the typical directory structure used in small and medium-sized data analysis projects.
  • Download the required data.
  • Install the required software.

B.2 The Basics of the Unix Shell

  • Explain how the shell relates to the keyboard, the screen, the operating system, and users’ programs.
  • Explain when and why a command-line interface should be used instead of graphical user interfaces.
  • Explain the steps in the shell’s read-evaluate-print loop.
  • Identify the command, options, and filenames in a command-line call.
  • Explain the similarities and differences between files and directories.
  • Translate an absolute path into a relative path and vice versa.
  • Construct absolute and relative paths that identify specific files and directories.
  • Delete, copy, and move files and directories.

B.3 Building Tools with the Unix Shell

  • Redirect a command’s output to a file.
  • Use redirection to process a file instead of keyboard input.
  • Construct pipelines with two or more stages.
  • Explain Unix’s “small pieces, loosely joined” philosophy.
  • Write a loop that applies one or more commands separately to each file in a set of files.
  • Trace the values taken on by a loop variable during execution of the loop.
  • Explain the difference between a variable’s name and its value.
  • Demonstrate how to see recently-executed commands.
  • Re-run recently executed commands without retyping them.

B.4 Going Further with the Unix Shell

  • Write a shell script that uses command-line arguments.
  • Create pipelines that include shell scripts as well as built-in commands.
  • Create and use variables in shell scripts with correct quoting.
  • Use grep to select lines from text files that match simple patterns.
  • Use find to find files whose names match simple patterns.
  • Edit the .bashrc file to change default shell variables.
  • Create aliases for commonly-used commands.

B.5 Building Command-Line Tools with Python

  • Explain the benefits of writing Python programs that can be executed at the command line.
  • Create a command-line Python program that respects Unix shell conventions for reading input and writing output.
  • Use the argparse library to handle command-line arguments in a program.
  • Explain how to tell if a module is being run directly or being loaded by another program.
  • Write docstrings for programs and functions.
  • Explain the difference between optional arguments and positional arguments.
  • Create a module that contains functions used by multiple programs and import that module.

B.6 Using Git at the Command Line

  • Explain the advantages and disadvantages of using Git at the command line.
  • Demonstrate how to configure Git on a new computer.
  • Create a local Git repository at the command line.
  • Demonstrate the modify-add-commit cycle for one or more files.
  • Synchronize a local repository with a remote repository.
  • Explain what the HEAD of a repository is and demonstrate how to use it in commands.
  • Identify and use Git commit identifiers.
  • Demonstrate how to compare revisions to files in a repository.
  • Restore old versions of files in a repository.
  • Explain how to use .gitignore to ignore files and identify files that are being ignored.

B.7 Going Further with Git

  • Explain why branches are useful.
  • Demonstrate how to create a branch, make changes on that branch, and merge those changes back into the original branch.
  • Explain what conflicts are and demonstrate how to resolve them.
  • Explain what is meant by a branch-per-feature workflow.
  • Define the terms fork, clone, remote, and pull request.
  • Demonstrate how to fork a repository and submit a pull request to the original repository.

B.8 Working in Teams

  • Explain how a project lead can be a good ally.
  • Explain the purpose of a Code of Conduct and add one to a project.
  • Explain why every project should include a license and add one to a project.
  • Describe different kinds of licenses for software and written material.
  • Explain what an issue tracking system does and what it should be used for.
  • Describe what a well-written issue should contain.
  • Explain how to label issues to manage work.
  • Submit an issue to a project.
  • Describe common approaches to prioritizing tasks.
  • Describe some common-sense rules for running meetings.
  • Explain why every project should include contribution guidelines and add some to a project.
  • Explain how to handle conflict between project participants.

B.9 Automating Analyses with Make

  • Explain what a build manager is and how they aid reproducible research.
  • Name and describe the three parts of a build rule.
  • Write a Makefile that re-runs a multi-stage data analysis.
  • Explain and trace how Make chooses an order in which to execute rules.
  • Explain what phony targets are and define a phony target.
  • Explain what automatic variables are and identify three commonly-used automatic variables.
  • Write Make rules that use automatic variables.
  • Explain why and how to write pattern rules in a Makefile.
  • Write Make rules that use patterns.
  • Define variables in a Makefile explicitly and by using functions.
  • Make a self-documenting Makefile.

B.10 Configuring Programs

  • Explain what overlay configuration is.
  • Describe the four levels of configuration typically used by robust software.
  • Create a configuration file using YAML.

B.11 Testing Software

  • Explain three different goals for testing software.
  • Add assertions to a program to check that it is operating correctly.
  • Write and run unit tests using pytest.
  • Determine the coverage of those tests and identify untested portions of code.
  • Explain continuous integration and implement it using Travis CI.
  • Describe and contrast test-driven development and checking-driven development.

B.12 Handling Errors

  • Explain how to use exceptions to signal and handle errors in programs.
  • Write try/except blocks to raise and catch exceptions.
  • Explain what is meant by “throw low, catch high”.
  • Describe the most common built-in exception types in Python and how they relate to each other.
  • Explain what makes a useful error message.
  • Create and use a lookup table for common error messages.
  • Explain the advantages of using a logging framework rather than print statements.
  • Describe the five standard logging levels and explain what each should be used for.
  • Create, configure, and use a simple logger.

B.13 Tracking Provenance

  • Explain what a DOI is and how to get one.
  • Explain what an ORCID is and get one.
  • Describe the FAIR Principles and determine whether a dataset conforms to them.
  • Explain where to archive small, medium, and large datasets.
  • Describe good practices for archiving analysis code and determine whether a report conforms to them.
  • Explain the difference between reproducibility and inspectability.

B.14 Creating Packages with Python

  • Create a Python package using setuptools.
  • Create and use a virtual environment to manage Python package installations.
  • Install a Python package using pip.
  • Distribute that package via TestPyPI.
  • Write a README file for a Python package.
  • Use Sphinx to create and preview documentation for a package.
  • Explain where and how to obtain a DOI for a software release.
  • Describe some academic journals that publish software papers.