# J Glossary

abandonware
Software that is no longer being maintained.
absolute error
The absolute value of the difference between the observed and the correct value. Absolute error is usually less useful than relative error.
absolute import
In Python, an import that specifies the full location of the file to be imported.
absolute path
A path that points to the same location in the filesystem regardless of where it is evaluated. An absolute path is the equivalent of latitude and longitude in geography. See also: relative path
actual result (of test)
The value generated by running code in a test. If this matches the expected result, the test passes; if the two are different, the test fails.
agile development
A software development methodology that emphasizes lots of small steps and continuous feedback instead of up-front planning and long-term scheduling. Exploratory programming is often agile.
ally
Someone who actively promotes and supports inclusivity.
append mode
To add data to the end of an existing file instead of overwriting the previous contents of that file. Overwriting is the default, so most programming languages require programs to be explicit about wanting to append instead.
assertion
A Boolean expression that must be true at a certain point in a program. Assertions may be built into the language (e.g., Python’s assert statement) or provided as functions (e.g., R’s stopifnot). They are often used in testing, but are also put in production code to check that it is behaving correctly. In many languages, assertions should not be used to perform data-validation as they may be silently dropped by compilers and interpreters under optimization conditions. Using assertions for data validation can therefore introduce security risks. Unlike many languages, R does not have an assert statement which can be disabled, and so use of package such as assertr for data validation does not create security holes.
A task which contains important elements of things that learners would do in real (non-classroom situations).
auto-completion
A feature that allows the user to finish a word or code quickly through the use of pressing the TAB key to list possible words or code from which the user can select.
automatic variable
A variable that is automatically given a value in a build rule. For example, Make automatically assigns the name of a rule’s target to the automatic variable $@. Automatic variables are frequently used when writing pattern rules. See also: Makefile boilerplate Standard text that is included in legal contracts, licenses, and so on. branch-per-feature workflow A common strategy for managing work with Git and other version control systems in which a separate branch is created for work on each new feature or each bug fix and merged when that work is completed. This isolates changes from one another until they are completed. bug report A collection of files, logs, or related information that describes either an unexpected output of some code or program, or an unexpected error or warning. This information is used to help find and fix a bug in the program or code. bug tracker A system that tracks and manages reported bugs for a software program, to make it easier to address and fix the bugs. build manager A program that keeps track of how files depend on one another and runs commands to update any files that are out-of-date. Build managers were invented to compile only those parts of programs that had changed, but are now often used to implement workflows in which plots depend on results files, which in turn depend on raw data files or configuration files. See also: build rule, Makefile build recipe The part of a build rule that describes how to update something that has fallen out-of-date. build rule A specification for a build manager that describes how some files depend on others and what to do if those files are out-of-date. build target The file(s) that a build rule will update if they are out-of-date compared to their dependencies. See also: Makefile, default target byte code A set of instructions designed to be executed efficiently by an interpreter. call stack A data structure that stores information about the active subroutines executed. camel case A style of writing code that involves naming variables and objects with no space, underscore (_), dot (.), or dash (-) characters, with each word being capitalized. Examples include CalculateSum and findPattern. See also: kebab case, pothole case catch (an exception) To accept responsibility for handling an error or other unexpected event. R prefers “handling a condition” to “catching an exception”. Python, on the other hand, encourages raising and catching exceptions, and in some situations, requires it. Creative Commons license A set of licenses that can be applied to published work. Each license is formed by concatenating one or more of -BY (Attribution): users must cite the original source; -SA (ShareAlike): users must share their own work under a similar license; -NC (NonCommercial): work may not be used for commercial purposes without the creator’s permission; -ND (NoDerivatives): no derivative works (e.g., translations) can be created without the creator’s permission. Thus, CC-BY-NC means “users must give attribution and cannot use commercially without permission”. The term CC-0 (zero, not letter ‘O’) is sometimes used to mean “no restrictions”, i.e., the work is in the public domain. checklist A list of things to be checked or completed when doing a task. command-line interface A user interface that relies solely on text for commands and output, typically running in a shell. code coverage (in testing) How much of a library or program is executed when tests run. This is normally reported as a percentage of lines of code: for example, if 40 out of 50 lines in a file are run during testing, those tests have 80% code coverage. code review To check a program or a change to a program by inspecting its source code. cognitive load The amount of working memory needed to accomplish a set of simultaneous tasks. command history An automatically-created list of previously-executed commands. Most read-eval-print loops (REPLs), including the Unix shell, record history and allow users to play back recent commands. command-line argument A filename or control flag given to a command-line program when it is run. command line flag See command-line argument command line option See command-line argument command line switch See command-line argument comment Text written in a script that is not treated as code to be run, but rather as text that describes what the code is doing. These are usually short notes, often beginning with a # (in many programming languages). commit As a verb, the act of saving a set of changes to a database or version control repository. As a noun, the changes saved. commit message A comment attached to a commit that explains what was done and why. commons Something managed jointly by a community according to rules they themselves have evolved and adopted. competent practitioner Someone who can do normal tasks with normal effort under normal circumstances. See also: novice, expert computational notebook A combination of a document format that allows users to mix prose and code in a single file, and an application that executes that code interactively and in place. The Jupyter Notebook and R Markdown files are both examples of computational notebooks. conditional expression A ternary expression that serves the role of an if/else statement. For example, C and similar languages use the syntax test : ifTrue ? ifFalse to mean “choose the value ifTrue if test is true, or the value ifFalse if it is not”. confirmation bias The tendency to analyze information or make decisions in ways that reinforce existing beliefs. continuation prompt A prompt that indicates that the command currently being typed is not yet complete, and will not be run until it is. continuous integration A software development practice in which changes are automatically merged as soon as they become available. current working directory The folder or directory location in which the program operates. Any action taken by the program occurs relative to this directory. data package A software package that, mostly, contains only data. Is used to make it simpler to disseminate data for easier use. default target The build target that is used when none is specified explicitly. default value A value assigned to a function parameter when the caller does not specify a value. Default values are specified as part of the function’s definition. defensive programming A set of programming practices that assumes mistakes will happen and either reports or corrects them, such as inserting assertions to report situations that are not ever supposed to occur. destructuring assignment Unpacking values from data structures and assigning them to multiple variables in a single statement. dictionary A data structure that allows items to be looked up by value, sometimes called an associative array. Dictionaries are often implemented using hash tables. docstring Short for “documentation string”, a string appearing at the start of a module, class, or function in Python that automatically becomes that object’s documentation. documentation generator A software tool that extracts specially-formatted comments or dostrings from code and generates cross-referenced developer documentation. Digital Object Identifier A unique persistent identifier for a book, paper, report, dataset, software release, or other digital artefact. See also: ORCID down-vote A vote against something. See also: up-vote entry point Where a program or function starts executing, or the first commands in a file that run. exception An object that stores information about an error or other unusual event in a program. One part of a program will create and raise an exception to signal that something unexpected has happened; another part will catch it. expected result (of test) The value that a piece of software is supposed to produce when tested in a certain way, or the state in which it is supposed to leave the system. See also: actual result (of test) expert Someone who can diagnose and handle unusual situations, knows when the usual rules do not apply, and tends to recognize solutions rather than reasoning to them. See also: competent practitioner, novice explicit relative import In Python, an import that specifies a path relative to the current location. exploratory programming A software development methodology in which requirements emerge or change as the software is being written, often in response to results from early runs. export a variable To make a variable defined inside a shell script available outside that script. external error An error caused by something outside a program, such as trying to open a file that doesn’t exist. false beginner Someone whose previous knowledge allows them to learn (or re-learn) something more quickly. False beginners start at the same point as true beginners (i.e., a pre-test will show the same proficiency) but can move much more quickly. Frequently Asked Questions A curated list of questions commonly asked about a subject, along with answers. feature request A request to the maintainers or developers of a software program to add a specific functionality (a feature) to that program. filename extension The last part of a filename, usually following the ‘.’ symbol. Filename extensions are commonly used to indicate the type of content in the file, though there is no guarantee that this is correct. filename stem The part of the filename that does not include the extension. For example, the stem of glossary.yml is glossary. filesystem The part of the operating system that manages how files are stored and retrieved. Also used to refer to all of those files and directories or the specific way they are stored (as in “the Unix filesystem”). filter As a verb, to choose a set of records (i.e., rows of a table) based on the values they contain. As a noun, a command-line program that reads lines of text from files or standard input, performs some operation on them (such as filtering), and writes to a file or stdout. fixture The thing on which a test is run, such as the parameters to the function being tested or the file being processed. flag variable A variable that changes state exactly once to show that something has happened that needs to be dealt with later. folder Another term for a directory. forge A website that integrates version control, issue tracking, and other tools for software development. full identifier (of a commit) A unique 160-bit identifier for a commit in a Git repository, usually written as a 20-character hexadecimal character string. Git A version control tool to record and manage changes to a project. Git branch A snapshot of a version of a Git repository. Multiple branches can capture multiple versions of the same repository. Git clone Copies (and usually downloads) of a Git remote repository on a local computer. Git conflict A situation in which incompatible or overlapping changes have been made on different branches that are now being merged. Git fork To make a new copy of a Git repository on a server, or the copy that is made. See also: Git clone Git merge Merging branches in Git incorporates development histories of two branches in one. If changes are made to similar parts of the branches on both branches a conflict will occur and this must be resolved before the merge will be completed. Git pull Downloads and synchronizes changes between a remote repository and a local repository. Git push Uploads and synchronizes changes between a local repository and a remote repository. Git remote A short name for a remote repository (like a bookmark). Git stage To put changes in a “holding area” from which they can be committed. governance The process by which an organization manages itself, or the rules used to do so. GNU Public License A license that allows people to re-use software as long as they distribute the source of their changes. graphical user interface A user interface that relies on windows, menus, pointers, and other graphical elements, as opposed to a command-line interface or voice-driven interface. hitchhiker Someone who is part of a project but does not actually do any work on it. home directory A directory that contains a user’s files. Each user on a multi-user computer will have their own home directory; a personal computer will often only have one home directory. impact/effort matrix A tool for prioritizing work in which every task is placed according to its importance and the effort required to complete it. implicit relative import In Python, an import that does not specify a path (and hence may be ambiguous). impostor syndrome The false belief that one’s successes are a result of accident or luck rather than ability. in-place operator An operator that updates one of its operands. For example, the expression x += 2 uses the in-place operator += to add 2 to the current value of x and assign the result back to x. inspectability The degree to which a third party can figure out what was done and why. Work can be reproducible without being inspectable. integration test A test that checks whether the parts of a system work properly when put together. See also: unit test internal error An error caused by a fault in a program, such as trying to access elements beyond the end of an array. interruption bingo A technique for managing interruptions in meetings. Everyone’s name is placed on each row and each column of a grid; each time person A interrupts person B, a mark is added to the appropriate grid cell. invariant Something that is must be true at all times inside of a program or during the lifecycle of an object. Invariants are often expressed using assertions. If an invariant expression is not true, this is indicative of a problem, and may result in failure or early temrination of the program. issue A bug report, feature request, or other to-do item associated with a project. Also called a ticket. label (an issue) A short textual tag associated with an issue to categorize it. Common labels include bug and feature request. issue tracking system Similar to a bug tracking system in that it tracks "issues" made to a repository, usually in the form of feature requests, bug reports, or some other to-do item. JavaScript Object Notation A way to represent data by combining basic values like numbers and character strings in lists and key/value structures. The acronym stands for “JavaScript Object Notation”; unlike better-defined standards like XML, it is unencumbered by a syntax for comments or ways to define a schema. kebab case A naming convention in which the parts of a name are separated with dashes, as in first-second-third. See also: camel case, pothole case LaTeX A typesetting system for document preparation that uses a specialized markup language to define a document structure (e.g. headings), stylise text, insert mathematical equations, and manage citations and cross-references. LaTeX is widely used in academia, in particular for scientific papers and theses in mathematics, physics, engineering, and computer science. linter A program that checks for common problems in software, such as violations of indentation rules or variable naming conventions. The name comes from the first tool of its kind, called lint. list comprehension In Python, an expression that creates a new list in place. For example, [2*x for x in values] creates a new list whose items are the doubles of those in values. logging framework A software library that manages internal reporting for programs. logging level A setting that controls how much information is generated by a logging framework. Typical logging levels include DEBUG, WARNING, and ERROR. long option A full-word identifier for a command line argument. While most common flags are a single letter preceded by a dash, such as -v, long options typically use two dashes and a readable name, such as --verbose. See also: short option loop body The statement or statements executed by a loop. magic number An unnamed numerical constant that appears in a program without explanation. Makefile A file containing commands for Make, often actually called Makefile. Martha’s Rules A simple set of rules for making decisions in small groups. maximum likelihood estimation To choose the parameters for a probability distribution in order to maximize the likelihood of obtaining observed data. mental model A simplified representation of the key elements and relationships of some problem domain that is good enough to support problem solving. milestone A target that a project is trying to meet, often represented as a set of issues that all have to be resolved by a certain time. MIT License A license that allows people to re-use software with no restrictions. Nano (editor) A very simple text editor found on most Unix systems. non-governmental organization An organization that is not affiliated with the government, but does the sorts of public service work that governments often do. novice Someone who has not yet built a usable mental model of a domain. See also: competent practitioner, expert object-oriented programming A style of programming in which functions and data are bound together in objects that only interact with each other through well-defined interfaces. open license A license that permits general re-use, such as the MIT License or GPL for software and CC-BY or CC-0 for data, prose, or other creative outputs. open science A generic term for making scientific software, data, and publications generally available. operating system A program that provides a standard interface to whatever hardware it is running on. Theoretically, any program that only interacts with the operating system should run on any computer that operating system runs on. oppression A form of injustice in which one social group is marginalized or deprived while another is privileged. optional argument An argument to a function or a command that may be omitted. ORCID An Open Researcher and Contributor ID that uniquely and persistently identifies an author of scholarly works. ORCIDs are for people what DOIs are for documents. orthogonality The ability to use various features of software in any order or combination. Orthogonal systems tend to be easier to understand, since features can be combined without worrying about unexpected interactions. overlay configuration A technique for configuring programs in which several layers of configuration are used, each overriding settings in the ones before. pager A program that displays a few lines of text at a time. parent directory The directory that contains another directory of interest. Going from a directory to its parent, then its parent, and so on eventually leads to the root directory of the filesystem. See also: subdirectory patch A single file containing a set of changes to a set of files, separated by markers that indicate where each individual change should be applied. path (in filesystem) A string that specifies a location in a filesystem. In Unix, the directories in a path are joined using /. See also: absolute path, relative path path coverage The fraction of possible execution paths in a piece of software that have been executed by tests. Software can have complete code coverage without having complete path coverage. pattern rule A generic build rule that describes how to update any file whose name matches a pattern. Pattern rules often use automatic variables to represent the actual filenames. phony target A build target that does not correspond to an actual file. Phony targets are often used to store commonly-used commands in a Makefile. pipe (in the Unix shell) The | used to make the output of one command the input of the next. positional argument An argument to a function that gets its value according to its place in the function’s definition, as opposed to a named argument that is explicitly matched by name. postcondition Something that is guaranteed to be true after a piece of software finishes executing. See also: invariant, precondition pothole case A naming style that separates the parts of a name with underscores, as in first_second_third. See also: camel case, kebab case power law A mathematical relationship in which one quantity changes in proportion to a constant raised to the power of another quantity. precondition Something that must be true before a piece of software runs in order for that software to run correctly. See also: invariant, postcondition prerequisite Something that a build target depends on. privilege An unearned advantage, typically as a result of belonging to a dominant social class or group. pseudo-random number generator A function that can generate pseudo-random numbers. See also: seed procedural programming A style of programming in which functions operate on data that is passed into them. The term is used in contrast to other programming styles, such as object-oriented programming and functional programming. process An operating system’s representation of a running program. A process typically has some memory, the identity of the user who is running it, and a set of connections to open files. product manager The person responsible for defining what features a product should have. project manager The person responsible for ensuring that a project moves forward. prompt The text printed by an REPL or shell that indicates it is ready to accept another command. The default prompt in the Unix shell is usually $, while in Python it is >>>, and in R it is >. See also: continuation prompt
provenance
A record of where data originally came from and what was done to process it.
pull request
The request to merge a new feature or correction created on a user’s fork of a Git repositoryrepository into the upstream repository. The developer will be notified of the change, review it, make or suggest changes, and potentially merge it.
raise (an exception)
To signal that something unexpected or unusual has happened in a program by creating an exception and handing it to the error-handling system, which then tries to find a point in the program that will catch it.
raster image
An image stored as a matrix of pixels.
recursion
Calling a function from within a call to that function, or defining a term using a simpler version of the same term.
redirection
To send a request for a web page or web service to a different page or service.
refactoring
Reorganizing software without changing its behavior.
regression testing
Testing software to ensure that things which used to work have not been broken.
regular expression
A pattern for matching text, written as text itself. Regular expressions are sometimes called “regexp”, “regex”, or “RE”, and are powerful tools for working with text.
relative error
The absolute value of the difference between the actual and correct value divided by the correct value. For example, if the actual value is 9 and the correct value is 10, the relative error is 0.1. Relative error is usually more useful than absolute error.
relative path
A path whose destination is interpreted relative to some other location, such as the current working directory. A relative path is the equivalent of giving directions using terms like “straight” and “left”. See also: absolute path
Starting an interactive session on one computer from another computer, e.g., by using SSH.
remote repository
A repository located on another computer. Tools such as Git are designed to synchronize changes between local and remote repositories in order to share work.
An interactive program that reads a command typed in by a user, executes it, prints the result, and then waits patiently for the next command. REPLs are often used to explore new ideas, or for debugging.
repository
A place where a version control system stores the files that make up a project and the metadata that describes their history. See also: Git
reprex
A reproducible example. When asking questions about coding problems online or filing issues on GitHub, you should always include a reprex so others can reproduce your problem and help. The reprex package can help!
reproducible research
The practice of describing and documenting research results in such a way that another researcher or person can re-run the analysis code on the same data to obtain the same result.
reStructured Text
A plaintext markup format used primarily in Python documentation.
revision
See commit.
root directory
The directory that contains everything else, either directly or indirectly. The root directory is written / (a bare forward slash).
rotating file
A set of files used to store recent information. For example, there might be one file with results for each day of the week, so that results from last Tuesday are overwritten this Tuesday.
research software engineer
Someone whose primary responsibility is to build the specialized software that other researchers depend on.
script
Originally, a program written in a language too user-friendly for “real” programmers to take seriously; the term is now synonymous with program.
search path
The list of directories that a program searches to find something. For example, the Unix shell uses the search path stored in the PATH variable when trying to find a program whose name it has been given.
seed
A value used to initialize a pseudo-random number generator.
semantic versioning
A standard for identifying software releases. In the version identifier major.minor.patch, major changes when a new version of software is incompatible with old versions, minor changes when new features are added to an existing version, and patch changes when small bugs are fixed.
sense vote
A preliminary vote used to determine whether further discussion is needed in a meeting. See also: Martha’s Rules
shebang
In Unix, a character sequence such as #!/usr/bin/python in the first line of an executable file that tells the shell what program to use to run that file.
shell
A command-line interface that allows a user to interact with the operating system, such as Bash (for Unix and MacOS) or PowerShell (for Windows).
shell script
A set of commands for the shell stored in a file so that they can be re-executed. A shell script is effectively a program.
shell variable
A variable set and used in the Unix shell. Commonly-used shell variables include HOME (the user’s home directory) and PATH (their search path).
short circuit test
A logical test that only evaluates as many arguments as it needs to. For example, if A is false, then most languages never evaluate B in the expression A and B.
short identifier (of commit)
The first few characters of a full identifier. Short identifiers are easy for people to type and say aloud, and are usually unique within a repository's recent history.
short option
A single-letter identifier for a command line argument. Most common flags are a single letter preceded by a dash, such as -v. See also: long option
snake case
software distribution
A set of programs that are built, tested, and distributed as a collection so that they can run together.
source distribution
A software distribution that includes the source code, typically so that programs can be recompiled on the target computer when they are installed.
sprint
A short, intense period of work on a project.
ssh daemon
A remote login server that handles SSH connections.
SSH key
A string of random bits stored in a file that is used to identify a user for SSH. Each SSH key has separate public and private parts; the public part can safely be shared, but if the private part becomes known, the key is compromised.
SSH protocol
A formal standard for exchanging encrypted messages between computers and for managing remote logins.
stack frame
A section of the call stack that records details of a single call to a specific function.
standard error
A predefined communication channel for a process, typically used for error messages. See also: standard input, standard output
standard input
A predefined communication channel for a process, typically used to read input from the keyboard or from the previous process in a pipe. See also: standard error, standard output
standard output
A predefined communication channel for a process, typically used to send output to the screen or to the next process in a pipe. See also: standard error, standard input
stop word
Common words that are filtered out of text before processing it, such as “the” and “an”.
subcommand
A command that is part of a larger family of commands. For example, git commit is a subcommand of Git.
subdirectory
sustainable software
Software that its users can afford to keep up to date. Sustainability depends on the quality of the software, the skills of the potential maintainers, and how much the community is willing to invest.
tag (in version control)
A readable label attached to a specific commit so that it can easily be referenced to later.
test-driven development
A programming practice in which tests are written before a new feature is added or a bug is fixed in order to clarify the goal.
ternary expression
An expression that has three parts. Conditional expressions are the only ternary expressions in most languages.
test framework
See test runner.
test runner
A program that finds and runs software tests and reports their results.
three stickies
A technique for ensuring that everyone in a meeting gets a chance to speak. Everyone is given three sticky notes (or other tokens). Each time someone speaks, it costs them a sticky; when they are out of stickies they cannot speak until everyone has used at least one, at which point everyone gets all of their stickies back.
ticket
See issue.
ticketing system
See issue tracking system.
timestamp
A digital identifier showing the time at which something was created or accessed. Timestamps should use ISO date format for portability.
tolerance
How closely the actual result of a test must agree with the expected result in order for the test to pass. Tolerances are usually expressed in terms of relative error.
traceback
In Python, an object that records where an exception was raised, what stack frames were on the call stack, and other details.
transitive dependency
If A depends on B and B depends on C, C is a transitive dependency of A.
triage
To go through the issues associated with a project and decide which are currently priorities. Triage is one of the key responsibilities of a project manager.
tuple
A data type that has a fixed number of parts, such as the three color components of a red-green-blue color specification. Tuples are immutable (their values can not be reset.)
unit test
A test that exercises one function or feature of a piece of software and produces pass, fail, or error. See also: integration test
up-vote
update operator
See in-place operator.
validation
Checking that a piece of software does what its users want, i.e., “are we building the right thing”? See also: verification
variable arguments
In a function, the ability to take any number of arguments. R uses ... to capture the “extra” arguments. Python uses *args and **kwargs to capture unnamed, and named, “extra” arguments, respectively.
verification
Checking that a piece of software works as intended, i.e., “did we build the thing right?” See also: validation
version control system
In Python, the virtualenv package allows you to create virtual, disposable, Python software environments containing only the packages and versions of packages you want to use for a particular project or task, and to install new packages into the environment without affecting other virtual environments, or the system-wide default environment.
A character expression that can match text, such as the * in *.csv (which matches any filename whose name ends with .csv).