3. System dependencies
In this section, you will learn:
- what is a system dependency
- how do system dependencies relate to R and Python packages
- strategies for managing system dependencies
What is a system dependency?
For this course, a system dependency is anything that cannot be installed via R or Python. For example:
# R package, not a system dependency
R -e "install.packages('dplyr')"
# python package, not a system dependency
pip install pandas
# system dependency
sudo apt-get install gdebi-core
Package Manager only serves R and Python packages. It cannot serve system dependencies. However, some R and Python packages also require system dependencies to work.
There are two approaches to solving this:
- Bundling the system dependency with the R/Python package (preferred approach).
- Installing the system dependency manually on the required machine.
As the administrator of Package Manager, you do not have control over how third-party open-source developers bundle their packages. However, Package Manager does its best to help you.
R
Package Manager can build Linux Binaries for most R packages. This is one of the major benefits of using Package Manager instead of CRAN directly. CRAN does not build binaries for Linux distributions.
The binaries built by Package Manager will include the system dependencies bundled inside of the package. This will reduce the burden of maintaining system dependencies because most R packages will not require any system dependencies to run.
For some packages, it is not possible to build Linux binaries, and additional system dependencies will need to be installed on the computer/server that is installing the package. Read the CRAN Binary Availability section of the Admin Guide for more information.
Run the code snippet below to get a list of all R packages that do not have a binary available.
library(dplyr)
library(glue)
# Example of OS versions.
<- list(
os_versions ubuntu = list(
bionic_1804 = "bionic",
focal_2004 = "focal",
jammy_2204 = "jammy"
),redhat = list(
rhel7 = "centos7",
rhel8 = "centos8",
rhel9 = "centos9"
)opensuse = list(
opensuse153 = "opensuse153",
opensuse154 = "opensuse154",
)
)
# Select an OS version and major/minor R version.
<- os_versions$ubuntu$bionic_1804
selected_os_version = "4.1"
r_version
# Create URLs.
<- glue("https://packagemanager.posit.com/cran/__linux__/{selected_os_version}/latest")
repo_url <- glue("{repo_url}/bin/linux/{r_version}-{selected_os_version}/contrib/{r_version}")
contrib_url options(repos = repo_url)
# Get all available packages.
<- available.packages() |>
all_packages as_tibble()
# Get all available packages with a binary.
<- available.packages(contriburl = contrib_url, filters = list()) |>
binary_packages as_tibble()
# Find all packages without a binary.
<- anti_join(
no_binary_pacakges
all_packages,
binary_packages,by = "Package"
)
glimpse(no_binary_pacakges)
# Rows: 995
# Columns: 17
# $ Package <chr> "ACNE", "ADAPTS", "AEenrich", "ASIP", "ActiveDriverWGS", "AcuityView", "Anaconda",…
# $ Version <chr> "0.8.1", "1.0.22", "1.1.0", "0.4.9", "1.2.0", "0.1", "0.1.5", "0.1.7", "1.0.4", "0…
# $ Priority <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ Depends <chr> "R (>= 3.0.0), aroma.affymetrix (>= 2.14.0)", "R (>= 3.3.0)", "R (>= 3.5.0)", "R (…
# $ Imports <chr> "MASS, R.methodsS3 (>= 1.7.0), R.oo (>= 1.19.0), R.utils (>= 2.1.0), matrixStats (…
# $ LinkingTo <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Rcpp,…
# $ Suggests <chr> "DNAcopy", "R.rsp, DeconRNASeq, WGCNA", "testthat", NA, "knitr, testthat, rmarkdow…
# $ Enhances <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ License <chr> "LGPL (>= 2.1)", "MIT + file LICENSE", "GPL-2", "GPL-3", "GPL-3", "GPL (>= 2)", "G…
# $ License_is_FOSS <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ License_restricts_use <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ OS_type <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ Archs <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ MD5sum <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ NeedsCompilation <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no"…
# $ File <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ Repository <chr> "https://packagemanager.posit.com/cran/__linux__/focal/latest/src/contrib", "htt… $ License_restricts_use <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ OS_type <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ Archs <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ MD5sum <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ NeedsCompilation <chr> "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no", "no"…
# $ File <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
# $ Repository <chr> "https://packagemanager.posit.com/cran/__linux__/focal/latest/src/contrib", "htt…
Administrators can configure R to install the binary package version of a package when it is available. Posit recommends doing this because:
- Packages will install quicker.
- System dependencies will be included in the binary file.
This makes it easier for users to get their work done and for administrators to maintain their servers. Refer to the following instructions for how to configure R:
Python
Python packages are most commonly distributed in a binary “wheel” format (read the details in PEP 427). Like the R binary packages described above, Python packages that have been distributed as a “wheel” are very quick to install because there is no build step. Packages disturbed without a “wheel” may have a build step and take longer to install.
Package Manager does not build the Python “wheels” like R binary packages. This is because PyPI already serves “wheels” that will work on Linux distributions.
What to do when a binary package is not available
It is impossible for some R and Python packages to include all system dependencies in a binary. This may be because:
- An R package depends on a Bioconductor package. Package Manager does not currently support binary packages from Bioconductor. Notable packages include Seurat and WGCNA.
- An R package depends on an uncatalogued system dependency. Refer to Missing System Dependencies for more information.
- An R package depends on a newer compiler version or system library than available by default on the distribution being used.
- A Python package was built using and uploaded to PyPI using old build tools.
- A Python or R package relies on a system dependency that cannot bundle into binary.
In these cases, you should install the system dependency directly onto the server end users are working on (e.g. the Workbench and/or Connect servers). For example, the R package pdftools can be challenging to install because it relies on the system dependency libpoppler-cpp-dev
.
Exercise
🚀 Launch the exercise environment!
In the exercise environment, you will get experience:
- installing system dependencies
- distributing R and Python binary files to end users