Architecture for RStudio on SageMaker

A Note on Nomenclature

This article uses different terms to describe the different elements of “RStudio on SageMaker.” To offer clarification:

RStudio or RStudio IDE refers to the Integrated Development Environment (IDE) for code development.

Posit Workbench is the centralized, multi-user, professionally-licensed application that launches IDEs.

RStudio on SageMaker is the implementation of Posit Workbench within a SageMaker Domain in which RStudio IDE sessions are made available to SageMaker users.

You may see references to RStudio Workbench or RStudio Server Pro in AWS documentation. These are previous names for Posit Workbench, and are being phased out as product and company rebranding updates are made in AWS documentation.

Architecture Overview

Amazon SageMaker provides a managed environment for machine learning, where architecture and infrastructure management is handled by Amazon. RStudio on SageMaker brings the RStudio IDE to the SageMaker managed environment. The implementation of RStudio on SageMaker presents a modified architecture from a Posit Workbench implementation that is installed on your own, self-managed infrastructure. See Differences from Workbench on Self-Managed Infrastructure below for specific implementation differences.

The diagram below provides an overview of the SageMaker Domain and the Posit Workbench implementation within. Of note, Posit Workbench within SageMaker only launches RStudio sessions. SageMaker Studio is responsible for launching Jupyter sessions.

flowchart LR

    subgraph sagemakerDomain["AWS SageMaker Domain"]

        direction LR
            
        efs[["\n\nAWS Elastic File System (EFS)\n\n\n"]]
        
        subgraph sagemakerStudio["SageMaker Studio"]
            sagemakerStudioNote("SageMaker Studio architecture\n not expanded in this diagram.\n Jupyter sessions and other SageMaker\n tools run via SageMaker Studio.")
        end
        
        subgraph rstudioOnSagemaker["RStudio on SageMaker"]

            subgraph workbenchEC2["EC2 (Posit Workbench)"]
                workbench("Posit Workbench")
                launcher("Launcher")
                workbench---launcher
            end


            
            subgraph ec2SpecA ["EC2 (e.g. T5 Large)"]
                rstudioIdeSession1("RStudio IDE Session #1")
                rstudioIdeSession2("RStudio IDE Session #2")
            end


            subgraph ec2SpecB ["EC2 (e.g. T3 Medium)"]
                rstudioIdeSession3("RStudio IDE Session #3")
            end
        end

        efs-.-sagemakerStudio
        efs-.-workbenchEC2
        efs-.-ec2SpecA
        efs-.-ec2SpecB
        launcher---rstudioIdeSession1
        launcher---rstudioIdeSession2
        launcher---rstudioIdeSession3

    end

    classDef ec2Class fill:#c6c7cc
    classDef server fill:#FAEEE9,stroke:#ab4d26
    classDef blackbox fill:#C2C2C4,stroke:#213D4F 
    classDef product fill:#447099,stroke:#213D4F,color:#F2F2F2
    classDef session fill:#7494B1,color:#F2F2F2,stroke:#213D4F
    classDef note fill:#C2C2C4,stroke-width:0px
    classDef efs fill:#72994E,stroke:#1F4F4F
    
    class workbenchEC2,ec2SpecA,ec2SpecB server
    class sagemakerStudio,rstudioOnSagemaker blackbox
    class workbench,launcher product
    class rstudioIdeSession1,rstudioIdeSession2,rstudioIdeSession3 session
    class sagemakerStudioNote note
    class efs efs
    
    style sagemakerDomain fill:#f6f6f7,stroke-dasharray: 5 5

SageMaker Architecture Key Components

AWS SageMaker Domain

RStudio is installed within a SageMaker Domain. The Posit Workbench integration document Enable RStudio on AWS SageMaker describes how to enable RStudio.

AWS Elastic File System (EFS)

An EFS volume is automatically created in the SageMaker Domain the first time a user onboards to Amazon SageMaker. Each user has a home directory on the EFS, and the home directory is persistent across sessions. Users can access their home directory from any RStudio or SageMaker Studio session. See Manage Your Amazon EFS Storage Volume in SageMaker Studio for more details.

Posit Workbench and Session EC2s

Posit Workbench runs on a persistent EC2. Workbench launches RStudio sessions via the Launcher into on-demand EC2 instances. Sessions are launched with either a default container image or a custom image attached to the SageMaker domain by an administrator. The instance types are selectable by the user. Because the EFS-backed user home directory is accessible to each session, work can be started in one instance type, and then resumed in a different instance type if user resource needs change.

Screenshot of Posit Workbench in SageMaker

Screenshot of Posit Workbench in SageMaker

SageMaker Studio and Session EC2s

An EC2 instance will be running the SageMaker Studio software. SageMaker Studio can launch new SageMaker Studio Sessions into additional EC2 instances, including Jupyter sessions.

SageMaker Studio running in SageMaker

SageMaker Studio running in SageMaker

Differences from Workbench on Self-Managed Infrastructure

The implementation of RStudio on SageMaker has notable differences from a Posit Workbench implementation that is installed on your own, self-managed infrastructure.

Specifically:

  • Amazon manages the infrastructure, configuration files, default container image, version of Workbench, and versions of R available.

  • RStudio on SageMaker only launches RStudio IDE sessions; other IDEs supported by Workbench (i.e., Jupyter Lab, Jupyter Notebook, and VS Code) are not enabled.

  • Project Sharing in RStudio is not currently supported by RStudio on SageMaker.

  • Workbench Jobs are not currently supported by RStudio on SageMaker.

  • Currently there is not a direct means for mounting external file systems to the SageMaker Domain.

Back to top