R and RStudio

Published

September 28, 2024

Modified

October 24, 2024

R is a programming language that allows users to compute statistics and graphics. It is based on S and is a GNU Project. It has several built-in function to compute statistics with ease, as well as allowing users to build their own functions to be executed.

Installing R

R can be installed from the R-Project with these general instructions:

  1. Go to R’s website
  2. Select download R in the first paragraph
  3. Choose 0-Cloud to automatically choose the closes mirror.
  4. Select and follow the instructions for the respective operating system

RStudio

RStudio is an integrated development environment (IDE) that allows users to write code in a script and execute code to the R console in one application. Furthermore, RStudio has rich set of features that make data science projects easier to execute.

While RStudio has originally been used to program in R, it has been extended to program in python with the use of the reticulate package.

Installation

You can download and install the open-source (free) version of RStudio here.

Start-up

On start-up, RStudio will look like very similar to the image below:

An image of the RStudio IDE. There are 3 panes displayed with the console, environment, and plots showing. Majority of the panes are empty except for the console which contains the R startup message.

You can see that there are 3 parts in RStudio, these are known as panes.

Additionally, we can add a fourth pane to RStudio for writing code in a text file. Choosing the white plus sign with a green border followed by a white document on the upper-right hand side:

A zoomed portion of Rstudio showing add buttons for new files or RStudio Projects.

This will open up a menu of text files that a user can choose to code in:

The drop-down menu when the add new file is chosen. The different types of files are displayed that can be chosen.

The “R Script” Button will open a standard R text file with the extension as “.R”. This is the text file that most R programmers used to save and execute code. This will make RStudio to look like this:

An image of the RStudio IDE. There are 4 panes displayed with the script (source) console, environment, and plots showing. Majority of the panes are empty except for the console which contains the R startup message.

Notice a new pane is created on the top-left that allows you to write R code in a script. This script is also connected to the R console below which will allow you to send lines of code from the script to the console to be executed (also known as REPL).

Global Options

In this section, here are some recommended “Global Options” for users to set in RStudio. To begin, click on ToolsGlobal Options from the top-menu. The following window should open:

The global options of RStudio are displayed. The "R General" options are displayed.

The window allows you to make several changes in RStudio that will make your experience better. Here is a list of items that are recommended for users to change:

  1. R General
    1. Make sure “Restore .RData into workspace at startup:” is unchecked (Highly Recommended1)
    2. Set “Save workspce to .RData on exit:” to “Never” (Highly Recommended)

The global options of RStudio are displayed. The "Code" options are displayed.

  1. Code
    1. “Use native pipe operator |>” is recommended2 (Optional)

The global options of RStudio are displayed. The "Appearance" options are displayed with code colors are changed.

  1. Appearance
    1. In the “Editor theme:” box, choose a setting that you will prefer to work in (Optional)

The global options of RStudio are displayed. The "Pane Layout" options are displayed to control how the panes are organized in RStudio.

  1. Pane Layout (Optional)
    1. Change the pane layout to have the “Console” on the top-right corner
    2. Add all components (checkmark) to the lower-right corner except for “History” and “Connections”

This will allow for you to expand the “Source” (script) to be expanded for the entire left hand side. It will allow you to view more code at one time.

RStudio will look more like this:

The RStudio IDE is displayed with 4 panes: source, history, console, and environment.

With the expanded script:

The RStudio IDE is displayed with 3 panes: source, console, and environment. The source pane is expanded to cover half the image.

Source, Console and Plots

The source pane allows you to write an R script for analysis. Below x <- mtcars is written (top-left) and executed to R (top-right). Afterwards the “Environment” Tab in the lower right pane now how x. The “Environment” tab displays which R objects were created and available to use for further analysis.

The RStudio IDE is displayed with 3 panes: source, console, and environment. The source pane is expanded to cover half the image. The source pane contains written code, the console shows executed code, and environment shows objects created.

Since x is a data frame, clicking on x from the “Environment” tab will open a new tab in the Source pane containing the data set:

The RStudio IDE is displayed with 3 panes: source, console, and environment. The source pane is expanded to cover half the image. The source pane displays a data set.

If we create an object that is a vector ( y <- 4 as pictured below), the “Environment” tab now shows a new object as a value.

The RStudio IDE is displayed with 3 panes: source, console, and environment. The source pane is expanded to cover half the image. The source pane contains written code, the console shows executed code, and environment shows objects created.

If a plot is created (plot(mtcars$mpg)), a plot will be displayed in the “Plots” tab in the lower-right pane.

The RStudio IDE is displayed with 3 panes: source, console, and plot. The source pane is expanded to cover half the image. The source pane contains written code, the console shows executed code, and plot pane shows a dot plot on mtcars data set mpg..

The lower right-pane also contains other useful features such as access to your computer’s file directory:

The RStudio IDE is displayed with 3 panes: source, console, and files on computer. The source pane is expanded to cover half the image.

Access to installed packages:

The RStudio IDE is displayed with 3 panes: source, console, and R packages installed on computer. The source pane is expanded to cover half the image.

And access to help documentation:

The RStudio IDE is displayed with 3 panes: source, console, and help documentation on the mean function. The source pane is expanded to cover half the image.

R Packages

R’s functionality can be extended to do more things by installing R packages. An R package can be thought as extra software. This allows you to do more with R. To install an R package, you will need to use the install.packages("NAME_OF_PACKAGE") function. Once you install it, you do not need to install it again. To use an R package, use library("NAME_OF_PACKAGE"). This allows you to load the package in R. You will need to load the package every time you start R. For more information, please watch the video:

install Packages from RStudio, Inc. on Vimeo.

Footnotes

  1. This will ensure that your environment is always recreated from the code you write and not from anything else. It increases reproducibility.↩︎

  2. The native pipe does not require to have any packages installed. Additionally, it executes code slightly faster than %>%.↩︎