My second week started with a mission to explore the R software program and R Studio. Seriously, I have never done this before, nor am I good with statistics, but I must tell you that my week was not so good because the learning was a bit on the edge for me. As mentioned in the beginning of this class, I am trying to learn the R and Smart PLS software for data analysis purposes.
I have managed to pull through the first stage of the SmartPLS, which is understanding the user interface, and I thought the same should be done for the R software as well. As part of my learning process, I had to join an online workshop titled R for Data Visualization. To better understand the R platform and how it has been used, the facilitator shared some helpful resources for using R and ggplot ( Data Visualization with ggplot Cheat Sheet and the R Graph Gallery.
ggplot is the tool used for visualizing data, while gapminder is the tool where the data is extracted from.
- Firstly, download the R software from https://www.r-project.org/
- Then I downloaded the RStudio from https://posit.co/downloads/ (We were instructed during the workshop to use the Free/Open Source version of RStudio Desktop, not the Pro version)
- Please note that one will be asked to select a CRAN Mirror during the download and installation process. You can choose one of the three Canadian CRAN mirrors:
- https://mirror.rcg.sfu.ca/mirror/CRAN
- https://muug.ca/mirror/cran/
- https://mirror.csclub.uwaterloo.ca/CRAN/
I had to learn the terminologies used on the platform, which included the meaning of keywords like
- Character data: words and letters (called “strings”)
- Numeric data: whole numbers or decimal places, can be positive or negative
- Integer data: whole numbers, can be positive or negative
- Logical data: TRUE or FALSE
- Vector: a sequence of data elements of the same type (i.e., only character or only numeric)
- Boolean: operators meaning and (&), or (|), not (!) which let us combine inputs
- Function: an action being performed on an object (or argument). For example, in class(x), class() is the function
- Argument: the object for a function. For example, in class(x), x is the argument
- Optional argument: an object you don’t need to include for the function to work. For example, in round(sqrt(10), digits=2), digits=2 is an optional argument.
- Non-optional argument: the necessary argument for the function to work. For example, in round(sqrt(10), digits=2), sqrt(10) is a non-optional argument.
- Library or package: a suite of specialized functions for different types of data or different projects
- Tibble or Dataframes: tabular data
- Vectorized operation: operations, such as adding, subtracting or multiplying, that can be applied to two vectors in parallel
- For loops: a way to repeat a block of code
- Conditional: an if-else statement
After this, I learned how to get help in R as well as the common commands and functions needed to work effectively on the platform. My intention is to see which of the two softwares will be easy to use. I think i’ll rather go with the SmartPLS since I have explored the user interface of both softwares. In my next post I will be looking at how to build models usings the SmartPLS.