Cracking my weekly SmartPLS data analysis exploration

As indicated at the beginning of this project, my aim is to understand how to use SmartPLS software for sophisticated data analysis like structural equation modeling (SEM). Structural Equation Modeling is commonly employed to explain various statistical relationships concurrently through both visualization and the validation of models. As my learning progressed, I came to understand that there are two distinct types of SEM, which are the PLS-SEM (partial least square structural equation modeling) and the covariance-based structural equation modeling (CB-SEM). However, the choice of which method to use is dependent on the goal of the study.

According to Dash and Paul (2021), PLS-SEM is used when a study is focused on making predictions and theory building, whereas CB-SEM is used when a study is focused on theory testing, confirmation of the hypothesis, and validating model fit. Since the beginning of my project, I have been exploring the PLS-SEM analysis features of the SmartPLS software. One of the instructional resources that made my learning easy was from Dr James Gaskin, a professor of information systems at Brigham Young University (BYU), USA. He also has a wiki page called Gaskination, which contains several contents that simplify the various abstract concepts in SEM. One of the reasons I hold his instructional resources in high regard is due to the numerous insightful concepts he presents and the comprehensive guidance he provides throughout a meticulous SEM procedure, along with numerous pieces of advice and considerations to bear in mind during each stage of the analysis.

When exploring the SmartPLS 4 user interface, I realized that the software has five different analysis models, which are the PLS-SEM, CB-SEM, GSCA (generalized structured component analysis), process, and regression. My previous post has been focused on the PLS-SEM analysis. I just started exploring the CB-SEM analysis, even though findings from what I have read online show that using AMOS or LISREL is more effective for conducting CB-SEM. The latest SmartPLS 4 software has updated features, which can also be used for CB-SEM, as illustrated in the video below.

Just like PLS-SEM approach explored in my previous learning reports, it is also important to always check the quality of my measurement model through the following:

Reliability Tests:

  • Cronbach’s Alpha (α) > 0.7 (Internal consistency)
  • Composite Reliability (CR) > 0.7 (overall construct reliability).

Validity Tests

  • Convergent validity (how well items load onto their constructs)
    • Average Variance Extracted (AVE) > 0.5
  • Discriminant validity (how distinct one construct is from another)
    • Fornell-Larcker Criterion: the square root of AVE for a construct should be greater than its correlations with other constructs.

Another important aspect of my learning is knowing that CB-SEM requires model fit indices to ensure the proposed model aligns with the data. These include the following :

Absolute Fit Indices

  • Chi-square (χ²): Should be non-significant (p > 0.05).
  • Root Mean Square Error of Approximation (RMSEA) < 0.08 (Good fit).
  • Standardized Root Mean Square Residual (SRMR) < 0.08 (Good fit).

Incremental Fit Indices

  • Comparative Fit Index (CFI) > 0.90
  • Tucker-Lewis Index (TLI) > 0.90

Parsimony Fit Indices

  • Adjusted Goodness-of-Fit Index (AGFI) > 0.80

If model fit is poor, one has to  refine it by removing low-loading indicators (< 0.5), or by checking for high modification indices (indicating potential cross-loadings).

 

My learning progression with SmartPLS

My learning with SmartPLS seems to have been an amazing one. In my last post, I showed how I created my model by importing my data from Google Forms into the Smartpls user interface using the PLS-SEM algorithm. I also remember saying that I was going to interpret the values that were generated from the calculations of the PLS-SEM algorithm. In the below video, I tried explaining the various values in relation to how they should be understood and reported for general understanding. This involves exploring key terms used in explaining the outer model and structure model of the path model generated. I basically learned how to interpret the construct validity and reliability as well as the discriminant validity of the items and constructs used in creating my model.

Update on my learning project

It has been an awesome experience navigating the SmartPLS platform, and I am happy that I am able to replicate all that I have read and learned in creating a model on this platform. Though I am still reading and interpreting my results, I am quite satisfied with what I have achieved so far. Here is a recording of how I have been able to create a model on the PLS algorithm. I will be exploring Bootstrapping in my next video.

 

 

 

 

 

 

Week 4 update on my SmartPLS application

At the beginning of this course, I indicated that I wanted to learn how to use Smart PLS and possibly the R software for data analysis. During the second week, I explored what smartPLS is about and how to download the software, and I also did some readings. During the third week, I signed up for an online workshop on the R studio, where I learned about the user interface of R studio and how to run some commands. I must tell you that the R software is very difficult to navigate, and I doubt if I can be very efficient in using that software.

However, as I began exploring the SmartPLS user interface, I realized that I needed to familiarize myself with some key ideas and the meaning of some statistical techniques to enable me to explore the assessment model in SmartPLS. These techniques include exploratory factor analysis (EFA), confirmatory factor analysis (CFA), path analysis, and bootstrapping. So I had to check on YouTube as my go-to learning platform.

 

 

To further enhance my learning, I came across this slideshare presentation, Confirmatory factor analysis overview, which also provided guidance on the rule of thumb for factor loadings on a specific model when doing CFA or EFA. I also used Practical Introduction to CFA and explored more on the path analysis using the below YouTube video, and I was able to get a soft copy of the 6th edition of Advanced and Multivariate Statistical Methods. 

I latter realized that I will need a set of data to enable me to explore this software effectively. Why I am busy reading up on the various techniques, I also had to ask a colleague to help me share an online survey I drafted for this purpose.

Just as I said earlier, my goal is to be able to run a sophisticated and complex statistical analysis that involves structural equation modeling. I want to do a video of my learning, but this can only happen when I start running my own model. which I think I will begin in my next phase since the participants that had access to my questionnaire through my colleague have started responding. But I will first need to clean up the data and do a code book from an Excel sheet before transferring the data to the smartPLS platform. This is what my week 5 and 6 update will entail because coding participants responses is a bit of work.

My learning in the R data analysis platform

My second week started with a mission to explore the R software program and R Studio. Seriously, I have never done this before, nor am I good with statistics, but I must tell you that my week was not so good because the learning was a bit on the edge for me. As mentioned in the beginning of this class, I am trying to learn the R and Smart PLS software for data analysis purposes.

Getting started with R Programming | by Pier Paolo Ippolito | Towards Data Science

I have managed to pull through the first stage of the SmartPLS, which is understanding the user interface, and I thought the same should be done for the R software as well. As part of my learning process, I had to join an online workshop titled R for Data Visualization. To better understand the R platform and how it has been used, the facilitator shared some helpful resources for using R and ggplot ( Data Visualization with ggplot Cheat Sheet and the R Graph Gallery.

ggplot is the tool used for visualizing data, while gapminder is the tool where the data is extracted from.

To enable me to learn how to use R for data analysis, the first thing I had to do based on my experience from the workshop I attended was to
  • Firstly, download the R software from https://www.r-project.org/
  • Then I downloaded the RStudio from https://posit.co/downloads/ (We were instructed during the workshop to use the Free/Open Source version of RStudio Desktop, not the Pro version)
  • Please note that one will be asked to select a CRAN Mirror during the download and installation process. You can choose one of the three Canadian CRAN mirrors:
    • https://mirror.rcg.sfu.ca/mirror/CRAN
    • https://muug.ca/mirror/cran/
    • https://mirror.csclub.uwaterloo.ca/CRAN/

I had to learn the terminologies used on the platform, which included the meaning of keywords like

  • Character data: words and letters (called “strings”)
  • Numeric data: whole numbers or decimal places, can be positive or negative
  • Integer data: whole numbers, can be positive or negative
  • Logical data: TRUE or FALSE
  • Vector: a sequence of data elements of the same type (i.e., only character or only numeric)
  • Boolean: operators meaning and (&), or (|), not (!) which let us combine inputs
  • Function: an action being performed on an object (or argument). For example, in class(x), class() is the function
  • Argument: the object for a function. For example, in class(x), x is the argument
  • Optional argument: an object you don’t need to include for the function to work. For example, in round(sqrt(10), digits=2), digits=2 is an optional argument.
  • Non-optional argument: the necessary argument for the function to work. For example, in round(sqrt(10), digits=2), sqrt(10) is a non-optional argument.
  • Library or package: a suite of specialized functions for different types of data or different projects
  • Tibble or Dataframes: tabular data
  • Vectorized operation: operations, such as adding, subtracting or multiplying, that can be applied to two vectors in parallel
  • For loops: a way to repeat a block of code
  • Conditional: an if-else statement

After this, I learned how to get help in R as well as the common commands and functions needed to work effectively on the platform. My intention is to see which of the two softwares will be easy to use. I think i’ll rather go with the SmartPLS since I have explored the user interface of both softwares. In my next post I will be looking at how to build models usings the SmartPLS.

Let the learning begin: Phase 1

My primary goal in learning about these tools is to gain a thorough understanding of statistical concepts such as regression, correlation, and hypothesis testing. I also want to know how to use smartPLS to analyze survey or experimental data for structural equation modeling. In the course of this week, I visited the official SmartPLS website: https://www.smartpls.com/courses. This gave me the opportunity to learn about the company and the different types of licenses available to people. The website also has a lot of resources (recommended readings, other books, tutorials, and videos) to help users understand the tool. (https://www.smartpls.com/documentation).

I watched a YouTube video SmartPLS 4 Tutorial Guide 1: Getting Started. Then, I was able to download and install the tool on my laptop to enable me to navigate the workspace as indicated in the video.

This week, I learnt how to create a workspace, name a project, import data from an Excel sheet (csv file) into the workspace, choose a data set (ordinal, categorical, and measurement), and identify missing values. My journey continues, and in the coming week I will be learning about the terminologies used and the theoretical background of Partial Least Square Structural Equation Modelling (PLS-SEM). I think this will enhance my understanding of how models are built. I already have a set of data I plan to use. I think I will still have to come up with some kind of question or hypothesis to help me navigate the tool. Here are some of the useful resources for this week’s learning:

Learning Data Analysis skills: R and Smart PLS

I have been thinking a lot about what exactly to do for my project. I had various ideas ranging from baking, video editing, and jewellery making to website design and all that. These are ideas that I really want to explore, but I also have to consider their importance to my present job. After sleepless nights, I have decided to learn about complex statistical tools for data analysis. 

In addition to my part-time job as an exam invigilator, I also work as a researcher. This requires me to collect, analyze, and interpret data to inform policies, awareness, or views about topics researched. In most cases, I have always focused on qualitative data and contracted out the quantitative aspect of my work. In recent times, I have realized that statistics is becoming an important analytical skill that is used for decision-making in most fields, and I suppose people with statistical skills are now sought after. So, I think learning some of the tools used for complex statistical analysis will be a relevant skill that is useful to my Job and could create more opportunities as a data analyst.

I must say that I am quite knowledgeable about some statistical words like mean, median, mode, and standard deviation. I think that is all I know about basic statistics. So I feel learning to use tools like R and SmartPLS, will help me gain the ability to analyze complex datasets, uncover relationships between variables, and generate actionable insights. R is renowned for its versatility in data analysis and visualization, while SmartPLS is a powerful tool for Partial Least Squares Structural Equation Modeling (PLS-SEM), which is ideal for exploring latent variables and predictive relationships as shown below.

This project is not just about acquiring technical skills but also about learning how to leverage statistical analysis to answer impactful questions and support decision-making in my field of work. I really don’t know how this will go, but I am ready to see what lies ahead as I navigate these tools.

.