Posts

Showing posts from March, 2026

Module 10: Friedman R Package Proposal

     For this assignment, I created the initial structure for an R package called Friedman . The goal of this project was to learn how R packages are built and how they organize code, documentation, and metadata in a standardized way. Before this assignment, I didn’t fully understand how packages worked behind the scenes, so this helped me see how everything is connected. Purpose and Scope The purpose of the Friedman package is to make it easier to reuse code for data analysis and data mining tasks. In many assignments, I find myself writing similar code over and over again, especially for summarizing data or preparing results. This package is designed to group those repeated tasks into simple, reusable functions. The intended users are students and beginner R users who want a basic toolkit to help with analysis. Instead of rewriting code each time, they can use functions from the package to save time and stay organized. Key Functions Right now, the package includes one m...

Module 9: Comparing Base R, Lattice, and ggplot2

Image
       For this assignment, I used the built-in iris dataset in R. I chose this dataset because it includes several numerical variables and a grouping variable, Species , which made it useful for comparing different visualization systems. I created plots using base R, lattice, and ggplot2 to see how the syntax, workflow, and final output differed across the three systems. Base R Graphics      For the base R portion, I created a scatter plot showing the relationship between sepal length and petal length. Base R was straightforward to use for a basic graph, but it required more manual work, such as adding the legend separately.   Lattice Graphics      For the lattice portion, I used a conditioned scatter plot to separate the data by species. This made it easier to compare patterns across groups because each species appeared in its own panel.   ggplot2 Graphics      For ggplot2, I created a scatter plot with...

Module 8: plyr mean by Sex + filtering names with “i”

Image
     For this assignment, I worked with the Assignment 6 dataset in R. The dataset contains four variables: Name, Age, Sex, and Grade . The goal was to import the dataset, calculate the mean grade by category, filter the data based on a condition in the Name column, and export the results to files.      First, I imported the dataset into R using read.table() and confirmed that it loaded correctly. To quickly inspect the data, I used the head() function to display the first few rows of the dataset. This allowed me to verify that the columns and values were read properly.        The preview shows that each row represents a student and includes their name, age, sex, and grade.            Next, I wanted to see how many students were in each category of the Sex variable. I used the table() function to count how many males and females were present in the dataset.           ...

Module 7: R Objects S3 vs. S4

  For this assignment, I used the built-in dataset mtcars from R (so I didn’t have to download anything). First I loaded it and checked the first few rows to confirm it worked. Step 1. Data mtcars is a data frame (32 rows × 11 columns). Since it’s a normal R dataset, it already comes with a class and lots of functions that work with it. Step 2. Can a generic function be assigned to this dataset? If not, why? A generic function is a function that chooses which method to run based on the class of the object you pass in (like print() , summary() , or plot() ). For mtcars , generic functions already work because mtcars has class "data.frame" (and also behaves like a list under the hood). For example, summary(mtcars) runs the summary.data.frame() method automatically. If I tried to use a generic function that has no method for a data frame, it wouldn’t know what to do (it would either fall back to a default method or error). That’s basically the “why not” case: the obj...