

32 Random Permutations and Random Samples.31 Making Predictions on the Basis of a Linear Model.30 Extracting Information about a Linear Model.28 Calculating Correlation Coefficients with R.27 Visual Comparisons of Data with a Normal Model.26 Computations Related to Normal Models.23 Multi-variable Scatter Plots and Line Charts.10 Counting TRUE values in logical vectors.7 Appending and Removing Columns from Data Frames.5 Inspecting the Format of a Data Frame.Plot(dpi, sr, xlim = c(0, 3500), xlab = 'Real Per-Capita Disposable Income', ylab = 'Aggregate Personal Savings', main = 'Intercountry Life-Cycle Savings Data') Png('Insert Your Directory Path Here/savings.png') However, as Nick Horton on R Bloggers points out, this is not a recommended practice. (Of course, it’s good to stop this after using this data set with the detach() function.) This would have made the plotting codes simpler. I could have used the attach() function to set this data set in the search path in R, so that any variable in this data set can be called by simply entering its name. With(LifeCycleSavings, text(sr~dpi, labels = row.names(LifeCycleSavings), pos = 4)) png(' Insert Your Directory Path Here/savings.png') I have chosen to use “4” because I want the text to be to the right of the point.Ĥ = right Exporting the Image as a PNG Fileįinally, let’s sandwich the two lines of plotting functions with png() and dev.off() to export the image as a PNG file into my chosen directory. The “pos” option specifies the position of the text relative to the point. The value for the “labels” option looks complicated, but it’s just a vector of strings that I abstracted from the first 9 rows of the names of the “LifeCycleSavings data frame using row.names(), which is a very useful function! with(LifeCycleSavings, text(sr~dpi, labels = row.names(LifeCycleSavings), pos = 4)) It has to be nested within the with() function, because, unlike plot(), “data” is not a valid option for text(). Then, let’s use the text() function to add the text labels to the data.

Plot(sr~dpi, xlim = c(0, 3500), xlab = 'Real Per-Capita Disposable Income', ylab = 'Aggregate Personal Savings', main = 'Intercountry Life-Cycle Savings Data', data = LifeCycleSavings) # By Eric Cai - The Chemical Statistician :)) The plot() and text() Functionsįirst, let’s use the plot() function to plot the points. (It actually isn’t nicely aligned in the output I manually aligned it for you to make it easier to see each column. Here are the first 9 data, just to give you a sense of what this data set looks like. Note that I am not saying anything about a predictive relationship in this context I am simply trying to explore the data in these 2 dimensions, and I may eventually find clustering to be useful for further analysis, as I alluded to earlier in the introduction. I will plot aggregate personal savings (sr) as a function of real per-capita disposable income (dpi), and I will label each datum with its associated country. In this case, the country is a unique categorical label for each datum. Each row contains economic or demographic data for a particular country.
SCATTER PLOT IN R STUDIO HOW TO
I will show how to do this in R, illustrating the code with a built-in data set called LifeCycleSavings.Ī data set containing such labels is LifeCycleSavings, a built-in data set in R. Instead, it would be useful to write the label of each datum near its point in the scatter plot. However, what if every datum has a unique label, and there are many data in the scatter plot? A legend would add unnecessary clutter in such situations. It is common practice to use a legend to label data that belong to a group, as I illustrated in a previous post on bar charts and pie charts. Sometimes, such data come with categorical labels that have important meanings, and the visualization of the relationship can be enhanced when these labels are attached to the data. A scatter plot displays the values of 2 variables for a set of data, and it is a very useful way to visualize data during exploratory data analysis, especially ( though not exclusively) when you are interested in the relationship between a predictor variable and a target variable.
