Click for larger image:
Click for larger image:
In this post, I will show how to combine a series of plots into a GIF file using the animation package in R.
In this example, I will create a GIF of horizontal bar plots showing how the proportion of men and women have changed in various occupation categories in Canada. Although I will show how to generate the results, I will leave the analysis for another post – this post is just the Data and R code.
For my next step in looking at income inequality in Canada, I thought it would be interesting to look at average wages with respect to age and gender. It would have been nice to include level-of-education and number of years of work experience (rather than just having the age of the individual), but I was unable to find a dataset with all these properties. I did find some interesting data regarding education-level and escaping from poverty, but that’s for another post.
Today, I will show the overall results for Canada and the provinces. In following days, I will publish more detailed results for the individual provinces which will show average wages for the different categories of occupation.
In my last post, I was happy to come across my first unexpected result: that the vast majority of the rise in income inequality in Canada from 1976-2011 occurred within the 5 year span of 1995-2000. I was looking for some possible explanations to why this occurred when I came across this interesting paper by Emmanuel Saez and Michael Veall. They also remark on the 1995 surge in inequality and offer some explanations – I will summarize two which I found to be the most relevant and interesting:
In my previous post on this subject, I showed that the distribution of incomes in Canada has trended very slightly since 1976. Although I think the dataset I used is a good start, by no means does it offer a complete perspective on the issue of income inequality. One of its limitations is with respect to the top bracket of earners. To illustrate with a hypothetical example, suppose that the proportion of earners in the top bracket did not change over time, but they captured the vast majority of income gains. This trend would not be picked up in the data I used. If instead, the majority of income gains were captured by any other bracket than the top, we would see one bracket shrinking and another growing. It is only with the top bracket that the data has this blind spot.
In part answer to this, I looked at the Gini coefficient which is a measure of income distribution. How does it work? This is the explanation from the StatsCan website:
R is a fairly intuitive language. It generally does not take long to learn the different variable types, how to import data, the control statements, etc. However, when it comes to making graphics, there is so much syntax involved that it can be a bit overwhelming.
What is ggplot2? It is a package for R that contains tools for producing various graphics. I’ll go further into what it is, and how to use it in another post. The purpose of this post is to show how using the plot() function in R can get unnecessarily complicated, and that the ggplot2 package is a superior alternative.