Income inequality in Canada: Part 5 – what happened in 1995?

In my previous post on this subject, I showed that the distribution of incomes in Canada has trended very slightly since 1976. Although I think the dataset I used is a good start, by no means does it offer a complete perspective on the issue of income inequality.  One of its limitations is with respect to the top bracket of earners.  To illustrate with a hypothetical example, suppose that the proportion of earners in the top bracket did not change over time, but they captured the vast majority of income gains.  This trend would not be picked up in the data I used.  If instead, the majority of income gains were captured by any other bracket than the top, we would see one bracket shrinking and another growing.  It is only with the top bracket that the data has this blind spot.

In part answer to this, I looked at the Gini coefficient which is a measure of income distribution.  How does it work?  This is the explanation from the StatsCan website:

The Gini coefficient is a number between zero and one that measures the relative degree of inequality in the distribution of income. The coefficient would register zero (minimum inequality) for a population in which each person received exactly the same adjusted family income and it would register a coefficient of one (maximum inequality) if one person received all the adjusted family income and the rest received none. Even though a single Gini coefficient value has no simple interpretation, comparisons of the level over time or between populations are very straightforward: the higher the coefficient, the higher the inequality of the distribution, and vice versa.

So basically, I wanted to see how this Gini coefficient has changed over time.

Data:

You can find the data I used here.  The data is for after-tax income which includes earnings, net-investment income, retirement income, government transfers, less income tax.  Below is an excel version of the data I used:

cansim4917010673138913470

Methodology:

I used the below R code to produce the plots:

library(ggplot2)
library(reshape)

#import data, rename headers
pData <- read.csv("C:/Users/Business/Dropbox/Economics Research/Blog/Income inequality/Post 7 - Gini Coeff/cansim4917010673138913470.csv",header=TRUE)
pData <- rename(pData, c("Ref_Date" = "Date", "GEO" = "Geography", "INCOMECONCEPT" = "Income.Concept", "FAMILYTYPE" = "Family.Type"))

#change the order of the factors. we do this so that the faceted plot will start with the national data, and then
#the provinces from east to west
pData$Geography <- factor(pData$Geography, levels = c("Canada", "Atlantic provinces", "Quebec", "Ontario",
"Manitoba", "Saskatchewan", "Alberta", "British Columbia"))

#create a boolean vector for when the observation is at or after 1995 and add it to the pData dataframe.
#we will use this vector to group the linear regressions in one of the plots
pData$ge1995 <- pData$Date >= 1995

#I found the between the two hash lines at: http://ryouready.wordpress.com/2009/02/17/r-good-practice-adding-footnotes-to-graphics/
#it's a lovely bit of code that creates a custom function to add a footer to the plots
##############################################################################################################
source <- "Source: StatsCan"
author <- "PosNorm"
footnote <- paste(source, format(Sys.time(), "%d %b %Y"),
                  author, sep=" / ")

# default footnote is today's date, cex=.7 (size) and color
# is a kind of grey

makeFootnote <- function(footnoteText=
                         format(Sys.time(), "%d %b %Y"),
                         size= .7, color= grey(.5))
{
   require(grid)
   pushViewport(viewport())
   grid.text(label= footnoteText ,
             x = unit(1,"npc") - unit(2, "mm"),
             y= unit(2, "mm"),
             just=c("right", "bottom"),
             gp=gpar(cex= size, col=color))
   popViewport()
}
##############################################################################################################


#code for the plot of canada and the provinces
pdf("C:/Users/Business/Dropbox/Economics Research/Blog/Income inequality/Post 7 - Gini Coeff/Fig_4.pdf", height = 8.5, width = 15)
p <- ggplot(data = pData, aes(x = Date, y = Value))+
     geom_point(shape=3) +    #this is the first layer of the plot, the scatterplot
     geom_smooth(method="lm", se=TRUE, aes(group=ge1995))+  #in the second layer we add the regressions
     geom_vline(xintercept = 1995, linetype = 2, colour = "red") +  #the third layer is a vertical line at year 1995
     facet_wrap( ~ Geography, ncol=4) + #this creates a seperate plot for each geography and limits 4 graphs per row
     theme(axis.title.x = element_blank()) +    #removes x axis title
     theme(axis.title.y = element_blank()) +    #removes y axis title
     ggtitle("Gini coefficient of after-tax income in Canada and provinces from 1976 to 2011")     #add a title to the plot
p
makeFootnote(footnote)     #adds the footnote through the above custom function
dev.off()

#code for the plot of canada only with a rectangle highlighting years 1995-2000
rectData <- data.frame(xmin = 1995, xmax = 2000, ymin = -Inf, ymax = Inf) #create a data frame for the rectange used to highlight 1995-2000
pdf("C:/Users/Business/Dropbox/Economics Research/Blog/Income inequality/Post 7 - Gini Coeff/Fig_5.pdf")
p <- ggplot(data = subset(pData, Geography=="Canada"), aes(x = Date, y = Value))+
     geom_point(shape=3, color ="blue") +
     geom_rect(data = rectData, aes(xmin=xmin, xmax=xmax, ymin=ymin, ymax=ymax),fill = "orange", alpha = 0.3, inherit.aes=FALSE) +
     theme(axis.title.x = element_blank()) +
     theme(axis.title.y = element_blank()) +
     ggtitle("Gini coefficient of after-tax income in Canada from 1976 to 2011")
p
makeFootnote(footnote)
dev.off()

Results:

Fig_5

Fig_4

Positive analysis: The Gini coefficient has increased since 1976 in Canada in each province except Manitoba and Saskatchewan.  1995 appears to mark a change in trend for income inequality in Canada and the provinces.  Although the Gini coefficient increased only 8.5% nationwide, between 1976 and 2011, 93.5% of that change occurred between 1995 and 2000. British Columbia has the highest Gini coefficient and also the strongest upward trend.  The provinces of Alberta, Saskatchewan and Manitoba have the most volatile data series.

Normative analysis: I was not surprised to see that Manitoba, Saskatchewan and Alberta have the greatest variance in data since many of their incomes are tied to agriculture and commodities which tend to be volatile industries.  The strong trend in British Columbia is interesting because you hear stories of how Vancouver is a hotspot for wealthy foreigners and it has an extremely expensive real estate market.  The thing that really makes me scratch my head, is why the big jump between 1995 and 2000?

The Gini coefficient is a debated measure that has its limitations.  Although it has increased, a 0.031 increase in the coefficient does not significantly change the overall income distribution function.  I will look more into the 1995-2000 surge though.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s