Canada’s largest industry: Real estate rental and leasing?

Results:

The below graphs breakdown Canadian GDP by major categories of the North American Industry Classification System (NAICS):

(Click for larger versions)

Perc_contribution_to_GDP_growth_2000_2014

CAD GDP by NAICS

The first graph shows that the real estate industry most contributed to economic growth in Canada since 2000. It should be noted that real estate does not include the construction and development of buildings, that activity falls under construction, the second largest contributor to GDP growth.  The second graph shows that real estate surpassed manufacturing in 2008 as Canada’s largest industry.  Since 1997, the construction industry has jumped from 7th largest industry, to 4th.  The second largest industry, manufacturing, is the only industry that had a negative contribution to GDP growth since 2000.

Data:

The data comes from StatsCan (link).  An excel version of the data I used is given below:

cansim1516571592652333504

Methodology:

I used the below R code to generate the plots:

library(reshape)
library(ggplot2)
library(lubridate)
library(animation)
directory <-  "C:/Users/Business/Dropbox/Economics Research/Blog/GDP/" #set directory
pData <- read.csv(paste(directory,"cansim1516571592652333504.csv", sep=""), na.strings="x") #read in file
pData <- rename(pData, c("Ref_Date" = "Date", "GEO" = "Geography", "Value"="GDP")) #rename some of the headings
pData$Date <- as.Date(paste(pData$Date,"/01",sep="")) #convert from factor type to date type.  date type requires a day

#I found the between the two hash lines at: http://ryouready.wordpress.com/2009/02/17/r-good-practice-adding-footnotes-to-graphics/
#it's a lovely bit of code that creates a custom function to add a footer to the plots
##############################################################################################################
source <- "Source: StatsCan"
author <- "PosNorm"
footnote <- paste(source, format(Sys.time(), "%d %b %Y"),
                  author, sep=" / ")

# default footnote is today's date, cex=.7 (size) and color
# is a kind of grey

makeFootnote <- function(footnoteText=
                         format(Sys.time(), "%d %b %Y"),
                         size= .7, color= grey(.5))
{
   require(grid)
   pushViewport(viewport())
   grid.text(label= footnoteText ,
             x = unit(1,"npc") - unit(2, "mm"),
             y= unit(2, "mm"),
             just=c("right", "bottom"),
             gp=gpar(cex= size, col=color))
   popViewport()
}
##############################################################################################################


pDataEnd <- pData[pData$Date=="2014-08-01",] #filter data for end date
pDataStart <- pData[pData$Date=="2000-01-01",] #filter data for start date

#create a new data frame for the plot
NAICS <- pDataEnd$NAICS #get the NAICS vector 
GDP.Change <- pDataEnd$GDP - pDataStart$GDP #caluclate the change in GDP between start and end dates
pData2 <- data.frame(NAICS,GDP.Change) #combine the two new vectors into a new data frame
pData2$GDP.Change.Proportion <- pData2$GDP.Change/pData2$GDP.Change[pData2$NAICS=="All industries"] #calculate the proportion of gdp change for which each industry is responsible

#for the graph, we want the bars orders by size, this is a three step process:
pData2 <- pData2[order(pData2$GDP.Change),] #step one, order the dataframe by size of GDP.change, this alone won't influence the plot, we need to change how the factors are mapped
z <- as.character(pData2$NAICS)  #convert the newly ordered vector of factors to a vector of strings
pData2$NAICS <- factor(pData2$NAICS,levels=z) #redefine the factors according to the string vector.

fileName <- paste(directory,"Perc_contribution_to_GDP_growth_2000_2014.pdf", sep="")
pdf(fileName, height=8.5, width=14)
#plot a subset of data, omit "all industries' since that's the total, and eliminate 'Mangement of companies and enterprises' since that's often NA
p <- ggplot(data = subset(pData2, pData2$NAICS!="All industries" & pData2$NAICS!="Management of companies and enterprises")) + 
  geom_bar(aes(x=NAICS, y=GDP.Change.Proportion),stat="identity", fill="orange") + #add bars
  geom_text(aes(label=NAICS,x=NAICS, y=0), color="darkblue",size=5,hjust=0) + #add labels for bars
  theme(axis.ticks = element_blank(), axis.text.y = element_blank()) + #remove ticks and bar labels 
  labs(x=NULL,y=NULL)+  #removes x,y axis labels
  ggtitle("Percent contribution to Canadian GDP growth from 2000/01/01 to 2014/08/31") +  #add a title to the plot
  #theme(plot.margin=unit(c(0,0,0,0),"cm")) +
  coord_flip() #change the coordinates so bars are horizontal
print(p)
makeFootnote(footnote)
dev.off()

#we will create the gif by creating a bar graph for each date.
dateVector <-unique(pData$Date) #first, extract all the dates
dateVector <- sort(dateVector)  #second, make sure the dates are in the correct order

barplotGIF2 <- function(t) { #create a function that will create a barplot for a given year t
pData2 <- pData[pData$Date==t,]
pData2$GDP <- pData2$GDP/1000
#for the graph, we want the bars orders by size, this is a three step process:
pData2 <- pData2[order(pData2$GDP),] #step one, order the dataframe by size of GDP.change, this alone won't influence the plot, we need to change how the factors are mapped
z <- as.character(pData2$NAICS)  #convert the newly ordered vector of factors to a vector of strings
pData2$NAICS <- factor(pData2$NAICS,levels=z) #redefine the factors according to the string vector.


graphTitle <- paste("Canadian GDP by NAICS Industry: ", t,"(billions of constant 2007 dollars)", sep=" ") #generates string for graph title
  p <-  ggplot(data=subset(pData2, pData2$NAICS!="All industries" & pData2$NAICS!="Management of companies and enterprises")) +  #create a ggplot object with data for year t
        geom_bar(aes(x=NAICS, y=GDP),stat="identity", fill="orange") + #add bars
        geom_text(aes(label=NAICS,x=NAICS, y=0), color="darkblue",size=5,hjust=0) + #add labels for bars
        theme(axis.title.x = element_blank()) +   #removes x axis title
        theme(axis.title.y = element_blank()) +    #removes y axis title
        theme(axis.ticks = element_blank(), axis.text.y = element_blank()) + #remove ticks and labels
        ggtitle(graphTitle) + #add the graph title
        scale_y_continuous(limits=c(0,225)) +  #specify axis range so scales don't vary graph-to-graph
        coord_flip() #flip the coordinates, this changes the plot from a vertical bar plot to a horizontal bar plot
  print(p)
  makeFootnote(footnote)  #adds the footnote through the above custom function


}

#this is the line that creates the GIF
#lapply takes the dates, and applies them to our barplotGIF2 function and returns a list
#interval = 0.1 sets a 0.1 second delay between plots in the the GIF
#movie.name gives the name of the file, NOTE I found it very diffcult to specify the directory path without changing the working directory, - so i left it
#the R output shows the location of the files, usually stored in a temporary folder.
#ani. width and height specify the dimensions of the individual png files
saveGIF(lapply(dateVector,barplotGIF2), interval = 0.1, movie.name = "test.gif", ani.width = 1280, ani.height =720)



One thought on “Canada’s largest industry: Real estate rental and leasing?

  1. Just found this again after looking for months and months. Thank you so much, this is so incredibly helpful for teaching my students. Is there any chance you can update the data, or should I just go and get a copy of R?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s