Correlations of provincial house prices in Canada 1990-2014

A common opinion that I come across is that there is no such thing as a “Canadian housing market”.  Instead, the various housing markets across the country behave independently of one another.  I used the CMHC webcrawler to gather data on annual median provincial prices of absorbed houses from 1990-2014.  I then used the corrgram package in R to determine the degree of linear association of house prices between provinces and the 95% confidence interval.  The results show a very high correlation across all provinces suggesting to me that there is in fact a “Canadian housing market”.

Continue reading

Web crawling with Python: Part 2, Navigation

In the last post, I described how to get setup with Python, Scrapy, Selenium, and Firebug in order in order to start programming web crawlers.  In this post I will describe how to program a scrapy web crawler to navigate the CMHC website and locate data to be retrieved.  In the next post, I will show how to scrape and store the website data.

Start by opening a terminal window and navigating to the directory where you want to store the web crawler. Enter the following command in the terminal:

Continue reading

Web crawling with Python: Part 1, Setup

It has been awhile since my last post, I have been working on an app for the past few months which consumed all the behind-a-screen time I could muster, but now it’s time to get back to things.

In addition to the R graphs that I usually do, I will be writing more about data mining. If you can get your data from StatsCan, then you’re probably good to go since you’re able to customize a lot of their reports and there are several formats to choose from for downloading. A lot of data is not so easily attainable, and in the past, an analyst would manually copy and paste data into excel sheets, or if they were lucky, they might have been able to use a web query from their spreadsheet to link to the data. Now, there are far superior options available which not only help retrieve data but also open up a whole new world of perspective.

Continue reading