Predicting Real Estate Market Demand
- Ronald Daley

- Nov 20, 2019
- 5 min read
I am an aspiring Real Estate Investor with limited funds to purchase tools that will be useful to me in my investing journey. Like most Investors, I want to be as confident as possible that my investment property is secure and that I am investing in a market that is in high demand. Investing in a market that is in high demand will add an extra layer of security to my investment. As a new investor, my goal is to find an emerging market to begin my investing journey. Finding a market in high demand will serve as valuable input into finding those emerging markets and lowering the risk on my investment.
For this project, I defined an emerging market as a market with high demand. I understand that there are other factors that are also included in an emerging market but for the purposes of this project, demand was my only metric. I will explore the other factors (i.e. supply, population, crime, local economics, etc..) for another project. For my scope, I decided to focus on Illinois because it was the market that I was the most familiar with and I could use my knowledge to check the validity of my model.
My goal for this project was to create a classification model that predicts real estate demand for each identified Illinois market. I wanted to designed a model that could be used to assist new Investors, like myself, identify great areas to purchase investment properties in Illinois.

Data
For my analysis, I utilized real estate inventory data that is used for identifying market trends and monthly statistics on for-sale listings from Realtor.com. I leveraged Amazon Web Services (AWS) to store 6 years of inventory data in a AWS cloud. The data contained over 1 million observations of data which included inventory data for most markets in the United States. There were over 30 market features included in the dataset. These features included information such as supply, demand, total (active and pending) listing count, and average listing price.
Additionally, each market is identified via zip code (i.e. 60422 - Flossmoor, Illinois market) . To determine demand in a market, I leveraged the market's month over month days on market (DOM). A market with positive demand is a market with a decrease in month over month DOM. A market with negative demand is a market with an increase in month over month DOM.
A bit confusing, I get it. So let me walk you through my assumptions. An increase in month over month days on market means the average time that a property spends on a market has increased. For example, in July the average time a property is on the market (i.e. 60422) is 10 days. Then in August, the average time a property is on the market (i.e. 60422) is 12 days. For the purposes of my project, this means that the property's are not selling as quickly and the interest in that market is down. Which is a bad thing. But the market's month over month (DOM) has increased. This may seem like a good things but for the purposes of my project, but it is not. Negative demand is a market with an increase in month over month DOM and positive demand is a market with a decrease in month over month DOM.
Modeling Process
I began my analysis by querying all of the real estate inventory data that I stored in AWS in a Postgresql database. Afterwards, I utilized the libraries in Python (Pandas and Numpy) to wrangle and filter the data so that it only displayed Illinois data. Because there was so much complex data that I was cleaning, it took me a bit longer than expected to complete this task. After playing around in Python for a few days and constantly banging my head against my computer screen out of pure frustration, I was now confident my data was clean and I was able to start modeling.
The goal of my model was to use the Illinois real estate market inventory data to predict what the monthly demand will be for identified markets in Illinois. I fitted several classification models to my inventory data. The models included the Dummy Classifier, K-Nearest Neighbors (KNN), Logistic Regression, Gaussian Naive Bayes, Support Vector Machines (SVM), Decision Tree, and Random Forest. The model with the best performance was the Logistic Regression model with the highest ROC/AUC score. Because of my class balance in my dependent variable, I looked at accuracy to evaluate the Logistic Regression model's performance. The model had an 87% accuracy score. What this meant is out of all the predicted and actual classifications (market demand predictions), my model is predicting correctly 87% of the time.
Additionally, I was able to identify the most import features when predicting market demand. The features are average listing price, total listing count, active listing count, new listing count, and price decrease count. Increasing or decreasing these features in a market will have an impact on a market's demand.
Here's a link to my GitHub page with my Python code and results.
So, Where Should I invest?

I was able to predict demand for all of the specified markets (via zip code) in Illinois. The results are displayed in Fig. 1. The demand is displayed on a green to red gradient overlay for each market (via zip code). The stronger the green is in the market the higher the demand, and the stronger the red is in the market, the lower the demand. This is a great visual for quickly identifying what are the good, bad, and okay markets to invest in as a real estate investor.
I was also able to identify the markets that are predicted to have the highest probability of having high demand. In Fig. 2 is a list of my recommendations for 5 neighborhoods in Illinois that are likely to have high demand. As shown in the blue circle in the Fig. 2, markets surrounding and inside the city of Chicago are the markets in highest demand.

If you're an investor with a larger budget, those are areas that are predicted to be in high demand and they are worth further investigation. The average listing price for properties in the blue circle tend to be higher than what most new Investors can afford, so I decided to explore the "path of progress" outside the city of Chicago.
My recommendation is for new Investors, who aren't able to afford properties with such a high price point, should explore the markets that are northwest of Chicago and near the Wisconsin/Illinois border. Those markets have a greater probability of being in high demand than other markets that are west or south of Chicago. Those markets that are to the northwest of Chicago and near the Wisconsin/Illinois border are worth further investigating because they are relatively close to the city of Chicago and have lower average property listing price. These markets can create affordable opportunities for new Investors, like myself, to invest in Illinois.
RAD Guy


Comments