Automated Valuation Models for Commercial Real Estate in the Netherlands

  • Author:
  • Year: 2018

Purpose – The purpose of this thesis is to investigate the potential of Automated Valuation Models (AVMs) for estimating the market value of individual Commercial Real Estate properties in the Netherlands.

Design/methodology/approach – With a unique complete dataset of 979 office property transaction from 2010 through 2018 obtained from Cushman & Wakefield that is enriched with information about building, location, lease and market factors, we study several methodologies that previous literature has shown to offer excellent explainability, reliability and predictability. These are the traditional Hedonic Price Model and the newer tree-based Machine Learning algorithms; Random Forest and (Extreme) Gradient Boosting. In addition, we introduce a new methodology by the name of Comparable Weighted Regression (CWR) that extends the Hedonic Price Model to allow for spatialand temporal dependencies by weighting observations based on the degree of comparability to the subject property. Through a variety of error measures and cross-validation techniques we investigate which methodology provides not only the lowest Mean Absolute Percentage Error (MAPE), but also minimize the number of large errors as these are especially unwanted in practice.

Findings – The first hypothesis of this thesis addresses the importance of lease related factors in the prediction of the market value for Commercial Real Estate properties. We find through Leave-One Out Cross-Validation that the MAPE of the Baseline Hedonic regression model improves from 45.8 to 22.8 percent when lease factors are included. The second hypothesis investigates whether the prediction accuracy of the Hedonic improves when we incorporate spatial-temporal dependencies. We find that the MAPE decreases to 19.3 percent which is best among methodologies while the number of large errors are minimized. The third and last hypothesis studies whether a well-defined Hedonic regression model can outperform newer Machine Learning algorithms that have increased in popularity in recent years in both academia and practice. We find that the tuned (Extreme) Gradient Boosting outperforms the Random Forest algorithm with a MAPE of 21.6 percent, but which still performs worse than both traditional Hedonic and Comparable Weighted Regression.

Practical implications – As we find strong evidence that AVMs applied to the Commercial Real Estate sector benefit from including lease related factors into their model specification, such data should be gathered more extensively. Furthermore, data-driven Machine Learning techniques seem to have difficulties finding the underlying patterns in the data due to the relatively few transactions that take place in this sector. And as the estimates of this ‘black-box’ techniques are also more difficult to communicate and defend, traditional regression methodologies seem to fit the purpose of this thesis better than Machine Learning techniques. But with an optimal MAPE of 19.3 percent against the average of 10 percent of manual appraisals, the methodology and data still have a far way to go before practical application. AVMs that combine best from both worlds, such as the CWR, are likely to be the key to success.

Originality/value – The discussion whether Automated Valuations Models will disrupt the market or let it evolve is more relevant than ever. Surprisingly, literature that investigates the potential of such models for the Commercial Real Estate sector are practically non-existing. This thesis is a first study to compare traditional Hedonic Regression with Machine Learning techniques in this sector. In addition, we propose a new methodological framework, the CWR, that aims to counter some of the issues of traditional regression for the task at hand.