“I know a good comp when I see it! I have good judgment. I know my market. And I’ve been doing this for a long time.” (Oh yeah – I’m also pretty smart.)

What do we learn to get an appraiser license?  We are told that a comparable is competitive to the subject.  Then, a property is competitive if it is similar to the subject.  Then, a property is similar “if it can be compared” to the subject.  Got it?

And be sure to be credible while you are at it.

So, how can this be done better? It turns out there is a way. It is said that data scientists spend 80% of their time selecting and cleaning their data.  There are several data-analytic (algorithmic) ways to categorize data.  Yes, we can rigorously define “what is a comp?”

Many consider that there are three basic econometric methods:  regression, classification, and similarity matching.

  • Regression assumes one variable causes another, and measures that relationship.
  • Classification puts things into categories, like is it a comparable, or not.
  • Similarity matching identifies similar individuals based on what is known about them.

As it turns out, the question:  “What is a comp?” may be answered with each of the above three basic tools. But in each case, it is a specialized form. They can overlap.

For regression, it is logistic regression that works to put things into groups.  More like “is it a comp, or not a comp.”  This algorithm can measure the degree of likelihood of belonging or not.  It can be used for classification.  In the Valuemetrics.info classes, we distinguish between simple (one variable), and multiple regression (many predictor variables).  Each has critical assumptions and uses. I had one class at SDSU dedicated entirely to what the professor called ‘choice’ regression, where the desired decision was to put something into one group or another.

For classification, there are several methods, including K-nearest neighbors, classification/regression trees, and cluster analysis.  I have found cluster analysis is usable for asset analysts. We first presented cluster analysis at the SGDS2 class in Detroit last year. It is straightforward. And the algorithm is easily reproducible. (In programs such as R (open source), which have much greater flexibility and power than spreadsheets and canned forms software.)

For similarity matching, the objective is to find similar cases, like finding properties which would have been directly or indirectly considered by buyers as of the date of value. The goal of similarity scoring is an algorithm that gives a numerical score.

In the Stats, Graphs, and Data Science 1 class, we provide hands-on methods for valuers and asset analysts to measure similarity in a practical, day-to-day manner.  The method, touted as the DSI (Dell Similarity Index©), reduces the elements of comparison into groups useful for measuring a gradient of similarity. It does not measure risk, but does provide a numerical input into user analytics for risk analysis, portfolio management, and forecasting.

This would be an ideal tool for the GSEs and government agencies to use in upcoming residential dashboard ‘forms.’ It is simple to use, simple to understand, and immediately available.

The objective of Valuemetrics.info is to provide clients with analysts competent to fulfill those needs.