How Many Comps Are Enough?

Quiz:  How many comps are “just right?”

First we must ask:  What is a comparable?

We can take two attitudes:  1) “I know a good comp when I see it!” or; 2) What does our accepted literature say?  The Appraisal of Real Estate says . . .

“The data used for comparison … should come from properties that are similar…” (TARE, p.121)

“The comparable properties selected for analysis should be similar to the subject property in terms of zoning and other characteristics.”   (p. 106)

“A good comparable sale is a competitive alternative …” (p.121)

So . . .  What is “competitive?”  The sentence continues:

“… a competitive alternative – i.e., a property that the buyer of the subject would also consider.”

Comparable = Similar = Competitive =  Comparable

Fuzzy circular.  But all we have to work with.  For now, let’s say we need a cutoff.  Is it a comp, or not a comp.  Yes or no.  (In the Valuemetrics.Info SGDS1 workshop, we use the much better, five-dimension method!)Comparable--> Similar --> Competitive --> Comparable

So:  How many comparable-competitive-similar-comparables are best?  Is it three because that fits on a standard sheet of paper?  Should it be six, because that fits well sideways on a spreadsheet?  Or should it be the whole city or neighborhood?

More questions:  Are two comps better than one?  Are four comps better than two?   What about 8 instead of 4, or 32 or 63?  How about 12,448 comps?  Is there an optimum amount of data?

The data science says yes . . .  It involves a famous trade-off.  It is called the bias-variance tradeoff.  For appraisers, subjective bias may come from picking three comps.  There is always some uncertainty in each of the sale prices, some uncertainty in the reported measurements. The result will always be too high or too low – just from whatever three are picked.  This is analytic bias (not appraiser bias, of course).

Variance comes from the randomness also, but this randomness is noise error.  As more data is added, some if it may be irrelevant, and simply not correlated with prices at all.  Adding this ‘excess’ data simply increases the noise, and makes the result, less certain.  Too little data creates bias.  Too much data creates noise.  Too much or too little are less than optimum.  This is the technical explanation.

I found this – a five year old’s explanation:  A person with high bias is someone who starts to answer before you can even finish asking.  A person with high variance is someone who can think of all sorts of crazy answers.

Back to our question:

So what is the ‘sweet point’ of the number of comparables to use?  We need to define one more term closely — Information is data made useful.  Data becomes useful information in two ways — through selection/summarization or estimation/prediction.

In earlier appraisal history, the cost, the effort of identifying and confirming more than a handful of comps was a reasonable factor.  “The selection of comparables is directed to some extent by the availability of data.”  (TARE p.121)

Today, (in most appraisal situations) we can get all or substantially all the sales in an instant.  What is different today is how we go about verifying our data:  confirming, cleaning, and preparing.  We attend to outliers as much as the ‘similars.’  The ideal data set is the complete CMS, the Competitive Market Segment.  This is the first major goal of evidence-based valuationGet the right data set:  the CMS.