We asked about comp selection in a “big data” world.  Why do we still use three comps on a form, or six comps on a spreadsheet?

What is the right data? 

It has been claimed that three comparables was used because they fit on an 8 ½” by 11” sheet of paper.  For commercial work, especially after the discovery of the electronic spreadsheet and sideways printing, 6 or 7 comparables could fit on one sheet nicely.

When I became an appraiser, my trainer liked to collect data “the hard way” with my making calls and investigating sales.  It was not possible to collect all the competitive sales.  And the data was poor.  The best effort was put to confirming and talking to people connected with the transaction.  The deeper look at a few comparables created better confidence, and a richer understanding of the market.

Appraisers owned the data.  We went to chapter meetings to connect and to exchange comps.  We owned the data.  Then things changed.  First with MLS books, then CompData income property sales, and even some income/rent information – all in print.  Then came technology — something called a ‘modem.’  It went beepity-beep at 60 bps.  It was cuddly and cute and helpful.  Why in less than 20 minutes, you could pull down six or eight sale listings, confirm them, and turn over the file to the typist.  In a day or so, you got a draft, edited it with a red pencil, and got the final draft typed.

Confirmation provided a richer understanding and more confident justification and explanation and support of the opinion – an “estimate.”  To some extent, the deeper knowledge replaced the wider data set.

So what is the size of the ideal data set?

We have two conflicting theories:  More information or garbage data.

More information is always better than less information.  Can we simply use more comps?  When do we stop?  The answer is simple.  Any sales competing directly with the subject –exposed to the market – can be called competitive, similar, and comparable.

It is not three, not six, or any given number.  The ideal data set includes every sale which was on the market at the same time the subject would have been on the market.  If we use less, we are discarding information.  If we use more, it may be garbage data.

Once, again Baby bear got it “just right.”  Momma bear used the whole neighborhood, and it was “too much.”  Papa bear used just three, and it was “too few.”  In future posts,   we will consider the huge importance of getting it just right . . . and the optimal data set.  How do we objectively determine the ideal data set?

This is the foundation stone of Evidence Based Valuation.