# My Regression is Better Than Your Regression.

## If regression is just a math formula, how can anyone claim “Ours is better than theirs”?

Who do we believe? The reality is that the standard least-squares formula is just the inner algorithm. There is the input, the magical regression algorithm, and the output.

Four useful appraiser things to consider:

- The analyst’s decision that regression is the right
*model*for the problem; - The analyst’s decision on what sales data is put into the model;
- The analyst’s decision on what predictors (features) go into the data set;
- The analyst’s ability to properly explain the result.

What’s important to me as a user of appraisal software? How that particular software helps me in each of these four parts of my job. If only the magical software produced what was promised. Push the button, and *viola!* Another trouble free fee on the way.

If I could just push the button, get the instant answer, then deliver it to the client, my job could be so easy. I could be rich! So what’s the problem?There’s an inner contradiction here.

The problem is that if I can push the magic “analyze” button, so can my clients. So can other appraisers. So can other competitors — like accountants, economists, AVMs, BPOs, and other unlicensed ‘evaluators.’

Pretty soon I push the magical button, and it stops delivering my magical appraiser fee. Or – clients begin to ask for more. Like why did I push *that* button? What data did I put in? What data did I leave out? Why did I fail to consider the lack of sufficient value? Please explain. Please.

**Why is this happening to us?**

We hear how wonderful it is to regress, but no one is happy. Could it be one of the four “things”? For now, let’s look at just the first thing – whether the regression formula is the right algorithm for this problem. Is it the right model, the right solution? The assumption is that we are all talking about the same formula – the *minimized least squares* formula. I am not.

So, there are several different regression formulas. The least squares formula was popularized by a couple of things: First, doing math with paper and pencil or even by hand with a calculator was hard and slow. (Still is!) The formula had to be ‘tractable’ as they said. They admitted that squaring, summing, then un-squaring was rigid and often a poor model. But it was tractable. Next, after the accountant’s spreadsheet became popular, add-ons were added on. Some of them included statistics. The “statistics” regurgitated whatever the programmers learned in Stats 101. They included descriptive statistics and the inferential ‘sample’ statistics. Regression came with the graphs. It was always least squares, because that was all they knew, and the 8086 CPU was not powerful enough to do other regression types.

But today, there are choices of regression types: simple, multiple, multivariate, logistic, quantile, seam, polynomial, stepwise, ridge, lasso, and Bayesian, among others. Each does well for different purposes. Each is a good model solution for a particular data and problem type. It’s the model, not the math, that’s important. The appraiser models the model.

Appraisers will be useful so long as modeling decisions need to be made.

Appraisers will be useful so long as algorithms and tools need to be selected – like regression types.

Michael Sanders

June 14, 2017 @ 12:17 pm

I’m thinking that choice of regression model might be a good subject for your next seminar . . .

In the meantime, with respect to #1 above (is regression the right model for the problem?), I’d submit that the closer the market resembles the perfectly competitive market envisioned by neoclassical economic theory, the more appropriate it is to use regression modeling (including AVM’s). As you move along the continuum towards the perfectly non-competitive market (envision one property with one buyer), the less you can rely on mathematical models in favor of manual valuation and the (admittedly subjective) judgement of the appraiser. In the real world, real estate markets typically fall somewhere between these two extremes, where the analyst’s decision as to what type of model to use becomes critically important.

George Dell

June 14, 2017 @ 6:15 pm

Thank you Michael,

In my presentation at the joint Ottawa AI AICanada conference, one of the “two biggest regression mistakes” was assuming that the least squares algorithm is the one to use. In valuation work, it seldom is. Computation in the Data Science realm allows other regression forms that used to be not possible, due to lack of computer power. Today we can fit the best regression type to the problem at hand, without having to make compromises.

Actually, as there are usually a few potential buyers, and a few possible competitive properties, the model is probably more like oligopoly on both sides. However, once a buyer and a particular property have begun negotiations, the competitive market model may no longer be in effect. It is now a “game theory” problem, as originally devised by Von Neuman and Morgenstern in their classic book by that title. Richard Ratcliffe, MAI originally termed the result a “transaction zone” wherein equilibrium economics departs, and game theory is the model to be applied.

Absolutely, the choice of model to use is critically important. That is the need for professionals competent in the subject matter, but also able to use modern data science rather than the traditional methods — which were developed some 80 years ago in a world of sparse and difficult data.