Why Regression Doesn’t Work

Regression is just a math formula.

Regression is not magic. The answer is only as good as the data used. Regression is a tool.

It is helpful to understand the difference between a tool and a model.  To build a house, you need a hammer, the tool.  You also need a nailing schedule, which tells you where to use the hammer.  The nailing schedule is part of a group of models within the blueprint.  The blueprint is the overall model.

In valuation the overall model is the blueprint.  The cost approach is a model.  Deciding to use the cost approach is a modeling decision.  Identifying the market (the data set) is a model.  Finding the important elements of comparison is modeling.  Regression is a very useful tool in some instances.  But it is still only the hammer.  You still have to have the plans, the nailing schedule, and the lumber.

The appraiser’s job is twofold:

1) the mechanics of how to hold the hammer, and how to swing it; 2) when to use a hammer, and when to use a screw instead.

With regression the mechanics are simple:  push the ‘regress’ button.  The tool is a fairly simple formula, to minimize the squared numbers you throw in.  It is a simple computation, which takes a microsecond.  The model is the challenge.   Modeling is what the appraiser does.  Modeling for asset valuation requires some simple basics:  1) What data belongs in the data set?  2) What comparison variables matter?  3) How are the variables to be measured and entered into the formula?  4) How do you explain what you learn from the models you chose to use?

Regression always works.  It is just math.   Regression doesn’t work when:

  1. The modeler selects the wrong data, too much data, or too little data
  2. The modeler selects the wrong predictor variables (key elements of comparison)
  3. The modeler fails to use the correct data type in as input. There are four basic data types.
  4. The modeler is unable to explain the results and reliability of the results.

An upcoming blog will consider what matters in the two key parts of data selection:  What is the right data?  How much data is just right?  (Not too little and not too much).