Inference is defined as a conclusion reached on the basis of evidence and reasoning.
I have often said the biggest problem with statistics is words. (The very word statistics is used in more than one meaning – often in a misleading way.)[1] So what is logical inference, and how is it different from statistical inference?
For our purposes, there are two types of logical inference.
One is when the logic determines the conclusion. The result is certain. Another is when the logic is likely but uncertain.
- Deterministic = certain (If A, then B is absolutely true)
- Probabilistic = uncertain (If A, then B is probably true)
Most things in life are uncertain. Valuation is uncertain. Some things are uncertain, but the probability is so high, that we are able to assume it as factual or true (even if there is some chance it is not true).
Legacy appraisal practice applies logical inference. It is probabilistic. We select ‘comparables’ which in our judgment are similar and competitive. (But judgment is not statistics – it is judgment or opinion.)
Probabilistic things have variation. Variation can come from measurement error. Nothing can be measured exactly. But often things are so sure, we can consider them exact. Like if a house has a one-car garage or a two-car garage. (We know there are smaller and larger garages, but for practical purposes a 1 or a 2 is all we need for our analysis.)
Deterministic reasoning is deductive. If A is true, B must be true. (A one-car garage will not hold two.)
Probabilistic reasoning is inductive. (If A is true, then Be is probably true.) (A two-car garage will always hold two – after you remove the clutter.)
Descriptive statistics are deterministic. The mean of any data set is exactly the same every time. So is the median, the range, and the standard deviation (σ in Greek – the square root, of the squared deviations divided by the number of numbers.) These are actually parameters not statistics, as we will see below.
Statistical inference is about depicting a population based on random sampling. It is usually called inferential statistics.
The difference between traditional appraisal and today’s valuation methods is that we now have all the data. We can gain much better accuracy and precision (sureness and trueness) by simply using all the data. (But it has to be all the relevant data.) Garbage data-in does not help. It only makes the result more garbagy (and uncertain . . .).
Once you have a sample (a random sample), then you can get descriptive statistics on the sample, and infer to the parameters of your population. Officially, a statistic is an estimate of a population parameter. Whew!
As it turns out, the most important parts of inferential statistics are:
- Knowing what your population is, that you are trying to understand;
- Getting a truly random (This is harder than it seems.)
The good news is that appraisers do not take random samples. The best way is to simply use the whole population of actual transactions. Second best is to use your excellent judgment and pick good comps. The bad way is to take a random sample, when you have the complete data set. The worst way is to pick comps, pretend they are random, and proceed to use clever inferential statistical tests to prove your model is really good.
“The biggest problem with statistics is words.” George Dell
[1] See one of my journal articles on the misuse of statistics in The Appraisal Journal of the Appraisal Institute. Included are references to the American Statistical Association Statement on p-values. Readable, informative.