Statistics is a necessary part of modernized valuation, and it is very much misunderstood. The main problem with statistics is words!
First, let’s get one word out of the way – A parameter is a summary number of any data set, such as mean, median, maximum, minimum, range, quantile, or variation (standard deviation). We use parameters because it helps the human brain understand groups of data (beyond 6 or 7 data points).
The word statistics itself is used (and misused) for at least two main meanings. Statistics:
- Is the study of data;
- Are parameters a sample, when used to approximate population parameters.
Whenever you see the word statistics, ask yourself which meaning does the writer mean. Often the writer doesn’t really know.
The above summary numbers are called descriptive. (Or descriptive statistics.) But 90% of what is taught in statistics classes are inferential statistics – where a sample is used to infer (approximate) descriptives of a population. It is not necessary for appraisers to use inferential statistics.
For appraisal, our ideal population are the properties which competed directly with the subject on the date of value. None of the other properties are part of the population. (Just as the apple in my refrigerator is not part of the market for apples! – you need to go to the market for that.)
So here we have it in short, regular-word explanations of descriptive numbers:
A common 5 number summary provides the max, min, median, and 25th and 75th percentiles.
The command in R is summary(anyDataSet). It returns the max, min, median, mean, and the two quartiles (25th percentile and 75th percentile).
For valuation purposes, I prefer to also add the 10th and 90th percentiles. These are easy to understand. Along with R summary command, these describe numerically almost any data set we use in valuation: region, city, neighborhood, or CMS (Competitive Market Segment)©.
The visual which goes along with the 5-number summary is the boxplot, (or box-and-whiskers), which also usually shows outliers. This is an example:
The middle line is the median (50th %tile). The box range is from the 25th to the 75th %tile. And the whiskers extend to the maximum and minimum. Outliers can be set as dots beyond the range or the whiskers – which are set to go out 1.5 times the range of the box.
The ‘modern’ appraiser needs to know how to use these. And the above is all you need!
These are all subjects taught more in depth in my entry-level signature course, Stats, Graphs, and Data Science 1.