Why do we put “stats”, “graphs”, and “data science” together?  (Part I of VI parts.)

In coming blogs, we’ll look at each of these four words separately.  Today (part I) we look at the whole for its parts.  We use this principle of the scientific method:  take a problem apart into smaller parts, study the parts, then put it all back together.  In part VI, we’ll put it back together to see how data science is both an art and a science!

Data science is a unifying concept.  It’s replacing traditional statistics in today’s world of big data.
We may not realize that data science is also replacing traditional data analysis.

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision-making.

Traditional data analysis:

  • Uses limited, structured data sources (MLS, public records, commercial sources)
  • Precludes scalability to greater amounts of data (beyond 4 or 5 comps)
  • Avoids multi-dimensional data complexity (like market data)

Traditional data analysis focuses on the analyst’s conception of limited similar data.  The largest mistakes made are the result of the black swan effect – unexpected (even catastrophic) results from failure to consider outliers, or why something does not fit.

Data science:

  • Uses multiple data sources: (structured, unstructured, observed, transformed, maps)
  • Complete data sets, size-optimized (the competitive market segment©)
  • Integrates subject matter expertise

Data science focuses on the market’s conception of the complete relevant data set.  Large costly mistakes are avoided in two ways:  1) going back further to the assumptions of a discipline (such as the required definition of market value); and, 2) going deeper into the data, to pick up the unintended results from partial data analysis.

Data science emphasizes a deeper principle of analyst integrity:  Do not assume the swan will be white every time.  Seek the black swan.

The traditional appraisal process requires good data judgment: “Trust me, I know a good comp when I see it!”  The valuation data science© tenet is: “Let the data speak of itself.”

As we proceed to look at the parts of this reduction, we will consider:

  • Where do statistics make sense?
  • How do visuals, plots, graphs fit in?
  • When does data become information?
  • What is the skill and art of data science?

Next week’s blog will consider the next “black swan” economic meltdown.