As data scientists, we owe our clients the truth. The goal of any organization with big data is to translate it into business intelligence. Too many people believe that we can just use correlational analysis because we have big datasets. As the old adage goes, correlation does not equal causation! Correlational algorithms, with pretty pictures of the data illustrating trends and markets, often fail when applied to new data. This is nothing more than picking out shapes in the clouds. Yes, if we correlate the monthly birth rates in Oregon with the rainfall in Kentucky, then guess what: a picture appears in the data! A word of caution is appropriate. Correlational analysis seems to offer some plausible insights, but doesn’t achieve big-data analytics’ goal of providing intelligence for decision-making. We need sound, robust analytic techniques to discover statistical truths.
The constructs of Bayes’ theorem and its application to Bayesian belief networks (BBN) push us closer to these truths. The theorem is based on the interaction of two or more conditional events. It starts with an initial truth statement (hypothesis) and an observable event, and then refines this initial truth by evaluating the probability of the two events occurring simultaneously, divided by the probability of the observable event. By extending the theorem to BBN, we can add events to the equation to refine this truth even further. With BBN, we reach into untouched subspaces where the business intelligence we seek resides.
In presenting challenges, big data presents opportunities. Companies who were early adopters of collecting massive amounts of data found it difficult to extract actionable information from them to gain a business advantage. In response, the IT community constructed platforms such as Hadoop, MongoDB, NoSQL, Azure, Big Query, and even BayeSniffer.com to solve the problem of translating big data into business intelligence. When we data scientists enter the mix, we must provide services that slice through these data and extract the truth. We must provide the tools for strategic economic decision-makers to solve complex problems. This is exactly what BBN do.
Lastly, as we look to the future of exploring big data, we must remain ethical as data scientists. There is a universe of unexplored opportunity that awaits us, and I encourage our community to share our knowledge so that our discoveries may push our science to a higher level.
Dr. Jeff Grover is founder and Chief Data Scientist of Grover Group, Inc. He specializes in Bayes’ Theorem and its application through Bayesian Belief Networks (BBN) to strategic economic decision-making. He has conducted research with the US Army Research Institute for the Behavioral and Social Sciences and US Army Recruiting Command in the areas of Soldier attrition and special operations and medical recruiting. He has also recently conducted BBN research in the agricultural industry. Jeff received his Doctor of Business Administration in Finance from NOVA Southeastern, a Master of Business Administration in Aviation from Embry Riddle Aeronautical University, and a Bachelor of Science in Mathematics from Mobile College.
Jeff has published a book, Strategic Economic Decision-Making: Using Bayesian Belief Networks to Make Complex Decisions with SpringerBriefs. He has also operationalized the content of the book into a Bayesian big data algorithm at BayeSniffer.com. Also, he has published in the Journal of Wealth Management, the Journal of Business and Leadership; Research, Practice, and Teaching, and the Journal of Business Economics Research. He was also a guest speaker at the December 2014 MORS Conference in Washington, DC. Dr. Grover is also a retired US Army Special Forces officer.