Big Data vs. Pig Data
Yesterday I posted a tongue-in-cheek remark about “pig data” on facebook [ https://www.facebook.com/photo.php?fbid=10152278337104002 ]
If big data are muddy, then they need to be cleaned up before there is any potential to profit from them. Big data with only hope to help them “emerge” from the chaos are of little or no more value than the pigs above emerging from the mud.
As John Battelle thinks out loud about “potential information” (which is, I feel, just another way of saying “data yet to be collected”) [ http://battellemedia.com/archives/2014/03/unleashing-potential-information.php ], I hope he devotes some attention to the research methodologies that are fundamental to how “science” works, Very few of the data being collected or shared online are “scientifically” collected, scientific methodology is not very widespread online. In other words: Most of the so-called data are useless.
This will probably change over time, but the main reason why it might change is not down to some process of “natural emergence” from chaos, but rather because someone is willing to pay for reliable and verifiable research methods to be employed.