By Lucy Drummond
Before we analyze and celebrate the “insights” so far gleaned from big data, we need to look at who is analyzing and collecting the data.
In the 1980’s, research on heart attacks and blood pressure was published as a necessary read. However, only when gender-conscious thinkers approached the material did they realize that the research participants were almost all men. How could the physiological differences between males and females not warrant specific analysis, or at least, acknowledgement? Men and women have entirely different hormones, bone densities, and reproductive systems.
It turns out that gender does influence one’s heart and blood, and now health recommendations for men and women vary. Even if the variation is slight, the point is that the way the initial research was structured was not inclusive enough to reveal accurate results.
We are about to have a similar phenomenon with big data. Of course, big data is not vital like the human heart. But there are parallels to be drawn about the ways in which measurement tools are devised. If there are mostly men at the drawing board, only certain kinds of information will be recorded. We need a diverse group of people structuring the ways we measure the world.
An August 2015 article in Network World spotlighted thirteen up and coming big data analytics companies. Unfortunately, only three have a female employee. None have more than one female employee. Experfy and Interana each have one female co-founder, and DataTorrent is the only other company besides those two that even has a female employee. RapidMiner has raised $20 million, but seven out of seven of their employees are men; Snowflake Computing, which has accrued more than $65 million, has six total male employees; at $41 million, Tamr’s four employees are men.
This is not a statement condemning or blaming men. It is a call to action for those involved in big data. We are asking data scientists, analysts, managers and engineers to pause and think: are there enough viewpoints and backgrounds included in the earliest stages of structuring big data tools? Are we addressing the qualitative, in addition to quantitative stats? Valuable insights can be gleaned by assessing information in different ways. A complete and thorough rubric is the only thing that will yield accurate results.