The HR world is now abuzz with Big Data. But do we have a good feel for what Big Data is and how it will be used? According to bestselling author Bernard Marr, Big Data is “our ability to collect and analyze the vast amounts of data we are now generating in the world.” But just because we now have the processing horsepower coupled with vast amounts of information doesn’t mean that we’re going to come up with magical solutions. If you really think about it, in the field of selection and hiring we’ve been working with Big Data for decades, albeit maybe it wasn’t so big.
Biodata is essentially Big Data. Biodata is any information on an individual, usually a job candidate, that relates to their personal history, background, experience, etc. For instance, where they went to school, the degree they obtained, how many jobs they’ve had, the types of jobs, and even things such as their favorite courses. The goal of biodata is to find a stable set of variables, drawn from information provided by applicants or incumbents, to predict meaningful outcomes such as turnover, sales quota attainment, injury risk, and a slew of other criteria. I/O psychologists have been doing that since the turn of the century, the 20th not the 21st. The biggest criticism of purely empirical biodata has long been that is just “dustbowl empiricism,” meaning that if it correlates it should be used, whether you know why or not. That’s all fine and good but it is decidedly atheoretical and likely results in a lot of spurious findings that also “shrink” or disappear when they are appropriately cross-validated. In addition, it can lead to some really crazy findings, many of which would not only be unethical or unstable but also potentially illegal. Most I/O psychologists who use biodata now endorse more of a “rainforest empiricism” perspective, wherein there should be at least some theoretical basis for understanding correlations between two or more variables.