Let us call bourgeois data any research data the collection of which is funded by groups invested in the status quo. Bourgeois data, which is to say nearly all the quantitative data used by social scientists, suffer from a specific type of bias for which most methodologists show very little concern, despite the extraordinary analytical attention they devote to the question of bias in social-scientific inference more generally. Methodologists are likely unconcerned about the possibility of institutional biases in data collection because they have precisely the same institutional investments as the funders of data collection projects. Methodologists will consider “external validity” (whether the data reflects what we think it reflects in the world) and “internal validity” (the logical coherence of inferences), but rarely do they consider what we might call political validity, which we might provisionally define as the scientific legitimacy of the basic worldview, determined by a researcher’s institutional environment, with which scientific inferences are verbally constructed. In other words, we must take seriously the question of whether the institutional environment which sponsors data-collection and research activity more generally is itself non-random and non-trivial. I suspect that for too long this question has been delegated to “critical” theorists in a disingenuous gesture which gives them just enough attention so that positivist social scientists can say their discipline already has people “working on” this issue, but not so much attention that positivists have to make any serious adjustments in how we actually think, write, or speak about the political implications of our research agendas.
Unfortunately, bourgeois data are almost by definition generated by a non-random, non-trivial social process which biases at the very least its terms, if not necessarily the internal or external validity of conclusions drawn from it. By beginning with a vocabulary drawn from a status quo which is from a social-scientific perspective perfectly arbitrary, it ends with conclusions the meaning of which are equally arbitrary, no matter how internally or external valid those conclusions might be. More to the point, the non-random, non-trivial process which shapes a status quo into one vocabulary rather than another is, of course, power. This is why critical theory’s demand for reflexivity in social science is not empty posturing, as so many positivist social scientists dismissively believe. Rather, that demand points to a bona fide problem of scientific inference which is not merely humanistic critique from cultural studies departments. To put the issue in the words of positivist social science: the problem is that the words we use to draw conclusions from empirical patterns are never “controlled for.” We need theoretical work which will, in a methodologically rigorous fashion, subject the naive vocabularies of contemporary social science to a verification process where arbitrary power-driven linguistic determinations are “controlled for” and replaced in those theoretical edifices we devise to “house” the empirical patterns we discover.
(Social scientists, it must be admitted, have a child-like trust of the politicians and institutions from which they inherit their vocabularies, perhaps because like children they live from the good graces of these parents.)
Most contemporary datasets respond to some research demand that is considered valuable from the perspective of the status quo; it will be motivated by hypotheses the confirmation and rejection of which are consistent with the status quo, for no other reason than all of the actors involved prefer the status quo to a radical overhaul. One doesn’t need to even be a Marxist to believe this, perfectly mainstream rational-choice theories as well as sociological institutionalism confirm this point easily. Because we know these simple truths about institutions but can’t be bothered to take them very seriously in our own work—because the committment to truth, to unbiased inference, would actually demand an explicitly revolutionary politics for which social science does not have the courage today, quantiative social scientists in particular are like finance columnists who evaluate businesses they own stock in without disclosing this fact to readers.
Fortunately, for those of us who take seriously the revolutionary political agenda implied by the necessity of the scientific method, it is not at all the case that we are simply awash with hopelessly and uselessly prejudiced data. Bourgeois data cannot help but accidentally shed light on aspects of the social system it never had any interest in and would have preferred to omit. And it is these aspects to which we must devote our attention if we are to glean from biased data the stories which are most important for correcting the data’s institutional biases. I will call epiphanic data those datasets which emerge from transforming, subsetting, modeling, and otherwise manipulating bourgeois data in order to glean from it radical, anti-institutional insights. It might be objected that if the institutions which sponsor data collection so contaminate data with political bias, then what allows us to think we can glean non-biased conclusions from it? In that voice which positivists use to insinuate that critical theorists are merely conspiracy theorists with paranoiac worldviews: “Why is it not the case that this data is doomed to reproduce the status quo, if the data-collecting instituions are so politically perverse?” There are a few reasons why even the most bourgeois data can be used for anti-bourgeis, epiphanic social science.
The drive for short-term profit incentivizes institutions and individuals to collect dangerously more data than is ideal for the perpetuation of the status quo, at the risk of revealing contradictions which indict the system as a whole. Those groups funding data collection in one area are often unaware and, in any event, politically unconcerned with data collected in another area. Of course, they are politically aware of issues such as general conflicts of bureaucratic “turf” but certainly not in the long-term, revolutionary political dimensions under consideration here. Bourgeois data collection is complacent also because the division of academic labor is so developed and intensely enforced that status quo institutions don’t really have to worry about anyone discovering relationships across many interdisciplinary datasets, even though such datasets are widely and freely available. Bourgeois institutions have been emboldened by a severe undersupply of revolutionaries in quantitative social science. This has permitted data-collectiong institutions a great deal of complacency on this point. A sudden exogenous increase in radical social scientists could quickly seize on bourgeois data and extract from it extremely damning indictments of the status quo, which from a technical perspective would be almost childishly simple but have never yet been performed because the institutional biases are such that these findings are not rewarded. At the same time, phenomena such as racism, colonialism and patriarchy have, for a very long time, artifically depressed literacy, education, and civil rights to specific groups, suppressing demand for a really true, popular social science which would dispel the most vicious falsities and generate more legitimate equilibria. So long as the less privileged groups could not effectivley demand true social science, bourgeois institutions felt safe to collect data to their cold heart’s content.
But today, decolonization, civil rights movements, feminist movements, various democratic uprisings, and the rise of mass education as well as the internet, have all lead to a huge increase in the aggregate demand for precisely this kind of social science. Witness the rise of the “data visualizer”, that not-quite-data-scientist internet persona who specializes in fashionable data-intensive infographics for mass distribution on the web. Such a persona is merely an inane example of how millions of people around the world would be increasingly amenable to a popular, emancipatory social science.
In summary, for those who wish to read between the lines of bourgeois data, there always exists epiphanic data in the simple will to read it differently, a parallax data which promises a wide world of revolutionary intellectual exploration. Unfortunately, this world of data is currently available only to those few of us who have been given the unequally distributed privileges required to mine it and a sufficiently youthful spirit to still want to. But a mass defection of radical young social scientists from currently privileged research agendas toward anti-institutional perspectives would open a large range of research which need not be technically sophisticated to quickly produce huge increases in both the overall truth quotient of social science and its overall effectiveness in radical egalitarian political change.
Murphy, Justin. 2013. "A concept of political validity in quantitative social science," http://jmrphy.net/blog/2013/11/03/a-concept-of-political-validity-in-quantitative-social-science/ (April 24, 2017).