This posting is a thought experiment that is an effort to probe the underlying components of “public opinion.” I’ve created 8 sample data sets that I have normalized to a 1-5 scale; I’ve interpreted the scores as a Likert scale (from strongly disagree to strongly agree); I’ve assumed that each data set represents a distinct sub-population within a mass population; and I’ve examined the disaggregated and aggregated results in a simple spreadsheet model. This is an effort to push the questions raised in an earlier post a bit further: how do the tools of survey research allow us to answer questions about how a mass population thinks and feels about a given set of issues? The central observation there was that public opinion needs to be disaggregated around groups that have greater homogeneity. Here I am undertaking to illustrate how that disaggregation can shed greater light on the nature of public opinion.

So — suppose we have a public that consists of eight distinct and mutually exclusive groups, A through H. Suppose we want to evaluate their attitude towards some important issue — let’s say, immigration reform. And suppose we’ve designed a study with a number of Likert-scaled questions that are combined and reaggregated to a 1-5 scale from “strongly oppose (1) to strongly support (5)”. Finally, suppose the population as a whole shows a distribution represented in the large panel above, “SUM”. The whole population represented in the top panel is simply the sum of the sub-groups A-H.

The hypothetical study is pretty uninformative up to this point. The average value for the population as a whole is 2.94 — almost exactly in the middle of the scale. And with a standard deviation of 1.02, there is a substantial spread of opinion around the mean. We can be confident that 95% of the population falls within two standard deviations of the mean — roughly between 1 and 5. So the aggregate data we have for the population as a whole is almost wholly uninformative — as suggested in the earlier post. It tells us that the average voter is neutral on the issue and that the population ranges between extremely opposed and extremely favorable.

But now suppose that we are able to break out the data for the eight sub-groups, and we find that their attitudes are described in panels A through H. Here we can sometimes provide more specific answers to the question, what does Group X think about the issue?

Here we find more useful information, in that the groups have very different profiles of response to the issue. Group A is slightly unfavorable to the issue, with a standard deviation of a little more than half a point. Group B is strongly favorable (3.70) but with a significant distribution to the left with a standard deviation of a full point. Group D is the most negative, with an average score of 2.15 and a standard deviation of .74. Groups F and H are the most interesting, in that they are bimodal, with peaks around “strongly disagree” and “strongly agree” and almost no one indifferent. (The mean value for Group H is 2.93, right in the middle of the scale — but there are no individuals in this range!) Given the bimodal nature of these two populations, we’re encouraged to explore the idea that there may be a distinguishing characteristic within the group that accounts for the divided attitudes. Group C looks pretty much like the population as a whole; and Group G is distributed evenly across the whole spectrum of opinion, with no concentration around any particular position.

So this simple thought experiment seems to validate the conclusion proposed in the prior posting: in a population consisting of a number of heterogeneous groups, it is important to attempt to disaggregate the results of opinion research in a way that allows us to examine the sub-groups separately. And the statistics describing the sum of the sub-groups are likely to be uninformative; pooling the data from the eight subgroups creates pretty much of a broad, normal distribution of responses. The real insight comes in when we are able to differentiate the population into a number of sociologically real sub-groups with their own more distinctive profiles of attitudes and responses. And more important, perhaps this experiment illustrates a different way of conceptualizing public opinion: not as a characteristic distributed in a gradient across a population, but rather a composite characteristic reflecting underlying groups with their own distributions of attitudes and feelings.