Posted on

Why Measuring Political Bias is So Hard, and How We Can Do It Anyway: The Media Bias Chart Horizontal Axis

Post One of a Four-Part Series

The Media Bias Chart Horizontal Axis:

 

Part 1:

Measuring Political Bias–Challenges to Existing Approaches and an Overview of a New Approach

Many commentators on the Media Bias Chart have asked me (or argued with me about) why I placed a particular source in a particular spot on the horizontal axis. Some more astute observers have asked (and argued with me about) the underlying questions of “what do the categories mean?” and “what makes a source more or less politically biased?” In this series of posts I will answer these questions.

In previous posts I have discussed how I analyze and rate quality of news sources and individual articles for placement on the vertical axis of the Media Bias Chart. Here, I tackle the more controversial dimension of rating sources and articles for partisan bias on the horizontal axis. In my post on Media Bias Chart 3.0, I discussed rating each article on the vertical axis by taking each aspect, including the headline, the graphic(s), the lede, AND each individual sentence and ranking it. In that post, I proposed that when it comes to sentences, there are at least three different ways to score them for quality on a Veracity scale, an Expression scale, and a Fairness scale. However, the ranking system I’ve outlined for vertical quality ratings doesn’t address everything that is required to rank partisan bias. Vertical quality ratings don’t necessarily correlate with horizontal partisan bias ratings (though they often do, hence the somewhat bell-curved distribution of sources along the chart).

Rating partisan bias requires different measures, and is more controversial because it disagreements about it enflame the passions of those arguing about it. It’s also very difficult, for reasons I will discuss in this series. However, I think it’s worth trying to 1) create a taxonomy with a defined scope for ranking bias and 2) define a methodology for ranking sources within that taxonomy.

In this series, I will do both things. I’ve created the taxonomy already—the chart itself—and in these posts I’ll explain how I’ve defined its horizontal dimension. The scope of this horizontal axis has some arbitrary limits and definitions,  For example, it is limited in scope to US political issues, as they exist within the last one year or so, and uses the positions of various elected officials as proxies for the categories. You can feel free to disagree with each of these. However, it has to start and end somewhere in order to create a systematic, repeatable way of ranking sources within it. I’ll discuss how I define each of the horizontal categories (most extreme/hyper-partisan/skews/neutral). Then, I’ll discuss a formal, quantitative, and objective-as-possible methodology for systematically rating partisan bias, which has evolved from the informal and somewhat subjective processes I had been using to rate it in the past. This methodology comprises:

  • An initial placement of left, right, or neutral for the story topic selection itself
  • Three measurements of partisanship on quantifiable scales which include
    1. a “Promotion” scale
    2. a “Characterization” scale, and
    3. a “Terminology” scale

3)  A systematic process for measuring what is NOT in an article, the absence of which results in partisan bias.

  1. Problems with existing bias rating systems

To the extent that organizations try to measure news media stories and sources, they often do so only by judging or rating partisan bias (rather than quality). Because it is difficult to define standards and metrics by which partisan bias can be measured, such ratings are often made through admittedly subjective assessments by the raters (see here, for example), or are made by polling the public or a subset thereof (see here, for example). High levels of subjectivity can cause the public to be skeptical of ratings results (see, e.g., all the comments on my blog complaining about my bias), and polling subsets of the public can skew results in a number of directions.

Polling the public, or otherwise asking the public to rate “trustworthiness” or bias of news sources has proven problematic in a number of ways. For one, people’s subjective ratings of trustworthiness of particular sources tend to correlate very highly with their own political leanings, so while liberal people will tend to rate MSNBC as highly trustworthy and FOX as not trustworthy, conservative people will do the opposite, which says very little about an objective level of actual trustworthiness of each of those sources. Further, current events have revealed that certain segments of the population are extremely susceptible to influence by low-quality, highly biased, and even fake news, and those segments have proven themselves unable to reliably discern measures of quality and bias, making them unhelpful to poll.

Another way individuals and organizations have attempted to rate partisan bias is through software-enabled text analysis. The idea of text analysis software is appealing to researchers because the sheer volume of text of news sources is enormous. Social media companies, advertisers, and other organizations have recently used such software to perform “sentiment analysis” of content such as social media posts in order to identify how individuals and groups feel about particular topics, with the hopes that knowing such information can influence purchasing behavior. Some have endeavored to measure partisan bias in this way, by programming software to count certain words that could be categorized as “liberal” or “conservative.” A study conducted by researchers at UCLA tried to measure such bias by references by media figures to conservative and liberal think tanks. However, such attempts to rate partisan bias have had mixed results, at best, because of the variation in context in which these words are presented. For example, if a word is used sarcastically, or in a quote by someone on the opposite side of the political spectrum from the side that uses that word, then the use of the word is not necessarily indicative of partisan bias. In the UCLA study, references to political think tanks were too infrequent to generate a meaningful sample. I submit that other factors within an article or story are far more indicative of bias.

I also submit that large-scale, software-enabled analysis bias ratings are not useful if the results do not align well with the subjective bias ratings gathered by a group of knowledgeable media observers. That is, if we took a poll of an equal number of knowledgeable left-leaning and right-leaning media observers, we could come to some kind of reasonable average for ratings bias. To the extent the software-generated results disagree, that suggests that the software model is wrong. I earlier stated my dissatisfaction with consumer polls as the sole indicator of bias ratings because it is consumer-focused and not content-focused. I think there is a way to develop a content-based approach to ranking bias that aligns with our human perceptions of bias, and that once that is developed, it is possible to automate portions of that content-based approach. That is, we can get computers to help us rate bias, but we have to first create a very thorough bias-rating model.

  1. Finding a better way to rank bias

When I started doing ratings of partisanship, I, like all others before me, rated them subjectively and instinctively from my point of view. However, knowing that I, like every other human person, have my own bias, I tried to control for my own bias (as referenced in my original methodology post), possibly resulting in overcorrection. I wanted a more measurable and repeatable way to evaluate bias of both entire news sources and individual news stories.

I have created a formal framework for measuring political bias in news sources within the defined taxonomy of the chart. I have started implementing this formal framework when analyzing individual articles and sources for ranking on the chart. This framework is a work in progress, and the sample size upon which I have tested it is not yet large enough to conclude that it is truly accurate and repeatable. However, I am putting it out here for comments and suggestions, and to let you know that I am designing a study for the dual purposes of 1) rating a large data set of articles for political bias and 2) refining the framework itself. Therefore, I will refer to some of these measurements in the present tense and others in the future tense. My overall goal is to create a methodology by which other knowledgeable media observers, including left-leaning and right-leaning ones, can reliably and repeatably rate bias of individual stories and not deviate too far from each other in their ratings.

My existing methodology for ranking an overall source on the chart takes into account certain factors related to the overall source as a first step, but is primarily based on rankings of individual articles within the source. Therefore, I have an “Entire Source” bias rating methodology and an “Individual Article” bias rating methodology.

  1. “Entire Source” Bias Rating Methodology

I discussed ranking partisan bias of overall sources in my original methodology post, which involves accounting for each of the following factors:

  1. Percentage of news media stories falling within each partisanship category (according to the “Individual Story” bias ranking methodology detailed below)
  2. Reputation for a partisan point of view among other news sources
  3. Reputation for a partisan point of view among the public
  4. Party affiliation of regular journalists, contributors, and interviewees
  5. Presence of an ideological reference or party affiliation in the title of the source

In my original methodology post, I identified a number of other factors for ranking sources on both the quality and partisanship scales that I am not necessarily including here. These include the factors of 1) number of journalists 2) time in existence and 3) readership/viewership. This is because I am starting with an assumption that the factors (a-e) listed above are more precise factors indicating partisanship that would line up with polling results of journalists and knowledgeable media consumers. In other words, my starting assumption is that if you used factors (a-e) to rate partisanship of a set of sources, and then also polled significant samples of journalists and consumers, you would get similar results. I believe that over time, some of the factors 1-3 (number of journalists, time in existence, and readership/viewership) may be shown to correlate strongly with indications of partisanship or non-partisanship. For example, I suspect that the factor “number of journalists” may be found to correlate high numbers of journalists with low partisanship, for the reason that it is expensive to have a lot of journalists on staff, and running a profitable news enterprise with a large staff would require broad readership across party lines. I suspect that “time in existence” may not necessarily correlate with partisanship, because there are several new sources that have come into existence within just the last few years that strive to provide unbiased news. I suspect that readership/viewership will not correlate very much with partisanship, for the simple reason that as many people seem to like extremely partisan junk as like unbiased news. Implementation of a study based on the above listed factors should verify or disprove these assumptions.

I have “percentage of news media stories falling within each partisanship category” listed as the first factor for ranking sources, and I believe it is the most important metric. Whenever someone disagrees with a particular ranking of an overall source on the chart, they usually cite their perceived partisan bias of a particular story that they believe does not align with my ranking of the overall source. What should be apparent to all thoughtful media observers out there, though, is that individual articles can themselves be more liberal or conservative than the mean or median partisanship bias of its overall source. In order to accurately rank a source, you have to accurately rank the stories in it.

             2. “Individual Story” Bias Rating Methodology

As previously discussed, I propose evaluating partisanship of an individual article by: 1) creating an initial placement of left, right, or neutral based on the topic of the article itself, 2) measuring certain factors that exist within the article and then 3) accounting for context by counting and evaluating factors that exist outside of the article. I’ll discuss this fully in Posts 3 and 4 of this series.

In my next post (#2 in this series) I will discuss the taxonomy of the horizontal dimension. I’ll cover many reasons why it is so hard to quantify bias in the first place. Then I’ll define what I mean by “partisanship,” the very concepts of “liberal,” “mainstream/center,” and “conservative,” and what each of the categories (most extreme/hyper-partisan/skews/neutral or balance) mean within the scope of the chart.

 Until then, thanks for reading and thinking!

Posted on

The Chart, Version 3.0: What, Exactly, Are We Reading?

Note: this is actually version 3.1 of The Chart. I made some minor changes from version 3.0, explained here: http://www.allgeneralizationsarefalse.com/chart-3-1-minor-updates-based-constructive-feedback/

Summary: What’s new in this chart:

  • I edited the categories on the vertical axis to more accurately describe the contents of the news sources ranked therein (long discussion below).
  • I stuffed as many sources (from both version 1.0 and 2.0, plus some new ones) on here as I could, in response to all the “what about ______ source” questions I got. Now the logos are pretty tiny. If you have a request for a ranking of a particular source, let me know in the comments.
  • I changed the subheading under “Hyper-Partisan” from “questionable journalistic value” to “expressly promotes views.” This is because “hyper-partisan” does not always mean that the facts reported in the stories are necessarily “questionable.” Some analysis sources in these columns do good fact-finding in support of their expressly partisan stances. I didn’t want anyone to think those sources were necessarily “bad” just because they hyper-partisan (though they could be “bad” for other reasons.
  • I added a key that indicates what the circles and ellipses mean. They mean that a source within a particular circle or ellipse can often have stories that fall within that circle/ellipse’s range. This is, of course, not true for all sources
  • Green/Yellow/Orange/Red Key. Within each square: Green is news, yellow is fair interpretations of the news, orange is unfair interpretations of the news, and red is nonsense damaging to public discourse.

Just read this one more thing: It’s best to think of the position of a source as a weighted average position of the stories within each source. That is, I rank some sources in a particular spot because most of its stories fall in that spot. However, I weight the ranking downward is if it has a significant number of stories (even if they are a minority) that fall in the orange or red areas. For example, if Daily Kos has 75% of its stories fall under yellow (e.g., “analysis,” and “opinion, fair”), but 25% fall under orange (selective, unfair, hyper-partisan), it is rated overall in the orange. I rank them like this is because, in my view, the orange and red-type content is damaging to the overall media landscape, and if a significant enough number of stories fall in that category, readers should rely on it less. This is a subjective judgment on my part, but I think it is defensible.

OK, you can go now unless you just really love reading about this media analysis stuff. News nerds, proceed for more discussion about ranking the news.

As I discussed in my post entitled “The Chart, Second Edition: What Makes a News Source Good?” the most accurate and helpful way to analyze a news source is to analyze its individual stories, and the most accurate way to analyze an individual story is to analyze its individual sentences. I recently started a blog series where I rank individual stories on this chart and provide a written analysis that scores the article itself on a sentence-by-sentence basis, and separately scores the title, graphics, lede, and other visual elements. See a couple of examples here. Categorizing and ranking the news is hard to do because there are so very many factors. But I’m convinced that the most accurate way to analyze and categorize news is to look as closely at it as possible, and measure everything about it that is measurable. I think we can improve our media landscape by doing this and coming up with novel and accurate ways to rank and score the news, and then teaching others how to do the same. If you like how I analyze articles in my blog series, and have a request for a particular article, let me know in the comments. I’m interested in talking about individual articles, and what makes them good and bad, with you.

As I’ve been analyzing articles on an element-by element, sentence-by-sentence basis, it became apparent to me that individual elements and sentences can be ranked or categorized in several ways, and that my chart needed some revisions for accuracy.

So far I have settled on at least three different dimensions, or metric, upon which an individual sentence can be ranked. These are 1) the Veracity metric, 2) the Expression metric, and 3) the Fairness metric

The primary way statements are currently evaluated in the news are on the basis of truthfulness, which is arguably the most important ranking metric. Several existing fact-checking sites, such as Politifact and Washington Post Fact Checker, use a scale to rate the veracity of statements; Politifact has six levels and Washington Post Fact Checker has four, reflecting that many statements are not entirely either true or false. I score each sentence on a similar “Veracity” metric, as follows:

  • True and Complete
  • Mostly True/ True but Incomplete
  • Mixed True and False
  • Mostly False or Misleading
  • False

Since there are many reputable organizations that do this type of fact-checking work, according to well-established industry standards, (see, e.g., Poynter International Fact Checking Network), I do not replicate this work myself but rather rely on these sources for fact checking.

It is valid and important to rate articles and statements for truthfulness. But it is apparent  that sentences can vary in quality in other ways. One way, which I discussed in my previous post (The Chart, Second Edition: What makes a News Source ‘Good’) is on what I call an “Expression” scale of fact-to-opinion. The Expression scale I use goes like this:

  • (Presented as) Fact
  • (Presented as) Fact/Analysis (or persuasively-worded fact)
  • (Presented as) Analysis (well-supported by fact, reasonable)
  • (Presented as) Analysis/Opinion (somewhat supported by fact)
  • (Presented as) Opinion (unsupported by facts or by highly disputed facts)

In ranking stories and sentences, I believe it is important to distinguish between fact, analysis, and opinion, and to value fact-reporting as more essential to news than either analysis or opinion. Opinion isn’t necessarily bad, but it’s important to distinguish that it is not news, which is why I rank it lower on the chart than analysis or fact reporting.

Note that the ranking here includes whether something is “presented as” fact, analysis, etc. This Expression scale focuses on the syntax and intent of the sentence, but not necessarily the absolute veracity. For example, a sentence could be presented as a fact but may be completely false or completely true. It wouldn’t be accurate to characterize a false statement, presented as fact, as an “opinion.” A sentence presented as opinion is one that provides a strong conclusion, but can’t truly be verified or debunked, because it is a conclusion based on too many individual things. I’ll write more on this metric separately, but for now, I submit that it is an important one because it is a second dimension of ranking that can be applied consistently to any sentence. Also, I submit that a false or misleading statement that is presented as a fact is more damaging to a sentence’s credibility than a false or misleading statement presented as mere opinion.

The need for another metric became apparent when asking the question “what is this sentence for?” of each and every sentence. Sometimes, a sentence that is completely true and presented as fact can strike a reader as biased for some reason. There are several ways in which a sentence can be “biased,” even if true. For example, sentences that are not relevant to the current story, or not timely, or that provide a quote out of context, can strike a reader as unfair because they appear to be inserted merely for the purpose of persuasion. It is true that readers can be persuaded by any kind of fact or opinion, but it seems “fair” to use certain facts and opinions to persuade while unfair to use other kinds.

I submit that the following characteristics of sentences can make them seem unfair:

-Not relevant to present story

-Not timely

-Ad hominem (personal) attacks

-Name-calling

-Other character attacks

-Quotes inserted to prove the truth of what the speaker is saying

-Sentences including persuasive facts but which omit facts that would tend to prove the opposite point

-Emotionally-charged adjectives

-Any fact, analysis, or opinion statement that is based on false, misleading, or highly disputed premises

This is not an exhaustive list of what makes a sentence unfair, and I suspect that the more articles I analyze, the more accurate and comprehensive I can make this list over time. I welcome feedback on what other characteristics make a sentence unfair, and I’ll write more on this metric in the future. Admittedly, many of these factors have a subjective component. Some of the standards I used to make a call on whether a sentence was “fair” or unfair” are the same ones in the Federal Rules of Evidence (i.e., the ones that judges use to rule on objections in court). These rules define complex concepts such as relevance and permissible character evidence, and determine what is fair for a jury to consider in court. I have a sense that a similar set of comprehensive rules for legal evidence could be developed for journalism fairness. For now, these initial identifiers of fairness metric helped me distinguish the presence of unfair sentences in articles. I now use a “Fairness” metric in addition to the Veracity scale and the Expression scale. This metric only has two measures, and therefore requires a call to be made between:

  • Fair
  • Unfair

By identifying a percentage of sentences that were unfair, I was able to gain an additional perspective on what an overall article was doing, which helped me create some more accurate descriptions of types of articles on the vertical quality axis. In my previous chart (second edition), the fact-to-opinion metric was the primary basis for the vertical ranking descriptions, so it looked like this:

In using all three metrics, 1) the Veracity scale, 2), the fact-to-opinion Expression scale, and 3) the Fairness scale, I came up with what I believe are more accurate descriptions of article types, which looks like this:

As shown, the top three categories are the same, but the lower ranked categories are more specifically described than in the previous version. The new categories are “Opinion; Fair Persuasion,” “Selective or Incomplete Story; Unfair Persuasion,” “Propaganda/Contains Misleading Facts,” and “Contains Inaccurate/ Fabricated Info.” If you look at the news sources that fall into these categories, I think you’ll find that these descriptions more accurately describe many of the stories within the sources.

Thanks for reading about my media categorizing endeavors. I believe it is possible (though difficult) to categorize the news, and that doing so accurately is a worthy endeavor. In future posts and chart editions I’ll dive into other metrics I’ve been using and refining, such as those pertaining to partisanship, topic focus (e.g., story selection bias), and news source ownership.

If you would like a blank version for education purposes, here you go:

Third Edition Blank

And here is a lower-resolution version for download on mobile devices: