As an industry, we tend to get anxious, stressed and riled up when a new scientific study about pet food or an opinion-based editorial gets published. For those of us with science backgrounds, we like to take a brief pause, breathe deeply, then read and reread the article (or at least we think we do). We do this to evaluate elements like study design, research gaps and to determine whether we agree with the summary, in addition to other tricks of the trade. This is the practice of critically reading an article.
Often, we read and immediately react with emotion instead of evaluating if an article will even have a true impact in the field. Not all articles are good science or even based in the realm of science. Just because the article is “scientific” in nature, it does not mean it will make an impact within the peer community for which the article is written. For an article to have an impact, it not only has to be read, it also needs to be cited in other scientific publications.
Review articles are the most often cited scientific work because people trust that the initial writer did their homework and they would draw the same conclusion. As we now know by the infamous 2018 JAVMA Commentary/Review, this is not always the case. A recent survey found that 80% of veterinarians consider the American Veterinary Association (AVMA) the “most reliable source of information”; however, its journal, JAVMA, ranks 58 (out of 254) among veterinary journals worldwide. Meaning, this may be a trusted source among veterinarians reading the article but, among the peer group, it does not rank within the top 20% of the journals that are impactful and worth citing.
You can read below about how journals are evaluated for impact and the quality of the research they publish. The fact is, if an article gets published in a “good” journal, it does not mean it is good science. Take, for example, a study published in PLOS ONE in May 2020 titled, “Development of plasma and whole blood taurine reference ranges and identification of dietary features associated with taurine deficiency and dilated cardiomyopathy in golden retrievers: A prospective, observational study.”
PLOS ONE seems to be ranked well for impact and quality (see Table 1), so this should be a good study, right? Wrong!
Those of us who have served as reviewers for journal manuscripts would have seen fatal flaws in this study and simply rejected the manuscript. Based on study design, it was destined for failure and could not be corrected or resuscitated. Why? First, scientific studies need to be well controlled. This one included various food treatments, study populations, environments and other things that can confound a study.
The study was not controlled and has not one, but many fatal flaws, including:
1. Treatment groups lack control foods and contain too many confounding variables (i.e., too many moving parts):
a. Traditional diet (TD) criteria – Companies exceeding US$2 billion in global sales for 2018 and making grain-inclusive kibble that did not have legumes or potatoes in the top five ingredients.
b. Non-traditional diet (ND) criteria – Kibble or raw food diet that is grain free, includes legumes or potatoes in the ingredients list or is manufactured by a small pet food company with less than US$1billion in sales for 2018.
2. Treatment groups are not really treatment groups, and foods were not categorized correctly per their own treatment groups:
a. Blue Buffalo was listed as ND, even though it had more than US$1 billion in sales, its Lamb and Rice formula is grain inclusive and does not have legumes in the top five ingredients. It also includes DL-methionine. Based on these facts, this food does not fit in either the TD or ND category.
b. Some of the TD foods contain soybeans in the top five ingredients and do not contain taurine or DL-methionine. Soybeans are legumes! (I spent five years of graduate school studying soy in dogs). Despite this, since the companies make more than US$2 billion in sales, the researchers classified these foods as TD vs. ND. However, based on their own treatment groups, these foods do not fit in either category.
c. One of the foods in the TD group is a puppy food, not adult. Again, this does not fit into either category. Additionally, it brings into question the age range of the dogs in the study. Does age impact taurine status?
d. Maybe someone missed the 2015 press release that Merrick (ND group) is owned by Purina (TD group)?
e. Product form is not controlled across both treatment groups – kibble only (TD) vs. kibble, canned, raw, etc. (ND). Therefore, these groups’ outcomes cannot be reasonably compared.
f. Darwin’s Natural Selection Dog Food (13A in ND group) never had a pork formulation. Why is the one of the foods listed as pork? One needs to question the data integrity.
g. Prescription foods (OM and EN) are not traditional diets. They are designed for certain conditions, abnormal/unhealthy dogs and cannot be purchased without veterinary authorization. How are they comparable to the others?
3. Food intake was not controlled. The authors said, “Dogs in both groups were typically fed fewer calories when compared to sedentary or active MER calculations.” What seems like a benign statement has a big impact on the study design and results. If the authors were able to determine caloric intake of the dogs, then they would have been able to determine protein intake because the foods they fed would have had both pieces of information. Maybe this is a study about protein intake and taurine status vs. how much sales volume impacts taurine status? Also, what was the impact of treats if they were consumed properly (less than 10% of calories per day)?
4. Failure to recognize blood and sampling issues for taurine analysis. In their prior PLOS ONE study, the authors recognized the following:
“In addition, sample collection and the methodology used for taurine analysis could not be standardized as this was an observational study mimicking clinical practice. For example, taurine analysis was performed at multiple reference laboratories and either plasma, whole blood, serum or a combination of whole blood and plasma taurine concentrations were obtained. Currently, it is recommended that both whole blood and plasma taurine concentrations be obtained together, and that a diagnosis of taurine deficiency be made if either of those values are low.”
Was that fixed for this new study? Better yet, has the veterinary community determined a way to measure taurine more consistently and accurately? Maybe these articles simply taught us we need to measure taurine consistently and that is why we do not see true trends or differences within real treatment groups?
5. Failure to acknowledge all funding among the authors. How do we know if there are conflicts of interest and potential for veterinary bias? All three companies in the study with more than US$2 billion in sales are funding feeding programs for students, research studies and other programs at the authors’ university; however, PLOS ONE does not capture this information.
6. Poor statistical analysis. Based on treatment groups, the statistics performed are inappropriate for this type of study. What would have been interesting is if the authors used proximate component analysis. Then they could have seen where individual animals would be grouped by individual food types, ingredients and even manufacturers.
To start this post, I discussed the impact of journals. In general, scientific authors wanting to submit their articles to a journal should choose one with the highest impact factor (IF) in their area of expertise. IF is used to rank journals of similar content (e.g., medicine or nutrition) based on a calculation of how often a journal’s article is cited in other people’s research. In today’s digital era, this is equivalent to the number of clicks and shares of an article.
The IF index is calculated by the number of citations received by an article published in that journal during the two preceding years, divided by the total number of articles published in the journal during the same two preceding years. Said simply, how many times a publication is cited in that two-year period. A sample calculation:
Impact factor (2019) = (total citations for 2018 and 2017) / (total publications for 2018 and 2017)
If a journal has a high impact value, it does not mean it always publishes high-quality research. IF is only an indicator of the journal’s overall performance. Historically, a journal would increase its IF by publishing high-quality articles that, before the digital era, were sought, cited and often discussed within a person’s field of expertise (think old-school paper and ink).
Unfortunately, in modern times, journals can easily manipulate the IF by streamlining the submission process, publishing more review articles (people love to cite review articles vs. getting the original data), eliminating submission fees, citing your own work (or peer work), making articles easy to find (via internet searches) and providing open access, just to name a few tactics.
Thus, looking at IF alone will not give you a true sense of a journal’s reputation or quality of the scientific articles you may be reading. Readers should look at all indices and indicators for a better overall picture of the quality of the journal. For example, what is the H-index and SJR indicator in addition to the IF?
A journal’s H-index is another measure to determine its quality, and even that of the author’s work. The H-index is determined based on the number of papers (H) that have been cited at least H times. For example, if five papers were cited five times, the H-index would be 5. Many publications and authors like to use the H-index because it is easy to look up in search engines like Google Scholar.
Unfortunately, H-indexes can also be easily manipulated because of things like self-citation and colleagues who routinely cite each other’s work. Also, you can have an artificially high H-index if you are a “one hit wonder,” akin to a song like Baha Men’s “Who Let the Dogs Out?”
To avoid potential journal and author inflation of the two prior measurements, other indices should be considered. The SCImago Journal Rank (SJR) and Source Normalized Impact per Paper (SNIP) indicators were designed to take into account a journal’s reputation, where the citations come from, the average citation impact of a publication in the journal and correcting for different citation practices between fields of expertise (allowing for between-field comparisons).
See Table 1 for the IF, H-Index, SJR and SNIP of some journals that publish scientifically peer-reviewed animal nutrition studies.
We need to be able to critically review papers, even if the “good” journals fail to do their job. In the case of the PLOS ONE study, the garbage in resulted in garbage out results. When nutritionists and veterinarians see articles like this, there should be open dialogue and discussions about the article. (Perhaps a journal club?) This allows people to talk about the positives and negatives of the article instead of drawing a solid conclusion.
If we haven’t learned anything else from COVID-19, we should learn that scientists aren’t always right, data is open to interpretation and, more importantly, there should be open dialogue, not walls built to protect one’s own self interests.
You can do something when you spot “bad” science with fatal flaws. I did! Simply email the editor (in this case, firstname.lastname@example.org) with the article link and title, and list all the flaws. Reputable journals will open a case number and begin the review process. Others will simply say, “It is an opinion piece and you can write your own.” For the PLOS ONE article, my case number is 06641324; feel free to reference it if you send an email.
Finally, I would be remiss if I didn’t share what a high impact journal would look like in Table 1.
New England Journal of Medicine’s ratings: 40.1 (IF), 987 (H-Index), 18.29 (SJR) and 13 (SNIP)