Responding to recent critiques of iron fortification in India

e19brendan

Two recent EA Forum posts offered critiques on iron fortification in India and particularly the work of Fortify Health, given its relationship with EA: Targeted Treatment of Anemia in Adolescents in India as a Cause Area and Cost-effectiveness of iron fortification in India is lower than GiveWell's estimates, both authored by Akash Kulgod.

We heartily welcome the inquiry and critique, and Fortify Health will remain open minded in pursuit of understanding the true impact of its work and in adjusting course accordingly. I am one of the co-founders of Fortify Health, and although I have since stepped back from day-to-day operations, I offered to address the concerns raised so that the active core team can continue to focus on their implementation and partnership work.

In this post, I intend to provide initial responses to key points made in Kulgod’s articles:

Is iron fortification expected to increase diabetes prevalence in India?
- Short answer: This should not be inferred from available evidence.
Is the proportion of anemia in India attributable to iron deficiency lower than we thought?
- Short answer: At first glance, more recent estimates of iron deficiency among children cited are on average roughly similar to those incorporated in GiveWell’s 2019 CEA, and state and age specific prevalence could be incorporated into future models. Givewell’s 2019 model largely relies on iron deficiency prevalence rather than iron deficiency anemia prevalence, but I’m uncertain as to whether or for what outcomes use of one or both parameters would lead to the most accurate model.
Should hemoglobin cutoff values be changed, and how does this affect the expected impact of fortification efforts?
- Short answer: The distribution of hemoglobin levels among the Indian sample cited does not support the inference that outcomes associated with anemia are equivalent between populations at different threshold hemoglobin levels. The new information presented by the cited study does not weaken the expected impact of fortification.
Is a targeted intervention preferable to widespread fortification?
- Short answer: Targeted screening and treatment should be available as part of comprehensive primary care and possibly school-based programs, but does not preclude rapidly scaling up fortification efforts. Further exploration of this intervention would be worthwhile.
Is micronutrient fortification ethical?
- Short answer: Extending the benefits of fortification is one part of an ethical imperative for health equity, and absence of fortification does not provide greater autonomy.

I recognize that this post only scratches the surface of complex issues, and it does not provide a comprehensive review of all available arguments and evidence that may be relevant. Its scope is substantially more limited, merely contributing to an evolving conversation started by Kulgod’s posts. I’m eager to get feedback from Kulgod and other readers, and hope we can collaboratively advance our own understanding and the EA community’s understanding of fortification efforts.

I. Is 10mg/day per capita iron fortification expected to raise prevalence of diabetes by 2-14% as apparently suggested by Ghosh et al. (preprint 2021)?

Initial reaction: the cited study’s conclusion does not seem to be supported by its results. Claims of correlation between high fasting blood sugar with discrete elevation in serum ferritin are supported by only a small effect size in a subgroup analysis of the highest wealth quintile, which is sharply discontinuous from the next wealth quintile. Furthermore, their model of marginal increase in fasting blood sugar by serum ferritin levels demonstrates minimal effect size, even in this subgroup, so I’m skeptical that the odds ratio presented for the binary outcome of high fasting blood sugar is even internally consistent. Inference of causation from any true correlations is limited by compelling confounders not explored in the study. It’s important to clarify that this “raise” is relative, not absolute difference in prevalence. My interpretation of this study does not suggest a causal link between iron fortification and hyperglycemia / prediabetes.

The conclusion that the risk of of high fasting blood sugar (FBS) / prediabetes (OR of 1.05) is misleading. That is the OR found only in highest wealth quintile and is inconsistent with the remainder of the wealth quintiles.
There are several reasons to be skeptical of non-null effects, detailed below.
Table 1 demonstrates the odds ratio for elevated fasting blood sugar (FBS) for each 10µg/L increase in SF is only barely significant in the richest quintile (CI 1.01-1.08, whereas overall CI 0.99-1.04). It’s unclear if these CI are adjusted for multiple comparisons (i.e. you would expect more false positive findings by chance the more analyses you run, so various approaches to correction are typically used).
It is suspicious that the next wealth quintile had an effect essentially as strong but in the opposite direction (0.92-1.00), which could lead you to believe (by the paper’s logic) for this quintile, SF is protective against diabetes. In comparing the two findings, CI bound crossing 1.01 vs. 1.0 is fairly arbitrary and isn’t a good reason to hold the richest quintile data in higher regard than the next quintile data. These are contradictory.
If there is a true correlation in this subgroup, elevated serum ferritin (SF) may be the result of common factors that also independently contribute to risk of hyperglycemia, hypertension, and hypercholesterolemia (confounding).
Furthermore, the “increased prevalence of diabetes” in table 2 is potentially easy to misunderstand, as these seem to reflect relative risks or odds ratios, not absolute risk differences (in other words, they are arguing that the risk or odds are 1.02-1.14 times as large as it would be, not that 2-14% more of the population will develop diabetes.
However, the clinical effect is negligible. The linear additive model (adjusted) on page 10 (and aggregate on figure on page 30) reads 0.14mg/dL increase in FBS per 10µg/L increase in SF. It isn’t immediately clear how that corresponds to the odds ratios of table 1 or the percent change in prediabetes by state in table 2. Nevertheless, on page 8, it seems that fortification was assumed to lead to an SF increase of 7.5-11.5 µg/L (roughly 10µg/L). Taken literally, their model suggests a clinically insignificant increase in FBS (e.g. average 90.8mg/dL → ~90.94mg/dL overall. There is no available baseline for richest quintile, but if assumed same, 90.8mg/dL→ 91.18 mg/dL). The conclusions of the study hinge on the proportion of people who cross the threshold into prediabetes e.g. from 99.87->100.01 or 100.00->100.14, but health outcomes from this shift would be undetectable. This is illustrated below.

I tried to roughly estimate how shifting a normal distribution of FBS from mean 90.8 to mean 90.94 or affects the proportion above the 100mg/dL threshold they set for prediabetes (they mention overall sample had 13% prevalence, so I used that to derive σ=8.17) using Matt Bognar’s tool.

This demonstrated an increase of prediabetes defined by study as FBS>100mg/dL to 13.373% (absolute difference 0.373%, odds ratio 1.03, relative risk 1.03) overall and 14.017% (absolute difference 1.017, odds ratio 1.09, relative risk 1.08 in richest quintile). My estimates here may be different from theirs given adjustments in their model and aberrations from normal distribution. It should be stressed that Table 2 must be reporting relative rather than absolute differences in prevalence. An “elevation” of 3% means something like 1.03 x baseline prevalence.

It’s not clear why their stated OR among “richer” people is 0.96 in table 1, but all changes in prevalence listed for states in table 2 among “richer” people are positive when they would be expected to be negative, so I’m highly suspect of the methods used to generate table 2. I suspect peer-review prior to publication would require addressing these points and restating conclusions accordingly.

II. What proportion of anemia in catchment areas is attributable to iron deficiency or benefits from increased dietary iron?

Initial reaction: Better estimates of iron deficiency anemia (IDA) and iron deficiency (ID) could have a significant effect on models for Fortify Health's effectiveness. If the prevalences of IDA or ID are much lower than initially modeled, our benefits may be overestimated. Givewell’s 2019 model seems to incorporate estimates of dietary iron deficiency roughly similar to what would be inferred from more recent data cited by Kulgod, but new estimates should be incorporated into future models. Further questions arise about when iron deficiency versus iron deficiency anemia parameters should be used in models. Regardless, addressing the root causes of anemia including poverty remains essential.

The Kulkarni et al., (2021) estimates of iron deficiency among children seem to add substantial data to this discussion, but this doesn’t directly point us to the proportion of anemia that could be corrected by iron repletion in children (or adults). The state-specific breakdown in figure 2 could be useful in evaluating programmatic effects.
The most recent GiveWell cost effectiveness analysis (CEA) incorporates estimates of dietary iron deficiency prevalence of 22.5% under 5y and 20.4% 5-14y, which is roughly comparable to Kulkarni et al. (2021) estimates of iron deficiency using serum ferritin adjusted with modified BRINDA (31.5% 1-4y, 15.5% 5-9y, 20.9% 10-19y). The latter's estimates are even higher in some of the states where Fortify Health’s work is concentrated. Direct comparison is difficult due to distinct binning, but an updated CEA could reasonably incorporate Kulkarni et al. (2021) age brackets for modeling effects among children. It is not immediately clear if differences in methods would require additional adjustment to infer impact on modeled outcomes or how it could improve iron deficiency estimates among adults.
It isn't immediately clear when models of Fortify Health's effectiveness should use iron deficiency or iron deficiency anemia prevalence. Given that data surrounding outcomes of interest may include one or both parameters, it may be prudent to employ one parameter or the other or have a reasonable conversion factor relevant to the areas in which we work (i.e. the proportion of anemia that is due to iron deficiency).
Engle-Stone et al. (2017) / BRINDA article unfortunately doesn’t include data from India but does illustrate that even where infection burden is high, iron deficiency still substantially contributes to burden of anemia (pasting their Figure 1 below).
The Onyeneho et al. (2019) OR for iron deficiency among Indian children with and without anemia in table 3 is hard for me to interpret (I don’t understand the asterisk comment) but it seems essentially close to 1, which would imply in this sample iron deficiency and anemia are not well correlated at all. Perhaps this is in part due to the simplification into binary variables in which the prevalence of iron deficiency is very high. In contrast, table 4 shows significant contribution of iron deficiency in logistic regression, which I don’t know how to reconcile with table 3. It is not clear to me what metric is used to define iron deficiency in this study and whether that truly reflects a deficit in iron intake and absorption or some other process leading to iron sequestration (e.g. inflammation/infection).
I’m not sure whether there is good evidence about the impact of fortification with highly bioavailable fortificants (like NaFeEDTA) among people with chronic parasitic infections of the gut. While the etiology of their anemia may be infection that causes occult blood loss in the intestines, this may lead to iron deficiency due to losses if there is not adequate nutritional replenishment. Obviously these people need clean water and deworming treatment, but they may also benefit from iron.
I agree with the conclusion that addressing other constraints related to poverty is essential. I don’t know how to interpret the findings that lower wealth quintiles had less iron deficiency, but it does open the question of whether there is overcorrecting for covariates in the adjusted models.

III. Should an alternative to WHO anemia cutoffs be used as suggested by Sachdev et al. (2021)?

Initial reaction: It is likely that the “healthy” sample used is still at increased risk for sequelae of anemia (whether chronically such as cognitive development and physical endurance or with regard to risk profile to future illness) even if they are not demonstrating illnesses that meet the exclusion criteria. Revisiting cutoffs more broadly could have various advantages, but we should use comparable criteria to relate the health impact observed previously and in other places to the areas where Fortify Health works. Other than altitude and known genetic variants (e.g. in sickle cell and thalassemia), there is no compelling genetic reason to believe that lower hemoglobin levels would present lesser risk to an Indian as it would to North Americans or Europeans. Whether lower hemoglobin could provide certain advantages is a separate question not addressed here.

While I’m sympathetic to concerns about imposing inappropriate standards of “normal,” I am also wary of redefining the threshold for acceptable health on regional norms that could inappropriately accept a lower standard under false claims of biological difference.
For example, in the United States, the standard measure of estimated kidney function (eGFR) was for years adjusted by race under the faulty logic that black people have more muscle mass on average, so higher levels of creatinine (an expected product of muscle breakdown) should fall within the normal range. There are significant consequences of this given that eGFR is used to determine eligibility for transplantation and directs medical therapy. Only recently have US doctors acknowledged that such a system was not accurately capturing poor kidney function and the risks and treatment indications associated with it, systematically introducing a lower standard of care for black patients. Examples like these should make us wary of insisting biological difference between populations, rather than other social structures, defines a “normal parameter.”
Stunting provides another example. Let’s suppose that in India, the average adult male height is 165cm whereas in Finland it is 180.7cm (wikipedia). We might want to conclude there is only a genetic difference responsible. But there is also a significant effect of stunting due to nutritional differences. As it turns out, childhood stunting in India has improved dramatically (62.7%->34.7%) over the last 20 years (World Bank) not because genes changed but because society changed. Even a “healthy” subgroup defined to establish a normal height for an Indian may show a distribution that is shifted towards lower values because of stunting, yet the distribution would be shifted towards higher values with nutritional improvement that leads to increased height and decreased risks of illness that may not lead to exclusion from a “healthy” sample.
My main concern with this study is with the definition of “healthy.” Although the study excluded participants with vitamin deficiencies and other conditions, it’s not clear to me that those who remain would not be healthier if they had higher hemoglobin levels. I’d be most interested in health outcomes at various hemoglobin levels, e.g. demonstrating that the risks associated with a hemoglobin of 6g/dL in the US are similar to the risks associated with a hemoglobin of 4g/dL in India. This may be harder to measure around the threshold for the lower limit of normal as the risks are lower. Certainly, people do adapt to chronic anemia to a great extent, so it may be hard for a given person to “feel” much different living at a stable level of 10 vs 11 g/dL, though on average population health outcomes may reflect meaningful health consequences.
The effects on various health outcomes have often been studied using these cutoffs (whether they are ideal or not), so we would expect them to correlate with the impacts of people with hemoglobins up to the higher threshold used historically rather than a revised threshold.
There is a lot still to learn about the health consequences of chronic anemia. For certain outcomes like maternal mortality, it might make sense to think about higher Hgb as working to provide people a buffer so if they do bleed a lot during childbirth, they have enough reserve not to die - for most people, you will live if regardless of whether you have a Hgb of 7 vs 8 g/dL, but on the margin, more people with a Hgb of 7 g/dL are likely to die. Maybe there’s a moderate difference between 10 and 11, even if both people would be “healthy” by Sachdev’s cutoffs since the negative outcome is only revealed during a later childbirth. Alternatively, it’s possible that further research will reveal there really isn’t a meaningful difference between a Hgb of 11 and 12 g/dL, but this would not be expected to uniquely apply to Indians.
For other outcomes such as exertion tolerance, the linear relationship may continue even above threshold (e.g. why professional athletes train at altitude or “dope” with epo), and could plausibly be true of worker productivity in people whose livelihoods depend on their efficiency and endurance doing manual labor. You could look at whether people at WHO vs. Sachdev cutoffs have any differences in these kinds of health outcomes to conclude whether it’s an appropriate threshold for “healthy” or if the proposed healthy threshold is suboptimal.
In understanding why there could be population differences in hemoglobin levels among healthy people, I suspect differences in meat consumption plays a significant role. We should be careful not to pathologize not eating meat - that is not “the problem,” but this difference could explain differences in distributions of hemoglobin between populations. It’s also possible that people who don’t eat meat have more to gain from fortification programs.
I suspect for the ultimate health outcomes sought by fortification programs (not just intermediate outcomes of the hemoglobin level itself), the benefits of fortification have a nonlinear effect on people with different baseline hemoglobin levels with diminishing returns as people are more iron replete (both because they will absorb less iron and because they don’t benefit greatly from having additional iron). You could argue that inflection point happens at a lower hemoglobin level. More likely, it happens at a similar hemoglobin level and that that level could be debated in both populations.

IV. Targeted treatment

Initial reaction: Targeted treatment is likely worth doing and can occur in parallel to population-level interventions such as fortification, which is likely more rapidly scalable.

Aside from implementation difficulties and limited available data, targeted treatment is conceptually very appealing. It's great to know that this model is being implemented and studied.
Individualized treatment would be strongest as part of a comprehensive primary care system that would not be limited to single vertical interventions, as many people targeted may have other primary care needs, and investment in the primary care system could have many positive effects (some more or less easily measured). Fortification similarly could be critiqued and therefore also should not preclude strengthening of healthcare systems.
Depending on the treatment approach, there may be additional challenges in completing treatment (e.g. if short term GI side effects or limited buy-in make adequate treatment uncommon even when testing and talking phases of "T3" approach are undertaken). I haven't reviewed available data on how effective T3 or similar interventions are.
Costs of this intervention will require further examination, but I appreciate Kulgod's estimates. I haven't looked into relevant evidence of duration of iron repletion, but I imagine most of the people who receive targeted treatment would require repeated interventions unless a fundamental cause of iron deficiency is addressed.
Targeted treatment may be significantly harder to scale up than fortification, but that doesn’t mean it is not worth doing. A portfolio approach in which these interventions are scaled up simultaneously seems ideal, so long as fortification is deemed safe and effective.

V. Is food fortification ethical

Initial reaction

I think it’s good to be humble about this. Having considered it carefully myself, with a working understanding that the totality of the evidence points towards meaningful benefit and minimal harms (an empirical factor challenged by Kligod's posts), I feel confident that wheat flour fortification with iron, folic acid, and vitamin B12 is an ethical practice.
A detailed examination of relevant ethical considerations is worth further conversation and I don’t claim my reaction here will suffice.
The most compelling arguments against fortification being ethical might include
1. Causes significant harm (consequentialist/deontological)
  1. But the totality of evidence points towards significant benefits and limited to no harms.
2. Lacks consent (deontological/autonomy)
  1. But status quo lack of fortification and limits to educational opportunity (that might empower people to understand benefits of fortification and choose it), does not reflect a state of greater autonomy.
  2. Labeling and lack of market saturation technically allows for choosing alternative among those who actively object to fortification.
  3. A higher standard should not apply to fortification than applies to other food processes or additives (e.g. lack of informed choice about which species of wheat is grown, what fertilizer is employed, or what preservatives are used in processed foods).
On the other hand, there are strong arguments in favor of fortification
1. Significant benefit (consequentialist/deontological)
2. Extends standard of care sought for wealthy to all (equity)

I'm grateful for the time and attention that Akash Kulgod has dedicated to reviewing these issues, and hope to continue an open yet serious conversation on the nature of the problem of anemia and iron deficiency in India, the intervention we can employ, and the concerns that arise. An improved understanding of each of these components will surely guide Fortify Health's future work and the work of other organizations striving to improve health.

Akash Kulgod2y17

Thank you for providing such an in-depth response to my posts Brendan. I really appreciate the sentiment and tone, constructive dialogue like this is what drew me towards EA and motivates me (and other folks I bet) to write on the forum.

Just about to start the Annapurna Circuit so it'll be a few weeks before I can continue the conversation. I look forward to the unfolding of a spontaneous adversarial collaboration and the advancement of interventions that raise the physical and cognitive floor potential of the people of India. Do hope more voices join and flesh out the many threads at play already.

e19brendan2y4

Gladly. I'm also looking forward to learning from others on the forum, and I'm happy to speak 1:1 if you'd like!

Sarah Cheng2y1

People interested in global health & development and this post might be interested in applying to the Senior Researcher role at GiveWell, a non-profit focused on helping people do as much good as possible with their donations.

This is a test by the EA Forum Team to gauge interest in job ads relevant to posts - give us feedback here.

bruce2y2

I weakly think a dedicated thread to job offers would be better - active jobseekers can more easily go through a thread, and this keeps the noise : signal ratio in comment sections lower.

It also means orgs wanting to hire will have a go-to place, instead of trying to advertise on every relevant post, which would make the comment section a less enjoyable place for me personally if everyone were to do this.

(I think if this is a test for job ads that are put out ONLY by the EA Forum team, that'd be better, but still probably worse than a dedicated thread)

Sarah Cheng2y6

Thanks for the feedback! This is one of multiple job-related tests we're running on the forum, to see if we can find something impactful to build. We did try out your suggestion in the form of the Who's hiring? (May-September 2022) thread, and we're still analyzing the results. The difference here, as Lorenzo pointed out, is that we could potentially capture people who are not actively looking for a job but would consider applying if they were made aware of relevant opportunities.

We're testing this via comments because it's a cheap MVP - no coding necessary! :) But if we were to build out a feature it would likely be separate from the comments section, and come with an option to hide it.

Lorenzo Buonanno2y4

I disagree, at least in the case of "job posts only by the EA Forum Team, only for particularly impactful positions".
The noise is small, the potential impact is high.
I also don't think people that might be a good fit for these highly competitive particularly impactful roles would necessarily be "job seekers" that are actively looking for a job.

It will make the forum very slightly less enjoyable indeed, but I think it's net positive in terms of counterfactual impact.

Effective Altruism Forum
EA Forum

Responding to recent critiques of iron fortification in India

45

45

Reactions

More posts like this