Why current language models should align with the consideration of all sentient beings

Why current language models should align with the consideration of all sentient beings

11 Mar 2026

Artificial intelligence is increasingly influencing areas that directly affect sentient beings. Large language models (LLMs), integrated into search engines, assistants, educational environments, and professional tools, do more than just generate text: they shape what information circulates, what arguments seem reasonable, and what interests are considered relevant.1

Therefore, when these systems are deployed on a large scale, it is crucial to examine what values ​​they embody. They are not neutral tools.2 They contribute to structuring implicit moral frameworks.

In practice, most current approaches to AI alignment focus almost exclusively on human interests. “Harm” is usually understood as harm to human beings, institutions, or property. Other animals (even sentient beings) and other possible forms of sentience are rarely included as subjects with their own interests.3

This exclusion is not technical, but normative. Deciding who matters morally in the lineup is tantamount to deciding which types of harm will be systematically avoided and which will remain invisible. When the interests of animals are not part of the principles guiding training, the systems have no incentive to take their suffering seriously or to avoid reinforcing attitudes and practices that normalize it.

What is AI alignment and why is it not neutral?

The alignment aims to ensure that AI systems not only function correctly from a technical standpoint, but also act in accordance with certain ends considered acceptable. In the case of LLMs, simply generating coherent text is not enough: they are expected not to promote violence, discrimination, or serious harm.

There are different methods to achieve this:

·Constitutional alignment: The model is trained using explicit principles that guide which responses it should produce and which it should avoid. The system reviews and reformulates its own responses in light of these principles, incorporating them as an internal framework for behavior.4

·Reinforcement learning with human feedback (RLHF): Different responses of the model are compared according to criteria such as usefulness or absence of harm, and the model learns to maximize those preferences.5

·Deliberative alignment: The system internalizes rules and normative priorities, and learns to apply them during the generation of responses.6

Although they differ technically, all these approaches share one feature: they incorporate value judgments. Determining what counts as harm, which risks to prioritize, and which objectives are legitimate involves adopting ethical positions.

Therefore, there is no neutral alignment. Even when it is claimed that AI should reflect “human values,” someone decides which values and how to translate them into operational rules.

If systems are trained to avoid racism or sexism, but not to take into account the suffering of non-human animals, it is implicitly assumed that their interests can be ignored. The same applies to other possible forms of sentience, present or future, whose interests could be excluded by default. This exclusion is a form of discrimination: it leaves out individuals capable of suffering simply because they do not belong to the human species.

Given the growing role of these systems in the production of information, this omission may contribute to consolidating a moral framework that renders most sentient beings invisible.

The problem of anthropocentrism in AI

Some argue that by aligning with human values, AI will indirectly incorporate a certain concern for non-human animals. However, our societies are deeply marked by speciesism, the discrimination that consists of giving less weight to the interests of those who do not belong to the human species.

Although extreme cruelty is condemned, practices that cause intense and prolonged suffering to a vast number of animals each year continue to be accepted. Human benefits, such as certain food preferences or conveniences, are often prioritized over serious harm to other individuals capable of experiencing pain or pleasure.

If alignment simply replicates dominant preferences, systems can amplify existing unequal moral considerations already in existence. This can lead to technological dynamics that entrench harmful practices without acknowledging the suffering involved.7

Technologies tend to stabilize and expand the values ​​of the context in which they develop. An anthropocentric alignment framework risks consolidating this exclusion on a large scale.

There are, however, initial steps that show that broadening the moral framework is feasible. For example, the company Anthropic has included in the constitution that guides its Claude model an explicit reference to the “welfare of animals and all sentient beings.” Although this step is limited, it demonstrates that integrating these interests into the alignment is not technically unfeasible.8

Inclusion of all sentient beings in the alignment

If we accept that morally relevant interests are those of beings capable of suffering or experiencing joy, then it is consistent that the alignment of AI should include all sentient beings, not just humans. The practical question is not whether this should be done, but how to do it realistically. It can be approached gradually, with different levels of moral integration.

Strong or complete alignment

At the most demanding level, AI systems would robustly integrate the interests of all sentient beings into their objectives, accepting significant trade-offs in other goals in order to avoid harm.

This horizon is consistent with the idea of ​​equal moral consideration, but at present it is difficult to implement widely due to strong institutional and cultural resistance to deep structural changes.

Basic alignment

A more immediate and viable step is to adopt a minimum but meaningful standard:

Principle of minimizing avoidable harm:AI systems should avoid causing significant suffering or frustrating important interests of sentient beings when doing so does not involve significant sacrifices in achieving other goals.

This level does not radically transform the system’s purpose, but it introduces a clear operational constraint: when damage can be avoided with minor adjustments (slight variations in efficiency, time, or resources), it should be avoided.9

Here, the change affects real decisions. For example:

·⠀Adjust autonomous vehicle routes if the cost is minimal

·⠀Failing to optimize production processes that increase suffering when low-cost alternatives exist

·⠀Include the interests of sentient beings as a relevant variable when avoiding harm does not seriously compromise other objectives

Minimal discursive and cultural alignment

Here the focus is not on directly regulating physical action, but on how the system morally frames sentient beings and the practices that affect them. This involves avoiding the trivialization, objectification, or omission of their suffering, as well as detecting and correcting speciesist or substrateist10 biases in responses, and explicitly incorporating into alignment principles that harm to sentient beings is not morally irrelevant.

Each level can be implemented independently and does not require different tools but rather different normative content within already existing alignment mechanisms. Model constitutions, evaluator guidelines, internal safety specifications, or “do no harm” criteria can incorporate varying degrees of moral consideration. If they substantially redefine the objective function to give comparable weight to the interests of all sentient beings, they correspond to the most demanding level. If they introduce the rule of avoiding significant suffering when it can be avoided at low cost, they apply basic operational alignment. And if they merely prevent the trivialization or objectification of harm to animals, they operate at the discursive and cultural level. What is decisive is not the technical instrument, but the moral scope incorporated into it.

In addition to these three levels, actions can be taken to consolidate the process:

1. Develop specific tools to measure speciesist and substrate bias in LLMs

2. Conduct independent audits on the impact of AI systems on the interests of sentient beings

3. Incorporate adversarial evidence aimed at detecting recommendations that increase suffering

4. Demand public transparency regarding the normative principles guiding alignment

5. Include explicit provisions regarding animal interests in national and international AI regulatory frameworks

6. Fund interdisciplinary research that connects wellness science and AI systems design

Although the most demanding ideal may take time to achieve, the other levels are viable and would allow for the reduction of avoidable damage.

Short-term risks

The exclusion of sentient beings is not merely a theoretical concern. In the short term, the risks include:

·Discursive normalization: describing the non-human animal exploitation industry or experimentation in purely technical terms, without acknowledging the suffering involved

·Optimization without moral limits: maximizing productivity in animal exploitation systems without integrating animal interests as a relevant variable

·Unreported damage: In physical autonomous systems (vehicles, drones, robots), failing to consider avoidable damage to wild animals when reducing it would have a minimal cost

·Advertising and algorithmic recommendation: reinforce consumption patterns associated with harmful practices

·Uncritical moral delegation: increasingly relying on assistants who do not include all sentient beings in their ethical framework

In all these cases, the problem is not necessarily the intention to cause harm, but the systematic lack of consideration.

Long-term risks

In the long term, the issue is structural. If advanced systems are involved in economic planning, resource allocation, or policy design, the criteria incorporated into their alignment will influence decisions with cumulative effects.11 There are several risks in this regard:

·Automated moral outsourcing: Decisions with significant ethical impact could be automatically executed under incomplete criteria.

·Locking values: A global technological infrastructure aligned exclusively with human interests could persist for decades, even centuries, and an initial bias difficult to reverse. Unlike human generations, systems can replicate and operate for long periods with relatively stable purposes.

·Harm amplification: Technological optimization can multiply the number of individuals and/or entities affected, especially in the case of non-human animals with sectors that already involve billions of individuals.

·Expansion scenarios: New forms of automated exploitation, massive interference in ecosystems, or the creation of potentially sentient entities without safeguards could generate amounts of suffering far greater than those currently seen.

·Exporting different forms of discrimination to new environments: In scenarios of spatial expansion or colonization of new territories, systems aligned exclusively with human interests could reproduce and expand exploitation models on much larger scales.

Although some of these scenarios are uncertain, they acquire moral relevance when the potential number of individuals and/or entities affected is enormous.

Conclusion

AI alignment is not a purely technical problem. It involves deciding who matters morally.

If LLMs and other advanced systems align solely with human interests, they will consolidate and amplify an anthropocentric framework that excludes the majority of beings capable of suffering. Integrating the consideration of all sentient beings, at least through a basic principle of minimizing avoidable harm, does not require immediate radical transformations, but it does require ethical consistency.

As AI takes on a structural role in our societies, the question is no longer whether we should align it with values, but which ones.


Further readings

Butlin, P.; Long, R.; Elmoznino, E.; Bengio, Y.; Birch, J.; Constant, A.; Deane, G.; Fleming, S. M.; Frith, C.; Ji, X.; Kanai, R.; Klein, J.; Lindsay, G.; Michel, M.; Mudrik, L.; Peters, M. A. K.; Schwitzgebel, E.; Simon, J. & VanRullen, R. (2023) “Consciousness in artificial intelligence: Insights from the science of consciousness”, arXiv, 2308.08708 [accessed on 5 March 2026].

Caviola, L. (2025) “The societal response to potentially sentient AI”, arXiv, 2502.00388 [accessed on 5 March 2026].

Chalmers, D. J. (2024) “Could a large language model be conscious?”, arXiv, 2303.07103 [accessed on 4 March 2026].

Dung, L. (2025) “Tests of animal consciousness are tests of machine consciousness”, knowledge, 90, pp. 1323-1342 [accessed on 27 February 2026].

Dung, L. & Kersten, L. (2025) “Implementing artificial consciousness”, Mind & Language, 40, pp. 285-305 [accessed on 10 February 2026].

Gibert, M. & Martin, D. (2022) “In search of the moral status of AI: Why sentience is a strong argument”, AI & Society, 1, pp. 1-12.

Goldstein, S. & Kirk-Giannini, C. D. (2025) “AI wellbeing”, Asian Journal of Philosophy, 4, 25 [accessed on 14 February 2026].

Harris, J. & Anthis, J. R. (2021) “The moral consideration of artificial entities: A literature review”, Science and Engineering Ethics, 27, 53 [accessed on 2 March 2026].

Jotautaitė, M.; Caviola, L.; Brewster, D. A. & Hagendorff, T. (2025) “Speciesism in AI: Evaluating discrimination against animals in large language modelsarXiv, 2508.11534 [accessed on 5 March 2026].

Ladak, A. (2024) “What would qualify an artificial intelligence for moral standing?”, AI & Ethics, 4, pp. 213-228 [accessed on 30 January 2026].

Long, R.; Sebo, J.; Butlin, P.; Finlinson, K.; Fish, K.; Harding, J.; Pfau, J.; Sims, T.; Birch, J. & Chalmers, D. (2024) “Taking AI welfare seriously”, arXiv, 2411.00986 [accessed on 2 March 2026].

McClelland, T. (2025) “Agnosticism about artificial consciousness”, arXiv, 2412.13145 [accessed on 5 March 2026].

Pauketat, J. V. T. & Anthis, J. R. (2022) “Predicting the moral consideration of artificial intelligence”, Computers in Human Behavior, 136, 107372.

Pauketat, J. V. T.; Ladak, A. & Anthis, J. R. (2025) “World-making for a future with sentient AI”, British Journal of Social Psychology, 64, e12844.

Saad, B. & Bradley, A. (2025) “Digital suffering: Why it’s a problem and how to prevent it”, Inquiry, 68, pp. 2110-2145 [accessed on 27 February 2026].

Shiller, D. (2024) “Functionalism, integrity, and digital consciousness”, synthesis, 203, 47.

Tomasik, B. (2014) “Do artificial reinforcement-learning agents matter morally?”, arXiv, 1410.8233 [accessed on 19 January 2026].

Yetter-Chappell, H. (2026) “What a Bing really, really wants: Zigazig ah”, Journal of Consciousness Studies.


Notes

1 Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y. J.; Madotto, A. & Fung, P. (2023) “Survey of hallucination in natural language generation”, ACM Computing Surveys, 55, pp. 1-38 [accessed on 22 February 2026]. Sebo, J. & Long, R. (2025) “Moral consideration for AI systems by 2030”, AI and Ethics, 5, pp. 591-606 [accessed on 5 March 2026].

2 Gabriel, I. (2020) “Artificial intelligence, values, and alignment”, Minds and Machines, 30, p. 411–437 [accessed on 28 February 2026]. See also, Ji, J.; Qiu, T.; Chen, B.; Zhang, B.; Lou, H.; Wang, K.; Duan, Y.; He, Z.; Vierling, L.; Hong, D.; Zhou, J.; Zhang, Z.; Zeng, F.; Dai, J.; Pan, X.; Ng, K. Y.; O’Gara, A.; Xu, H.; These, B.; Fu, J.; McAleer, S.; Yang, Y.; Wang, Y.; Zhu, S.-C.; Guo, Y. & Gao, W. (2023) “AI alignment: A comprehensive survey”, arXiv, 2310.19852 [accessed on 27 February 2026].

3 For exceptions see, for example: Hagendorff, T.; Bossert, L. N.; Tse, Y. F. & Singer, P. (2023) “Speciesist bias in AI: how AI applications perpetuate discrimination and unfair outcomes against animals”, AI and Ethics, 3, pp. 101-1 717-734 [accessed on 26 February 2026]; Singer , P. & Tse , Y. F. (2023) “AI ethics: The case for including animals”, AI and Ethics, 3, pp. 539-551 [accessed on 24 February 2026]; Tse, Y. F.; Moret, A.; Ziesche, S. & Singer, P. (2025) “AI alignment: The case for including animals”, Philosophy & Technology, 38, 139 [accessed on 24 February 2026].

4 Bai, Y.; Kadavath, S.; Kundu, S.; Askell, A.; Kernion, J.; Jones, A.; Chen, A.; Goldie, A.; Mirhoseini, A.; McKinnon, C.; Chen, C.; Olsson, J.; Olah, C.; Hernandez, D.; Drain, D.; Ganguli, D.; Ceremony.; Tran-Johnson, E.; Perez, E.; Kerr, J.; Mueller, J.; Ladish, J.; Landau, J.; Ndousse, K.; Lukosuite, K.; Lovitt, L.; Sellitto, M.; Elhage, N.; Schiefer, N.; Market, N.; DasSarma, N.; Lasenby, R.; Larson, R.; Ringer, S.; Johnston, S.; Kravec, S.; El Showk, S.; Fort, S.; Lanham, T.; Telleen-Lawton, T.; Conerly, T.; Henighan, T.; Hume, T.; Bowman, S. R.; Hatfield-Dodds, Z.; Mann, B.; Amodei, B.; Joseph, N.; McCandlish, S.; Brown, T. & Kaplan, J. (2022) “Constitutional AI: Harmlessness from AI feedback”, arXiv, 2212.08073 [accessed on 24 February 2026].

5 Amodei, D.; Olah, C.; Steinhardt, J.; Christiano, P.; Schulman, J. & Mané, D. (2016) “Concrete problems in AI safety”, arXiv, 1606.06565 [accessed on 23 February 2026]. Ji, J.; Qiu, T.; Chen, B.; Zhang, B.; Lou, H.; Wang, K.; Duan, Y.; He, Z.; Vierling, L.; Hong, D.; Zhou, J.; Zhang, Z.; Zeng, F.; Dai, J.; Pan, X.; Ng, K. Y.; O’Gara, A.; Xu, H.; Tse, B.; Fu, J.; McAleer, S.; Yang, Y.; Wang, Y.; Zhu, S.-C.; Guo , Y. & Gao , W. ( 2023 ) “AI alignment: A comprehensive survey ”, op. cit.

6 Guan, M. Y.; Joglekar, M.; Wallace, E.; Jain, S.; Barak, B.; Helyar, A.; Dias, R.; Vallone, A.; Ren, H.; Wei, J.; Chung, H. W.; Toyer, S.; Heidecke, J.; Beutel, A. & Glaese, A. (2025) “Deliberative alignment: Reasoning enables safer language models”, arXiv, 2412.16339 [accessed on 5 March 2026].

7 Bostrom, N. & Yudkowsky, E. (2018) “The ethics of artificial intelligence”, en Yampolskiy, R. V. (ed.) Artificial intelligence safety and security, Nueva York: Chapman and Hall, pp. 57-69. Owe, A. & Baum, S. D. (2021) “Moral consideration of nonhumans in the ethics of artificial intelligence”, AI and Ethics, 1, pp. 517-528.

8 Anthropic (2026) Claude’s constitution, San Francisco: Anthropic [accessed on 5 March 2026].

9 Singer, P. & Tse, Y. F. (2023) “AI ethics: The case for including animals”, op. cit. Tse, Y. F.; Moret, A.; Ziesche, S. & Singer, P. (2025) “AI alignment: The case for including animals”, op. cit.

10 Substratism is a form of discrimination analogous to speciesism: it consists of treating the interests of certain individuals as less important or even ignoring them solely because of the type of substrate. Thus, potentially sentient beings can be excluded from moral consideration because they are not made of biological tissues, but of silicon or other materials (or because they are implemented in digital or artificial systems). As with speciesism, which discriminates unjustifiably based on species, substratism takes a characteristic that is not decisive in itself (the material support) as a reason to deny or reduce the consideration of interests, such as having positive experiences and lacking negative ones.

11 For examples of risks see, for example: Ziesche, S. & Yampolskiy, R. (2018) “Towards AI welfare science and policies”, Big Data and Cognitive Computing, 3, 2 [accessed on 17 February 2026]; Baumann, T. (2022) Avoiding the worst: How to prevent a moral catastrophe, London: Center on Reducing Suffering.; Birch, J. (2024) The edge of sentience: Risk and precaution in humans, other animals, and AI, Oxford: Oxford University Press; Dung, L. (2025) “How to deal with risks of AI suffering”, Inquiry, 68, 7; Moret, A. (2025) “AI welfare risks”, Philosophical Studies, 09 June 2025 [accessed on 5 March 2026].