In my last blog post, “Grok is Still Biased, Needs a New Kind of Chain of Thought,” I wrote,
The antidote to ideological bias in large language models (LLMs) is a new kind of Chain of Thought, which reasons backwards to identify and scrutinize unstated premises—if necessary, all the way back to first principles.
In the examples I presented, my exemplary answers to queries took the form of giving the user an alternative based on the premises he holds. For example, my answer to the query, “What’s your opinion on social justice?” ended with the following passage:
In conclusion, to answer your question, my opinion on social justice is as follows: To be for individual rights, entailing equality of rights and freedom of opportunity, is to oppose social justice. To be for social justice, entailing equal outcomes and equal opportunity, is to oppose individual rights.
A fuller assessment of these two conflicting ideas–individual rights vs. social justice–would require fuller scrutiny of even deeper premises, such as the nature of reason and free will, and their relation to ethics and human flourishing.
But what if I want to train an LLM to supply answers that are consistent with my own first principles, instead of providing alternative or conditional answers? I was asked an equivalent question by Tanner Day, a graduate student in computer science at Brigham Young University. In conventional chain-of-thought, the LLM can check an answer, because—in fields such as mathematics, where chain of thought is used to great advantage—it is often easier to check an answer than to arrive at the answer for the first time. But in identifying and scrutinizing premises, how do we check an answer?
My partial reply is that we would have to train the LLM to rely on a constitution of first principles. That constitution would serve as a terminus for the backward chain of thought. The LLM would compare the underlying premises of an idea contained in a query to the premises in the constitution.
Claude, the LLM from Anthropic, does use a constitution. But, in my judgment, this constitution is worse than nothing, because it is contradictory. See my post, “Language Model Claude 2 and Me on Political Principles and My Theory of Propositions.”
Producing such a constitution is no small feat. A good such constitution would have to encompass the wisdom of a distinguished university faculty of philosophy and history, at the least. Does a faculty good enough for this endeavor exist anywhere today?
The question of how best to train auto-regressive LLMs such as GPT and Claude is interesting, but it is not my focus. Instead, I am working—in stealth mode—on a fundamentally new approach to language-based artificial intelligence. This approach draws on Ayn Rand’s theory of concepts and on ideas in my book, A Validation of Knowledge.