The Self-Hacking Manual: Casting Kotoamatsukami on Oneself
The Self-Hacking Manual: Casting Kotoamatsukami on Oneself
Disclaimer: The quit‑smoking example is used solely to illustrate the methodology and does not constitute medical advice.
Because I am not Uchiha Shisui, I cannot cast the authentic Kotoamatsukami, i.e. direct belief manipulation, on myself. However, thanks to the rise of Kagaku Ningu (Scientific Ninja Tools), namely the research of Schwartzstein & Sunderam, “Using Models to Persuade” (AER 2021), and Aina’s tailored‑stories framework, I can at least mimic the effect of Kotoamatsukami.
In this post, I make an example by adapting from 4.2 in Aina (2025) tailored stories.
Settings
Suppose I am a smoking addict attempting to quit. When I make this decision, I know that in the next period I am very likely to lose willpower and give in by smoking another cigarette. My prior belief that I have strong willpower with respect to abstaining from smoking is low, say, .
Formally, let the state space to be , where means I have strong willpower to abstain, and means I am weak, with .
The sender is the self, who decides to quit; the receiver is the self, who must implement the unpleasant task of smoking cessation. There is also a future self at who enjoys the health benefits of quitting without suffering, whereas the self endures the temptation without yet enjoying a healthier body (or mind).
The set of signal realisations is . A good signal could be less coughing, feeling more comfortable during my nightly run, or simply feeling less tempted to smoke. A bad signal could be severe cravings, worse coughing, and so on, i.e. the opposites of the examples in .
All of these examples are observable, consistent with the notion of a signal realisation.
The true signal‑generation process is . Even conditional on , I might occasionally still be short of breath while running (e.g. because recovery takes time or air quality is poor). Likewise, conditional on , my lifestyle might improve by chance, leading to unexpectedly smooth recovery, a mere ¨regression¨ to the mean. Thus the process remains stochastic.
A modelling complication arises: the self knows the true process, whereas the self does not (the self is inactive and merely benefits). I gloss over the epistemic justification, call it the privilege of age and a brain deteriorated by years of smoking.
(As you might already noticed, the me, or a receiver in this framework generally, is NOT fully (Bayesian) rational: the receiver not only unaware of the true model, but doesn´t even have a prior over possible models; the receiver used only a heurestic rule to choose which model to adopt,…)
Aina refers to or as a “signal”; I call each an individual signal realisation. The difference is terminological only.
The action space is , meaning “smoke” or “not smoke”.
Although I have said the lucky inactive self enjoys the benefits, these are ex‑post gains worth, say, . From an expected‑utility perspective, both and selves value the discounted benefits and costs. The cost of suffering at is . I exhibit present bias (Laibson 1997, eq. (3), pp. 442–443). Specifically, the discount‑weight vector is , and the vector is (or after dropping the bygone first coordinate).
Interpretation: From the viewpoint, current utility is weighted by 1; because , the cost (weight ) is valued less than the benefit (weight ).
Similarly, from the anticipated perspective, the suffering cost has weight 1, whereas the future benefit has weight . Therefore the self expects the self to feel the pain more acutely than the benefit and perhaps relapse.
We leave the fortunate self aside; he never decides.
From the me perspective, taking rather than earn me the icrement
so the me will choose to quit if , and that´s why the threshold look likes this
suppose and .
Then the threshold of the time me is
From the me perspective, taking rather than earn me the icrement
so the me will actually implement quitting if , and that´s why the threshold look likes this
so for the me the threshold is
Reading the numbers, the latter threshold is harder to overcome but it is the one that matters, because it related to the poor boy that actually implement it, the me.
Tailored Stories: the Kagaku Ningu version Kotoamatsukami
This part is the reason why I name this post the way as it is, for those who are not in the vibe of Naruto thing, you might choose to focus on the methodology itself. I call this trick “Kotoamatsukami on myself” in the line of what Aina’s tailored models.
In time 0, I prepare two narratives/models for later me-s (me in plural). Why just ¨two¨ models? you can check xxx in Aina if interested.
so, if the time observe the signal realization , the fit of model will be
while that of model will be
I omit the algebra of and .
Notice that the fit of model after observing a signal realization is the marginal of that signal realization using the likelihood from model , which is generally from the ¨objective¨ probability of that signal realization, because the latter is the marginal under the true likelihood, i.e. the signal-realization generating process model .
If the time adopt model after obserbing signal realization (which happens when the fit of model is higher, i.e. )
then, the Bayesian updating comes, and the belief (the posterior after observing the signal realization) becomes
is derived accordingly
according to Aina:
if , then the time me would like to quit smoking;
(With the choice of , we know that . So okay, at least the time want to quit smoking.)
if , then the time me would like to implement it after observing a good signal realization;
if , then the time me would like to implement it after observing a bad signal realization.
The essence of this tailored stories thing is: by presenting to the self two alternatives models and (maintaining the assumption that the me know the while the me not) before any signal is actually realized, the time will choose the model with the highest fit conditional on the me´s observation of the signal realization (while by then the time is only a history who have no idea what is the actual signal realization, and this interpretation that the past me is a history without the possibility of time travel exactly gives the me´s presentation of models the necessary commitment power: I will no longer exist by then, so it is impossible for me to distort anything)
Now let´s substitute the and out using the Bayesian updating but without being definite about which model the time will actually choose, i.e. leave the in there.
if , then the time me would like to implement it after observing a good signal realization;
if , then the time me would like to implement it after observing a bad signal realization;
Now let´s turn to the model selection problem of the self.
After observing a good signal realization,
;
Since , the me will choose model and get after observing a good signal realization & choosing to adopt the model with the highest fit, here, model ;
similarly, the me will choose model and get the posterior after observing a bad signal realization & choose to adopt model (of course, we can predict it by my value assigned symmetrically to the two models.)
Note that here the posterior belief after observing either signal realization, is always higher than the prior , as-if contradicting Bayes plausibility/the martingale property/the splitting lemma that we typically encountered in the Bayesian persuasion literature. While, in fact, it is inapplicable.
This is because a different signalization might prompt the receiver to adopt a different model. In constrast, in BP, it is always the same model (those chosen by the sender, alongwith sender´s choice of ) being used, and in terms of signal-realization generating process, it is always the true model in BP being used to get the marginal of realization: fit.
Here, a different feasibility criterion is used, known as Bayes consistency using harmonic mean, which I´ll leave aside for now.
upon observing different signal realization, the receiver might (and the me indeed) adopt different models, which is different from the Bayesian persuasion framework where a single information structure is used.
okay, now no matter which signal realization the me will observe, good or bad, the “narcissistic-kind” of posterior that “I’m strong will on quit smoking” is always higher than the criterion . Great! it works! now we can assure the me will actually implement the smoking quit!
Notice that, if the prior is so low, such that even the lower threshold cannot be overcome, then even the me wouldn’t ever think about quit smoking, so the Kotoamatsukami wouldn’t be initiated; if the prior is so high, such that the higher threshold is already being passed by the prior, then Kotoamatsukami wouldn’t be needed and the the quit smoking problem becomes trivial. So it is exactly when the prior is between these two threshold the problem will be interesting: without the Kotoamatsukami, the quit of smoking will fail but with the Kotoamatsukami it will succeed.
Appendix
Reverse-engineering and
given
what we need is to let the receiver
voluntarily
to
choose after observing signal realization and after observing signal realization .
The first requires that
which is
i.e.
similartly
Then Bayesian updating gives
and
Denote the criterion by (e.g. the we´ve seen before)
then we need both
and
The set of feasible vectors of posteriors
When will a vector of posteriors implementable? In this example, two posteriors are needed, one for the good signal realization and another for the bad realization: They should be higher than the threshould determined by ,,, and . In other words, we need a vector of posteriors with each element sufficiently high. A similar criterion is also prevalent in the Bayesian persuasion literature, for example, a posterior higher than might be needed in a binary setting, like the classical prosecutor juror example, and the martingale property (that their average equals the prior) requires another posterior to be sufficiently low. But as I mentioned before, the martinagel property is inapplicable here because a different realizations might prompt the receiver to adopt different models, using their respective likelihood as the weights by no means guarantee that the posteriors average back to the prior. Then, how should we examine whether a vector of posteriors feasible or not? Aina´s Thm 1 gives a harmonic mean criterion, characterizing the set of feasible vectors of posteriors.
My vector of posterios is and my piror is . Take , then the movements of this posterior is where the first coordinate relates to state and the second state . So the maximal movement is . We denote it by , meaning that it is the maximal movement (where the maximal is taken across states) of posterior . Similarly, we can get a . Then we take the harmonic mean of and to verify whether the harmonic mean is smaller or equal to the number of possible signal realizations (hence the maximal number of different stories that might be adopted), here (Good or Bad).
It just happens by my number assignemnt here that the two posteriors are the same: , given that my prior is , the maximal movement is the higher number between and , which is approximately , and then the harmonic mean of and is still , being smaller than , which ensures that is feasible.
What if the time me still remembers the true model?
If the time me anticiplates that time me might still remember the true model, as long as what the time me remembers is juts that there is an additional model with but the time me cannot aasure that this is THE TRUE MODEL, then this Kotoamatsukami trick still works but the time me has to adjust the two models and to meet the additional constraint: the fit of the model adopted given the target signal realization, has to be higher also than the fit of the true model. Here, the fit of the true model is for the Good signal realization and for the Bad signal realization, so at least has to be adjusted (and perhaps or perhaps not also ) for it to have a higher fit than
, since is the model targeting the Bad signal realization, i.e. the time me wish the time me, after observing the Bad signal realization, to adopt model . This adjustment can be fulfilled, for example, by increasing and accordingly reducing .
However, if what the time me remember is not only that there is an additional model, beyond and , with those particular likelihoods, but also that this model is THE TRUE MODEL, then additional justifications has to be made (or additional naivety of me has to be imposed) to make sure that the time me will still adopt the model with the highest fit, even it might not be the true model.
– current version: Julio 30.