### INTRODUCTION

### METHODS

*i*) decrease gradually because the individuals continually pick up them. Since the individuals pick up easy-to-acquire resources first, the resource acquisition per unit time (

*T*) - that is

*h*(

_{i}*T*) - also decreases gradually. Each patch has a different

_{i}*h*(

_{i}*T*). In Patch A with a high

_{i}*h*(

_{i}*T*), the rate of the resource returns declines slowly (solid line A). On the other hand, in Patch B with a low

_{i}*h*(

_{i}*T*), the rate of the resource returns declines rapidly (solid line B).

_{i}*uTi*, which is the sum of the time spent in the patch (

*i*) and travel time (

*t*), is:

*nE*, during the time spent in patch i and while moving to the next patch can be expressed as:

_{i}*patches*; hi (T)=the amount of resources acquired per unit time, T, from the patch (i=1, 2…, k); hi(Ti)=resource acquisition per unit time in patch i (i=1, 2,…, k)]

*T*, the slope of the resource return obtained at a specific time in a patch is as follows:

*h total*(

*T*), that is if

*∂hi*(

*T*)/

_{i}*∂Ti-h*(

_{total}*T*) <0, the agent should leave the current patch.

*h*(

_{i}*T*)/

_{i}*∂Ti-h*(

_{total}*T*) approaches zero, the defence module would be activated. The reduction in the resource acquisition rate is the critical ecological pressure that directly affects the survival and reproduction of individuals. In Figure 2, it is advantageous for individuals in the patch A to stay there longer than individuals in the patch B, because the resource acquisition rate decreases more slowly in the patch A than in the patch B. In other words, the defence module should be activated more quickly and robustly in resource-poor patches (Figure 2). This logic is consistent with results of clinical studies that defence activation disorders occur more often in the unfortunate socioeconomic situations such as high competition, socioeconomic crisis, high unemployment rate, and natural disasters [53-57].

*∂h*(

_{i}*T*)/

_{i}*∂T*(

_{i}-h_{total}*T*). The agent in the patch already knows

*the hi*(

*T*) and

_{i}*T*. However, information about

_{i}*h*(

_{total}*T*), the average resource acquisition rate in a given habitat, cannot be obtained accurately at the individual level.

_{a}be the initial value. Then, the x-axis value T

_{a}, where the dashed line A touches the marginal value curve, is the optimal time to remain in the current patch or the cost of remaining in the current patch. However, the cost (or risk) of leaving to a new patch is relatively small for

*C*(

_{a1}*T*<

_{a1}*T*). In this case, even if the cost of seeking a new patch is added, movement can increase the average return of the agent. Therefore, it is advantageous to move to another patch after

_{a}*T*, whereas the cost (or risk) of

_{a1}*C*moving to a new patch is relatively higher than that of

_{a2}*C*. Therefore, it is advantageous to stay longer in the current patch (

_{a}*T*>

_{a2}*T*).

_{a}*h*(

_{total}*T*) is the slope of the dashed line A, if the agent misinterprets it as the dotted line A

_{1}, the agent leaves the current patch earlier than the optimal time and moves to a new patch. In this case, the agent underestimates the cost (or risk), for leaving to a new patch. By contrast, if the agent misinterprets it as the dotted line A

_{2}, it leaves the current patch later than the optimal time. In this case, the agent overestimates the cost (or risk), for leaving to a new patch.

*d*be the weight of the individual for

*h*(

_{total}*T*). Therefore, the weight of average resource acquisition per unit time, weighted

*h*(

_{total}*T*), can be expressed as:

*d*is assumed to be a normal distribution with an average of 1, a maximum value of 2, and a minimum value of 0. In other words, an individual in a patch will try to leave the patch when

*∂h*(

_{i}*T*)/

_{i}*∂T*-

_{i}*h*(

_{total}*T*)·(1-(

*d*-1))<0. Here

*d*>1 means that the agent is more pessimistic than the actual condition of the entire habitat, and

*d*<1 means that the agent is more optimistic than the actual condition of the entire habitat. Thus,

*d*is greater than 1 for D-type disorders and

*d*is less than 1 for d-type disorders.

_{1}). Therefore, the null hypothesis is that a range of dysfunctional behavioural strategies with different levels of defence activation cannot be maintained in any case or can be maintained regardless of the gradient distribution of resource (H

_{0}) within the agent-based evolutionary simulation environments.

_{1}). In the second environment, resources are not distributed spatially uniformly, but are randomly distributed throughout the environment (E

_{2}). In the third environment, resources are not uniformly distributed in space, and resources are distributed in a gradient in accordance with a certain spatial tendency in the entire environment (E

_{3}).

_{3}, the null hypothesis can be dismissed. Otherwise, I cannot prove the alternative hypothesis. Otherwise, the alternative hypothesis cannot be verified in an evolutionary simulation environment.

### ODD (Overview, Design concepts, Details) protocol

### Overview

#### Purpose

*d-value*and low

*d-value*maintains stable under various patchy environment. Additionally, the factors affecting the proportion of individuals with high or low

*d-value*are analysed.

#### Entities, State Variables, and Scales

#### Each patch has two important state variables

_{0}, the R. The circle can get information about the R from the patch. The amount of resources is distributed differentially, according to Env.Ht. The distribution is vertically uniform and horizontally gradient. R

_{0}of patches arranged from the lowest to the highest through x-coordinates. The average amount of resource is set as 150 (at xcor 9). When Env.Ht. is 1, the resource amount of patch with xcor 0 is 300, and the resource amount of patch with xcor 18 is 0. If Env. Ht. is 0, the resource amount of patch with xcor 0 is 150, and the resource amount of patch with xcor 18 is also 150. If Env. Ht. is 2, the resource amount is distributed from -150 to 450. The details are explained again in the section of sub-models.

_{0}, E and

*d-value*. The Int.No. and the E

_{0}of the circle is set at the initial setting. A circle represents each movable agent on the interface screen, and a new-born circle is displayed in yellow for one year. Movement between patches means leaving to a new ecological patch, and the energy required to movement corresponds to Mov.Cost.

#### Process overview and scheduling

*d-value*of each circle on the actual amount of average energy acquisition. If the circle does not move, the TSS of it increases by 1. When a move is completed, the TSS returns to zero. 3) Movement: The circle moves randomly to one of the neighbouring patches. If there are no empty patches in 8 neighbours, the circle stays. If movement becomes successful, Mov.Cost is paid from E. 4) Decision of Reproduction: If the age is between 15 and 40 years old, the E of the circle is higher than M.E.R. weighted by (1-Rep.Prob.), and there are empty patches among neighbours, the circle breed a new offspring. The new-born circle begins the life at one-of neighbouring patches. If there are no empty neighbouring patches, the circle waits for the next chance. The

*d-value*assigned to the newborn circle is set by multiplying the parental

*d-value*by a randomly determined number in the distribution with an average of 1 and a standard deviation of d (SD-of-d). Here, Rep.Prob. is determined by the logistic function. Details are provided in the section of sub-models.

### Design concepts

#### Basic principle

*d-value*. The purpose of this model is to determine the usefulness of the MVT for explaining the relationship between

*d-value*and distribution of R, Mnt. Cost, Mov.Cost, and M.E.R. within various simulation environments. It is also to identify whether dysfunctional behavioural patterns associated with defence activation disorders can be maintained as ESS within simulated evolutionary environments.

#### Emergence

*d-value*and distribution of circles according to the

*d-value*.

#### Adaptation

*d-value*of each agent limit their adaptive behaviour and their behaviours modified

*d-value*of themselves by generations. Breeding and death are determined by the R and the availability of empty surrounding patches. So, population density and resource amount in local environment restraint the circle’s behaviours indirectly.

#### Objective, Learning, Prediction, Sensing and Interaction

*d-value*. Here, the average amount of resource acquisition per unit time perceived by circles is not accurate because circles cannot sense it. Moreover, circles cannot perceive any other information about the environment, patches, or other circles even whether there are empty neighbouring patches around them. Interaction between the patches is not considered in the design, but they can affect other circle’s behaviours through the crowdedness.

#### Stochasticity

*d-value*of each circle are determined stochastically in an ecological environment. Also, in the initial setup, the placement of the circles is determined randomly. In the real-world, circles may have some information about the amount of resources of neighbouring patches from their experience or a social network. Moreover, all individuals have different competencies about movement efficiency, survival, and reproduction. However, the model does not consider asymmetric information levels or differences in physical or psychological capabilities between individuals for simplification. The model was designed based on the killjoy explanation, to say, intricate behavioural patterns could be produced by simple mechanisms [62-64].

#### Observation

*d-value*(mean and SD). Some of them offer 4 different results per charts according to their

*d-value*(UA, NA, OA). They can be observed in real time. Also, there are 4 histograms for age, E, R,

*d-value*. Also, a specific number of them are monitored in real time. All of them are presented the Supplementary Materials (in the online-only Data Supplement).

### Details

#### Initialization

_{0}) of the circle can be adjusted up to 200, but the initial default value is a mean of 50 (0-100). The R.D.R. and M.R.D.R. of each patch can be within 0 and 1. Mov.Cost. can be set from 0 to 100, but the initial default value is 7. Mnt.Cost can be within 0 and 100, but the initial default value is 20. The Env. Ht. is 1, but it can be adjusted from 0 to 2. The C.C. can be up to 1,500, but the initial default value is 900. M.E.R. may reach 400, but the initial default value is 130. The

*d-value*of the parent mostly determines the

*d-value*of the offspring. The model is designed to give birth to one to three offspring for a lifespan. It reflects two to six offspring because parents in this model to give birth to a 15-year-old offspring. Indeed, childhood mortality rates in hunter-gathering communities range from 50 to 60% [67,68]. Innate

*d-value*is determined by parent’s

*d-value*multiplied by the number randomly selected from the normal distribution with an average of 1 and a standard deviation of d (SD-of-d). The default value of SD-of-d is 0.03.

#### Sub-models

### (1) Energy acquisition and maintenance

_{0}be the initial resource amount of each patch. In each patch, the circle acquires resources as energy. Energy acquisition and Mnt.Cost are set as follows:

*R*be the amount of resources remaining in the patch after k repetitions, as follows:

_{k}### (2) Gradation of resource distribution

_{0}: 300 minus the absolute value of X divided by 18 multiplied by 300. If the X value is 0, the resource is 300, and if the X value is 18, the resource is 0 (default). It is calculated as follows.

### (3) Movement cost

### (4) Movement decision

*Y-U*([0, 1]);

*R*(

*p*

_{n}) is the amount of resources in the n

_{th}patch.

### (5) Reproductive probability

_{1}) is 50% of the carrying capacity (C.C.) of the entire habitat, the likelihood of reproduction is P

_{1}; when the population (X

_{2}) is 100% of the C.C. of the whole habitat, the likelihood of reproduction is P

_{2}. Then, intermediate variables A and B can be obtained as follows:

_{1}is 0.7, and P

_{2}is 0.2, the logistic function of the probability of reproduction can be expressed as shown in the following chart (Figure 5).

### (6) The circle’s *d-value*

*d-value*of a circle is determined as follows:

*d-value*. Therefore, they are classified as follows (SD is 0.15):

_{0}is set to 50. So, if net resource acquisition is 0, circles will die soon (about 3yrs after). For survival, the circle should be positioned in patches where offer at least 20 units of R every year. SD-of-d is set to 0.03. Therefore, the phenotypic variation (V

_{p}) between parent and offspring is a maximum of 0.03. The results of the calibration are presented in the Supplementary Materials (in the online-only Data Supplement) comprehensively. The calibrated ranges of the primary parameters used in the simulation model are as follows. When parameters not listed in Table 3 are used in some simulation environment, they are described again.

### RESULTS

*d-value*. UA, NA and OA showed different TSS. The model provided stable ranges of

*d-values*over time. And the proportions of UA, NA and OA were also stably different across the time. The simulation model showed stable and predictable results for 5 or 10 kyr. Stress tests were conducted for 97.5 kyr, and the stability of the model is confirmed.

*d-values*were distributed differentially according to the amount of resources, and the numbers of UA, NA, and OA were distributed as expected with niche specialisation. The counts of UA, NA, and OA were negatively correlated with each other in the correlation analysis.

*d-value*of the object is extreme. In the basic simulation setup, the

*d-value*was set to 0.5, 1.0, and 1.5, respectively, and randomly selected one of the three

*d-values*in each circle. So, experiments were conducted to ascertain whether

*d-value*converged to a similar value even if it started at 0.5 or 1.5. The experiments were repeated for a total of 32 times. If the initial

*d-value*remains unchanged or does not converge to a particular level, the assumption of balancing selection model should be rejected in the simulation environment. However, despite the extreme initial setting, the

*d-value*converged to constant value over time. This agent-based model of defence activation disorder shows the expected outcomes of balancing selection. The relative proportion of UA, NA and OA became stable after about 1 kyr (Figure 7).

*d-value*of each circle was randomly assigned from 0.5, 1.0, and 1.5 at the beginning, the standard deviation decreased gradually with time, and

*d-value*converged to a value close to 1. The mean

*d-value*was 0.949, and the standard deviation was 0.156. When

*d-value*of each circle was fixedly assigned between 0.5 and 1.5 at the initial setting, the standard deviation gradually increased over time but stabilised at a similar level. And

*d-value*also converged to a value close to 1 as well. The mean

*d-value*was 0.939 and 0.947, and the standard deviation was 0.149 and 0.138, respectively. After about 1.5 kyr, they converged to similar patterns (Figure 8).

*d-value*of the circle stably converged in the simulation model. The initial condition affected the distribution, but after about 0.5 to 1.5 kyr, the influence of it disappeared.

*d-values*for the 97.5 kyr was 0.95374+/-0.018, and the maximum and minimum values were 1.016 and 0.895, respectively. Based on the above results, it is concluded that the agent-based simulation model of defence activation disorder has fine reliability and feasibility.

*d-value*of circles according to the resource distribution were collected and analysed. The optimal

*d-value*of the individual differs according to the resource distribution. Also, the local population densities varied depending on the amount of resources. As Mov.Cost increased, circles tended to gather in places with many resources, but the overall distribution was similar. The absolute

*d-value*was also different, but the tendency of the difference according to the resource gradient was somewhat identical (Figure 10).

*d-values*are depending on each local environment. In other words, the diversity of defence activation level by niche specialisation was apparently observed in the simulation environment.

*d-values*would maintain inverse frequency-dependent selection.

### DISCUSSION

*d-value*) for the ego, the world and the future. There has been no study that have quantified the traits of dysfunctional behavioural patterns in the context of human behavioural ecology or evolutionary neuro-anthropology as far as I know.

*d-value*, on ecological currency, i.e., the acquisition of energy, was studied.

*d-value*corresponding to local environmental conditions. The local optimal

*d-value*was different from the global optimal

*d-value*of the entire habitat. It was also observed that individuals with different

*d-values*clustered differently depending on the local environment. In the resource-rich area, the subgroup with low defence activation level was clustered. In contrast, in resource-poor area, the subgroup with high defence activation level was clustered. The model showed that multiple defence activations level could be ESS by the mechanism of balancing selection, specifically, niche specialization, at least, in a simulated environment. A schematic diagram is shown below (Figure 16).

*d-value*. If all other factors are controlled, different levels of defensive behaviour are less likely to be evolved. The Total Niche Width (TNW) of the population can be divided into the Within-Individual Component (WIC) and Between-Individual Component (BIC) [79]. All individuals in this model were premised to have the same mental and physical abilities. So Env.Ht. affect the BIC and

*d-value*is only WIC. Specialisation occurs when WIC are much smaller than TNW or BIC is a large proportion of TNW. Therefore, under the evenly patchy environment, specialisation is hard to occur because BIC is small. However, under the resource-gradient environment, specialisation can occur quickly if free movement is limited.

*d-values*will be minimised. This is because the total number of niches is limited. Since the number of niches where their d value is optimal is limited, the growth of the sub-population increases the likelihood of moving to the suboptimal area. Thus, the niche specialisation phenomenon could work as the environmental factor that maintains the proportion of subgroups with different d values in a frequency-dependent manner [30]. However, it does not appear as a predator-prey relationship, as the Lotka-Volterra equation [80] suggests. It is unclear what the negative correlation in this study has evolutionary meaning. A more sophisticated model is needed to distinguish the effects of the niche specialisation and frequency-dependent selection.

*d-value*by itself does not induce absolute selection pressures, but a variety of optimal

*d-values*may appear over the long term through relative superior fitness to others in various microenvironments [28]. The multiple-niche polymorphism model could work with multiple ESS if the geographic or social gradient of the environment does not change often.

*d-value*is relatively determined by the distribution gradient of the resources in the simulation habitat. It is due to the fixed environmental conditions [36], where each agent is located, but also by the frequency-dependent mechanism of local population density [35]. Within these conditions, there is no evolutionary significance of the so-called “the best defence activation level,” (i.e., globally ideal level mood, anxiety, fear) corresponding to the average resource value of the entire habitat. The irrational behaviour can be caused by adaptive decision process the domain of selection and the domain of testing mismatch [85,86]. The optimal level of defence activation appears relative to various conditions and circumstances, and this demonstrates there is no absolute optimal value.

*d-value*and ecological constraints. In particular, the model was designed using the NetLogo programming platform, which is a robust multi-agent programmable modelling environment.