This is a project I worked on and wrote in my first semester of my PhD studies in 2015. It was intended as a mid-term paper. I’ve left the original text as is and posted it here. There will be typos and high levels of incomplete logic, but surely this will be useful to someone.
In the 1970s a prominent social science researcher by the name of Thomas Schelling surprised his colleagues (and a bunch of other people) by building a computational/mathematical model that showed that only a slight racial, or color-based preference in a buyer’s geographic decision on a home purchase can create very segregated societies. In other words, that communities in cities and rural areas which tend to be heavily dominated by specific group or demographic can result, even when people living in those communities are not necessarily super racist or bigoted.
Before you go hang a huge sign outside your door that celebrates the end of racism or something, please be aware that this model is a simple (toy) model that illustrates a point, and though it forms a good example in the computational social science research area for how simple agent-level behavior can aggregate to form interesting group-level dynamics, it is by no means a complete answer to how and why racism occurs.
Back then, computational power and access to computational tools was very limited, but nonetheless Schelling and his colleagues would build a model and wait for hours, sometimes days while his seemingly mechanical computer would run through the simulation. Today, his model is still being used by computational social scientists, traditional social scientists and a variety of other disciplinarians to explain the effect of these small choice variations on how we choose everything from where we live to what we buy and base those decisions on same/like other behaviors. The model is also pretty well documented and used in learning environments. Just so that we’re using the right terminology, homopohily (the tendency to choose based on “sameness”) plays a big role in the segregation mechanism. And, Homophily has its own line of very interesting research as well, but we can talk about that another time.
So, in order to illustrate how this model works, I decided to rebuild it, extend it and modify it to include new behavior.
“The Original Schelling Netlogo Model
The model was built in Netlogo and has simple rules and the results are clearly emergent. The model instructs each agent to move randomly, until it finds a place in space where it is “happy”. Agents (or turtles as they’re called in NetLogo lingo) continue moving and relocating until they find a “happy” place. This continues for some time or until the model converges to a steady state. Happiness is defined as the percentage (or the number) of directly adjacent agents that are the same as you (in the original Netlogo model sameness is defines only by color).
Below is a video of the original Netlogo model run found in the NetLogo native models library. Feel free to run the video a few times just to get a feeling of how it behaves. The model is run a few times in the video with different parameter settings. The parameters are density of agents in the 2-d space and the level of acceptable homophily for each agent. Simple!
The model is really quite elegant in its hypothesis and demonstration: small preferences can lead to very segregated communities. It hypothesizes that there might not be much to it in terms of formal and structured racism. The model lets you play around with the homophily grade to see the relevant difference in segregated communities as compared to agent-level decisions and it’s very clear that agents who may have very reasonable preferences in who they want to live with (in communities) can create some drastic results, in that communities become highly segregated.
Eventually however, the model loses some of its luster because it doesn’t take into account a number of other factors related to social preferences in the process of choosing where an agent might live. So, for many years researchers and modelers from within the agent-based community and from other disciplines have been extending the model again and again. By extending, we mean adding new processes, variables, and mechanics to better understand segregation. It also means trying to eliminate some of the artifacts of the model that do not necessarily make much sense. For instance, agents might be “happy” if no one is living immediately around them, which makes the model converge as long as there is enough empty space between agents. In real terms that doesn’t tell us much, so one way of overcoming that challenge is to give agents vision that lets them see not just their most immediate surroundings but a little further out (in Netlogo terms we want them to see more patches).
Original Model Mechanics
Let’s look at that in terms of basic model design diagrams. The basic Schelling model accounts for homophily, sets agent vision at 0 (or 1 depending on where you like to start counting) – meaning agents only see their immediate surroundings and utilizes random movement in agent relocating decisions. It also uses a basic Moore neighborhood:
Check out my Github Repository for the original model Netlogo code, courtesy of Wilensky and the Netlogo online models library.
The basic model works through a simple process. Ask all agents to move some number of steps in any random direction, ask agents to check their immediate Moore neighborhood (8 patches) and determine how many other agents are the same color as the original (self). If the number of same-type agents surrounding the original agent is less than or equal to what the observer (whoever is running the model) sets as the minimum level of acceptable homophily, then the agent becomes “happy” and stops moving. This means that some agents will move every turn, while others will not move at all, while still, others will move sometimes and on other turns will not move at all. At some point (we hope) the model will emerge to a stable state and none of the agents will continue moving at that point.
The results as you can see from the mode runs are simply amazing and super interesting, so wouldn’t it be even more interesting if we could extend the model to incorporate some other super interesting factors. That is precisely what I did (by the way this has been done hundreds of times by all manner of researchers and for all manner of variables in top journals).
Schelling Segregation Model (Extended) With Increased Vision, Multi-level Selection, and Propinquity
The extension I designed for this project was to mainly investigate 3 factors working together or in some cases, against each other. The first is “Vision” – that is – giving each agent increased vision around them. I opted to keep the Moore neighborhood design instead of picking a Von Neumann or any other design so that I can compare visions on a one-to-one ration. The neighborhood vision is extended by 3, meaning that agents can now see other agents for up to 3 patches away compared to before when they were only able to see other agents at D = 1 or in their immediate distance. the rules for the extended model essentially come down to the following:
- –Every agents has 3 attributes, initiated randomly
- –Every agent can “see” 3 neighborhoods with d1, d2, and d3 distances
- –Applied happiness point approach to calculate the agent’s similarity/homophily by considering propinquity as they move. The closer an agent is the more effect they have on your own happiness
- –Happiness point considers both similarity and dissimilarities (positive and negative points) by rewarding homophily and punishing heterogeneity.
- –Agents preferences and vision set as follows:
- Color from (d3) distance
- Color and shape from (d2) distance, and
- Color, shape, and wealth from (d1) distance
Model Extension Pseudocode
For those who’d like to see the psuedocode to program their own model, the core of the model essentially goes as follows:
//Rules //if attribute match @ d= 1 H+3 if not H-3 //if attribute match @ d= 2 H+2 if not H-2 //if attribute match @ d= 3 H+1 if not H-1 Check Propinquity/Social Sense if you are @ d=3 if (color = color) Happiness +1 if not Happiness -1 if you are @ d=2 if (color = color and shape = shape) then Happiness +4 if either (color = color OR shape=shape) Happiness +0, //Don't need to code this line since it has no effect on the point system if Neither (color=color and Shape=Shape) Happiness - 4 if you are @ d=1 if (All attributes = All Attributes) Happiness + 9 if (color = color and shape = shape and wealth != wealth) Happiness + 6 if (color = color and wealth = wealth and shape != shape) Happiness + 6 if (shape = shape and wealth = wealth and color != color) Happiness + 6 if (color = color but shape != shape and wealth !=wealth) Happiness -3 if (color != color and shape != shape but wealth = wealth) Happiness -3 if (color != color and shape = shape and wealth != wealth) Happiness -3 if (ALL attributes != ALL attributes) - 9 Add up the Happiness points/update variables Move Turtles
Model Extension Mechanics
The attributes themselves could signify any real world attribute, but to make the model easy to understand I picked color of agent, shape and then a non-displayable variable which I call wealth. I play around. The expanded Moore neighborhood creates enormous complexity to the model, but allows vision to really play an interesting role. For example in the model though agents can now see farther away than the original model, they can only see 1 out of the 3 attributes that other agents possess at D = 3, see only 2 attributes at D = 2, and see all (3) attributes at D = 1. This tries to simulate a process which we are all familiar with – that is – the process by which we learn more about our neighbors who live closer to us because we see them more often, talk to them more often, and generally interact with them at a much deeper level than neighbors that live all the way at the far edges of our neighborhood.
An example of the expanded Moore neighborhood above. The arrows represent which agents in this particular neighborhood would be acting on each other based on my logic outlined above. The top left and bottom left agents are not influencing each other’s decisions because they fall outside the outlined expanded Moore neighborhood.
The next factor I implemented in my model was the what I’m calling multi-level selection integrated with propinquity. What I mean by multi-level selection is the process of selecting a neighboring agent based on multiple levels of similarity or dissimilarity. Since vision is expanded in my model, but agents see less of other agents’ attributes based on what they can and can’t see, it would only make sense that they can only be “happy” or “sad” based on the factors that they can see. For example, if an agent is at D = 2 from another agent (let’s refer to him as Origin) then Origin can only see 2 attributes out of the the 3 that Alter (the other agent) possesses, and thus Origin can only select whether s/he likes that agent or not based on those 2 attributes, not the full 3. I’m calling that multi-level selection to keep it simple.
Multi-level selection is combined with propinquity in my model. Propinquity is the essentially the state of being close to others. We can call is proximity as well. Now traditionally, I would’ve wanted to code propinquity using networks in Netlogo where alters close to origin has a stronger relationship and thus exert more influence, but Netlogo being good for computationally non-intensive modeling wasn’t going to keep up with the computations. It was also a little bit more difficult to determine what type of network I should use, and to code it well, so I used a shortcut that applies the effects of propinquity without having to code a network in NetLogo. The way I included propinquity relies on a point system and on multi-level selection: the closer you are to Origin, the greater the impact of your similarities and differences. So if alter is at D = 1 attribute similarity/dissimilarity counts for a lot more happiness and/or unhappiness than if alter was at D = 3.
I also model the strength associated with propinquity to be 1 – non-linear and 2 – approximately power-law or exponentially distributed. I know this is a strong assumption, but I read a while back a number of peer-reviewed articles that approximate relationship strength between network actors as being non-linear and skewed, and thought it would be more interesting and closer to reality to model it that way. Thus, if you look at the pseudocode you’ll find that I award 1/-1 point, 2/-2 points, and 3/-3 points for each homophilous/heterogeneous attribute between origin and alter. This actually results in a skewed, nonlinear effect. If I wanted it linear I would award the same number of points (both positive and negative) at each level for each attribute consideration.
The model design changes when compared to the earlier mode basically look like this:
Note that I did not modify the basic movement mechanism at all, so this is all still based on random movement. However, in order to make my new system work, the “happiness” calculation mechanism had to be completely redesigned. Therefore, happiness in the original Schelling model is mechanistically different from happiness in my extended model and relies more on an aggregation of happy or sad points rather a strict one-dimensional view of happiness. When compared to reality I think that fits the structure of home buying decisions more precisely. At minimum, it’s more interesting.
Running the (Extended) Model
Let’s get to the model runs, shall we.
Below is a video of multiple runs under different conditions and a number of parameters. I also did a number of edits to the code on the fly in the video (code is available on Github) to increase or decrease the number of attributes in order to get the model to converge. You can see pretty quickly that the model rarely converges for a large number of attributes assigned to the agents. For example, if there too many color choices, shape choices, or wealth choices. The most optimal convergence parameters I found from multiple runs is that setting wealth to be at 0 or 1 (or any two static values), color to be set at most at 3 different choices, and shape to be set at 2 different choices. Anything more than that for the size and mechanics of the model rarely ever converged to an emergent, stable state (where everybody was relatively happy).
Model Extension Results
The first result is fairly obvious: It’s much harder to converge the model to a steady, emergent state – much harder. It also seemed that the extended Schelling model was a little insensitive to input parameters like how acceptable homophily levels are set for the agents. You saw that in the model runs. Also, the density of agents this time around was much more sensitive to model convergence, unlike the original model. Specifically, the model would not converge for low densities as much. This is actually relatively simple to explain: My model valued empty space a lot less than the original because it used a point system that only worked when there were others around you. I suppose I can make empty space more relevant again if I give the agents almost an unlimited vision. I can probably do that through assuming that the internet has an effect on where agents choose to settle. It’s a pretty weak assumption, but I’ve seen weaker ones in peer-reviewed papers.
When I presented this model a while ago I did get some good questions about what the system-level point systems would likely represent in real life. It’s a good question since for me the primary reason I used the point system was to avoid having to code a network in NetLogo, but there is some theoretical thought going into nonetheless. Basically, I think of the system level point system as being analogous to learned cultural behavior – in essence something like a system-wide knowledge base. In Washington, DC where I live for example everyone “knows” which areas to avoid living in and which areas are “hot” – whether the masses assessment of their neighborhoods is true or not isn’t the question here. The reason I analogize it to be something similar to culture is that there are cities where the agents are not randomly moving at all, but would only move within specified areas that are deemed, culturally to be safe for them. Think of 8 Mile road in Detroit, Michigan. North of 8 Mile road is heavily concentrated with White and south of that road heavily concentrated with Black or people of color in general (Link). The effect is so strong it can, in no way, be based only on homophily. It’s simply too strong for it to be explained by ONLY agent-level behaviors to begin with, unless you include a system level variable in there. Either way, the point-level system is not actually taken into account on the agent-level. At least not in this particular model.
Ultimately, the end result of the extension simulation is very simple. I found that propinquity, increased vision, and the multi-level homophily selection changed the dynamics of the similarity curve from some second order effect to a third order effect. (see figures below).
The best way I can explain that is to consider the effects of propinquity and the fact that at d=1 from ego/actor the strength of the negative or positive energy points (the system level dynamic) overpowers everything else, but first there needs to be some clustering for the effects to be powerful enough. That’s how you get that double curved line for similarity in the extension model. At first, none of the agents are finding the best places for them to live, but as soon as one cluster begins to form and a few agent are happy quickly being to attract others who are also seeking happiness, and because proximity counts for a powerful effects on the agents’ happiness or sadness clusters begin to form rapidly. Finally, as the best configuration of clusters begins to take shape, the similarity and thus happiness levels begin to level out. In other words, the totality of my model comes down to clusters forming.
As additional evidence of this let’s check out the positive and negative energy curves:
The black curve is the positive energy curve, and the red curve is the negative energy curve. This particular curve is a rare occurrence because as I mentioned earlier, most model runs did not converge to a stable or an emergent state. However, if you look at the relative energy levels the negative energy level never really dominates when the model converges. At best it gets close to the positive energy levels only in the first few turns of the model itself. But as soon as the clusters begin to form the positive energy levels begin to take over and are compounded by the proximity (propinquity) factor built into the model.
Some other smaller ramifications of the model included:
- Propinquity and social sense enhance the model in order to make it more real-like. Creating clumpy/clustered behavior.
- Phase shifts occur when a group of homophilous residents begin to converge on a location allowing rapid local development of an area.
- More moderate behavior still occurs as individuals choose based on a wider criteria, less susceptible to extremism
- My model much more sensitive to density
Conclusions & Final Thoughts
This was a fun and very exciting project. It took weeks of coding in Netlogo and a few weeks for analysis. I didn’t go through the verification and validation process because it would’ve made the article way too long, but a lot of validation was also conducted.
I guess the takeaway from this model is the effects of adding more stimulating variables to the original Schelling model and how much more realistic and interesting it becomes the more is added to it. Of course, this has been many times using different variations, but even so with each new mechanism or variable added we seem to get even more new insights – it’s just a testament to how rich this model really was.
I hope you enjoyed reading about this project, and feel free to check out my other projects, and comment/ share your thoughts.