The Phersu Atlas Model
Phersu Atlas is an interactive historical atlas wich offers the opportunity to observe the geopolitical changes that have marked the world, day by day, in a time span between 3,499 B.C. and today. For the purpose of a partial reference bibliography we thought it would be appropriate to explain the methodology we used. Phersu Atlas is first and foremost based on the algorithmic derivation of historical data; then, at a later stage, on the verification and correction of such material by a series of supporting scientific texts.
In the initial Data Search Phase, we collected data inherent in spatial changes that have occurred on all continents from 3,499 B.C. to the present. We wrote a web-scraping algorithm to obtain two basic lists from Wikipedia, based on a set of input categories (these two files respectively contain a list of all states mentioned there and a list of wars). We then divided the work into a series of identical tables, divided by region and period, and proceeded to carefully verify the historiographical information contained there. We used a reference bibliography to check for states that were not already in the lists of wars and states. All the files were eventually combined into a single list and, using a code in VBA, initial qualitative and logical checks were made.
In order to achieve a realistic and high-quality result for very complex situations, in which there were many states often of dubious independence, we treated such cases separately. This is the case with the Germanic Confederations, the Greek city-states and, in Chinese history, the Zhou Era (12th-8th centuries B.C.), the so-called Period of Springs and Autumns (722-481 B.C.) and that of the Warring States (453-221 B.C.). As for the various successive Germanic confederations in German history, we extrapolated a list of all candidates for effective states from a number of specific sources and eliminated a portion of the proposed states, on the basis of certain logical assumptions, particularly on the basis of size. In addition, enclaves and exclaves were treated in a simplified manner, relying at this particular juncture on the level of detail adopted by Gustav Droysen in his maps, based on a ratio of 1:3,400,000. Spatial changes were then documented in a table according to this logic: in ordinate the spatial units, in abscissa the time. In this way it was possible to check for gaps and errors, comparing it with the reference maps from time to time.
At this point we consolidated the Input Data and created Supporting Data: we gathered all the material thus put together and, once we had identified all the map files, georeferenced them. We then hand-drew all regions subject to spatial changes. We also took into account, if there was certain mention of them in the supporting sources, changes, which had occurred over time, in the coastline. All drawings thus obtained were processed, applying Mercator's modified center-graphic Cylindrical Projection (as that used in every programming library, as well as in software such as OpenStreetMap and Google Maps). The individual drawings were saved in GeoJSON format, which has the advantage of being convertible to any other projection in case of future need. The structure of the table regarding spatial changes included numerous columns, including those for start state, finish state, description of the event that brought about the spatial change, area affected by the territorial change (in the form of a polygon in GeoJSON format or a point with precise geographic coordinates), war or front of the war of which the territorial change is a part. All lists, at this point, have been unified. The polygons of the models described so far and their territorial modifications were in turn added to the unified list. That unified list and its related files, at this point, included about 100,000 spatial modifications, 10,000 states, 10,000 georeferenced maps and as many files in GeoJSON format. To conclude, the rows were sorted in chronological order.
We wrote an algorithm that computes states based on spatial changes and saves both changes and state-by-date combinations (e.g., Italy from time t to time t+1 and Italy from time t+1 to time t+2, etc.) in a database. The algorithm processes one row at a time, " transferring " the affected region from the starting state to the target state. The Algorithm includes a number of advanced functions: a) it tests the logic of spatial changes b) if the region involved is a point, it calculates the shortest route to the point, and the affected region will then be, as a result, the point itself with its surroundings c) there are a number of complex functions to deal with the priority of sources or resolve doubtful situations d) the calculation of individual polygons in this case is fully automated by a loop (the code calculates all delta polygons). At the end of this phase, all spatial changes of states and coastline, along with all versions of states that existed from 3500 B.C. onward (on a daily basis) were saved in a database. After manual verification, a unique list of wars and states is added to the database. The states are finally detailed according to the unique name-parent-cluster logic.
Once the model was completed, a special algorithm was written to color it: it colors neighboring states with different colors, assigns dependent states the same color as the parent state, uses the same color over time not only for the same state but also for the entire cluster (except during periods when the state is controlled by others), and finally, assigns a second color to all the states that are part of the Holy Roman Empire according to their category (imperial, ecclesiastical or independent). Also by algorithmic means, a list of regions with unclear control due to revolts, coups, or revolutions was created. It was decided to make this additional calculation to separately take into account situations where territorial control is doubtful.
For the Demographic Calculation, whose main reference data were those of the Maddison Project and the World Bank, several input files were created, with a total of 295 territories. At this point, the population per year for each second-level subdivision was calculated algorithmically, using as a reference the percentage of population given by the density of the individual subdivision divided by the population of the total territory. The population of the states in the model on Jan. 1 of each year was then calculated, comparing the territory at that time of the state with the territorial subdivisions and their population for that year. The area of the intersection for each second-tier subdivision with the state subject to the calculation was used as the weight to calculate the percentage of that subdivision's population to be assigned to the state subject to the calculation.
In the Database Enrichment Phase, algorithms were used to obtain three results: a) the polygons were converted into less detailed versions that could be used more easily in the software b) statistics deriving from the data calculated in the previous points were calculated and geographical polygons for each date of the Holy Roman Empire, Confederation of the Rhine, German Confederation, and North German Confederation were calculated based on a membership list c) these statistics and all data were consolidated into one database.