# brain-readme.txt # # Project: SndLib # # Demand matrices from measurements in the BRAIN research network in berlin ************* Content: ************* 1. Origin 2. Topology 3. Creation 4. Format 5. Remarks *********************** 1. Origin *********************** The dynamic demand matrices contained in the archives - directed-brain-1min-over-7days.tgz (xml files) - directed-brain-1min-over-7days-native.tgz (native format files) - directed-brain-1h-over-375day.tgz (xml files) - directed-brain-1h-over-375day-native.tgz (native format files) are calculated from real-life traffic data from the BRAIN research network in berlin, see https://www.brain.de/. The traffic was measured on each link of the network in the units of bits per second. For the first set of demands, the original data was taken in 1min steps starting on 08.03.2013 at 14:52 and ending on 15.03.2013 at 08:56. There are 9723 traffic files. For at least one timestamp there is no demand file and maybe there are more gaps, see "Remarks" section. The second set of demands lasts from 05.03.2012 15:00 until 15.03.2013 08:00. There are 8993 demand files, maybe containing gaps. The network topology was available, but no capacities, costs or coordinates are given. *********************** 2. Topology *********************** The network contains 9 backbone nodes, which are connected in a connected s.t. there are multiple independent paths from each node to each other node. At each backbone node, there are connections to multiple regional nodes which are the source and target of each demand. There are 152 regio nodes in the network. Each regio node has a unique backbone node, it is connected to and there are no connections between any regio nodes. *********************** 3. Creation *********************** Demands: To create point-to-point demands for each time stamp, which are valid in the sense that routing the created demands in the network may lead to the given traffic value on the links, the following steps had been made: 1. For each node, some coordinates spread across germany are chosen to varify that the produced pictures look nice. 2. Since each backbone node measures all traffic on each of its connected links, there are two traffic value for a backbone link, one for each backbone node of the connection. Since these values don't have to be equal, the mean value is used. 3. For each regio node the ingoing and outgoing traffic is known, since this is equal to the traffic on the corresponding links to the backbone node. Based on this an ideal demand goal value is calculated for each pair of different regio nodes. This goal for the demand from regio1 to regio2 is calculated by: outgoingTrafficOfRegio1*ingoingTrafficOfRegio2*alpha The factor alpha is chosen in a way, such that the following holds: sum of traffic on all ingoing links of regio node = sum of all demand goals 4. To calculate the demands the following LP model has been solved to optimality: For each regio-to-regio demand a set of path based flows is calculated. Then the following values were calculated, forming the objective of the optimization: a: average relative difference between the demand goal and the flow on all paths for each pair of regio nodes b: average relative difference between the traffic on a backbone-to-backbone link and the sum of flows on that link c: average relative difference between the traffic on a regio-to-backbone link and the sum of flows on that link The objective was a+5*b+10*c, resulting in 24 out of 9723 optimal values in the range of 1 to 7 and the rest smaller then 1. 5. Since the resulting regio-to-regio demands are fractional, the demand values are rounded to integer values, since solving the corresponding model with an integrality constraint would have been taken much more computation time. All resulting demands with value zero have been removed. This results in the given directed demands (without loops). Cost: The given costs in the native format file are based on the model of huelsermann et al from the paper "Cost modeling and evaluation of capital expenditures in optical multilayer networks" from 2008. Using the values from table 2, it is possible to build an 1G and 10G ethernet link having costs of 2 according port with 40km reach and corresponding the partial costs of a needed port card and a basic node. These value is multiplied by 1000 to avoid highly fractional numbers. *********************** 4. Format *********************** For the new multiple demand matrix archives we decided to NOT introduce a new XML scheme nor data format but to use the existing SNDlib formats. This means that also all the available code (parsing/writing) can be used for the multiple matrices. A single demand matrix in the multiple matrices archive is just a Network object without a link section, that is, it consists of nodes and demands between the nodes. It follows that the Network parser/writer available in the SNDlib API can be used to parse/write a single demand matrix. The node sections for all single matrices in the brain archive are of course identical and correspond to the brain sndlib network. In addition to a node and a demand section a single demand matrix also has a Meta-Section giving additional information about the matrix such as the time stamp, the time horizon, the origin, and the data unit. The new SNDlib API 1.3 is able to handle this (optional) Meta-Section. *********************** 5. Remarks *********************** We do not give any warranty for the correctness of the data. There might be mistakes already in the original accounting data. We might also have made mistakes in the creation of the data. NOTICE: THERE ALWAYS IS ONE demandMatrix*.xml FOR EVERY TRAFFIC FILE IN THE ORIGINAL DATA NOTICE: THERE IS NO DEMAND FILE FOR TIME 14:56 on 08.03.2013 in the 1min demand set