forked from Framstag/libosmscout
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathResources.txt
145 lines (125 loc) · 4.44 KB
/
Resources.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
Import of germany.osm (~11G) from http://download.geofabrik.de/osm/.
As one can see the following steps are the relevant, most time-consuming
steps:
* Calculation the objects in areas index. "object in areas" calculation must be
improved by improved the "point is area" algorithm.
* Generation of ways.dat file. This is time consuming because the nodes of a way
are resolved. Since data is too big to keep everything in memory and
on-disk algorithm must be used, repeately scanning data files. Instead of this
the node index file could be used, but we need some performance tests first.
egrep -h "(^\+ Step|^ =>)" <logfile>
+ Step #1 - Preprocess...
=> 128.146 second(s)
+ Step #2 - Generating 'rawnode.idx'...
=> 32.002 second(s)
+ Step #3 - Generating 'rawway.idx'...
=> 8.849 second(s)
+ Step #4 - Generate 'relations.dat'...
=> 67.382 second(s)
+ Step #5 - Generating 'relation.idx'...
=> 1.530 second(s)
+ Step #6 - Generate 'nodes.dat'...
=> 25.975 second(s)
+ Step #7 - Generating 'node.idx'...
=> 1.052 second(s)
+ Step #8 - Generate 'ways.dat'...
=> 752.376 second(s)
+ Step #9 - Generating 'way.idx'...
=> 17.741 second(s)
+ Step #10 - Generate 'area.idx'...
=> 252.107 second(s)
+ Step #11 - Generate 'areanode.idx'...
=> 5.262 second(s)
+ Step #12 - Generate 'region.dat' and 'nameregion.idx'...
=> 774.346 second(s)
+ Step #13 - Generate 'nodeuse.idx'...
=> 229.044 second(s)
+ Step #14 - Generate 'water.idx'...
=> 29.784 second(s)
=> 2325.608 second(s)
The following files have been generated (du --total -h *.idx *.dat):
67M area.idx
8,0M areanode.idx
1,5M nameregion.idx
3,2M node.idx
88M nodeuse.idx
98M rawnode.idx
15M rawway.idx
204K relation.idx
20K water.idx
15M way.idx
4,0K bounding.dat
31M nodes.dat
730M rawnodes.dat
22M rawrels.dat
331M rawways.dat
21M region.dat
110M relations.dat
1,2M wayblack.dat
635M ways.dat
2,2G insgesamt
Of these the following are necessary at application runtime:
67M area.idx
8,0M areanode.idx
1,5M nameregion.idx
3,2M node.idx
88M nodeuse.idx
204K relation.idx
20K water.idx
15M way.idx
4,0K bounding.dat
31M nodes.dat
21M region.dat
110M relations.dat
635M ways.dat
Where the nodeuse.idx currently is only interesting for the internal routing
solution. However some code change is required to do not make the database
class require it. The water.idx is not finished and will likely increase
in future (not not in a way that it changes the overall calculation
drastically).
So the overall size required on disk for a gemrany map without routing
would be:
du --total -h area.idx areanode.idx bounding.dat nameregion.idx node.idx \
nodes.dat region.dat relation.idx relations.dat water.idx way.idx ways.dat
68M area.idx
8,5M areanode.idx
4,0K bounding.dat
1,5M nameregion.idx
3,2M node.idx
31M nodes.dat
21M region.dat
216K relation.idx
114M relations.dat
20K water.idx
16M way.idx
646M ways.dat
907M insgesamt
Memory usage:
Open the germany map, search for Bonn, search for Dortmund
(top -b | grep <processname>):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11244 tim 20 0 1031m 81m 34m S 0 2.7 0:05.38 lt-TravelJinni
11415 tim 20 0 975m 79m 36m S 0 2.6 0:09.19 lt-OSMScout
Most of the memory useage however comes from opening the data files using memory
maped files.
The same, if no memory maped files are used:
9276 tim 20 0 219m 127m 13m S 0 4.2 0:02.97 lt-TravelJinni
This is the result of the internal statistics dump. Memory usage is rather
high (but was much higher in the past). There is not yet any real measurement
regarding optimal/minimal cache sizes, so possibly by just reducing the cache
sizes some memory can be saved.
Problem is not the in memory cache for nodes, ways/areas and relations but the
various indexes. The currently biggest index is the area node index and should
be next target for optimizations. After that most memory saving can be gained by
improving the NumericIndex that is used for accessing all date files. Possibly
using caching to hold only used index pages in memory is possible here, too.
nodes.dat entries: 330, memory 28000
Index node.idx: 113 entries, memory 4469
ways.dat entries: 1000, memory 121600
Index way.idx: 192 entries, memory 1800
relations.dat entries: 884, memory 79392
Index relation.idx: 268 entries, memory 1868
area.idx entries: 1000, memory 316888
areanode.idx entries: 385, memory 27560
AdminRegion size 77942, locations size 0, memory 2182376
WaterIndex size 18795, memory 18795