-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathgarbage_collection.txt
300 lines (205 loc) · 15.5 KB
/
garbage_collection.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
Garbage Collection in .NET
Chris Maunder, The Code Project (www.codeproject.com)
A quick introduction to Garbage collection in .NET. This article and sample
demonstrates the concepts of managed classes, Weak references, generations
and Finalizers.
Background
The inclusion of Garbage Collection in the .NET runtime removes the need to track and release memory allocations. When programming in the managed environment you allocate memory on the managed heap using the new operator, but instead of deleting or freeing that memory, you simply remove all references to that memory location (eg. by setting your pointer to that memory to NULL) and let the garbage collector (GC) take care of the rest. Note that we are talking only about memory - not resources. If you create a new object on the managed heap and that object allocates resources such as handles or connections then you must ensure that that object has released it's resources before casting the element adrift to the mercy of the GC.
The traditional C runtime heap is essentially a linked list that required traversing each time a memory allocation was made. Once a suitable block of memory is located this block would be split and the memory location returned. In contrast, the managed heap in .NET contains a single pointer that always points to the next available spot in memory, and the pointer marking the top of the heap is moved accordingly when an object is allocated. There is no traversal or splitting necessary, which means allocating memory on the managed heap is almost as fast as allocating memory on the stack.
This memory allocation model works perfectly if one assumes you have infinite memory - which invariably you won't. So what happens when you attempt to allocate a block of memory at the end of the managed heap but you find that you have run out of memory? Garbage collection, of course. In a typical application you would normally be allocating and deallocating memory through the course of your application's logic. In a managed application this deallocation will not happen, but rather the memory that is no longer in use will sit there, unreferenced, until you get to the point where there is no more free memory and the GC is forced to reclaim all this unused memory.
Garbage Collection
The process of garbage collection starts with the GC assuming all memory on the managed heap is rubbish. It then lists all your application's global and static memory pointers, local or parameter variables and CPU registers containing references to objects on the heap, and then uses these objects to build a graph of all objects on the heap that are either directly or indirectly referenced by your application. Any object on the managed heap that can somehow be accessed by your application will be marked. There are optimisations in place to remove the possibility of circular memory references causing infinite loops, and to ensure that chains of references are only processed once.
Once the graph is complete the GC now has a complete picture of what is, and isn't, garbage. The GC then compacts the heap by moving non-garbage items together and then resets it 'Next available memory slot' pointer to the top of this new, compacted heap. In doing so the GC is also responsible for updating the values of all pointers into the heap, so that references to objects on the heap that have been moved are still valid.
Large objects (> 20,000 bytes) are treated a little different that smaller objects. Objects of this size are allocated on a separate large heap and when garbage collection occurs these objects are not moved around since moving blocks of memory this large can really start to slow things down.
If, after garbage collection has occurred, there is still insufficient memory for the memory allocation request, an OutOfMemoryException exception is thrown.
Generations
While memory allocation on the managed heap is fast, GC itself may take some time. With this in mind several optimisations have been made to improve performance. The GC supports the concept of generations, based on the assumption that the longer an object has been on the heap, the longer it will probably stay there. When an object is allocated on the heap it belongs in generation 0. Each garbage collection that that object survives increases its generation by 1 (currently the highest supported generation is 2). Obviously it's faster to search through, and garbage collect a subset of all objects on the heap, so the GC has the option of collecting only generation 0, 1 or 2 objects (or whatever combination it chooses until it has sufficient memory). Even while collecting only younger objects the GC can also determine if old objects have references to new objects to ensure that it doesn't inadvertently ignore in-use objects.
System.GC contains the garbage collection object, and there are a number of static methods you can use to have direct control over the process.
void GC::Collect() invokes the GC for all generations, while void GC::Collect(int Generation) invokes it only up to and including the generation you specify.
The maximum supported generation can be found by querying GC::MaxGeneration, and if you are curious as to what generation your object currently resides in you can call int GC::GetGeneration(Object* obj).
You would normally leave the GC alone to do its own thing, but if you are aware that your application is about to start needing huge chunks of memory quickly, or you know that you have finished using a ton of memory and you are having a little idle time, then it may be worthwhile to give the GC a hint that its services are needed.
If you are ever curious as to the number of bytes currently thought to be allocated, just call
_int64 GC::GetTotalMemory(bool forceFullCollection);
The parameter forceFullCollection determines whether or not the function should wait for a GC to occur before reporting the amount of memory.
Weak References
You may need to allocate an object on memory, but then only need to reference that object occasionally. It would be nice if you could keep track of that object, but release it to the GC if memory started getting a little tight. This can be achieved using Weak References. Essentially you allocate an object, create a weak reference to that object and then remove any direct references you have to that object so the GC can claim it if it needs to.
Weak references are created using the WeakReference class, and then accessed by recreating a strong reference to the object by accessing WeakReference::Target. Once you are done with the object you can then again release it as a weak reference.
// Create your managed object
MyObject* myObj = new MyObject();
// Create a weak reference
WeakReference* weak = new WeakReference(myObj);
// Remove your direct (strong) reference
myObj = 0;
...
// Now do all accessing of your object via the weak reference
object myObj = weak->Target;
if (myObj)
{
// The object still exists - We can use it!
...
// Release all strong references again. Use weak->Target to grab it again later.
myObj = 0;
}
else
{
// The object is shark bait. Time to create another and use it instead :(
}
The other option in checking if a weak reference is still valid is to query the WeakReference::IsAlive property.
So how do you maintain a reference to an object (even a weak reference) and still allow the GC to collect it if the GC is invoked? When a weak reference is made, an entry into a weak reference table is made which holds the location of the object (this is what your object pointer's value will be). This table is not taken into account when the GC walks the applications objects, but after the GC makes a list of all garbage objects, it looks up the weak reference table for all entries that point to garbage. These entries are set to NULL, and the objects reclaimed, and so the next time you try and access the object via the weak reference you will get back NULL.
Memory Allocation
One important point with memory allocation on the managed heap is that when you allocate successive memory blocks you are assured that you will receive successive memory locations. By having related objects physically close to each you get the advantage of fewer page faults and a higher likelihood that the objects will be in the cache. This is a nice performance booster.
To actually declare and create an object that is accessible by the garbage collector you need to prefix your class declaration by the __gc keyword. For example:
__gc class CMyGCClass
{
int n;
};
This will declare a class CMyGCClass and mark it as managed, meaning that it must be declared on the managed heap, and will be subject to garbage collection. To create the object you need to create it on the heap using the new keyword:
CMyGCClass *gc = new MyGCClass();
To release the object simply set its reference to NULL
gc = 0;
Finalizers and Dispose, and resource management
A managed C++ class can implement a destructor, but that destructor isn't called automatically. The analogous method to a destructor in a managed object is the Finalize method.
protected: virtual void Object::Finalize();
If implemented, this method may be called by the garbage collector at some undetermined time. You can't rely on it being called at the time you release your reference to the object. This means that resources that your object may have allocated (eg Handles or database connections) may be held until after the program ends (unless you call GC::Collect(), which is using a rather large hammer to solve a simple problem). Implementing a finalization method also promotes the object to a higher generation, meaning that they may stay in memory longer than necessary. Finalization is also not guaranteed to occur for objects that are still alive when the application exits (this allows the application to shutdown faster). Resources will be reclaimed, but not gracefully. Also, finalizable objects take longer to allocate.
Instead of implementing a Finalize method you are encouraged to implement a Dispose method (in Beta 2 you should inherit from IDisposable)
public virtual void Dispose();
This method should take care of all resource deallocation and also call GC:SuppressFinalization(this) which instructs the runtime not to call Finalize on your object since you have already done all cleanup. There is a cost associated with having Finalize called, so if you can avoid it, do so.
The pattern recommended is to declare a Dispose method (deriving from IDisposable in Beta 2) and call this when you want to free your object's resources.
CMyClass *mc = new CMyClass();
mc->DoSomething();
mc->Dispose();
mc = 0;
The above code snippet allocates an object, uses the object (possible allocating resources along the way) then forces the object to clean up in a deterministic way before being released to the whims of the GC. You have complete control over all resource management, and leave the managed heap management to the GC.
Implementing such a class is done as follows:
__gc class CMyClass: public Object, public IDisposable
{
public:
CMyClass()
{
m_bDisposed = false;
}
virtual ~CMyClass()
{
// This will never be called
}
public:
// You are encouraged to not allow a disposed object to be reused
void MyMethod()
{
if (!m_bDisposed)
{
// do something
}
else
{
// throw an exception
}
}
public:
// In case you would like to call 'Close' instead of 'Dispose'
void Close()
{
Dispose();
}
// defined in IDisposable in Beta2 - or just declare it yourself
void Dispose()
{
if (!m_bDisposed)
{
m_bDisposed = true;
// Free all resources
GC::SuppressFinalize(this);
}
}
protected:
virtual void Finalize()
{
Dispose();
Object::Finalize();
}
bool m_bDisposed;
};
You are strongly encourage to disallow the use of an object once it has been Disposed. Note in the above code that we have provided both a Finalize method which will be called automatically, and a Dispose method that must either be called by the user, or called via a containing object. In the Dispose method there is the line
GC::SuppressFinalize(this);
This instructs the GC to not call the Finalize method. We have already cleaned up and so don't want to incur the overhead of a call to Finalize.
Example Code
Below is a simple code snippet that demonstrates some of the main points of garbage collection:
Console::WriteLine("Garbage Collection demonstration\n");
__int64 TotalMem = GC::GetTotalMemory(true);
Console::WriteLine("Currently used {0} bytes of memory", TotalMem.ToString());
// Wanton disregard for all the things we've been taught
Console::WriteLine("Creating a ton of junk");
for (int i = 0; i < 10000; i++)
new String("Woohooo!");
TotalMem = GC::GetTotalMemory(false);
Console::WriteLine("Have now used {0} bytes of memory", TotalMem.ToString());
GC::Collect();
TotalMem = GC::GetTotalMemory(false);
Console::WriteLine("After GC, used {0} bytes of memory", TotalMem.ToString());
// -------------------------------------------------------------
// Demonstration of Generations...
Console::WriteLine("\nDemonstration of Generations\n");
String* str = new String("This is a string");
Console::WriteLine("We have created the string '{0}'", str);
// How old is it?
int nMaxGen = GC::MaxGeneration;
int nGen = GC::GetGeneration(str);
Console::WriteLine("The object's generation is '{0} (max {0})'",
nGen.ToString(), nMaxGen.ToString());
// Let's make it older and wiser
Console::WriteLine("Garbage Collecting...");
GC::Collect();
nGen = GC::GetGeneration(str);
Console::WriteLine("The object's generation is '{0} (max {0})'",
nGen.ToString(), nMaxGen.ToString());
Console::WriteLine("Garbage Collecting...");
GC::Collect();
nGen = GC::GetGeneration(str);
Console::WriteLine("The object's generation is '{0} (max {0})'",
nGen.ToString(), nMaxGen.ToString());
Console::WriteLine("Garbage Collecting...");
GC::Collect();
nGen = GC::GetGeneration(str);
Console::WriteLine("The object's generation is '{0} (max {0})'",
nGen.ToString(), nMaxGen.ToString());
// -------------------------------------------------------------
// Demonstration of Weak References...
Console::WriteLine("\nDemonstration of Weak References\n");
Console::WriteLine("Creating a weak reference.");
WeakReference* weak = new WeakReference(str);
str = 0;
if (weak->IsAlive)
{
str = (String*) weak->Target;
Console::WriteLine("Object is alive. [{0}]", str);
str = 0;
}
else
Console::WriteLine("Object is gone");
GC::Collect(0);
if (weak->IsAlive)
{
str = (String*) weak->Target;
Console::WriteLine("Object survived GC of generation 0. [{0}]", str);
str = 0;
}
else
Console::WriteLine("Object is gone (collected with Gen 0)");
GC::Collect(1);
if (weak->IsAlive)
{
str = (String*) weak->Target;
Console::WriteLine("Object survived GC of generation 1. [{0}]", str);
str = 0;
}
else
Console::WriteLine("Object is gone (collected with Gen 1)");
GC::Collect(2);
if (weak->IsAlive)
{
str = (String*) weak->Target;
Console::WriteLine("Object survived GC of generation 2. [{0}]", str);
str = 0;
}
else
Console::WriteLine("Object is gone (collected with Gen 2)");