-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes.txt
182 lines (117 loc) · 7.71 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
August 28 August 28 August 28 August 28 August 28
the scipy.signal.resample function works fantastic for getting a weekdays signal of 480 into 120 for clustering
clustering cetners have been found, need to optimize this clustering and design the metric for evaluating the actual generation of signals... want to email chad as tao sun was not helpful, that paper isn't clearly written enough to copy their methods, the code is also in r.
August 24 August 24 August 24 August 24 August 24
k means algorithm has some randomness to it so I'll choose the best one out of 100 for all Ks from 4 to 200
again, DBSCAN underperforms, will see variance in single k's dbi before deciding if DBSCAN is entriely out of the question
tested dbi 50 times on the same clusters, there is NO randomness here. K means to run 80 times from 5 to 300. each run took 5 minutes.
August 23 August 23 August 23 August 23 August 23
Now to look at the clustering, again we use the paper's clustering [see ClusterSuccess]:
[there may be issues in the sleep load duration as it used the two-day arrays]
Near-Peak load: 97.5 (no, 95) percentile of daily load, denoted as
Near-Base load: 2.5 (no, 10) percentile of daily load, denoted as
High-load duration: Duration for which load is closer to near-peak load ( than to the near-base ( i.e., the duration of
Rise time: Duration for load to go from base load to the start of the high-load period
Fall time: Duration for load to go from the end of the high-load period to the base load
August 5 August 5 August 5 August 5 August 5 August 5
doubled training set for discriminator as well as epochs, performance not better on original
testing, now setting off large 6 gen, 8 disc, arch search with double disc (DD) GAN training
scheeme. best architectures:
['DDConv12 16 24 32_c4 c4 c4 f d48 d18 d9.png',
'DDConv12 16 24 32_c4 c4 f d48 d18 d9.png',
'DDConv16 16 16_c4 c4 f d24 d24 d24 d12.png']
August 3 August 3 August 3 August 3 August 3
convolutional model resolved, much smoother outputs
July 15 July 15 July 15 July 15 July 15 July 15 July 15
lost all July 1st work. redoing successful GANs,
-remove weekends, normalize
alright, the sample GAN has been edited, now some learning is happening
DISC =
model.add(Dense(24*2, activation='relu', kernel_initializer='he_uniform', input_dim=n_inputs))
model.add(Dense(18, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(18, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(9, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
GEN =
model.add(Dense(16, activation='tanh', kernel_initializer='he_uniform', input_dim=latent_dim))
model.add(Dense(16, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(24, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(24, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(n_outputs, activation='linear'))
there are very noisy signals generated, but a very rough day and night pattern could be discerned
now fixed plots, trying
DISC=
model.add(Dense(24*2, activation='relu', kernel_initializer='he_uniform', input_dim=n_inputs))
model.add(Dense(18, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(9, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
GEN=
model.add(Dense(16, activation='tanh', kernel_initializer='he_uniform', input_dim=latent_dim))
model.add(Dense(16, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(24, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(n_outputs, activation='linear'))
and
DISC=
model.add(Dense(24*2, activation='relu', kernel_initializer='he_uniform', input_dim=n_inputs))
model.add(Dense(18, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(1, activation='sigmoid'))
GEN=
model.add(Dense(16, activation='tanh', kernel_initializer='he_uniform', input_dim=latent_dim))
model.add(Dense(24, activation='tanh', kernel_initializer='he_uniform'))
model.add(Dense(n_outputs, activation='linear'))
In conclusiong, the deeper and shallower models weren't great, the medium model returned some of the best results so far.
LAST GOOD MODEL SAVED UNDER 'testSaveAll'
July 2 July 2 July 2 July 2 July 2 July 2 July 2 July 2
testing with double the training data and twice the epochs for the discriminator in each evaluation as
compared to the generator, results may still be noisy
next steps MUST include:
- mechanism for storing loss functions
- mechanism for storing models
- somethign nice to show Professor El Gamal
additionally, potential paths forward MAY include:
- lowpass filter of noisy generated data
- conv layers in gen and disc architectures
- additional input to discrimantor with fourier transform of signal and/or 1st and 2nd differences
- predicting multiple days
- predicting from whole cluster OR multiple buildings
- using true random data in latent space, rn we used normalized sequences of data, where we only produce one,
which means model might not be optimal
July 1 July 1 July 1 July 1 July 1 July 1
can't use regular latent space and feed random noise, must give it data and ask it to predict the day that
follows
recall we have 2018 data and particularly june july august data, june 1st 2018 was a Friday, so we'll remove sat sun
for a weekday data set, btw, jan1 2018 was a monday
will attempt to predict one day with three days
THERE iS LEARNING 104AM July 2nd !!!!
now to improve batch size
nbatch = 512, 31s for 500 epochs
nbatch = 256, 31s for 500 epochs
nbatch = 256, 29s for 500 epochs
nbatch = 64, 27s for 500 epochs
^^NEEDS REVIEW^^
SUCCESFUL learning on basic architecture, though at 20k epochs, results are noisy and discriminator
accuracy is 64 and 86 for the real and fake data respectively
now, disc training as much as generator, tomorrow, disc will train on twice the data and get two looks at it while generator does the standard batch
June 15 June 15 June 15 June 15 June 15 June 15
normalized kmeans for tscore tmax max
as dbscan took an L and non normalized had sub10 clusters
with 4 clusters
0 has 96 elements
1 has 119 elements
2 has 38 elements
3 has 15 elements
same for tmax and max
tons given the noisy label -1
with 3 clusters
0 has 22 elements
1 has 202 elements
2 has 44 elements
https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-a-1-dimensional-function-from-scratch-in-keras/
is the GAN tutorial I first looked at
Feb 28 Feb 28 Feb 28 Feb 28 Feb 28 Feb 28 Feb 28 Feb 28
made method to reliably download, store and retrieve regions form the OEDI
made method for creating a top month set of the data, 30 first days of month with top energy demand
initiated DBSCAN clustering in tracy residential data
made method for plotting in 3 dimensions, allowing visualization of generated clusters
clustering in the (median, stdev,max) space has some success, likely separates out the noisy samples from the mainstream
clustering from the raw data doesn't work yet even with eps = .005