issues.txt

    1. HMM Implementation Issues
	0. add in weights later    
    	1. add in batch_size later
	2. add checks on load later
        3.add five fold cv later. 
        4. there seems to be a difference between fitting from sample and fitting just generically. 
        5.  Also, the load and dump is different from loads and dumps in json. 
          1. More importantly, the ordering of the json file changes but the results of the model does not. 
        6. not complete testing but good enough. maybe use fit instead of from samples just in case
        7. the probabilities become -inf or 0 if the model hasn’t seen it before should we make count 0. 
        8. try from samples if fit is wrong. 
      1. add the one hot encoding of amino acid sequences later. 
      2. Add super class for all 3 models later. 
      3. Gradient descent on HMM
      4. HMM trials: 
	 hmm_sample_1 : 5 seconds, 0.65 mismatches, seemed to have little imrovement. clearly -n 1 means 1 job
	 hmm_sample_2 : -n 24 does not mean run 24 times had same results, seemed to fit really well..., 8 seconds 
	hmm_sample_3 : -c 24 did not run
	hmm_sample_4 : --n_jobs 24 95.33 seconds. 0.08 so took longer per run... training 0.93 mismatches...222.02 mismatche
    2. RNN implementation
      1. we can add in samples of batch size and evaluation later
      2. add in more logging abilities/step size  + learning ratechanger 
      3. a little concerning for bg samples where log probability doesn't match
      4. RNN generic python rnn_experiments runs in 70 seconds on edison and generates samples correctly, (rnn_test_sample_1),
	rnn_test_sample_2: 8 lstm layers, takes 254 seconds or 4x as long and does not converge to correct distribution 
	rnn_test_sample_4: batch size 40, runs in 35 seconds, 224 mismatches and loss is a little bit higher. but yeah... runs twice as fast. 
	rnn_test_sample_5: hidden size 800 * 4, runs in 1066 seconds or 15x slower. so its an n^2 rate slower. surprisingly/unsurprisingly increasing hidden state results in best generated sequences 
	rnn_test_sample_6: small model, hidden size 100, layers 1, loss a little higher, ran in 23 seconds. 
	5. need to check no_grad   
  3. VAE
	1. https://colab.research.google.com/drive/1Wl78KHPzQ2Q253Rob5W1o0bd8nu9DZel#scrollTo=zaCO7S0-_KNn&forceEdit=true&offline=true&sandboxMode=true	
	2. it is simple and does not contain the option to change the architecture much
	3. check if your losses are what you think they are
	4. can add more checks in evaluate to see training 
	5. vae also has shitty coding style where you do not set with torch.no_grad()
	6. hmmm seems to be different results for vae the way you code it. Around 2 off of tf implementation in test loss and a little bit off of vae in kld loss, may just be difference in parameters tho
	7. oky finally got it to converge
	8. add in weights
	9. make it more OOP by using kwargs  ƒ