-
Notifications
You must be signed in to change notification settings - Fork 1
/
logbook_first_half.txt
1736 lines (1197 loc) · 60.6 KB
/
logbook_first_half.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
--------------------------------------------------------------------
In the folder, /lunarc/nobackup/projects/snic2020-6-41/salma-files/ImageProject/
I have the following folders:
1) Courses (The material of the course I have taken (e.g., FMAN45, DNA sequening, ...))
2) Game (contains Sonja's annotation and main and updated version of the game)
In the Sonja's annotation folder, I have some scripts for extracting labels from sql files, remove the name and convert them to training databases,...)
3) selected_images_folder (contain the raw, 8-bit png, 16-bit png files of 89099 selected images (Very precious dataset))
4) master_notebooks_python_sripts (all the master projects files and folders including unsupervised analysis and notebooks)
5) objects (Contains the objects from Jon's script for all channels in png form)
--------------------------------------------------------------------
June 22, 2020
I am working on the final version of the game.
1. I am going to add so many examples for practicing and help part.
2. I separated the Sonja's annotated imageand add them to help pages of the game.
3. I will add counter to the number of images that people have annotated.
--------------------------------------------------------------------
June 23-26 working on the game and individual study plan.
--------------------------------------------------------------------
June 26
1. Help part done.
2. Counter done.
3. Images and labels are bigger now and easier to annotate.
Game folders are now two folders called:
Web_game_26_June_CompleteHelpntain 500 image for eacch channel)
Web_game_26_June_10000each (cin 10000 image for each channel)
4. Upload new versions for master students
--------------------------------------------------------------------
June29
1. Completeing the individual study plan (finish)
2. working on students feedbacks about game (Annie's feedback)
--------------------------------------------------------------------
June30
1. Working on how to add lightning text on the labels by only put mouse over them
2. solve the problem by adding title attributed to label checkboxes
--------------------------------------------------------------------
July1
I took an off day
--------------------------------------------------------------------
July2
NLP meeting
1. finish the help part and lightening text over the labels
--------------------------------------------------------------------
July3
1. Finish training part of the game based on Sonja's previous annotations
2. Game finished with sqlite databases
--------------------------------------------------------------------
July6
1. Eugloh summer school started
2. I received an email from lunarc people as follows for gpu problem:
1) you cannot use "interactve" in a batch job
2) there are 2 partiotions (queues) with gpus
-p gpu
-p gpuk20
the first one is quite loaded so you must be prepared to wait.
if you use the lu partition (-p lu) you will never get a gpu.
3. working on gpu problem solving in lunarc
--------------------------------------------------------------------
July7
Second day of Eugloh school
here are the link of three good lectures on
1)biomedical image_processing
https://www.youtube.com/watch?v=SGKej5ZovVI
2)Patient iPSC-derived brain cells as a precision model for stratifying cellular phenotypes and
developing therapies
https://www.youtube.com/watch?v=hgBSPd8xxwY
I uploaded new game for Sonja-- I need accurate labels.
I correct the commands for using lunarc's gpus and update students
--------------------------------------------------------------------
July 8
Unsupervised Machine Learning for Gene Expression Analysis - Part 1 (Pedro Gabriel Dias Ferreira)
https://www.youtube.com/watch?v=MY88Jz4f8lU
Unsupervised Machine Learning for Gene Expression Analysis - Part 2 (Pedro Gabriel Dias Ferreira)
https://www.youtube.com/watch?v=B0109uFoT_I
Meeting with Sonja
Schedule until 1st of August
1) Game and documentation
2) Writing manuscript about game(Only outline of the paper)
3) find journal where to submit? (Bioinformatics...dataset of images?..
4) Create Annoatate agreement table (50 images)
5) Solve GPU problem of lunarc and Kebnekaise
6)Take the credits of EUGLOH summer school
------------------------------------------------------------------------
July 9
Summer school lectures on Economy and epidemiological aspects of COVID-19
A good lecture from Anders Widell a virologist from LUND univerity
SARS-Cov-2 And COVID-19 (Joakim Esbjörnsson, Anders Widell)
https://www.youtube.com/watch?v=LUOInNx4q_Q
---------------------------------------------------------------------------
July 10
Summer school was finished.
Test is done.
A good lecture on Molecular biology and
immunology of the SARS CoV-2 infection
link: https://www.youtube.com/watch?v=_wjLK4_csOs
---------------------------------------------------------------------------
July 13
Shared 89088 8-bit png images on snic2020-6-41 with students
Finish gpu tutorial and shared with students(does not work on lunarc)
start working with kebnekaise (login through terminal, thinlinc, ...)
through thinlinc: server: kebnekaise-tl.hpc2n.umu.se
through terminal : domain: ssh yourusername@abisko.hpc2n.umu.se
or ssh yourusername@kebnekaise.hpc2n.umu.se
Solved mariam's problems with Game
---------------------------------------------------------------------------
July 14
Trying to solve gpu and torch.cuda problem in lunarc and kebnekaise (seems unsolvable :(()
---------------------------------------------------------------------------
July 15, 16
I tried to run three different scripts on lunarc and make a connection to gpus
1) The first one was NER_by_Flair_NCBI.py
################################################
I tried the following job, first. However, It is still pending after 48 hours
#SBATCH -A lu2020-2-10
#SBATCH -p gpu
#SBATCH --gres=gpu:2
#SBATCH -n 1
#SBATCH --mail-user=sa5202ka-s@student.lu.se
#SBATCH --mail-type=END
#SBATCH -J Flair_model_on_NCBI_disease
#SBATCH -t 40:00:00
#SBATCH -o NCBI_disease.out
#SBATCH -e NCBI_disease.err
#SBATCH --mem-per-cpu=11000
python3 ../notebooks/python-scripts/NER_by_Flair_NCBI.py > NCBI_log.txt
****** The good news is after it started, it used gpus and it took only 00:59:22 minutes
to run while it took 13 hours on cpu.
In this case The devices was shown as
Device: cuda:0
It means that this if became true finally:
if torch.cuda.is_available():
################################################
Then I tried the following:
#SBATCH -A lu2020-2-10
#SBATCH -p gpuk20
#SBATCH -n 1
#SBATCH --mail-user=sa5202ka-s@student.lu.se
#SBATCH --mail-type=END
#SBATCH -J Flair_model_on_NCBI_disease
#SBATCH -t 40:00:00
#SBATCH -o NCBI_disease.out
#SBATCH -e NCBI_disease.err
#SBATCH --mem-per-cpu=11000
python3 ../notebooks/python-scripts/NER_by_Flair_NCBI.py > NCBI_log.txt
#################################################
The run has started but again it was on CPU mode for the following part of code (It didnot run on gpu!!):
if torch.cuda.is_available():
device = torch.device('cuda:0')
print('gpu')
else:
device = torch.device('cpu')
print('cpu')
#################################################
I tried interactive mode by the following command in terminal:
interactive -A LU 2020-2-10 -p gpu --gres=gpu:2 -t 1:00:00
It is still pending after 48 hours.
and the following one also didn't work, although it started immediately for one hour.
interactive -A LU2020-2-10 -p gpuk20 -t 1:00:00
###############################################################################
The second and third scripts were the following scripts and I got different errors for each of them.
2) /snic2020-6-41/salma-files/NLPProject/Flair/jobs/gpu-test.py
It is for testing numba which is a jit compiler but I was not successful.....
In the jupyter notebook snic2020-06-41/salma-files/NLPProject/Flair/Regex_cuda_test.ipynb I have some notes on numba and jit and.......
3) /snic2020-6-41/salma-files/ImageProject/Courses/FMAN45/L14_files/torch_mnist_cuda.py
I could run this on Marcus's system.
The code has the following part to copy the data on gpu:
# Load network and send to GPU
c = ConvNet()
print(summary(c, torch.zeros((1,1,28,28))))
c.cuda()
*****I received the following error while I was trying to run it on cpu in lunarc
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
And I am waiting for gpu results on lunarc
While I was trying to run NER_by_Flair_NCBI.py on marcus system I got the following error:
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.22' not found
(required by /mnt/fastdisk/BioNLP/anaconda3/lib/python3.7/site-packages/scipy/fft/_pocketfft/pypocketfft.cpython-37m-x86_64-linux-gnu.so)
by typing the following command I got the version of GLIBCXX_3.4. which was:
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX
GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBCXX_3.4.20
GLIBCXX_3.4.21
GLIBCXX_DEBUG_MESSAGE_LENGTH
I should contact marcus for updating the version since I don't have the admin privilages in his system
--------------------------------------------------------------------
July 17, July 20
A thorough support is on:
https://www.hpc2n.umu.se/documentation/guides/beginner-guide
Work with kebnekaise gpu:
a sample for kebnekaise job script is:
###################################################################
#!/bin/bash
# Put in actual SNIC number
#SBATCH -A snic2020-9-99
#SBATCH -n 1
#SBATCH -c 1
#SBATCH -J torch_mnist
#SBATCH --time=00:15:00
###SBATCH -p largemem
#For OpenFOAM version 6
#ml purge > /dev/null 2>&1 # Ignore warnings from purge
#ml icc/2018.1.163-GCC-6.4.0-2.28 impi/2018.1.163
#ml ifort/2018.1.163-GCC-6.4.0-2.28 impi/2018.1.163
#ml OpenFOAM/6
# to change the default platforms directory of OpenFOAM
#source /pfs/nobackup/home/m/morteza/etc/settings.sh
# run the program
#decomposePar -force >& log.decomposePar
#srun -n 32 pelletReactingFoam -parallel >& log.pelletReactingFoam
#reconstructPar -newTimes >& log.reconstructPar
python ../torch_mnist.py
####################################################################
--------------------------------------------------------------------
July 21
I am working on a tutorial for using kebnekaise and abisko
for log in, save files, submit a job and use gpus
It is stored as /snic2020-06-41/salma_files/Tutorials/kebnekaise_abisko_short_tutorial.ipynb
I also completed other tutorials and store them in the same directory as /snic2020-06-41/salma_files/Tutorials/
--------------------------------------------------------------------
July 22
submitting a job on cpu and also gpu works on kebnekaise and abisko now.
A copy of tutorial is sent to Malou.
For running our image processing scripts on kebnekaise we need the storage project,
since we have only 25GB of memory on /pfs/nobackup/ space. However, we have access to
large memry project on kebnekaise which provides us 3072000MB memory for our job.
""If your job requires more than 126000MB / node on Kebnekaise, there is a limited number of nodes with 3072000MB memory, which you may be allowed to use (you apply for it as a separate resource when you make your project proposal in SUPR). They are accessed by selecting the largemem partition of the cluster. You do this by setting: -p largemem.""
--------------------------------------------------------------------
July 23, 24
I make a run on gpu nodes of kebnekaise. I had some error yesterday for my scripts
as follows:
AssertionError:
The NVIDIA driver on your system is too old (found version 10010).
Please update your GPU driver by downloading and installing a new
version from the URL: http://www.nvidia.com/Download/index.aspx
Alternatively, go to: https://pytorch.org to install
a PyTorch version that has been compiled with your version
of the CUDA driver.
I tried to change the version of pytorch(torch and torchvision) and then I got another error as:
/bin/bash: /hpc2n/eb/software/lmod/lmod/init/bash: Transport endpoint is not connected
python: error while loading shared libraries: libpython3.6m.so.1.0: cannot open shared object file: No such file or directory
I emailed the problem to support people and I got this answer:
You should load the appropriate python in your submit file using the following
after the SBATCH commands and before actually using python, you can find the
available versions of python using ml spider python/
see the following for more information
https://www.hpc2n.umu.se/documentation/environment/lmod
ml purge 2>/dev/null >/dev/null
ml GCCcore/8.3.0
ml Python/3.7.4
-----------------------------------------------------------------------
July 27 (off day)
-----------------------------------------------------------------------
July 28
I downgraded the version of torch and torchvision to be campatible with the NVIDIA driver version.
My previous version of torch was torch==1.5.1 and torchvision==0.6.0.
I had to reinstall them and install lower versions as (command="pip freeze"):
torch==1.3.0
torchvision==0.4.0
I also had to load "cuDNN" and "CUDA" modules. And for loading these modules I had to load their dependencies
which I can find out with "ml spider module_name".
Finally, I could run my "torch_mnist_cuda.py" script as a job on kebnekaise on K80 node without error in newEnv environment.
The list of modules I loaded were: (command== "ml")
Currently Loaded Modules:
1) systemdefault (S) 7) GCC/8.3.0 13) libffi/3.2.1
2) snicenvironment (S) 8) ncurses/6.1 14) bzip2/1.0.8
3) iccifort/2019.5.281 9) libreadline/8.0 15) SQLite/3.29.0
4) GCCcore/8.3.0 10) Tcl/8.6.9 16) Python/3.7.4
5) zlib/1.2.11 11) XZ/5.2.4 17) CUDA/10.1.243
6) binutils/2.32 12) GMP/6.1.2 18) cuDNN/7.6.4.38
For training flair model on kebnekaise I submitted the job. However, I got this error:
PermissionError: [Errno 13] Permission denied: '/home/s/salmak/.flair/embeddings/pubmed-2015-fw-lm.pt'
The steps:
ml GCCcore/8.3.0
ml Python/3.7.4
source /pfs/nobackup/$HOME/NLPenv/bin/activate
by running in terminal I got the memory error that quota exceeds.
------------------------------------------------------------------------
July 29, 30
I annotated images again to compare with Sonja's and Jon and Mariam annotations.
I assigned a number to each label and if I had the same label for each image, the numbers would sum up together..
I assign a column to each label and show the results in a table in
snic2020-6-41/salma-files/ImageProject/Game/Annotations/Annotation_comparison.ipynb
------------------------------------------------------------------------
July31
Iran shared images with us on lu box.
All images are transfered to lunarc and kebnekaise for some analysis.
I searched a little bit on a target journal for game manuscript.
I think bioinformatcs journal is good.
https://academic.oup.com/bioinformatics/pages/instructions_for_authors
Application Notes (up to 2 pages; this is approx. 1,300 words or 1,000 words plus one figure): Applications Notes are short descriptions of novel software or new algorithm implementations, databases and network services (web servers, and interfaces). Software or data must be freely available to non-commercial users. Availability and Implementation must be clearly stated in the article. Authors must also ensure that the software is available for a full two years following publication. Web services must not require mandatory registration by the user. Additional supplementary data can be published online-only by the journal. This supplementary material should be referred to in the abstract of the Application Note. If describing software, the software should run under nearly all conditions on a wide range of machines. Web servers should not be browser specific. Application Notes must not describe trivial utilities, nor involve significant investment of time for the user to install. The name of the application should be included in the title.
--------------------------------------------------------------------------
Aug03-07
working on Iran's images.
1) Read them and convert them to a numpy arrays categorized in two classes: Control, and lps
2) Cut images to (224,224) tiles to fed into a dense network.
3) The total images were 140 conrol images and 70 lps images.
4) Now, we have 2520 control tiles and 1820 lps images. the size of lps ones are different.
-------------------------------------------------------------------------
Aug10
train a DenseNet with the images and test new data.
I can find the DenseNet paper from following link:
All the files are stored on snic2020-6-41/salma-files/ImageProject/Collabrations directory.
There are still some issues with the result.
-------------------------------------------------------------------------
Aug11
Continue the analysis of iran images
Complete Annotation_comparison notebook and upload it to AitsLab/Microscopy_image_analysis_folder
Upload Tutorials directory to AitsLab/Infrastructure
------------------------------------------------------------------------
Aug12-13
Debugging analysis
train network over and over again
still bad test results
------------------------------------------------------------------------
Sep 06
I trained a VGG16model + two different layers on top of that. All analysis is on snic2020-6-41/salma-files/ImageProject/Collabrations/
directory.
There are four jupyter notebooks that are summarized in one for sharing with Darsi group and will be presented on Sep 09 with them.
The main codes are in python-script/3_Sep_VGG_data_4 and python-script/3_Sep_VGG_batch_normalization_data_4 that is the final results on data_4 set.
In which 80 percent of original images are separated for training dataset and 10, 10 for validation and test. Then those images are cropped to smaller tiles (224,224).
I am trying to use gradcam to find the gradients of output loss with respect to the last conv layer to see the network is trained on what part of the image.
However, for the tensorflow version problem I got multiple error.
I had to create new conda environment as following:
conda create -n tfgpu tensorflow python=3.6.8
conda install tensorflow-gpu==1.13.1
to test my code again. I am working on it to solve the errors.
------------------------------------------------------------------------
Sep 14
I am working on the LPS/Ctrl images. I am trying to work with original images. extract some images.
separate them in 80, 10, 10 ratio. Add blurred images with kernel size=3 and K=5 to the main data. Change the brighness of images randomly.
train the new network (VGG16) on them and check the result.
We have the new storage project on kebnekaise.
-----------------------------------------------------------------------
Sep 16-17
I used pytorch for training the vgg16 classifiers on darcy' project and ran on marcus's system.
The results were not good enough. working on that....
----------------------------------------------------------------------
Sep 18-30
vacation
----------------------------------------------------------------------
October 1- Nov 09
parental leave
----------------------------------------------------------------------
Nov 07
on Nov07 I had problem by lunarc system.
My pocketpass token expired by October 26:th.
I had to Follow the instructions at
https://lunarc-documentation.readthedocs.io/en/latest/authenticator_howto/#checking-the-validity-of-your-token
to register and activate a new one.
I start to listen to Stanford NLP course lectures.
first lecture on youtube:
https://www.youtube.com/watch?v=8rXD5-xhemo
----------------------------------------------------------------------
Nov 10
Stanford NLP course
----------------------------------------------------------------------
Nov13
The first lecture is summarized in /snic2020-06-41/salma-files/NLPProject/CS224N/lecture1/cs224n.ipynb
the theoritical optimization problem+gensim package
gloVe embedding + word2vec
A small cell death lecture is summarized in /snic2020-06-41/salma-files/Biology/Cell_death/ directory as apoptosis.txt
A lecture on NLP (analysing the text of stand-up comedians transcripts..) with
full steps are summarized in /snic2020-06-41/salma-files/
----------------------------------------------------------------------
Dec 1
Come back from parental leave
work 25% (mornings)
A short talk with Sonja: what to do next?
what is going on:
1)Augustin and ludwig are working on classification of histology screens
2)Peter_Alexander are working on biobert NLP relation extraction
what should I do:
Game:
Compare results (new results)
Share with Rafsan
Transfer everything to git (change notebooks to .py files)
write "make-files" for scripts
control version with git
.json hyperparameter for each model
write readme.txt for each directory
send email to carl for system biology course
deep learning_ journal club in 2021
a facebook page hubAI
deep learning course (get material from Sonja)
Read Augustin and ludwiq's notebook in onenote
----------------------------------------------
Dec 02
Start working at 18:30 (for around 2 hours)
Update the annotation databases to skip the first 100 images
I used "update annotate_table set first_label ='skip,salma' where id in (select id from annotate_table limit 100 );" command.
Copied the new game for Sonja and Rafsan
---------------------------------------------
Dec 03
start 9:30
try to fix the game for Rafsan, still error and it is about the conda environment.
histology meeting(guys trained a 3 class classifier)
They shared grad-cam code and the version of packages.
Tensorflow version 2.3.0
keras version 2.4.3
---------------------------------------------
Dec 08,
Fix the grad-cam (results are not good)
Fix Rafsan's game
---------------------------------------------
Dec 14
start 10:30
check Rafsan's and Sonja's annotation(not done yet)
---------------------------------------------
Dec 21
Things to do:
1) Binary game
2) Check scores of Iran's images
3) Grad-cam completion and uplaod
4) Game Draft and binary draft
5) Ask for Malous' code for new cutouts
----------------------------------------------
Dec 22-Jan 21
Working on Iran's dataset
train Vgg16 and resnet50 regression model for average scores
train Vgg16 and resnet50 regression model for individual scores
Working on kebnekaise
---------------------------------------------
Jan 22
Receive Iran's new dataset for ECMO, MV, and ECMO+LPS treatments
Want to test previous models on these Images
Meeting with Darcy and Iran on 22 of Jan
Image processing course of Michigan University still going on!
Paper from Thomas group for weekend
--------------------------------------------
5th of Feb
Annual meeting with Sonja
What we discussed:
agile,virtal board: doing, to be done, three of us!
make contacts with industry
Build a network, journal club
After a seminra discuss what we have learned
every two weeks,
one in a month,
Technical groups
PhD course
put you in contact with others
practice on papers
Lets do pytorch
grad-com
docker
NLP course
May (NLP course)
Teaching course...
Spark
8 papers (small papers)
1. good routines work together
2. publish papers
3. histology paper
4. Mariam projects
5. This year Sonja writes
6. read lots of papers
7. grant writing
8. Co-supervisor
9. This paper
10. Malou's annotations
11. Swedish NLP (NER for swedish symptoms)
12. Swedish Sapcy and bert (Flair)
-----------------------------------------------
Feb 9, 10
Working on training with new paremeters
Change learning rate : doesnot work
Add BN: works but not well
Add CLR: /proj/nobackup/aits_storage/salma-files/NLPenv/bin/python -m pip install CLR
or /proj/nobackup/aits_storage/salma-files/NLPenv/bin/python -m pip install --upgrade pip first
I have to mention the PATH since it is originally installed on /pfs/nobackup/home/s/salmak/NLPenv/lib/python3.7/site-packages
-----------------------------------------------
Feb12
Read the neurodegenerative thesis and comment on DE analysis part
----------------------------------------------
Feb13, 14
Review CS224 course first lecture (Math and script part)
----------------------------------------------
Feb15
Working on Pytorch version of regression model
Review first lecture of deep learning in computer vision course
Learning Pytorch!!!(started from this link: https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)
The steps are summerized in salma-files/Courses/Pytorch/Tutorial.ipynb notebook
-----------------------------------------------
Feb 15-Feb28
Finilizing histology project's results
----------------------------------------------
March first Week : knowledge in collabration course
----------------------------------------------
March Second Week: Statistics I course
----------------------------------------------
March third week: Off days
----------------------------------------------
March Fourth week: Qualitative resarch course
----------------------------------------------
April first week: Working on graphical visualization of scores on the images for histology project
----------------------------------------------
April second week: Off days
---------------------------------------------
April third and fourth week : Statistics II course
---------------------------------------------
April 25-May 3
Train EfficientNet0 on total score (not better than Vgg16)
*************Future plan: Train EfficientNet4 and 7.
--------------------------------------------
May 4-May 10 Research ethics Course
-------------------------------------------
May 6 jouranl club :
SPICE paper
************Future plan to run it over 890000 images
-------------------------------------------
May 12
Meeting with Johanna
*************Write script fot parsing the pdf journals
in R or Python
Visualize relation graph in R and Cytoscape
install.packages("BiocManager")
library(BiocManager)
BiocManager::install(version = "3.12")
BiocManager::install("paxtoolsr")
library(paxtoolsr)
BiocManager::install("rJava")
library(rJava)
help.search("paxtoolsr")
install.packages("igraph")
library(igraph)
results <- readSif("tab_example.sif")
g <- loadSifInIgraph(results)
g
plot(g)
-----------------------------------------------
May 13
Johanna added us to the github repository
**************Add flair model to the NLP pipeline
----------------------------------------------
May 14-23
Work 50%
Done this week:
*Debug Mariam's Grad-cam code
*Write the Script for parsing pdf file in R (only first phase that extracts patient info+ aktuellt + huvud... info)
*train B4, and B7 on three-groups dataset and also five-group dataset (for B7 I changed batch-size to 8)
waiting for results
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
13215312 single Regressi salmak R 6:44 1 b-cn0123
13215298 single Regressi salmak R 8:24 1 b-cn0123
13215230 single Regressi salmak R 12:18 1 b-cn0343
13215206 single Regressi salmak R 13:39 1 b-cn0343
13215202 single Regressi salmak R 17:17 1 b-cn0847
Going to do:
Start Coursera NLP course
----------------------------------------------
May 24-29
plan:
Finish histology runs and complete the manuscript
Finisg NLP course
----------------------------------------------
June 1-25
working 50 %
Working on histology project
Finding a bug in 5-fold dataset
slide 28 exists in both MV and Control and copied to both folds where MV was validation and where Control was validation by mistake
Correcting that
Filling the information in the manuscript
---------------------------------------------
June 29
*******making new environment
pip install opencv-python-headless
--------------------------------------------
Parental leave on July (alost 100 %)
-------------------------------------------
9th of July Group meeting
Results:
Salma:
1. Histology
2. Biobert project
Goals:
Train the large version on HunNer corpora
Save the model in .pb format for further predictions
Use clusters (Alvis)
Make it compatible with TF.02
Finish the manuscript
Sonja:
Mariam’s project
Gene, symbols, identifier
To resolve them to a single
Conversion tool, uniport has
Gene ids could be chosen but uniport (manually reviewed part and
Another part automatically part (unreviewed part))
Some don’t match (still updating)
Sonja Update Mariam’s code (pandas instead of for loops)
Theresa is doing master project in August
--------------------------------------------
Aug 03,
I realized I was not running my codes on gpus. There was this error that
"Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.1.."
I did ml CUDAcore/11.3.1
And It is solved. I also
ml 2) cuDNN/8.2.1.32-CUDA-11.3.1 3) CUDA/10.1.243 4) CUDAcore/11.3.1
Now the code is running on gpu :)
------------------------------------------
Nov 01 2021
start working after Two month full leave
1. This week I will work on histology manuscript
2. I also try to run the biobert pytorch models on gold standard
(only for gene and protein)
3. Add my binary part to Iran's manuscript
-----------------------------------------
Nov02, 03, and 04 vab (Noura was at home)
I added all biobert results to NER_results excel sheet.
checking the data (Hunflair data and tokenization)
I have one from Adam-ola github page which is
https://github.com/Aitslab/BioNLP/tree/master/Adam_Ola/ner_inputs/HunFlair_NER_gene/gene_all_combined/train_dev.tsv
And one from Marcus Klang as HUNER_DATASET.zip on
ner_inputs directory
----------------------------------------
Nov 08
-All the dates of folders are checked by stat command
-All the reported results were done on Adam-ola dataset Added all to excel sheet
**Start to add Flair results
After this all the done tasks will shown by -
After this all the ongoing tasks will shown by **
-Updated logbook added to /salmaviolet/Microscopy_image_analysis_folder/ github repo
Following steps were done:
-git clone https://github.com/salmaviolet/Microscopy_image_analysis_folder.git
-git add logbook.txt
-git commit -m 'Update Nov 08'
then in Setting - developer setting - generate token - copy the token
-git push -u origin master or -git push
username: salmaviolet
paste: paste token
tips: For moving curser to the end of vim editor:
ESC
then
Shift + G
---------------------------------------
Nov 16
*Flair embeddings are not working
*The link is now
https://nlp.informatik.hu-berlin.de/resources/embeddings/flair/
-I had to download embedding files as *.pt and pass the path to FlairEmbeddings('.pt path') function.
---------------------------------------
Nov 17 -- Parental leave
---------------------------------------
Nov 18
- Request for barzelius account and sign the agreement
- username: x_salka
For using gpu in scripts
add --gpus 4 (e.g.,)
and also run this
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
*****This did not work
It is also possible to use gpu in front end interactive mode
interactive --gpus=1
*** This also worked
*****This did
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
-------------------------------------
Nov 19
work 50%
Noura was sick
Finish Flair evaluation on Gold standard
train the model
save the checkpoints
load the checkpoints and resume the training
add all the results to Excel sheet
-------------------------------------
Nov 22- Noura was sick
-------------------------------------
Nov 23- come back to office
read Iran's manuscript
Answer all comments
-------------------------------------
Nov 24-25
I was sick
-------------------------------------
Nov 26
Working on manuscript
-------------------------------------
Nov 29- Dec 03
Oral communication course
-------------------------------------
Dec 06
A little work on Kaggle new compettition data
Being in contact with Malou
She train U-Net on the data 15 epochs
The results shows F1-score of 0.16 (the best score is around 0.339 now)
write the manuscript
-------------------------------------
Dec 07
-Still on Kaggle data
* I will write the binary part of histology manuscript today
--------------------------------------
Dec 08
For creating Malou's environment
First need to
conda update --all
conda env export --no-builds > env.yml
conda env create -f env.yml
- also need to change the name of env in env.yml
-------------------------------------
Dec 09
Got gpu work
still some error
tensorflow = '1.14.0'
keras = 2.2.4
cudatoolkit = 10.0.130
cudnn = first 7.3 then 7.6.5 (got gpu to work but still not running the model)
---------------------------------------
Dec 10
Gpu works for Malou's code on berzelius
Still some bugs in evaluation (have to fixed)
Meeting with Sonja and rafsan
---------------------------------------
Dec 13
Working on a new model from https://bitbucket.org/t_scherr/cell-segmentation-and-tracking/src/master/
did not work
Looking for a simple algorithm called Watershed from cv package
or run UNET again on predicted masks tomorrow
-Working on the github repo Rafsan shared with me
-Add Flair to the Pipeline
----------------------------------------
Dec 14
Put aside segmentation project for now.
I did not get Malou's result anyway.*************************************
Meeting with Iran
-----------------------------------------
Dec 15-Jan 14
Working on Binary paper- histology
Rerun everything
Add all files to OneDrive/manuscripts/histo1
Working on Manuscript
-----------------------------------------
Jan 16
Start adding Flair to pipleline
-----------------------------------------
Jan 17-27
Correcting Iran's plot
*** Adding three-class classifier instead of binary to the first manuscript
Adding plots to the figure file
Adding Flair to pipleline
presented Journal club paper
-----------------------------------------
Jan 29
Finish 3-class classifier runs
Finisg adding Flair model to Pipleline
-----------------------------------------
Feb 03
running 3 class classifier on Alvis
did not work on my own laptop
no clusters
only Alvis
******************************************************
I realized I was not running my codes on gpus. There was this error that
"Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.1.."
I did ml CUDAcore/11.3.1
And It is solved. I also
ml 2) cuDNN/8.2.1.32-CUDA-11.3.1 3) CUDA/10.1.243 4) CUDAcore/11.3.1
Now the code is running on gpu :)
****************************************************
-----------------------------------------
Feb 04, 2022
193855 alvis Ctrl_lps salmak R 0:03 1 alvis1-16
193854 alvis Ctrl_lps salmak R 1:29 1 alvis1-15
193853 alvis Ctrl_lps salmak R 2:51 1 alvis1-14
193852 alvis Ctrl_lps salmak R 5:33 1 alvis1-13
four runs on Alvis for 3 class classifier.
EffNetB4, Vgg16 for 3-fold and 5-fold cross validation
------------------------------------------
March 17
I was finalizing the Histology papers results during the past few weeks.
Binary classification of MV+LPS and control slides.
Three-class classification of all slides.
And regression models.
notebooks and excel results of total score is now added to OneDrive.
Individual scores and visualization did not finalized (I will do after I came back on 14th)
During next weeks I will write the histology papers (first Iran's paper and second my own).
Today I checked Sonja's new images on Swestore. Only names and list them.
Berzelius is now has two step authorization with TOPT app on phone.
Something I should do is the LUBI seminar (20th of April)
Ellite focus group stuff
Flair model and all embeddings that I downloaded from ftp server are on berzelius shared with Rafsan.
I did a small unsupervised clustering on histology images. I took the features from classifiers and regression models and feed them into unsupervised clustering algorthms. The results shows separate clusters in some cases. results are on OneDrive. I want to train a 5 class classifier and do unsupervised clustering on those for focus ellite group.
***After one month
1) you should update your Mahara,