Skip to content

Latest commit

 

History

History
executable file
·
48 lines (37 loc) · 1.74 KB

File metadata and controls

executable file
·
48 lines (37 loc) · 1.74 KB

Chaos-Game-Representation_BioSeq

Representation of Bio Sequences via Chaos Game and using the same to find similarities.

Medium : https://rrohith2001.medium.com/chaos-game-representation-of-genetic-sequences-e0e6bdcfaf6c


Data

Each excel file is a collection of all sequences in the respective family.


GUI

https://share.streamlit.io/rohith-2/chaos-game-representation_bioseq/stream.py

Screenshot 2021-05-18 at 7 41 30 PM


Gene-Similarity

CGR Matrix is a 2D matrix => (x,y) which consists of normalised value ranging from 0 to 1, which depicts the intensity of a color at any given (x,y)
The first two rows are considered for similarity measurement:

cgr_vec = Empty Vector()
for i <- cgr matrix of SEQ_1 # i iterates row wise
  a = max(i)
  new_row = i/a # Element-Wise Division
  cgr_vector = cgr_vector + new_row
  
cgr_vec_2 = Empty Vector()
for i <- cgr matrix of SEQ_2 # i iterates row wise
  a = max(i)
  new_row = i/a # Element-Wise Division
  cgr_vector_2 = cgr_vector_2 + new_row

Correlation(cgr_vec,cgr_vec_2)

cgr_vec and cgr_vec_2 will be vectors which can be utilised for measuring similarity via Spearmans correlation.


Running the GUI

The GUI is universally accesible via the above mentioned link to run in locally :

git clone https://github.com/Rohith-2/Chaos-Game-Representation_BioSeq.git
pip install -r requirements.txt  
cd Chaos-Game-Representation_BioSeq/GUI/
streamlit run gui_v2.py