diff --git a/Music recommendation feature.ipynb b/Music recommendation feature.ipynb new file mode 100644 index 00000000..ae88dc6c --- /dev/null +++ b/Music recommendation feature.ipynb @@ -0,0 +1 @@ +{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.10.13","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kaggle":{"accelerator":"none","dataSources":[{"sourceId":8726191,"sourceType":"datasetVersion","datasetId":5236926}],"isInternetEnabled":false,"language":"python","sourceType":"notebook","isGpuEnabled":false}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Content Based Recommendation Systems\n\nA recommendation system (or recommender system) is a class of machine learning that uses data to help predict, narrow down, and find what people are looking for among an exponentially growing number of options.\n\nRecommendation systems are divided into three:\n\n* Collaborative Filtering\n* Content Based RS\n* Hybrid Models\n\nIn this notebook we are going to discuss Content Based RS.\n\n## Content Based Recommendation Systems\n\n* Content-based filtering methods are based on a description of the item and a profile of the user's preferences. These methods are best suited to situations where there is known data on an item (name, location, description, etc.), but not on the user. Content-based recommenders treat recommendation as a user-specific classification problem and learn a classifier for the user's likes and dislikes based on an item's features.\n* It is used to models such as TF_IDF and Word2Vec in order to capture similarity.\n* It is very powerful that a item adding newly is recommend. \n* A key issue with content-based filtering is whether the system can learn user preferences from users' actions regarding one content source and use them across other content types. When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended.\n* To overcome this, most content-based recommender systems now use some form of the hybrid system.\n* Content-based recommender systems can also include opinion-based recommender systems. ","metadata":{"_uuid":"211893b3-b34f-45c2-8bba-ae96b890ff26","_cell_guid":"64ffe843-0f51-4bf5-9d01-3479b120064d","trusted":true}},{"cell_type":"markdown","source":"## What is TF-IDF?\n\nTF-IDF stands for Term Frequency Inverse Document Frequency of records. It can be defined as the calculation of how relevant a word in a series or corpus is to a text. The meaning increases proportionally to the number of times in the text a word appears but is compensated by the word frequency in the corpus (data-set).\n\nTF-IDF is a weight factor which a word display important into document and had been calculated with statistics method. TF-IDF method use to a lot domains (sentiment analysis, RS, stop words filterin etc.) This method is divided into two. Fistly we will analyze Term Frequency(TF).\n\n### Term Frequency\n\nIn document d, the frequency represents the number of instances of a given word t. Therefore, we can see that it becomes more relevant when a word appears in the text, which is rational. Since the ordering of terms is not significant, we can use a vector to describe the text in the bag of term models. For each specific term in the paper, there is an entry with the value being the term frequency.\nThe weight of a term that occurs in a document is simply proportional to the term frequency.\n\n### Inverse Document Frequency\n\nMainly, it tests how relevant the word is. The key aim of the search is to locate the appropriate records that fit the demand. Since tf considers all terms equally significant, it is therefore not only possible to use the term frequencies to measure the weight of the term in the paper. First, find the document frequency of a term t by counting the number of documents containing the term:\n\n**TF-IDF method used as multipy TF value and IDF value. (TF * IDF)**\n\nI have applied this method in my model. And I have found the similarity in between with cosine distance.","metadata":{"_uuid":"ef5594de-1f51-4f29-a84f-19136909b11e","_cell_guid":"87fdf135-277f-4fce-af94-b69dcb9ff45e","trusted":true}},{"cell_type":"markdown","source":"","metadata":{"_uuid":"2f2f516a-e4cc-4223-94d7-9e373614f9ef","_cell_guid":"2db1677c-31ac-4739-bb17-934203a8c590","trusted":true}},{"cell_type":"code","source":"# This Python 3 environment comes with many helpful analytics libraries installed\n# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python\n# For example, here's several helpful packages to load\n\nimport numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n\n# For Text\n\nimport matplotlib.pyplot as plt\nimport seaborn as sb\n\nfrom sklearn.metrics.pairwise import cosine_similarity\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom sklearn.manifold import TSNE\n\nimport warnings\nwarnings.filterwarnings('ignore')\n\n\n# Capture similarity \nfrom sklearn.metrics.pairwise import linear_kernel\n\nimport os\nfor dirname, _, filenames in os.walk('/kaggle/input'):\n for filename in filenames:\n print(os.path.join(dirname, filename))\n\n# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using \"Save & Run All\" \n# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session","metadata":{"_uuid":"969f5946-9174-4282-a6ad-d3735e6d38ae","_cell_guid":"c91fd311-1267-464c-9f00-6d899c98ac9a","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:35.099082Z","iopub.execute_input":"2024-07-18T06:43:35.099880Z","iopub.status.idle":"2024-07-18T06:43:37.958493Z","shell.execute_reply.started":"2024-07-18T06:43:35.099838Z","shell.execute_reply":"2024-07-18T06:43:37.957125Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"Let's get to know our dataset.","metadata":{"_uuid":"f815a9d3-982d-4783-8ff2-a97788b32b76","_cell_guid":"d0583ba6-7426-4f13-b439-4f1a1a4f063c","trusted":true}},{"cell_type":"code","source":"\n\ndata=pd.read_csv(\"/kaggle/input/musicaldata/musicaldata.csv\")\ndata.head(4000)","metadata":{"_uuid":"7835eb94-3a34-4253-a5d9-5ed64b45283d","_cell_guid":"0ff8ee63-f975-46ed-8346-450fef2d00f1","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:37.960867Z","iopub.execute_input":"2024-07-18T06:43:37.961541Z","iopub.status.idle":"2024-07-18T06:43:38.037706Z","shell.execute_reply.started":"2024-07-18T06:43:37.961497Z","shell.execute_reply":"2024-07-18T06:43:38.036554Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"> I wanted to make a suggestion depends on the cast members, description and \"listed_in\" which contains the shows type but there is another column named \"type\" it is a bit confusing I know. >","metadata":{"_uuid":"f47c0286-051f-44b7-8f3b-dff2636c57a2","_cell_guid":"5e4f0150-982c-40c2-ae42-df0e5b50ca14","trusted":true}},{"cell_type":"markdown","source":"Drop nan values on these columns to make a proper matrix which contains linear_kernel values of selected strings.","metadata":{"_uuid":"9e742cf3-6450-474e-b320-42275dea7b1b","_cell_guid":"4d9511c6-3b3c-4a66-b1f8-b77c1eb16472","trusted":true}},{"cell_type":"code","source":"data.shape","metadata":{"_uuid":"86c52013-2fe9-4582-826e-83e63d83a71c","_cell_guid":"546f44f4-1e9f-4316-b67a-4147d1cc61ac","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:38.039377Z","iopub.execute_input":"2024-07-18T06:43:38.040377Z","iopub.status.idle":"2024-07-18T06:43:38.048032Z","shell.execute_reply.started":"2024-07-18T06:43:38.040335Z","shell.execute_reply":"2024-07-18T06:43:38.046782Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data.info()","metadata":{"_uuid":"c20641aa-17af-4e43-ab22-671baaf3af06","_cell_guid":"5ccfef9a-1c31-4211-977d-71895d623c2e","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:38.049427Z","iopub.execute_input":"2024-07-18T06:43:38.050029Z","iopub.status.idle":"2024-07-18T06:43:38.081228Z","shell.execute_reply.started":"2024-07-18T06:43:38.049999Z","shell.execute_reply":"2024-07-18T06:43:38.079843Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data.isnull().sum()","metadata":{"_uuid":"488e4a21-2e98-4b64-b4b7-1f3a8774d197","_cell_guid":"8782b655-413f-407e-b6b0-95f80f0c86c4","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:38.085155Z","iopub.execute_input":"2024-07-18T06:43:38.085578Z","iopub.status.idle":"2024-07-18T06:43:38.097471Z","shell.execute_reply.started":"2024-07-18T06:43:38.085546Z","shell.execute_reply":"2024-07-18T06:43:38.096046Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data.dropna(inplace = True)\ndata.isnull().sum().plot.bar()\nplt.show()\n","metadata":{"execution":{"iopub.status.busy":"2024-07-18T06:43:38.098926Z","iopub.execute_input":"2024-07-18T06:43:38.099358Z","iopub.status.idle":"2024-07-18T06:43:38.469804Z","shell.execute_reply.started":"2024-07-18T06:43:38.099322Z","shell.execute_reply":"2024-07-18T06:43:38.468707Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data = data.drop(['track id'], axis = 1)\n\n\ndata","metadata":{"_uuid":"705daa1b-1a41-4765-a5c4-ca06243f8e73","_cell_guid":"9587614f-971d-4d90-a243-7059d10e7a53","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:38.471285Z","iopub.execute_input":"2024-07-18T06:43:38.472181Z","iopub.status.idle":"2024-07-18T06:43:38.503136Z","shell.execute_reply.started":"2024-07-18T06:43:38.472139Z","shell.execute_reply":"2024-07-18T06:43:38.501725Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data = data.sort_values(by=[' mood'], ascending=False)\ndata\n","metadata":{"_uuid":"7d10088e-9fb4-4e32-8a52-f0355a04e139","_cell_guid":"54bd07ed-5919-4c34-b8c2-537fc8bb4c63","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T06:43:38.504752Z","iopub.execute_input":"2024-07-18T06:43:38.505194Z","iopub.status.idle":"2024-07-18T06:43:38.531330Z","shell.execute_reply.started":"2024-07-18T06:43:38.505155Z","shell.execute_reply":"2024-07-18T06:43:38.530232Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"%%capture\nsong_vectorizer = CountVectorizer(lowercase=False)\nsong_vectorizer.fit(data[' genre'])\n","metadata":{"_uuid":"f4e99d39-6244-4a54-bd59-da0c0b4eb263","_cell_guid":"4cc6a4dd-69da-4dd2-9e9f-3db4ad1bf9ec","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T07:20:14.961494Z","iopub.execute_input":"2024-07-18T07:20:14.962320Z","iopub.status.idle":"2024-07-18T07:20:15.016863Z","shell.execute_reply.started":"2024-07-18T07:20:14.962281Z","shell.execute_reply":"2024-07-18T07:20:15.015952Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"27044823-561d-4684-9c1e-b58330477a09","_cell_guid":"b02809b0-533d-4f2a-a814-bcb371d4b623","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data = data.sort_values(by=[' mood'], ascending=False).head(8407)\n","metadata":{"_uuid":"9b9c0d05-34c7-4fbe-b3e8-3da5311f79aa","_cell_guid":"0895c987-d693-4ec5-98f4-52964893929f","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T07:20:18.625596Z","iopub.execute_input":"2024-07-18T07:20:18.626296Z","iopub.status.idle":"2024-07-18T07:20:18.633823Z","shell.execute_reply.started":"2024-07-18T07:20:18.626259Z","shell.execute_reply":"2024-07-18T07:20:18.632754Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"def get_similarities(song_genre, datas):\n\n# Getting vector for the input song.\n text_array1 = song_vectorizer.transform(datas[datas[' genre']==song_genre][' mother tongue']).toarray()\n num_array1 = datas[datas[' genre']==song_genre].select_dtypes(include=np.number).to_numpy()\n\n# We will store similarity for each row of the dataset.\n sim = []\n for idx, row in data.iterrows():\n\t genre = row[' genre']\n\t\n\t# Getting vector for current song.\n\t text_array2 = song_vectorizer.transform(datas[datas[' genre']==genre][' mother tongue']).toarray()\n\t num_array2 = datas[datas[' genre']==genre].select_dtypes(include=np.number).to_numpy()\n\n\t# Calculating similarities for text as well as numeric features\n\t text_sim = cosine_similarity(text_array1, text_array2)[0][0]\n\t num_sim = cosine_similarity(num_array1, num_array2)[0][0]\n\t sim.append(text_sim + num_sim)\n\t\n return sim\n","metadata":{"_uuid":"7f64a3c5-6644-4500-b768-9588a1a1796e","_cell_guid":"f6ba9b5c-1cd6-4329-9716-768fc66b6185","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T07:21:22.321219Z","iopub.execute_input":"2024-07-18T07:21:22.322530Z","iopub.status.idle":"2024-07-18T07:21:22.332061Z","shell.execute_reply.started":"2024-07-18T07:21:22.322483Z","shell.execute_reply":"2024-07-18T07:21:22.330729Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"def recommend_songs(song_genre, datas=data):\n # Base case\n if data[data[' genre'] == song_genre].shape[0] == 0:\n print('This song is not so popular')\n \n for song in datas.sample(n=5)[' genre'].values:\n print(song)\n return\n \n datas['similarity_factor'] = get_similarities(song_genre, datas)\n \n datas.sort_values(by=['similarity_factor', ' mood'],\n ascending = [False, False],\n inplace=True)\n \n # First song will be the input song itself as the similarity will be highest.\n display(datas[[' genre', ' age', ' mother tongue']][2:7])\n","metadata":{"_uuid":"f42b9e20-bfde-468c-a061-baede0e7acc9","_cell_guid":"16949b27-8cac-4a55-84f8-c0109e8141f5","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T07:32:45.646510Z","iopub.execute_input":"2024-07-18T07:32:45.646931Z","iopub.status.idle":"2024-07-18T07:32:45.656010Z","shell.execute_reply.started":"2024-07-18T07:32:45.646887Z","shell.execute_reply":"2024-07-18T07:32:45.654684Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"recommend_songs('classical')\n","metadata":{"_uuid":"123ddf07-1178-4198-82ac-3e4b0077b682","_cell_guid":"5c7396f0-7d3a-48f5-8c60-9dc8a74d671f","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-07-18T07:50:20.350621Z","iopub.execute_input":"2024-07-18T07:50:20.351890Z","iopub.status.idle":"2024-07-18T07:59:03.032540Z","shell.execute_reply.started":"2024-07-18T07:50:20.351829Z","shell.execute_reply":"2024-07-18T07:59:03.031273Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"8fc6e3b5-9204-4b61-81f1-1ad298187b0b","_cell_guid":"0d8ef6be-f82b-4727-bf71-ae5e3b93b065","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"db38054b-03da-4b74-90c8-da0952e22e97","_cell_guid":"00755ea1-9ccb-48c3-b5bf-ed8502dba41a","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"ebeabf14-181a-4931-a489-3692eaa20b94","_cell_guid":"2c060445-e4a7-47f3-bc6e-77b60ca5f21d","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"fd7cc031-07e1-470a-b04e-e4914737aa34","_cell_guid":"f34ee388-4084-468f-bce4-f85059694cda","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"\n","metadata":{"_uuid":"8fe27341-f205-43b1-bbb0-1aeac550fa46","_cell_guid":"e6249211-42c8-4197-ab2d-cedf58afcd17","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"In this above. My model brought the recommendations I wanted to bring acording to the mood of the songs of different genres . Firstly it brought to Transformers movies.\n\n# Conclusion\n\n* In this notebook. I worked to explain content based recommendation system.\n* Content based RS models are powerful in new item recommendation. \n* In general it is used TF-IDF and Word2Vec model while content based RS designs.\n* TF-IDF method is the method which explains words frequency in document.\n* It can be created a recommendation model by using cosine distance with the aid of TF-IDF weights.\n\n","metadata":{"_uuid":"a5fbdfc2-4e81-4d76-985e-c10441e7fe32","_cell_guid":"8d31b41f-78cd-4b03-b692-8644e3d9a1be","trusted":true}}]} diff --git a/notebook10298254f6.ipynb b/notebook10298254f6.ipynb new file mode 100644 index 00000000..6e2e6500 --- /dev/null +++ b/notebook10298254f6.ipynb @@ -0,0 +1 @@ +{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.10.14","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"},"kaggle":{"accelerator":"none","dataSources":[{"sourceId":8726191,"sourceType":"datasetVersion","datasetId":5236926}],"isInternetEnabled":false,"language":"python","sourceType":"notebook","isGpuEnabled":false}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"# Content Based Recommendation Systems\n\nA recommendation system (or recommender system) is a class of machine learning that uses data to help predict, narrow down, and find what people are looking for among an exponentially growing number of options.\n\nRecommendation systems are divided into three:\n\n* Collaborative Filtering\n* Content Based RS\n* Hybrid Models\n\nIn this notebook we are going to discuss Content Based RS.\n\n## Content Based Recommendation Systems\n\n* Content-based filtering methods are based on a description of the item and a profile of the user's preferences. These methods are best suited to situations where there is known data on an item (name, location, description, etc.), but not on the user. Content-based recommenders treat recommendation as a user-specific classification problem and learn a classifier for the user's likes and dislikes based on an item's features.\n* It is used to models such as TF_IDF and Word2Vec in order to capture similarity.\n* It is very powerful that a item adding newly is recommend. \n* A key issue with content-based filtering is whether the system can learn user preferences from users' actions regarding one content source and use them across other content types. When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended.\n* To overcome this, most content-based recommender systems now use some form of the hybrid system.\n* Content-based recommender systems can also include opinion-based recommender systems.","metadata":{"_uuid":"cca15c56-4f26-4e9a-b4a8-2eca1c5ab108","_cell_guid":"d594d373-a86f-489c-87d1-45302b1b86a3","trusted":true}},{"cell_type":"markdown","source":"## What is TF-IDF?\n\nTF-IDF stands for Term Frequency Inverse Document Frequency of records. It can be defined as the calculation of how relevant a word in a series or corpus is to a text. The meaning increases proportionally to the number of times in the text a word appears but is compensated by the word frequency in the corpus (data-set).\n\nTF-IDF is a weight factor which a word display important into document and had been calculated with statistics method. TF-IDF method use to a lot domains (sentiment analysis, RS, stop words filterin etc.) This method is divided into two. Fistly we will analyze Term Frequency(TF).\n\n### Term Frequency\n\nIn document d, the frequency represents the number of instances of a given word t. Therefore, we can see that it becomes more relevant when a word appears in the text, which is rational. Since the ordering of terms is not significant, we can use a vector to describe the text in the bag of term models. For each specific term in the paper, there is an entry with the value being the term frequency.\nThe weight of a term that occurs in a document is simply proportional to the term frequency.\n\n### Inverse Document Frequency\n\nMainly, it tests how relevant the word is. The key aim of the search is to locate the appropriate records that fit the demand. Since tf considers all terms equally significant, it is therefore not only possible to use the term frequencies to measure the weight of the term in the paper. First, find the document frequency of a term t by counting the number of documents containing the term:\n\n**TF-IDF method used as multipy TF value and IDF value. (TF * IDF)**\n\nI have applied this method in my model. And I have found the similarity in between with cosine distance.","metadata":{"_uuid":"29f885c7-5339-4d8d-b99d-0585784e6cc0","_cell_guid":"7d0903c4-9308-4059-9076-d544ee9f8224","trusted":true}},{"cell_type":"markdown","source":"","metadata":{"_uuid":"f3c53cfc-5bc3-4ff8-91df-f20953eac67a","_cell_guid":"2b2b995e-6d6c-4452-9948-0bbb90228ada","trusted":true}},{"cell_type":"code","source":"# This Python 3 environment comes with many helpful analytics libraries installed\n# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python\n# For example, here's several helpful packages to load\n\nimport numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n\n# For Text\n\nimport matplotlib.pyplot as plt\nimport seaborn as sb\n\nfrom sklearn.metrics.pairwise import cosine_similarity\nfrom sklearn.feature_extraction.text import CountVectorizer\nfrom sklearn.manifold import TSNE\n\nimport warnings\nwarnings.filterwarnings('ignore')\n\n\n# Capture similarity \nfrom sklearn.metrics.pairwise import linear_kernel\n\nimport os\nfor dirname, _, filenames in os.walk('/kaggle/input'):\n for filename in filenames:\n print(os.path.join(dirname, filename))\n\n# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using \"Save & Run All\" \n# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session","metadata":{"_uuid":"fdaf36cf-4d9e-4ce2-9360-8417b6e445a1","_cell_guid":"c7427ab3-c6a6-433e-aa09-8469f15ff2fa","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:06.194729Z","iopub.execute_input":"2024-10-13T07:33:06.195208Z","iopub.status.idle":"2024-10-13T07:33:07.965888Z","shell.execute_reply.started":"2024-10-13T07:33:06.195163Z","shell.execute_reply":"2024-10-13T07:33:07.964662Z"},"trusted":true},"execution_count":2,"outputs":[{"name":"stdout","text":"/kaggle/input/musicaldata/musicaldata.csv\n","output_type":"stream"}]},{"cell_type":"markdown","source":"Let's get to know our dataset.","metadata":{"_uuid":"3c7eb90b-9f18-4613-ad7f-e6ce5f313c94","_cell_guid":"d039e612-5229-47f1-9b5c-0f3ced80c3f8","trusted":true}},{"cell_type":"code","source":"\n\ndata=pd.read_csv(\"/kaggle/input/musicaldata/musicaldata.csv\")\ndata.head(4000)","metadata":{"_uuid":"c0659919-7412-4f17-9e3c-62ea3bc44e4a","_cell_guid":"880bd5b0-e0cf-4df8-9da4-d40a3718e190","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:07.968352Z","iopub.execute_input":"2024-10-13T07:33:07.969507Z","iopub.status.idle":"2024-10-13T07:33:08.037031Z","shell.execute_reply.started":"2024-10-13T07:33:07.969427Z","shell.execute_reply":"2024-10-13T07:33:08.035890Z"},"trusted":true},"execution_count":3,"outputs":[{"execution_count":3,"output_type":"execute_result","data":{"text/plain":" track id genre amazement solemnity tenderness nostalgia \\\n0 1 classical 0 1 0 0 \n1 1 classical 0 0 0 1 \n2 1 classical 0 0 0 1 \n3 1 classical 0 0 0 0 \n4 1 classical 0 0 0 1 \n... ... ... ... ... ... ... \n3995 149 rock 0 0 1 0 \n3996 149 rock 0 0 0 1 \n3997 150 rock 0 0 0 0 \n3998 150 rock 0 1 0 0 \n3999 150 rock 0 1 0 1 \n\n calmness power joyful_activation tension sadness mood \\\n0 0 0 1 1 0 3 \n1 0 0 0 0 0 3 \n2 0 0 0 0 1 3 \n3 1 0 0 0 0 3 \n4 1 0 0 0 0 4 \n... ... ... ... ... ... ... \n3995 1 0 0 0 1 4 \n3996 1 0 0 0 1 3 \n3997 0 0 0 1 0 3 \n3998 1 0 0 0 0 3 \n3999 0 1 0 0 0 2 \n\n liked disliked age gender mother tongue \n0 1 0 21 1 English \n1 0 1 41 1 Dutch \n2 0 0 24 1 English \n3 0 0 32 0 Spanish \n4 0 1 21 0 English \n... ... ... ... ... ... \n3995 1 0 25 1 Dutch \n3996 1 0 24 0 French \n3997 0 1 38 0 English \n3998 0 0 23 0 Dutch \n3999 0 0 38 0 Dutch \n\n[4000 rows x 17 columns]","text/html":"
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
track idgenreamazementsolemnitytendernessnostalgiacalmnesspowerjoyful_activationtensionsadnessmoodlikeddislikedagegendermother tongue
01classical010000110310211English
11classical000100000301411Dutch
21classical000100001300241English
31classical000010000300320Spanish
41classical000110000401210English
......................................................
3995149rock001010001410251Dutch
3996149rock000110001310240French
3997150rock000000010301380English
3998150rock010010000300230Dutch
3999150rock010101000200380Dutch
\n

4000 rows × 17 columns

\n
"},"metadata":{}}]},{"cell_type":"markdown","source":"> I wanted to make a suggestion depends on the cast members, description and \"listed_in\" which contains the shows type but there is another column named \"type\" it is a bit confusing I know. >","metadata":{"_uuid":"9c48687d-089d-4973-a961-1f0f3fd64222","_cell_guid":"bcbed148-1afc-4856-897c-2785218373a6","trusted":true}},{"cell_type":"markdown","source":"Drop nan values on these columns to make a proper matrix which contains linear_kernel values of selected strings.","metadata":{"_uuid":"ee78398d-7cba-4648-82c2-623599c65af1","_cell_guid":"e14fb5c6-5ecc-4d93-a689-051207177a1b","trusted":true}},{"cell_type":"code","source":"data.shape","metadata":{"_uuid":"1189e1aa-8de4-4b1d-9e73-9efc27b0a042","_cell_guid":"f9eb8186-2ed0-4582-9a22-b42262396c99","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.038298Z","iopub.execute_input":"2024-10-13T07:33:08.038675Z","iopub.status.idle":"2024-10-13T07:33:08.046585Z","shell.execute_reply.started":"2024-10-13T07:33:08.038636Z","shell.execute_reply":"2024-10-13T07:33:08.045399Z"},"trusted":true},"execution_count":4,"outputs":[{"execution_count":4,"output_type":"execute_result","data":{"text/plain":"(8407, 17)"},"metadata":{}}]},{"cell_type":"code","source":"data.info()","metadata":{"_uuid":"57deefcf-30eb-49c3-8d31-0ed102606343","_cell_guid":"f976d69f-6068-495e-8c26-877647ac4072","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.048210Z","iopub.execute_input":"2024-10-13T07:33:08.048680Z","iopub.status.idle":"2024-10-13T07:33:08.080505Z","shell.execute_reply.started":"2024-10-13T07:33:08.048626Z","shell.execute_reply":"2024-10-13T07:33:08.079336Z"},"trusted":true},"execution_count":5,"outputs":[{"name":"stdout","text":"\nRangeIndex: 8407 entries, 0 to 8406\nData columns (total 17 columns):\n # Column Non-Null Count Dtype \n--- ------ -------------- ----- \n 0 track id 8407 non-null int64 \n 1 genre 8407 non-null object\n 2 amazement 8407 non-null int64 \n 3 solemnity 8407 non-null int64 \n 4 tenderness 8407 non-null int64 \n 5 nostalgia 8407 non-null int64 \n 6 calmness 8407 non-null int64 \n 7 power 8407 non-null int64 \n 8 joyful_activation 8407 non-null int64 \n 9 tension 8407 non-null int64 \n 10 sadness 8407 non-null int64 \n 11 mood 8407 non-null int64 \n 12 liked 8407 non-null int64 \n 13 disliked 8407 non-null int64 \n 14 age 8407 non-null int64 \n 15 gender 8407 non-null int64 \n 16 mother tongue 8407 non-null object\ndtypes: int64(15), object(2)\nmemory usage: 1.1+ MB\n","output_type":"stream"}]},{"cell_type":"code","source":"data.isnull().sum()","metadata":{"_uuid":"8f43e982-b57f-44d4-8d27-1278f4d02e33","_cell_guid":"b58e0982-ab1a-46b4-92b5-5e160bcf3de5","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.083661Z","iopub.execute_input":"2024-10-13T07:33:08.084062Z","iopub.status.idle":"2024-10-13T07:33:08.095794Z","shell.execute_reply.started":"2024-10-13T07:33:08.084022Z","shell.execute_reply":"2024-10-13T07:33:08.094615Z"},"trusted":true},"execution_count":6,"outputs":[{"execution_count":6,"output_type":"execute_result","data":{"text/plain":"track id 0\n genre 0\n amazement 0\n solemnity 0\n tenderness 0\n nostalgia 0\n calmness 0\n power 0\n joyful_activation 0\n tension 0\n sadness 0\n mood 0\n liked 0\n disliked 0\n age 0\n gender 0\n mother tongue 0\ndtype: int64"},"metadata":{}}]},{"cell_type":"code","source":"data.dropna(inplace = True)\ndata.isnull().sum().plot.bar()\nplt.show()","metadata":{"_uuid":"1323b0f8-6392-4630-9228-cdd495832606","_cell_guid":"43838d57-110b-4cc7-b9c7-af19f24439a3","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.097340Z","iopub.execute_input":"2024-10-13T07:33:08.097826Z","iopub.status.idle":"2024-10-13T07:33:08.476431Z","shell.execute_reply.started":"2024-10-13T07:33:08.097775Z","shell.execute_reply":"2024-10-13T07:33:08.475143Z"},"trusted":true},"execution_count":7,"outputs":[{"output_type":"display_data","data":{"text/plain":"
","image/png":""},"metadata":{}}]},{"cell_type":"code","source":"data = data.drop(['track id'], axis = 1)\n\n\ndata","metadata":{"_uuid":"7544d9ad-5943-4c3e-b0a9-d182076eea2b","_cell_guid":"e548f824-acef-4dba-8309-c81ba42283ca","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.478236Z","iopub.execute_input":"2024-10-13T07:33:08.478837Z","iopub.status.idle":"2024-10-13T07:33:08.503828Z","shell.execute_reply.started":"2024-10-13T07:33:08.478782Z","shell.execute_reply":"2024-10-13T07:33:08.502655Z"},"trusted":true},"execution_count":8,"outputs":[{"execution_count":8,"output_type":"execute_result","data":{"text/plain":" genre amazement solemnity tenderness nostalgia calmness power \\\n0 classical 0 1 0 0 0 0 \n1 classical 0 0 0 1 0 0 \n2 classical 0 0 0 1 0 0 \n3 classical 0 0 0 0 1 0 \n4 classical 0 0 0 1 1 0 \n... ... ... ... ... ... ... ... \n8402 pop 1 1 0 0 0 0 \n8403 pop 0 0 0 1 0 0 \n8404 pop 0 0 0 0 0 0 \n8405 pop 1 0 0 0 0 0 \n8406 pop 1 0 0 0 0 0 \n\n joyful_activation tension sadness mood liked disliked age gender \\\n0 1 1 0 3 1 0 21 1 \n1 0 0 0 3 0 1 41 1 \n2 0 0 1 3 0 0 24 1 \n3 0 0 0 3 0 0 32 0 \n4 0 0 0 4 0 1 21 0 \n... ... ... ... ... ... ... ... ... \n8402 1 0 0 3 0 0 26 1 \n8403 1 0 1 3 0 1 29 0 \n8404 0 1 0 4 0 1 34 1 \n8405 1 1 0 5 0 0 39 1 \n8406 0 1 0 4 0 1 18 1 \n\n mother tongue \n0 English \n1 Dutch \n2 English \n3 Spanish \n4 English \n... ... \n8402 Russian \n8403 Russian \n8404 Polish \n8405 French \n8406 Russian \n\n[8407 rows x 16 columns]","text/html":"
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
genreamazementsolemnitytendernessnostalgiacalmnesspowerjoyful_activationtensionsadnessmoodlikeddislikedagegendermother tongue
0classical010000110310211English
1classical000100000301411Dutch
2classical000100001300241English
3classical000010000300320Spanish
4classical000110000401210English
...................................................
8402pop110000100300261Russian
8403pop000100101301290Russian
8404pop000000010401341Polish
8405pop100000110500391French
8406pop100000010401181Russian
\n

8407 rows × 16 columns

\n
"},"metadata":{}}]},{"cell_type":"code","source":"","metadata":{"_uuid":"1b44f336-d75d-4645-b483-44e5f887b560","_cell_guid":"b46575de-40e8-4968-b3a0-6e201c7a5169","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data = data.sort_values(by=[' mood'], ascending=False)\ndata","metadata":{"_uuid":"124f2541-78e1-4fa5-8a5e-77d695687d0f","_cell_guid":"8bee36e8-a88f-4dc5-bfad-9d8d7d09b749","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.505368Z","iopub.execute_input":"2024-10-13T07:33:08.505850Z","iopub.status.idle":"2024-10-13T07:33:08.531549Z","shell.execute_reply.started":"2024-10-13T07:33:08.505808Z","shell.execute_reply":"2024-10-13T07:33:08.530527Z"},"trusted":true},"execution_count":9,"outputs":[{"execution_count":9,"output_type":"execute_result","data":{"text/plain":" genre amazement solemnity tenderness nostalgia calmness \\\n5883 electronic 0 0 0 0 1 \n6562 pop 0 0 0 0 0 \n4458 rock 0 0 1 0 0 \n1178 classical 0 0 0 0 0 \n6543 electronic 0 0 0 0 0 \n... ... ... ... ... ... ... \n5334 electronic 0 0 0 0 0 \n5332 electronic 0 0 1 0 1 \n2031 classical 0 0 0 1 1 \n2345 classical 0 0 0 1 0 \n1295 classical 0 0 1 0 1 \n\n power joyful_activation tension sadness mood liked disliked age \\\n5883 0 0 0 0 5 1 0 53 \n6562 0 0 1 0 5 0 1 25 \n4458 0 0 0 0 5 0 1 38 \n1178 0 0 0 1 5 0 0 33 \n6543 1 1 0 0 5 1 0 54 \n... ... ... ... ... ... ... ... ... \n5334 0 1 0 0 1 1 0 20 \n5332 0 0 0 0 1 0 0 25 \n2031 0 0 0 0 1 0 0 25 \n2345 0 1 0 0 1 1 0 33 \n1295 0 0 0 0 1 0 0 30 \n\n gender mother tongue \n5883 0 English \n6562 1 English \n4458 0 Dutch \n1178 0 Russian \n6543 0 Estonian \n... ... ... \n5334 1 French \n5332 0 English \n2031 1 Russian \n2345 0 Russian \n1295 1 English \n\n[8407 rows x 16 columns]","text/html":"
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
genreamazementsolemnitytendernessnostalgiacalmnesspowerjoyful_activationtensionsadnessmoodlikeddislikedagegendermother tongue
5883electronic000010000510530English
6562pop000000010501251English
4458rock001000000501380Dutch
1178classical000000001500330Russian
6543electronic000001100510540Estonian
...................................................
5334electronic000000100110201French
5332electronic001010000100250English
2031classical000110000100251Russian
2345classical000100100110330Russian
1295classical001010000100301English
\n

8407 rows × 16 columns

\n
"},"metadata":{}}]},{"cell_type":"code","source":"%%capture\nsong_vectorizer = CountVectorizer(lowercase=False)\nsong_vectorizer.fit(data[' genre'])","metadata":{"_uuid":"7bbefcab-7638-47dd-b592-8e5571e572fd","_cell_guid":"54454efe-e88b-45a0-9d25-f80c5a0035ba","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.533041Z","iopub.execute_input":"2024-10-13T07:33:08.533504Z","iopub.status.idle":"2024-10-13T07:33:08.594800Z","shell.execute_reply.started":"2024-10-13T07:33:08.533433Z","shell.execute_reply":"2024-10-13T07:33:08.593722Z"},"trusted":true},"execution_count":10,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"92fb6efd-e669-4ec6-b8d5-9159816d3ccd","_cell_guid":"8bf19c95-4be4-441f-b10f-1e5753deeace","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"data = data.sort_values(by=[' mood'], ascending=False).head(8407)","metadata":{"_uuid":"37cb1e22-e2f2-4a5c-a1ab-2e1cbe3312c8","_cell_guid":"f8287be2-15db-4368-9cdd-a9380689fd06","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.596500Z","iopub.execute_input":"2024-10-13T07:33:08.596889Z","iopub.status.idle":"2024-10-13T07:33:08.603445Z","shell.execute_reply.started":"2024-10-13T07:33:08.596850Z","shell.execute_reply":"2024-10-13T07:33:08.602557Z"},"trusted":true},"execution_count":11,"outputs":[]},{"cell_type":"code","source":"def get_similarities(song_genre, datas):\n\n# Getting vector for the input song.\n text_array1 = song_vectorizer.transform(datas[datas[' genre']==song_genre][' mother tongue']).toarray()\n num_array1 = datas[datas[' genre']==song_genre].select_dtypes(include=np.number).to_numpy()\n\n# We will store similarity for each row of the dataset.\n sim = []\n for idx, row in data.iterrows():\n\t genre = row[' genre']\n\t\n\t# Getting vector for current song.\n\t text_array2 = song_vectorizer.transform(datas[datas[' genre']==genre][' mother tongue']).toarray()\n\t num_array2 = datas[datas[' genre']==genre].select_dtypes(include=np.number).to_numpy()\n\n\t# Calculating similarities for text as well as numeric features\n\t text_sim = cosine_similarity(text_array1, text_array2)[0][0]\n\t num_sim = cosine_similarity(num_array1, num_array2)[0][0]\n\t sim.append(text_sim + num_sim)\n\t\n return sim","metadata":{"_uuid":"2446d4e8-3d65-4c9f-9814-5117a85fbc4c","_cell_guid":"6ce791c5-fd3a-439c-b747-6861fc32c5c6","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.604749Z","iopub.execute_input":"2024-10-13T07:33:08.605122Z","iopub.status.idle":"2024-10-13T07:33:08.616061Z","shell.execute_reply.started":"2024-10-13T07:33:08.605065Z","shell.execute_reply":"2024-10-13T07:33:08.615025Z"},"trusted":true},"execution_count":12,"outputs":[]},{"cell_type":"code","source":"def recommend_songs(song_genre, datas=data):\n # Base case\n if data[data[' genre'] == song_genre].shape[0] == 0:\n print('This song is not so popular')\n \n for song in datas.sample(n=5)[' genre'].values:\n print(song)\n return\n \n datas['similarity_factor'] = get_similarities(song_genre, datas)\n \n datas.sort_values(by=['similarity_factor', ' mood'],\n ascending = [False, False],\n inplace=True)\n \n # First song will be the input song itself as the similarity will be highest.\n display(datas[[' genre', ' age', ' mother tongue']][2:7])","metadata":{"_uuid":"f9b1672e-b8d3-41f2-bda3-d18fc6306d93","_cell_guid":"18501619-640c-418d-bde4-a98dfe9325f9","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.617517Z","iopub.execute_input":"2024-10-13T07:33:08.617957Z","iopub.status.idle":"2024-10-13T07:33:08.628758Z","shell.execute_reply.started":"2024-10-13T07:33:08.617908Z","shell.execute_reply":"2024-10-13T07:33:08.627732Z"},"trusted":true},"execution_count":13,"outputs":[]},{"cell_type":"code","source":"recommend_songs('classical')","metadata":{"_uuid":"444cba3e-4bcc-4e0c-9f8d-920a04e37178","_cell_guid":"ce074402-e969-4fe1-85bd-c5908c6bff61","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:33:08.630468Z","iopub.execute_input":"2024-10-13T07:33:08.631030Z","iopub.status.idle":"2024-10-13T07:41:43.863235Z","shell.execute_reply.started":"2024-10-13T07:33:08.630969Z","shell.execute_reply":"2024-10-13T07:41:43.862031Z"},"trusted":true},"execution_count":14,"outputs":[{"output_type":"display_data","data":{"text/plain":" genre age mother tongue\n246 classical 45 Korean\n528 classical 46 English\n85 classical 34 Portuguese\n580 classical 46 English\n587 classical 55 Dutch","text/html":"
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
genreagemother tongue
246classical45Korean
528classical46English
85classical34Portuguese
580classical46English
587classical55Dutch
\n
"},"metadata":{}}]},{"cell_type":"code","source":"\nimport streamlit as st\nst.header(\"music genre based age Recommendation System\" )","metadata":{"_uuid":"be98c715-a3eb-413d-9a68-b758d11db6f1","_cell_guid":"dc6a052e-9e86-4803-b4c3-1b3cc2feb275","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T08:01:06.254773Z","iopub.execute_input":"2024-10-13T08:01:06.256183Z","iopub.status.idle":"2024-10-13T08:02:24.385650Z","shell.execute_reply.started":"2024-10-13T08:01:06.256130Z","shell.execute_reply":"2024-10-13T08:02:24.383554Z"},"trusted":true},"execution_count":21,"outputs":[{"name":"stdout","text":"conda 24.9.0\n\u001b[33mWARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /simple/streamlit/\u001b[0m\u001b[33m\n\u001b[0m\u001b[33mWARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /simple/streamlit/\u001b[0m\u001b[33m\n\u001b[0m\u001b[33mWARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError(': Failed to establish a new connection: [Errno -3] Temporary failure in name resolution')': /simple/streamlit/\u001b[0m\u001b[33m\n\u001b[0m^C\n","output_type":"stream"},{"traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)","Cell \u001b[0;32mIn[21], line 4\u001b[0m\n\u001b[1;32m 1\u001b[0m get_ipython()\u001b[38;5;241m.\u001b[39msystem(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mconda -V\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m 2\u001b[0m get_ipython()\u001b[38;5;241m.\u001b[39msystem(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mpip install -q streamlit\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[0;32m----> 4\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mstreamlit\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mst\u001b[39;00m\n\u001b[1;32m 5\u001b[0m st\u001b[38;5;241m.\u001b[39mheader(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmusic genre based age Recommendation System\u001b[39m\u001b[38;5;124m\"\u001b[39m )\n","\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'streamlit'"],"ename":"ModuleNotFoundError","evalue":"No module named 'streamlit'","output_type":"error"}]},{"cell_type":"code","source":"genre = data['genre'].values\ngenre","metadata":{"_uuid":"1b266fc4-5690-46d4-89d9-e7bee55b51ce","_cell_guid":"28f4c7e0-a5ad-4619-a36e-fa7e80c2ea72","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:41:44.405653Z","iopub.status.idle":"2024-10-13T07:41:44.406272Z","shell.execute_reply.started":"2024-10-13T07:41:44.405930Z","shell.execute_reply":"2024-10-13T07:41:44.405959Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"if st.button('if st.button('Show Recommendation'):\n recommended_age = recommend_songs(song_genre, datas=data)\n recommended_age","metadata":{"_uuid":"b16edf29-3b36-4b08-8edf-0bbf284c9b29","_cell_guid":"9264a102-2179-40ec-8765-d8359ed601f1","collapsed":false,"jupyter":{"outputs_hidden":false},"execution":{"iopub.status.busy":"2024-10-13T07:41:44.408245Z","iopub.status.idle":"2024-10-13T07:41:44.408817Z","shell.execute_reply.started":"2024-10-13T07:41:44.408520Z","shell.execute_reply":"2024-10-13T07:41:44.408550Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"e1161ad4-d84b-413c-87b8-35e9f91686a5","_cell_guid":"7808e3c8-4331-43d6-a974-a6946b44cadd","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"","metadata":{"_uuid":"f66e3d74-1fd8-4268-a34a-4dda1596f4c7","_cell_guid":"72985dcb-49dd-42ba-bc4b-c4d722190ba8","collapsed":false,"jupyter":{"outputs_hidden":false},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"In this above. My model brought the recommendations I wanted to bring acording to the mood of the songs of different genres . Firstly it brought to Transformers movies.\n\n# Conclusion\n\n* In this notebook. I worked to explain content based recommendation system.\n* Content based RS models are powerful in new item recommendation. \n* In general it is used TF-IDF and Word2Vec model while content based RS designs.\n* TF-IDF method is the method which explains words frequency in document.\n* It can be created a recommendation model by using cosine distance with the aid of TF-IDF weights.","metadata":{"_uuid":"44c83c1a-3c0f-453d-b4ff-7fc5680f4c4a","_cell_guid":"03c84478-9f90-49ea-8c39-80e5b748160d","trusted":true}}]} \ No newline at end of file