Skip to content

Engin Deniz Alpman eğitmenin "Veri Bilimi" kursunu bitirme projem. (Patika.dev)

Notifications You must be signed in to change notification settings

mervesenacnr/VeriBilimi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 

Repository files navigation

📊📋Data Science🧮🗂

⚙ What is data and Data Science? 🤔

Data is everything we perceive, describe.
For example, the population of Turkey is a data. The population of Germany, the population of the world, simply dogs, cats, houses, schools are all data.
Subcategories of data:

  1. Numeric Data
  2. Categorial Data
When we look at Numeric Data closely we'll see:
  • Continous (Interval)
  • Discerete (Ratio)
  • When we look at Categorial Data closely we'll see:
  • Binary
  • Multiclass
  • Although I have simplified the meaning of the term "data", actually "Data Science" is a broad concept that encompasses mathematics and statistics, custom programming, advanced analytics, artificial intelligence (AI) and machine learning. Data science is a multidisciplinary field that uses scientific methods, processes, algorithms and systems to extract information and insights from structured and unstructured data.
    Data Science is collected under 3 main headings:
  • Data Analyst
  • Statistician
  • Machine Learning
  • If you want to learn more about "What is Data Science" here is a link for you to read. 👉https://www.ibm.com/topics/data-science

    ⚙ What is Machine Learning? 🤔

    It is a communication tool used to tell our requests to the computer. Deep Learning is a sub-branch of Machine Learning and Machine Learning is a sub-branch of Data Science.
    Machine Learning has 2 areas:

  • Applied Machine Learning
  • Machine Learning Research
  • If you want to learn more about "What is Machine Learning" here is a link for you to read. 👉https://www.ibm.com/topics/machine-learning

    ⚙ Let's learn some Data Science terms! 🦾🤖

  • Supervised: Supervised learning is to take output from labeled models. 👉https://www.ibm.com/topics/supervised-learning
  • Unsupervised: Unsupervised learning is grouping unlabeled models. 👉https://www.ibm.com/topics/unsupervised-learning
  • Regression: Estimated data are constantly variable. 👉https://www.investopedia.com/terms/r/regression.asp
  • Classification: Estimated data are in certain categories. 👉https://www.techtarget.com/searchdatamanagement/definition/data-classification
  • Also you need to know that 😏:
  • Regression ≌ Classification.
  • There is no absolute 0 reference in "interval", but there is in "ratio".
  • There is no contiuous variable.
  • Prediction: We have a lot of data and we try to correctly guess the answer to a question from this data. For example, we have data such as the height, leaves and color of a flower, and we can estimate whether the flower is poisonous by looking at these data.
  • Mapping: f(x1,x2,x3)= ŷ↔y. So we describe the function as an input, and the "ŷ" as an output. Also "ŷ" means prediction of the model,and "y"means the truth. Our main goal in mapping is to minimize the errors that occur. error = e(ŷ,y)
  • regression-analysis-diagram

    If e=0 in known data, there is no such thing as e0 in unknown data. The main purpose is to minimize the errors that will arise from the unseen data while training the machine in the "train". So, how can we do this?
    The answer is: I just train the model on the "train" validation and set the hyperparameters of the model. But since I did this hyperparameter update according to its good performance on validation, my model starts to overfit the validation set, albeit indirectly. So I need data that it has never seen to test it implicitly as well.
    In short 🤐, I split the "train" into 3:

  • Train
  • Validation
  • Test
  • ⚙ What is Bias Statics? 🤔

    Bias is when the model systematically discriminates. Models carry the ideas of the people who created them. That's why every model is as objective as its designer. (Look "overstimate and "understimate") 👉https://www.statisticshowto.com/what-is-bias/

    Author of this article ✍: Merve Sena Çınar
    Follow me on LinkedIn 💁‍♀️ https://www.linkedin.com/in/mervesenacinar/

    About

    Engin Deniz Alpman eğitmenin "Veri Bilimi" kursunu bitirme projem. (Patika.dev)

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published