Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
olivierzach authored Jul 1, 2018
1 parent 5d65176 commit b98fabe
Show file tree
Hide file tree
Showing 4 changed files with 203 additions and 0 deletions.
66 changes: 66 additions & 0 deletions Week 7 Notes/bayesian_modeling.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Week 7 Notes: Bayesian Modeling

- bayesian models are sometimes counterintuitive

Bayesian Probability:
- based on basic rule of conditional probability
- bayes' rule
- P(A|B) = P(B|A) * P(A) / P(B)
- example: medical test
- true positives: 98%
- false positives: 8%
- 1% of the population and 8.9% of people test positive
- if someone tests positive what is the probability someone actually has the disease?
- write out the equation in bayes' rule
- A = has the disease
- B = tested positive
- P(A|B) = P(B|A) * P(A) / P(B) = 98% * 1% / 8.9% = 11%
- even after testing positive a person only has a 11% of having the disease
- why?
- so many more people don't have the disease = many more false positives than true positives


Empirical Bayes Modeling:
- overall distribution of something is known or estimated
- only a little data is available for the problem
- ex. predicting basketball outcomes NCAA
- difference X in points scored by home team and road team
- approximately normal: X ~ N(m + h, simga^2)
- h = home court advantage
- m = true difference in the teams' strength (unknown)
- simga^2 = variance
- bayes rule allows us to figure out the unknown m
- first model the difference between teams' strengths m ~ N(0, tau^2)
- then look at observed data:
- x = observed point difference in game
- m = real difference between two teams, with m != x
- bayes' rule: look for the probability of having a true points difference given the observation x
- P(M = m | X = x) = P(X= x | M = m)*P(M=m) / P(X = x)
- probability of m given x!!
- if team a beats team b by x points we could find the distribution of how much one team is better
- we could also integrate that distribution (zero to infinity) to show the probability that a team is actually better!
- P (home team better | X = x) = integral(P(M = m | X = x)* dm)

what are we actually saying?
- home team won by 20 points
- estimated home court advantage h = 4 points
- standard deviation in team strength difference tau = 6 points
- standard error from random variance sigma = 11 points
- of the 20 point victory:
- about 4 points was home court advantage
- about 12.5 points due to random variation
- only 3.5 points due to the difference between teams
- this seems counterintuitive: 20 points win just 3 point difference?
- there is a lot more variance due to randomness
- 20 point win is more likely to happen due to randomness
- bayes rule shrinks the estimate to a more normal distribution

Summary:
- take a single observation
- combine with broader set of observations
- then make a deduction or prediction
- bayesian models work especially in the absence of lots of data


P(A) = prior distribution
P(A|B) = posterior distribution
30 changes: 30 additions & 0 deletions Week 7 Notes/deep_learning.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Week 7 Notes: Neural Networks and Deep Learning

- used to react to patterns that we don't even understand
- CAPTCHA type of questions
- idea of deep learning is to train a system to react to without knowing what it is reacting to
- powerful in image recognition and speech recognition, NLP


Neural Networks:
- neural networks are modeled after the way neurons work in our brains
- Artificial Neural Network
- three levels of neurons:
- input level, hidden level, output level
- input > hidden > output
- each input accepts a single piece of information
- each neuron: gets inputs from previous layer > calculates function of weighted inputs > gives it output to next layer
- there might be several layers of hidden layer neurons
- finally we reach the output is the combination of all weighted hidden layer results
- the output layer chooses the 'best' answer based on the results from all the hidden layers
- then the results are fed back through the entire system and re-weighted based on the incorrectness of the first output
- simple is gradient descent to do this
- if the network learns well with enough data all of the weights will be adjusted so that the network generates correct outputs from the input
- require a lot of data to train
- hard to choose and tune the learning algorithm: re-weight too fast or too slow can be problematic


Deep Learning:
- idea of neural networks adapted for more layers
- similar approach to neural networks = input > "deep" layers > output > restart with re-weights
- powerful in NLP, speech, image recognition
73 changes: 73 additions & 0 deletions Week 7 Notes/game_theory.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
strategy# Week 7 Notes: Competitive Models

- competitive decision making
- previous models = 'us against the data'
- descriptive models = get understanding of reality
- predictive models = find hidden relationships and predict the future
- prescriptive models = find the best thing to do assuming the system does not react
- what if the system reacts intelligently?
- we need to use analytics to consider all sides of the system

Examples:
- pricing examples
- using past purchase data and competitor data to price products
- one price is set competitors may change their price more - giving different results than the model
- government = corporate tax policies
- companies need to decide how to store and spend their money based on tax revenue of the government
- employee incentives to change behavior

- need to consider not just your own situation but the competitive situation
- these situations need competitive decision making = game theory
- cooperative game theory = competitive and cooperative game theory

Timing:
- make decisions simultaneously
- can't change once made
- strategy = counter strategy > counter-counter strategy = best strategy after many iterations
- sequential game = decision made in series

Types of Strategies:
pure strategy = just one choice
mixed strategy = randomize decisions according to probabilities
example = rock paper scissors
pure strategy will eventually lose
mixed strategy will work best

Information Levels:
- perfect information: know all the information for everyone's situation
- imperfect information: some have more information than others - competitive advantage - not symmetric across competitors

No Sum and Nero Sum:
- whatever one side gets the other side loses and vice versa
- bet $1 on game: get dollar vs. lose dollar
- non-zero sum: total benefit might be higher or lower
- example = economics

Summary:
competitive decision making = game theory
how do we determine the best strategy? = optimization models
we want to find the optimal strategy!



# Week 7 Notes: Game Theory Models

basic demo of game theory models
how does it work
what analysis is involved?

Game Theory Example:
- two gas stations
- set price: $2 or $2.50
- same price = 50 / 50 demand
- otherwise = all demand will go to lower priced one
- what's the best price to choose?
- talk to each other and set at $2.50 - half demand at the higher profit margin
- it matters the cost to determine what the price should be chosen!
- stable equilibrium = no incentive to change
- prisoner's dilemma = incentive to agree to higher price and then back out and charge the lower price
- can choose any price points they want
- both can keep lowering prices until price is about equal to the cost
- "race to the bottom" = simple model will cause each to lower prices until they meet the margin
- competition drives down price for consumers
- game theory says there is incentive to charge slightly lower price than competition
34 changes: 34 additions & 0 deletions Week 7 Notes/graph_theory.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Week 7 Notes: Communities in Graphs

- analysis of large interconnected networks
- automated ways of finding highly interconnected subpopulations
- social media 'influencers'
- disease outbreak
- model to automatically find 'communities'

- community = a set of circles that's highly connected within itself
- graph = collection of circles, lines of the community
- circles = nodes / vertices
- lines = arcs / edges
- clique = a set of nodes that all have edges between each other

- we don't need full clique (complete)
- goal is to decompose the graph into a community
- we do this using the Louvain Algorithm
- the goal of the Louvain Algorithm is to maximize the modularity of a graph

Louvain Algorithm:
- aij = weight on the arc between nodes i and j
- if there no arc between i and j then aij = 0
- wi = total weight of arcs connected to i
- W = total weight of all the arcs in the graph
- Modularity = (1 / 2W) * sumof(i,j in same community * (aij - wi*wj / 2W))
- modularity = measure of how well the graph is separated into communities or modules that are connected internally but not connected much between each other
- Step 0: each node is its own community
- Step 1: make biggest modularity increase by moving a node from its current community to an adjacent community
- Step 2: repeat this process until there are no more increases in modularity
- Step 3: each community is a super node and repeat step1 using super nodes
- louvain is a heuristic:
- not guaranteed to find the absolute best partition of a graph into a community
- gives very good solutions very quickly
- best for finding communities inside a large network

0 comments on commit b98fabe

Please sign in to comment.