SDS 385, Fall 2016

I am Giorgio Paulon, 1st year PhD student in Statistics at UT Austin. This is my personal page for the Big Data class with Prof. James Scott.

Exercise 1

In the first homework, we firstly adressed the problem of linear regression in the context of weighted least squares. We tested three different matrix factorization that can deal with large datasets, and afterwards we implemented methods for dealing with large sparse matrices. A benchmarking of the different procedures has been carried out.

Secondly, the problem of logistic regression has been tackled. Two approaches minimizing the negative log-likelihood of the data have been performed: the gradient descent and the Newton-Raphson method.

Exercise 2

In the second homework, we implement Stochastic gradient descent (SGD), an algorithm that allows to approximate the computation of the gradient with a faster routine, using an unbiased estimate of the gradient.

The SGD for logistic regression has been succesfully implemented. This algorithm, despite being faster than gradient descent and Newton method, has crucial issues, such as the choice of the step size. In a more sophisticated version, the step size can vary along the number of iterations. This is a very important step in order to ensure convergence.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
HW1		HW1
HW2		HW2
HW3		HW3
HW4		HW4
HW5		HW5
HW6		HW6
HW7		HW7
HW8		HW8
HW9		HW9
Peer Reviews		Peer Reviews
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SDS 385, Fall 2016

Exercise 1

Exercise 2

About

Releases

Packages

Languages

giorgiopaulon/SDS385

Folders and files

Latest commit

History

Repository files navigation

SDS 385, Fall 2016

Exercise 1

Exercise 2

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages