Skip to content

bigdata-i523/hid213

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

---
owner:
    hid: 213
    name: Liu, Yuchen
    url: https://github.com/bigdata-i523/hid213
paper1:
    abstract: >
        Nowadays, Speech Recognition is becoming more and more
        important. Many technology companies are trying to use Big
        Data to develop more efficient and accurate algorithm for
        Speech Recognition. Nowadays, Deep learning can be described
        as the foundation of Speech Recognition. Deep learning
        algorithms such as RNN and CNN often need to supported by
        large amount of data -- Big data.  Before Big Data and deep
        learning, the word error rate was 24 percent.  Recently, IBM
        published a paper where the word error rate was below 5.5
        percent.  In August, Microsoft speech recognition system has
        reached a 5.1 percent error rate.
    author:
        - Yuchen Liu
    hid:
        - 213
    status: Oct 06 2017 100%
    title: Big Data and Speech Recognition
    url: https://github.com/bigdata-i523/hid213/paper1/paper1.pdf
    chapter: Media
paper2:
    review: Nov 6 2017
    abstract: >
        Face recognition is a technology focus on identity retrieval
        and verification.  Face recognition extracting face
        information from a given static or dynamic images to match
        with the known identity face database. Due to the interference
        of illumination, expression, occlusion and orientation, the
        accuracy of face recognition technology is relatively low
        compared with other recognition technology, such as palm print
        and fingerprint. But the acquisition method of Face
        recognition is the most friendly : without the cooperation of
        the parties, even in the case of its lack of awareness, it
        completed the acquisition and identification of face
        information. Therefore, face recognition technology has been a
        hot research topic in the field of artificial intelligence for
        more than 40 years and has gradually become mature.  Many
        technology companies are trying to use Big Data to develop
        more efficient and accurate algorithm for Face Recognition.
        It has been used in fields such as anti-terrorism, security
        and access control.  In recent years, it has been applied to
        fields such as education and finance Promotion.
    author:
        - Yuchen Liu
    hid:
        - 213
    title: Big Data and Face Identification
    status: Nov 06 2017 100%
    url: https://github.com/bigdata-i523/hd213/paper2/paper2.pdf
    chapter: Security
project:
    review: Dec 4 2017
    abstract: >
        Digit Recognizer is becoming more and more important in many 
        different areas, such as zip code recognizer, banking receipt 
        and balance sheet. Many technology companies are trying to use 
        Big Data to develop more efficient and accurate algorithm for 
        Digit Recognizer. This project uses Digit Recognizer data set 
        from Kaggle.com. There are more than 42000 samples in the data 
        set. Each sample contains 784 features which contain pixel 
        information from a $28*28$ graph. Each pixel has a value between 
        0 to 255. We use binary classification technique for data cleaning 
        and PCA for feature extraction. For the classification model, 
        we choose five most commonly used classification algorithms, 
        which include Decision Tree (DT), Naive Bayes (NB), Logistic 
        Regression (LR), Random Forest (RF) and Support Vector Machine 
        (SVM). From the result, SVM classifier on PCA data produces the
        highest accuracy with 0.9813. The time spend is 127 seconds. 
        Naive Bayes classifier on PCA data spends the least amount of 
        time to finish the classification task. It takes less one second 
        and reaches a 0.8651 accuracy.
    duplicate: True
    author:
        - Han, Wenxuan
        - Liu, Yuchen
        - Lu, Junjie
    hid:
        - 209
        - 213
        - 214
    title: Comparison between different classification algorithms in Digit Recognizer
    status: Dec 04 2017 100%
    type: project
    url: https://github.com/bigdata-i523/hd213/Project/report.pdf
    chapter: Media