🌟 Hit star button to save this repo in your profile
The information on this Github is part of the materials for the subject High Performance Data Processing (SECP3133). This folder contains general big data information as well as big data case studies using Malaysian datasets. This case study was created by a Bachelor of Computer Science (Data Engineering), Universiti Teknologi Malaysia student.
- Essential Preparations for a Successful Start in High-Performance Data Processing Class
- Course Information
- Student information
- Exercise
No | Module | Description | File |
---|---|---|---|
1. | Lab 1 | Understanding Your Data | |
2. | Lab 2 | EDA Big Data | |
3. | Lab 3 | Feature Engineering |
No. | Section | Content |
---|---|---|
1 | Introduction to Big Data Management | A. What is Big Data? B. The Importance of Managing Big Data C. Why Big Data Management is Important? D. The History of Big Data |
2 | Understanding Big Data | A. Defining Big Data B. Characteristics of Big Data C. Sources of Big Data D. Challenges in Dealing with Big Data E. The workflow of Big Data Management |
3 | The Role of Big Data Management | A. Managing big data B. Benefits of Effective Big Data Management C. Risks of Ignoring Big Data Management D. Industries Benefiting from Big Data Management |
4 | Data Collection and Storage | A. Data Collection Methods 1. Traditional Data Sources 2. Emerging Data Sources B. Data Storage and Warehousing C. Data Security and Privacy Concerns |
5 | Data Processing and Analysis | A. Data Preprocessing B. Data Analysis Tools and Techniques C. Real-time Data Processing D. Machine Learning in Big Data Analysis |
- Awesome Public Datasets
- Portal Data Terbuka Malaysia
- Department of Statistics Malaysia
- data.world
- Dataportal.asia
- knoema
- The World Bank
- Dataset Search - Google
- UCI Machine Learning Repository
- Kaggle datasets
- Awesome-public-datasets
- Datahub.io
- Earthdata
- CERN Open Data Portal
Please create an Issue for any improvements, suggestions or errors in the content.
You can also contact me using Linkedin for any other queries or feedback.