IMDb Sci-Fi Movies Scrapy Project

Welcome! This project is about scraping Sci-Fi movie data from IMDb using the start URL from IMDb Sci-Fi genre. Navigating through IMDb's structure can be intricate, but we've managed to create a spider that does the job!

🚀 Quick Start

Install necessary packages

Ensure you've installed all the required packages. The list is available in the requirements.txt file.
```
pip install -r requirements.txt
```
Run the Spider

Move to the project directory and initiate the imdb_spider.
```
scrapy crawl imdb_spider -O output.csv
```

📦 Output

The scraped movie data will be stored in a CSV format. Open the file with spreadsheet software like Microsoft Excel or Google Sheets to review and analyze the movie details.

📋 Data Columns

The scraper extracts the following columns:

TITLE
RELEASED YEAR
RATING
GENRES
LENGTH
CERTIFICATION
DIRECTORS
WRITERS
STARS

🔍 Challenges Faced

Complex HTML Structure: IMDb's website layout is known for its intricacies. We took the challenge head-on and designed our spider to fetch data precisely.
Rate Limiting: IMDb sometimes limits the number of requests, leading to temporary blocks. We've employed strategies like changing user-agents and introducing delays to handle this.

📚 Resources

Scrapy Documentation: For further information or if any challenges arise, refer to the official Scrapy documentation: Scrapy Documentation.

🙌 Contributions

Dive in and contribute! Whether you're opening an issue, suggesting enhancements, or fixing potential bugs, your input is greatly appreciated!

📃 License

This scraper is open-source and available for all. Feel free to utilize, alter, and distribute as you see fit. Happy scraping!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
imdb		imdb
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IMDb Sci-Fi Movies Scrapy Project

🚀 Quick Start

📦 Output

📋 Data Columns

🔍 Challenges Faced

📚 Resources

🙌 Contributions

📃 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Pr45H4nt/IMDBscraper

Folders and files

Latest commit

History

Repository files navigation

IMDb Sci-Fi Movies Scrapy Project

🚀 Quick Start

📦 Output

📋 Data Columns

🔍 Challenges Faced

📚 Resources

🙌 Contributions

📃 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages