Skip to content

Crawling Javalibrary actors’ work data based on Scrapy

Notifications You must be signed in to change notification settings

desonglll/JavlibraryScrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤗New Version Coming!🤗

A tool to scrape works data from library and displaying in custom website.

this project includes backend and frontend.

demo

The previous version of JavlibraryScrapy is released to the following address: https://github.com/desonglll/JavlibraryScrapy/releases

Repoistories

Github

https://github.com/desonglll/JavlibraryScrapy

Gitee

https://gitee.com/desonglll/scrapy-jav

💤 Requirements

Before

Windows/macOS/Linux

git

mysql8

And proxy to access Javlibrary.

🐍 Install Conda

Quick command line install

These quick command line instructions will get you set up quickly with the latest Miniconda installer. For graphical installer (.exe and .pkg) and hash checking instructions, see Installing Miniconda.

For macOS

These four commands quickly and quietly install the latest M1 macOS version of the installer and then clean up after themselves. To install a different version or architecture of Miniconda for macOS, change the name of the .sh installer in the curl command.

mkdir -p ~/miniconda3
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh -o ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm -rf ~/miniconda3/miniconda.sh

After installing, initialize your newly-installed Miniconda. The following commands initialize for bash and zsh shells:

~/miniconda3/bin/conda init bash
~/miniconda3/bin/conda init zsh

Create a new Python environment using conda

conda create -n scrapyJAV python=3.11

Then activate environment

conda activate scrapyJAV

Clone

Github

git clone https://github.com/desonglll/JavlibraryScrapy.git scrapyjav-project

Gitee

git clone https://gitee.com/desonglll/scrapy-jav.git scrapyjav-project

Install poetry

If you are in scrapyjav-project directory:

pip install poetry
cd scrapyJAV
poetry install

🚀 Running

Configuration

Edit scrapyJAV/config.yaml for database configuration.

Edit scrapyJAV/config.yaml for argument configuration.

📝 Edit configuration arguments based on your needs.

  • Enter the actor ID that you want to scrape.
  • Eg: https://www.javlibrary.com/cn/vl_star.php?list&mode=&s=ae5q6&page=1 and ae5q6is the id of 楓カレン
  • In id_references, there are some references of actress ID in json format.

Initialize Database

Suppose you are in scrapyjav-project directory.

There is two ways to create database

Running .sql file

cd scrapyJAV
mysql -u root -p

Then

create database scrapyjav character set utf8;
exit

Running .sql file using the following command

mysql -Dscrapyjav -u root < database_structure.sql

Using built-in methods

Running the following command for initialize the database.

poetry run scrapyjav -d init

Start scrapy

Suppose you are in scrapyjav-project.scrapyJAV directory.

Running the following command:

poetry run scrapyjav -c start

Delete database

poetry run scrapyjav -d delete

Backend Server

Go into the backend folder, and see README.md there.

Frontend Server

Go into the frontend folder, and see README.md there.

Star History

Star History Chart