####VKStalk - vk.com scraper ######v5.0.0 BETA
Python console application. Scraps a VK user's public information. When running it displays in console:
- User online/offline (if offline, it shows last seen time)
- User is using mobile client
- User status (OR current music track if available)
- User data updates (any updates to user data, e.g. profile photo, nr. of wallposts)
#####Sample console output
=======| VKStalk ver. 5.0.0 BETA |=======
Launched on 08-November-2015 at 22:15
User ID: 45156687
User Name: Alexey Dvorak
Logs written: 0
==============| LATEST LOG |==============
>>> Checked on 2015-11-08 at 22:15:30 <<<
Date: 08-11-2015. Time: 19:09:20
Alexey Dvorak -- last seen yesterday at 03:49 [Mobile]
Status: it was all about cookies
==========================================
#####Setup
- Clone or download this repo
- Start a virtualenv within root directory
- Activate virtualenv
- Install requirements
pip install -r requirements/base.txt
- In
src/config
. Make a copy ofsample_secrets.py
and rename it tosecrets.py
- In
src/config/secrets.py
fill your database information. (By default it uses postgres, but you can try any other database, see this) - In
src/config/settings.py
set your timezoneCLIENT_TZ
- Start the app
python main.py USER_VK_ID
#####Notes
- You can go ahead and play with settings in
src/config/settings.py
.- Of more interest could be:
DATA_FETCH_INTERVAL
,MAX_CONNECTION_ATTEMPTS
,CONNECTION_TIMEOUT
,CONSOLE_LOG_TEMPLATE
- Of more interest could be:
- To see accepted CLI arguments run
python main.py -h
- To get the summary on a user run
python main.py USER_VK_ID --summary
. By default it will generate the summary for the past week and write it to a file inPROJECT_ROOT/summaries
and also print it to console. This behaviour can be changed using CLI arguments.
#####Currently it parses the following information
- User data
- name
- birthday
- photo
- hometown
- site
- skype
- phone
- university
- studied_at
- wallposts
- photos
- videos
- followers
- communities
- noteworthy_pages
- current_city
- info_1
- info_2
- info_3
- Activity
- is_online
- is_mobile
- status
- last_visit_lt_an_hour_ago
- last_visit