Skip to content

Commit f1ba2ec

Browse files
committed
2.0.0.2
Site settings have been expanded, some functions and dependencies have been changed. Removed unused elements in UserDataBase, added additional xml fields, added error executor. Created a basic download function. Added Instagram saved posts and 429 bypass. Added channel statistics. Added site redgifs. Updated sites algorithms. Other improvements. Updated downloader algorithm.
1 parent 7da1ccf commit f1ba2ec

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+4907
-1144
lines changed

CONTRIBUTING.md

+13-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Contributor's Guide
22

3-
I welcome pull requests! Follow these steps to contribute:
3+
I welcome requests! Follow these steps to contribute:
44

55
1. Find an [issue](https://github.com/AAndyProgram/SCrawler/issues) that needs assistance.
66
2. Let me know you are working on it by posting a comment on the issue.
@@ -21,11 +21,22 @@ I welcome pull requests! Follow these steps to contribute:
2121
2. If you don't find anything, create a new issue with your request. I usually reply as soon as possible (within the next few hours).
2222
- If I'm interested in a site you want to add, it may be added in future releases.
2323
- If the site has an API that does not require authorization, it may be added in the coming releases.
24-
- You can make it faster by posting a link to the API. I don't use OAuth authentication in my application, so if it's not too hard to make a new parsing algorithm without OAuth authorization, I can start developing it in the coming days. Otherwise, I need time to figure out how to do it.
24+
- You can make it faster by posting a link to the API. **I don't use OAuth authentication** in my application, so if it's not too hard to make a new parsing algorithm **without OAuth** authorization, I can start developing it in the coming days. Otherwise, I need time to figure out how to do it.
2525
- If the site does not have an API that does not require authorization, this may take some time.
26+
- If you will be posting request urls **without OAuth** authentication, I might consider adding your site if I have time.
2627
- If I'm **not** interested in the site you want to add, you can pay to have it added by making a donation of approximately $10. **But before that, you still need to create an issue. If I'm not interested, you can offer me a deal to develop it for money. I'll check the site you want to add, check the availability of the API and tell you how much time I need to develop it and the price. If you agree, I will do it.** [![ko-fi](https://www.ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/andyprogram)
2728

2829

2930
# Sites I will never develop
3031

3132
- Facebook
33+
34+
# Sites requested by users
35+
36+
- TikTok
37+
- API for receiving data without authorization was not found. Therefore, I don't have time to start developing this site parsing algorithm. If anyone knows of requests that may collect data without OAuth authentication, please let me know.
38+
39+
# Contact me
40+
41+
[Element messenger](https://element.io/): @andyprogram:matrix.org
42+
https://matrix.to/#/@andyprogram:matrix.org

Changelog.md

+22
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,25 @@
1+
# 2.0.0.2
2+
3+
**This is the last release that supports program settings of version 1.0.0.4 and lower. Compatibility of program settings with version 1.0.0.4 and lower will be removed in future releases. It is strongly recommended that you upgrade to this release before future releases. Otherwise, you will have to configure the program settings again. If your program version is 1.0.1.0 or higher, you should not pay attention to this message.**
4+
5+
- Added
6+
- Tray icon
7+
- Close program to tray
8+
- Close confirmation dialog
9+
- **Separated thread for downloading Instagram profiles**
10+
- **Wait timers to bypass Instagram error "Too Many Requests" (429)**
11+
- **Downloading saved Instagram posts** *(requires a second InstaHash)*
12+
- Downloading saved posts (from Reddit and Instagram) form
13+
- Tray notification when download is complete (Instagram notification separate from other)
14+
- Downloading not downloaded Instagram posts when a 429 error is encountered and/or the user stops downloading
15+
- Separate progress bar for downloading Instagram profiles
16+
- Clear information about downloaded profiles of the current session in the "Download info form"
17+
- Increased the number of Instagram posts (from 12 to 50) received per request
18+
- Channels' statistics
19+
- **RedGisf profiles support**
20+
- Fixed
21+
- The program was showing incorrect information about the total numbers of images and videos downloaded when a Reddit user was created from a channel
22+
123
# 2.0.0.1
224

325
- Added

README.md

+18-5
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Social networks crawler
22

3-
Program for downloading photo and video from Reddit, Twitter and Instagram
3+
A program to download photo and video from Reddit, Twitter, Instagram, [etc](#supported-sites).
44

55
Do you like this program? Consider adding to my coffee fund by making a donation to show your support. :)
66

@@ -11,23 +11,30 @@ Do you like this program? Consider adding to my coffee fund by making a donation
1111
- Reddit images;
1212
- Reddit galleries of images;
1313
- Redgifs hosted videos (https://www.redgifs.com/);
14-
- Reddit hosted videos (downloading Reddit hosted video is going through ffmpeg);
14+
- Reddit hosted videos (downloading Reddit hosted video is going through ffmpeg (**ffmpeg only works with the x64 program**));
1515
- Twitter images and videos;
1616
- Instagram images and videos.
1717
- Parse [channel and view data](https://github.com/AAndyProgram/SCrawler/wiki/Channels).
18-
- Download [saved Reddit posts](https://github.com/AAndyProgram/SCrawler/wiki/Home#saved-posts).
18+
- Download [saved Reddit and Instagram posts](https://github.com/AAndyProgram/SCrawler/wiki/Home#saved-posts).
1919
- Add users from parsed channel.
2020
- Labeling users.
2121
- Filter exists users by label or group.
2222
- Selection of media types you want to download (images only, videos only, both)
2323

24+
# Supported sites
25+
26+
- Reddit
27+
- Twitter
28+
- Instagram
29+
- RedGifs
30+
2431
# How does it works:
2532

2633
## Reddit
2734

2835
The program parses all user posts, obtain MD5 images hash and compares them with existing ones to remove duplicates. Then the media will be downloaded.
2936

30-
## Twitter and Instagram
37+
## Other sites
3138

3239
The program parses all user posts and compares file names with existing ones to remove duplicates. Then the media will be downloaded.
3340

@@ -42,7 +49,7 @@ Read [here](https://github.com/AAndyProgram/SCrawler/blob/main/CONTRIBUTING.md#h
4249
- Windows 7, 8, 9, 10, 11 with NET Framework 4.6.1 or higher
4350
- Authorization cookies and tokens for Twitter (if you want to download data from Twitter)
4451
- Authorization cookies Instagram (if you want to download data from Instagram)
45-
- ffmpeg library for downloading videos hosted on Reddit (you can download it from the [official repo](https://github.com/GyanD/codexffmpeg/releases/tag/2021-01-12-git-ca21cb1e36) or [from my first release](https://github.com/AAndyProgram/SCrawler/releases/download/1.0.0.0/ffmpeg.zip))
52+
- ffmpeg library for downloading videos hosted on Reddit (you can download it from the [official repo](https://github.com/GyanD/codexffmpeg/releases/tag/2021-01-12-git-ca21cb1e36) or [from my first release](https://github.com/AAndyProgram/SCrawler/releases/download/1.0.0.0/ffmpeg.zip)). **ffmpeg only works with the x64 version of the program.**
4653
- **Don't put program in the ```Program Files``` system folder (this is portable program and program settings are stored in the program folder)**
4754
- **Just unzip the program archive to any folder, copy the file ```ffmpeg.exe``` into it and enjoy. :)**
4855

@@ -69,6 +76,7 @@ You can add users by patterns:
6976
- https://twitter.com/SomeUserName
7077
- https://reddit.com/user/SomeUserName
7178
- https://reddit.com/r/SomeSubredditName
79+
- https://www.redgifs.com/users/SomeUserName
7280
- u/SomeUserName
7381
- r/SomeSubredditName
7482
- SomeUserName (in this case, you need to select the user's site)
@@ -83,3 +91,8 @@ Read more about adding users and subreddits [here](https://github.com/AAndyProgr
8391
Create a shortcut for the program. Open shortcut properties. In the ```Shortcut``` tab, in the ```Target``` field, just add the letter ```v``` at the end across the space.
8492

8593
Example: ```D:\Programs\SCrawler\SCrawler.exe v```
94+
95+
# Contact me
96+
97+
[Element messenger](https://element.io/): @andyprogram:matrix.org
98+
https://matrix.to/#/@andyprogram:matrix.org

SCrawler.sln

-5
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,8 @@ Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "Solution Items", "Solution
1111
ProjectSection(SolutionItems) = preProject
1212
.gitignore = .gitignore
1313
Changelog.md = Changelog.md
14-
Info\InstaAlgo.txt = Info\InstaAlgo.txt
15-
Info\InstagramInfo.txt = Info\InstagramInfo.txt
1614
README.md = README.md
17-
Info\RedditUrlsInfo.txt = Info\RedditUrlsInfo.txt
1815
ToDo.txt = ToDo.txt
19-
Info\TwitterNewAlgo.txt = Info\TwitterNewAlgo.txt
20-
Info\TwitterUrlsInfo.txt = Info\TwitterUrlsInfo.txt
2116
EndProjectSection
2217
EndProject
2318
Global

SCrawler/API/Base/SiteSettings.vb

+38-9
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,9 @@ Namespace API.Base
2525
_Path.Value = NewFile
2626
End Set
2727
End Property
28+
#Region "Instagram"
2829
Friend ReadOnly Property InstaHash As XMLValue(Of String)
30+
Friend ReadOnly Property InstaHash_SP As XMLValue(Of String)
2931
Friend ReadOnly Property InstaHashUpdateRequired As XMLValue(Of Boolean)
3032
Friend ReadOnly Property InstagramDownloadingErrorDate As XMLValue(Of Date)
3133
Friend Property InstagramLastApplyingValue As Integer? = Nothing
@@ -40,7 +42,18 @@ Namespace API.Base
4042
End With
4143
End Get
4244
End Property
43-
Friend Property InstagramTooManyRequestsReadyForCatch As Boolean = True
45+
Friend ReadOnly Property InstagramLastDownloadDate As XMLValue(Of Date)
46+
Friend ReadOnly Property InstagramLastRequestsCount As XMLValue(Of Integer)
47+
Private InstagramTooManyRequestsReadyForCatch As Boolean = True
48+
Friend Function GetInstaWaitDate() As Date
49+
With InstagramDownloadingErrorDate
50+
If .ValueF.Exists Then
51+
Return .ValueF.Value.AddMinutes(If(InstagramLastApplyingValue, 10))
52+
Else
53+
Return Now
54+
End If
55+
End With
56+
End Function
4457
Friend Sub InstagramTooManyRequests(ByVal Catched As Boolean)
4558
With InstagramDownloadingErrorDate
4659
If Catched Then
@@ -55,9 +68,14 @@ Namespace API.Base
5568
Else
5669
.ValueF = Nothing
5770
InstagramLastApplyingValue = Nothing
71+
InstagramTooManyRequestsReadyForCatch = True
5872
End If
5973
End With
6074
End Sub
75+
Friend ReadOnly Property RequestsWaitTimer As XMLValue(Of Integer)
76+
Friend ReadOnly Property RequestsWaitTimerTaskCount As XMLValue(Of Integer)
77+
Friend ReadOnly Property SleepTimerOnPostsLimit As XMLValue(Of Integer)
78+
#End Region
6179
Friend ReadOnly Property Temporary As XMLValue(Of Boolean)
6280
Friend ReadOnly Property DownloadImages As XMLValue(Of Boolean)
6381
Friend ReadOnly Property DownloadVideos As XMLValue(Of Boolean)
@@ -98,6 +116,7 @@ Namespace API.Base
98116
Responser.CookiesDomain = "reddit.com"
99117
Responser.Decoders.Add(SymbolsConverter.Converters.Unicode)
100118
Case Sites.Instagram : Responser.CookiesDomain = "instagram.com"
119+
Case Sites.RedGifs : Responser.CookiesDomain = "redgifs.com"
101120
End Select
102121
Responser.SaveSettings()
103122
End If
@@ -126,20 +145,30 @@ Namespace API.Base
126145
GetUserMediaOnly = New XMLValue(Of Boolean)
127146
End If
128147

148+
CreateProp(InstaHashUpdateRequired, Sites.Instagram, "InstaHashUpdateRequired", True, _XML, n)
149+
CreateProp(InstaHash, Sites.Instagram, "InstaHash", String.Empty, _XML, n)
150+
If Site = Sites.Instagram AndAlso (InstaHash.IsEmptyString Or InstaHashUpdateRequired) AndAlso Responser.Cookies.ListExists Then GatherInstaHash()
151+
CreateProp(InstaHash_SP, Sites.Instagram, "InstaHashSavedPosts", String.Empty, _XML, n)
152+
CreateProp(InstagramLastDownloadDate, Sites.Instagram, "LastDownloadDate", Now.AddDays(-1), _XML, n)
153+
CreateProp(InstagramLastRequestsCount, Sites.Instagram, "LastRequestsCount", 0, _XML, n)
154+
CreateProp(RequestsWaitTimer, Sites.Instagram, "RequestsWaitTimer", 1000, _XML, n)
155+
CreateProp(RequestsWaitTimerTaskCount, Sites.Instagram, "RequestsWaitTimerTaskCount", 1, _XML, n)
156+
CreateProp(SleepTimerOnPostsLimit, Sites.Instagram, "SleepTimerOnPostsLimit", 60000, _XML, n)
129157
If Site = Sites.Instagram Then
130-
InstaHash = New XMLValue(Of String)("InstaHash", String.Empty, _XML, n)
131-
InstaHashUpdateRequired = New XMLValue(Of Boolean)("InstaHashUpdateRequired", True, _XML, n)
132-
If (InstaHash.IsEmptyString Or InstaHashUpdateRequired) And Responser.Cookies.ListExists Then GatherInstaHash()
133158
InstagramDownloadingErrorDate = New XMLValue(Of Date) With {.ToStringFunction = Function(ss, vv) AConvert(Of String)(vv, Nothing)}
134159
InstagramDownloadingErrorDate.SetExtended("InstagramDownloadingErrorDate", Now.AddYears(-10), _XML, n)
135160
Else
136-
InstaHash = New XMLValue(Of String)
137-
InstaHashUpdateRequired = New XMLValue(Of Boolean)
161+
InstagramDownloadingErrorDate = New XMLValue(Of Date)
138162
End If
139-
If Site = Sites.Reddit Then
140-
SavedPostsUserName = New XMLValue(Of String)("SavedPostsUserName", String.Empty, _XML, n)
163+
164+
SavedPostsUserName = New XMLValue(Of String)("SavedPostsUserName", String.Empty, _XML, n)
165+
End Sub
166+
Private Sub CreateProp(Of T)(ByRef p As XMLValue(Of T), ByVal s As Sites,
167+
ByVal p_Name As String, ByVal p_Value As T, ByRef x As XmlFile, ByVal n() As String)
168+
If Site = s Then
169+
p = New XMLValue(Of T)(p_Name, p_Value, x, n)
141170
Else
142-
SavedPostsUserName = New XMLValue(Of String)
171+
p = New XMLValue(Of T)
143172
End If
144173
End Sub
145174
Friend Sub Update()

SCrawler/API/Base/Structures.vb

+16
Original file line numberDiff line numberDiff line change
@@ -85,5 +85,21 @@ Namespace API.Base
8585
Return v
8686
End Function
8787
End Structure
88+
Friend Structure Sizes : Implements IComparable(Of Sizes)
89+
Friend Value As Integer
90+
Friend Data As String
91+
Friend ReadOnly HasError As Boolean
92+
Friend Sub New(ByVal _Value As String, ByVal _Data As String)
93+
Try
94+
Value = _Value
95+
Data = _Data
96+
Catch ex As Exception
97+
HasError = True
98+
End Try
99+
End Sub
100+
Friend Function CompareTo(ByVal Other As Sizes) As Integer Implements IComparable(Of Sizes).CompareTo
101+
Return Value.CompareTo(Other.Value) * -1
102+
End Function
103+
End Structure
88104
End Module
89105
End Namespace

0 commit comments

Comments
 (0)