-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download_Mode
for File
, S3_File
and Enso_File
#12017
Changes from 12 commits
2c44654
3340832
f49e29c
04ebbda
4d1db59
2bfe697
3daae54
c8c77b2
b4f5a9c
b0fb99d
b2ce858
85902ad
269b7c4
9bb0dc8
6e5fca2
9c04163
00e40af
6bf36ce
af5d144
a6d66c6
e9bece1
4cffe39
a19064c
c451f82
fdc1281
acfca95
4ed8d35
d5c9cb2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
import project.Data.Time.Date_Time.Date_Time | ||
import project.Data.Time.Duration.Duration | ||
import project.Nothing.Nothing | ||
import project.System.File.Generic.Writable_File.Writable_File | ||
from project.Data.Boolean import Boolean, False, True | ||
|
||
type Download_Mode | ||
## Download the file if it does not already exist on disk. | ||
If_Not_Exists | ||
|
||
## Download the file if it is older than the specified age. | ||
If_Older_Than age:Duration | ||
|
||
## Always download. | ||
Always | ||
|
||
## PRIVATE | ||
Determine if a file should be downloaded, based on the file type, | ||
download mode, and file age. | ||
should_download self (file:Writable_File) -> Boolean = | ||
case self of | ||
Download_Mode.If_Not_Exists -> | ||
file.file.exists.not | ||
Download_Mode.If_Older_Than age -> | ||
file.file.exists.not || file.file.creation_time < (Date_Time.now - age) | ||
Download_Mode.Always -> | ||
True |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
from Standard.Base import all | ||
|
||
from Standard.Test import all | ||
import Standard.Test.Test_Environment | ||
|
||
from enso_dev.Base_Tests.Network.Http.Http_Test_Setup import base_url_with_slash, pending_has_url | ||
|
||
polyglot java import java.lang.Thread | ||
|
||
with_test_file f ~action = | ||
f.delete_if_exists | ||
Panic.with_finalizer (f.delete_if_exists) <| | ||
action f | ||
|
||
## file_maker should take a path component and return a full file path, for example: | ||
"if_not_exist" | ||
=> | ||
(enso_project.data / "transient" / "if_not_exist.txt") | ||
add_specs prefix suite_builder file_maker = | ||
suite_builder.group prefix+"Download Mode" pending=pending_has_url group_builder-> | ||
url_n_bytes n = base_url_with_slash+'test_download?length='+n.to_text | ||
|
||
group_builder.specify prefix+"Will always download a file for mode Always" <| | ||
with_test_file (file_maker "always") file-> | ||
file.exists . should_be_false | ||
Data.download (url_n_bytes 10) mode=..Always file | ||
first_contents = file.read | ||
Data.download (url_n_bytes 11) mode=..Always file | ||
second_contents = file.read | ||
first_contents . should_not_equal second_contents | ||
|
||
group_builder.specify prefix+"Will download a file if it does not exist for default mode If_Not_Exists" <| | ||
with_test_file (file_maker "if_not_exist") file-> | ||
file.exists . should_be_false | ||
Data.download (url_n_bytes 10) file | ||
first_contents = file.read | ||
Data.download (url_n_bytes 11) file | ||
second_contents = file.read | ||
first_contents . should_equal second_contents | ||
|
||
group_builder.specify prefix+"Will download a file if it is older than a specified duration for mode If_Older_Than" <| | ||
with_test_file (file_maker "if_older_than") file-> | ||
sleep_duration_secs = 3 | ||
|
||
file.exists . should_be_false | ||
|
||
Data.download (url_n_bytes 10) file | ||
first_contents = file.read | ||
|
||
Data.download (url_n_bytes 11) (mode=..If_Older_Than (Duration.new seconds=sleep_duration_secs)) file | ||
second_contents = file.read | ||
first_contents . should_equal second_contents | ||
|
||
Thread.sleep (sleep_duration_secs * 1000) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this changes anything, but my intuition suggests to make the sleep duration a little bit larger than the duration in e.g. do If Older Than 3 seconds but sleep for 3.5 seconds. If you have the exact same amount I imagine some small fluctuation in how the times are computed could make the test fail, however if you add a little bit of offset it feels like it should decrease the chances of this randomly failing. Although I do admit this is purely based on intuition. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a 0.5 -- and also, this test has retries since it will inevitably fail. |
||
|
||
Data.download (url_n_bytes 12) (mode=..If_Older_Than (Duration.new seconds=sleep_duration_secs)) file | ||
third_contents = file.read | ||
first_contents . should_not_equal third_contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API suggestion: what if we rename
mode
intoreplace_existing
?IMO it will be much clearer for the user.
If I do:
tbh I'd still expect the second expression to download. That is because I am now downloading a different file to that destination. So while the destination exists, as a user I would expect it to be overwritten, because I've changed the URL - e.g. I was working with a report from June relying on the cache to only download it once, but now I want to start working with reports from July. I change the URL and expect the file to get redownloaded even if it exists - because I expect the data is new. I even reset caches and am confused why I'm still seeing June data in the file.
I understand that this is not what this was designed for, but I think the above is a likely user scenario.
Now, if we rename the parameter to
replace_existing
, the code reads as:Now it is obvious to me that the second statement will do nothing if the first one succeeded - because the file is redownloaded only if it didn't exist in the first place (regardless of the URL). And now that semantics (what is currently implemented) is completely clear when reading the calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and regardless of parameter name - we need to update the method documentation to include it and ideally describe what the expected semantics is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we rely on the 'refresh' button to clear caches in many cases, we probably should add a note that this
download
method works only based on file existance/age and so refresh button does not affect it.As a user I might expect the refresh button to ensure the file is redownloaded (whether that should work this way or not is up to discussion, I think current semantics are ok) - but with current semantics the refresh button just does nothing for
download
. I think it would be good if the documentation mentioned that, so that the user can know what to expect / can see that this is expected and not a bug if they get confused seeing that refresh button does nothing.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to
replace_existing
, and added documentation.I am not sure about how the refresh button interacts with the
Always
option, and I cannot run the front end to find out, so I did not mention that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With
Always
we always redownload the file, right? How could the refresh button interfere with that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is, I assume that refresh will also cause a download.