Skip to content
This repository has been archived by the owner on Jan 2, 2021. It is now read-only.

Seeds option #33

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Seeds option #33

wants to merge 1 commit into from

Conversation

dribnet
Copy link
Contributor

@dribnet dribnet commented Nov 1, 2016

This is a more radical followup to #12. This adds an option where all seed images come from disk instead of being computed on the fly. The assumption is that the seeds and training set has the same directory structure and filenames, though a filter was added to allow for some missing files.

The rationale is that this allows the same codebase to be used for more types of reconstruction. For example, the seeds could be a set of images compressed in jpeg file format, which would teach the network to also correct for compression artefacts. This is opens up some powerful and more general purpose ways of using the same enhance code.

I believe I have made this option backwards compatible by making the previous behaviour the fallback when the seeds option is not given explicitly. However, this might warrant more careful review as this alters the shuffling and data-augmentation routines to carefully match the operations on the images and seeds. That said, I've been using this code for a day or two and it seems to be training well and I've seen no discrepancies in the diagnostic valid images.

@alexjc
Copy link
Owner

alexjc commented Nov 1, 2016

Great idea! I can see what you intend this for, it's useful. You're right this will need more review, so I want to merge the 0.2 branch with some training improvements first.

@dribnet
Copy link
Contributor Author

dribnet commented Nov 6, 2016

(this has been updated to be compatible with the now merged 0.2 branch)

@dribnet
Copy link
Contributor Author

dribnet commented Nov 8, 2016

I am willing to make this compatible with 0.3, but am going to wait until there is some feedback indicating this is a desired feature in the main release.

@alexjc
Copy link
Owner

alexjc commented Nov 8, 2016

Currently my focus is on making a new release that works reliably for all image types (with a variety of user options for models), but I've been thinking a lot about how to integrate your changes.

I'm wondering if having a plugin approach would help. So you could specify --plugin=smilevector and it would load smilevector.py and run the code there to load or pre-process images. This kind of customization would be very useful for me on another project, but I don't necessarily want all features in the base file (e.g. calling sort with a 1M dataset doesn't work ;-)

What do you think?

@dribnet
Copy link
Contributor Author

dribnet commented Nov 11, 2016

OK, without pulling my punches here are my thoughts:

  • the hidden power of this library is that it can potentially learn to map between any two image datasets, thought it is streamlined for the case of super-resolution and (now) correcting jpeg artifacts. --seeds exposes this full power of the tool to power users.
  • the built in options you've added are nice (eg: --train-jpeg, --train-blur), but I can easily replicate most of them the --seeds option and a three line imagemagick script. Also - they have hard-coded distribution and parameter assumptions built in that are not adjustable.
  • sorting 1M strings is pretty trivial. I run --seeds routinely on 200k datasets.
  • as written, --seeds is compatible with most other options you've included (--train-jpeg, train-noise, etc). It simply replaces the first step of using a resized version of the original as the basis of the seed.
  • the plugin option is interesting, but I think using the filesystem is simpler. it also is much more efficient because the seeds themselves are implicitly cached on disk and reusable.

That said, I certainly appreciate that this is your library and I am admittedly somewhat of a fringe user at this point. So I certainly understand if you want to for now focus on the primary use cases, and I don't mind splicing in this functionality from time to time when I need it. Just let me know if / when you'd like to have a version for testing and I'm happy to update this pull request.

@alexjc
Copy link
Owner

alexjc commented Nov 22, 2016

I added plugin support to [cb55929]. It requires Python 3.5. You create file in plugins/ and provide three functions to list the files, load the original image, and then the seed (load or modify original). Use the example using --train-plugin=simple and put files in data/*.jpg.

I think this will allow most of the custom things you need, while still allowing the seeds to be procedurally generated — since some transformations are best done at runtime (like noise/dropout).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants