Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha 1 #2

Open
wants to merge 42 commits into
base: master
Choose a base branch
from
Open

Alpha 1 #2

wants to merge 42 commits into from

Conversation

haydenjackson01
Copy link
Contributor

The render service prepped and ready for local development. This means others should be able to run locally and test their own additions. This is a huge PR and is best viewed on your own machine, take your time and leave lots of comments but most attention should be on the /renderservice folder.

Current known issues:

  • Images are not scaled correctly in output pdfs
  • More documentation for top level elements e.g. dev folder
  • It isn't the nicest due to all the constraints and needs work

How do I test this?
Aside for normal methods such as unit tests I have added a script in the dev folder that acts as a client on the queue service, allowing developers to create tasks and retrieve output.

Some examples of usage would be:

  • python queue_client.py add which adds a task.
  • python queue_client.py list which lists up to 100 tasks.
  • python queue_client.py document which saves out any documents found in the task queue.

Since all queue data is stored in the redis server the command redis-cli will log in and allow the user to interact with the data using commands such as keys * to list all keys, get <key> to view contents of a string key, zrange <key> 0 -1 to list all the items in a sortedset key. More commands are found at https://redis.io/commands.

Since the render service runs many programs at once using just the docker logs many not be very informative i.e. docker-compose logs render or ./render. Other logs more informative logs are available by using the shell ./render dev shell or docker-compose exec render bash and using cat and tail to view the contents of the logs. The render daemon logs are found at /renderservice/logs/, while the web-server access log is found at /renderservice/access.log.

- Added configuration files for travis, flake8, etc.
- Added script for handling docker interactions.
- Queue service now correctly increments retry_count after leasing.
- Render service now correctly logs errors during generation.
- Fix typos with variables names.
@codecov
Copy link

codecov bot commented Sep 18, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@9ad16c1). Click here to learn what that means.
The diff coverage is 91.03%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master       #2   +/-   ##
=========================================
  Coverage          ?   91.04%           
=========================================
  Files             ?       52           
  Lines             ?     2500           
  Branches          ?      225           
=========================================
  Hits              ?     2276           
  Misses            ?      197           
  Partials          ?       27
Impacted Files Coverage Δ
renderservice/render/webserver/__init__.py 100% <ø> (ø)
renderservice/render/daemon/RenderDaemon.py 0% <0%> (ø)
renderservice/render/daemon/__main__.py 0% <0%> (ø)
renderservice/render/webserver/wsgi.py 0% <0%> (ø)
...erservice/render/tests/daemon/test_file_manager.py 100% <100%> (ø)
...ervice/render/tests/resources/test_modulo_clock.py 100% <100%> (ø)
...er/tests/resources/test_barcode_checksum_poster.py 100% <100%> (ø)
renderservice/render/resources/job_badges.py 100% <100%> (ø)
...rservice/render/tests/resources/test_job_badges.py 100% <100%> (ø)
renderservice/render/resources/arrows.py 100% <100%> (ø)
... and 42 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9ad16c1...07371c6. Read the comment docs.

The first major step in releasing a open source version of CS Unplugged.
While some existing content from the classic version of CS Unplugged have yet
to be adapted into the new format and system, we are releasing this version as
a sneak peek for teachers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/paste leftovers :P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm copy pasta.

the directory is a mounted bucket.

Since files with the same filepath could belong to different
directories only the first found is used, based on the order

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

files with the same filepath could belong to different directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The render system has two locations for static files. A local copy for do don't distribute resources and things for speed, and the cloud bucket system for all other static files. Since the bucket is mounted as another folder in the file-system this class enables the use of both these locations as a single entity.

self.directories = tuple(args)
self.save_directory = kwargs.get("save_directory", None)

@cachedmethod(lambda cls: type(cls).__cache)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this magic do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caches the result of the method based on the input arguments. Useful when retrieving things on a external server that are used frequently.

"""Get the full filepath for a given resource.

Finds the first file that matches the filepath (this means
if a folder matches first it is not returned.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems terrible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope the previous comments clarified why this is necessary. If not since there are multiple folders being treated as a single filesystem this is necessary, since there is a chance that there maybe similarly named files in the same apparent location. The class allows ordering of the filesystems to determine which to favour in these situations.

Tasks will be run to recieve a result, saved if necessary
then deleted. Tasks that return a none result or are
interupted by an exception will not be deleted and have
their lease reduced for another daemon. Tasks that have

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean, "have their lease reduced for another daemon" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A task is leased for a specified time from the queue (and remains in the queue) by a daemon. This here means that a daemon is telling the queue that it is finished with it's lease early (but the task should not be removed) so that the queue may lease it out to another daemon. This happens when something goes wrong, like going over the allowed time. When the task is next leased it maybe given more time based on how many times it has been leased before.

interupted by an exception will not be deleted and have
their lease reduced for another daemon. Tasks that have
surpassed their retry limit will have a failure result
saved if necessary otherwise they will be deleted.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is deemed as "necessary" and what happens if it is? i.e. do they stay on the queue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changed, they will all have a failure result saved and be deleted.

else:
result = self.handle_retry_limit(task_descriptor)

# Save out documents

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"out documents" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, was probably trying to capture the idea that they are not saved on the current machine.

def generate_resource(self, task, resource_generator):
"""Retrieve page(s) for one copy of resource from resource generator.

Images are resized to size.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what is was on the previous system. Changed now to "resized to fit page".


Args:
task: The specification of file to generate as a dictionary.
resource_image_generator: The file generation module.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect arg

"""Load and request a daemon to perform an action.

Do not trust code or the process to be running after this method
as certain actions such as 'start' cause the main thread to be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol what, start causes the main thread to stop?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, to start a daemon in UNIX you start a process from the current process, then kill it. You are left with a orphan process, which then creates the another process which will be the daemon and then kills itself. Your daemon on it's double orphan process is then adopted by the main system thread. Look it up when you need it.

return details


def get_recommended_number_of_daemons():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, this function figures out how many CPUs you can have running on a particular machine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct.

with open('/proc/self/status') as f:
m = re.search(r'(?m)^Cpus_allowed:\s*(.*)$', f.read())
if m:
res = bin(int(m.group(1).replace(',', ''), 16)).count('1')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having trouble processing how this works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've edited it to make it cleaner, but basically the number of CPUs is stored as a number but that number isn't actually a count but a bitmap of the cores you are allowed to run on. So the number of ones in the bitmap is how many cores are available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants