Skip to content

scheduler for Ruby (at, in, cron and every jobs)

License

Notifications You must be signed in to change notification settings

liuganggang/rufus-scheduler

 
 

Repository files navigation

rufus-scheduler

Build Status Gem Version

This gem translation by LGG

![(http://mall.zycg.cn/assets/logo.png)] (http://mall.zycg.cn/assets/logo.png)

Ruby语言的定时任务 (at, cron, in and every jobs).

Note: 或者你正在看 README of rufus-scheduler 2.x?

Quickstart:

require 'rufus-scheduler'

scheduler = Rufus::Scheduler.new

scheduler.in '3s' do
  puts 'Hello... Rufus'
end

scheduler.join
  # let the current thread join the scheduler thread
  # 让当前的进程合并scheduler进程

支持多种形式:

require 'rufus-scheduler'

scheduler = Rufus::Scheduler.new

# ...

scheduler.in '10d' do
  # do something in 10 days
end

scheduler.at '2030/12/12 23:30:00' do
  # do something at a given point in time
end

scheduler.every '3h' do
  # do something every 3 hours
end

scheduler.cron '5 0 * * *' do
  # do something every day, five minutes after midnight
  # (see "man 5 crontab" in your terminal)
end

# ...

non-features

Rufus-scheduler (out of the box)是一个在线进行, 内存使用的定时器.

它不能坚持你的定时计划. 如果带有定时任务的进程丢失, 定时任务也会丢失.

类似相关的 gems

  • whenever - let cron call back your Ruby code, trusted and reliable cron drives your schedule
  • clockwork - rufus-scheduler inspired gem

(please note: rufus-scheduler is not a cron replacement)

note about the 3.0 line

It's a complete rewrite of rufus-scheduler.

There is no EventMachine-based scheduler anymore.

Notables changes:

  • 如之前说的, 没有更多的计时时间基础的定时器
  • scheduler.every('100') { 将会定时每100秒(以前,他是 0.1s). 这相当于Ruby中的sleep(100)
  • The scheduler 再也不会获取全部的异常任务,仅仅指的是StandardError
  • The error_handler is #on_error (instead of #on_exception), by default it now prints the details of the error to $stderr (used to be $stdout)
  • Rufus::Scheduler::TimeOutError 改名为 Rufus::Scheduler::TimeoutError
  • "interval" 循环任务介绍. 只要有 "every" 带头的,如 "every 10 minutes, do this", 它就会执行一次,然后等10分钟后,再执行一次,周而复始.
  • :lockfile => true/filename 原理是保护多个正在执行的定时任务(scheduler)
  • "discard_past" 默认是开启的. 如果 scheduler (its host) 睡眠1个小时 并且有一个 every '10m' 的任务在执行, 它将会尝试唤醒1次,而不是6次 (discard_past was false by default in rufus-scheduler 2.x). 从现在开始 3.0 不用去特意引用 :discard_past => false .
  • Introduction of Scheduler #on_pre_trigger and #on_post_trigger callback points

getting help

So you need help. People can help you, but first help them help you, and don't waste their time. Provide a complete description of the issue. If it works on A but not on B and others have to ask you: "so what is different between A and B" you are wasting everyone's time.

Go read how to report bugs effectively, twice.

Update: help_help.md might help help you.

on StackOverflow

Use this form. It'll create a question flagged "rufus-scheduler" and "ruby".

Here are the questions tagged rufus-scheduler.

on IRC

I sometimes haunt #ruote on freenode.net. The channel is not dedicated to rufus-scheduler, so if you ask a question, first mention it's about rufus-scheduler.

Please note that I prefer helping over Stack Overflow because it's more searchable than the ruote IRC archive.

issues

Yes, issues can be reported in rufus-scheduler issues, I'd actually prefer bugs in there. If there is nothing wrong with rufus-scheduler, a Stack Overflow question is better.

faq

scheduling

Rufus-scheduler 支持5种 job. in, at, every, interval and cron jobs.

大多数 rufus-scheduler examples 展示 block形式的任务, but it's also OK to schedule handler instances or handler classes.

in, at, every, interval, cron

In and at 任务 执行一次.

require 'rufus-scheduler'

scheduler = Rufus::Scheduler.new

scheduler.in '10d' do
  puts "10 days reminder for review X!"
end

scheduler.at '2014/12/24 2000' do
  puts "merry xmas!"
end

In jobs 是制定一个时间间隔, 他们模拟过去的时间消逝. At jobs 在一个时间点, 他们模拟到达一个时间点 (将来最还选择是用 In).

Every, interval and cron jobs 模拟反复进行.

require 'rufus-scheduler'

scheduler = Rufus::Scheduler.new

scheduler.every '3h' do
  puts "change the oil filter!"
end

scheduler.interval '2h' do
  puts "thinking..."
  puts sleep(rand * 1000)
  puts "thought."
end

scheduler.cron '00 09 * * *' do
  puts "it's 9am! good morning!"
end

Every jobs 尽力去跟随他们定时任务的频率.

Interval jobs, 触发事件,并执行,等待执行完后再间隔时间执行. (every jobs 的时间是取决于时间间隔, interval jobs 时间取决于执行的完成和下次执行的开始).

Cron jobs are based on the venerable cron utility (man 5 crontab). They trigger following a pattern given in (almost) the same language cron uses.

#schedule_x vs #x

schedule_in, schedule_at, schedule_cron,等等 会返回一个 schedule 实例.

in, at, cron 会返回新任务实例的id (a String).

job_id =
  scheduler.in '10d' do
    # ...
  end
job = scheduler.job(job_id)

# 对比

job =
  scheduler.schedule_in '10d' do
    # ...
  end

# also

job =
  scheduler.in '10d', :job => true do
    # ...
  end

#schedule and #repeat

有时会有不详细的.

The #schedule 方法定时一个 at, in or cron job. 他依据后面的input 来判定. 它返回一个实例.

scheduler.schedule '10d' do; end.class
  # => Rufus::Scheduler::InJob

scheduler.schedule '2013/12/12 12:30' do; end.class
  # => Rufus::Scheduler::AtJob

scheduler.schedule '* * * * *' do; end.class
  # => Rufus::Scheduler::CronJob

The #repeat 方法返回一个 EveryJob or a CronJob.

scheduler.repeat '10d' do; end.class
  # => Rufus::Scheduler::EveryJob

scheduler.repeat '* * * * *' do; end.class
  # => Rufus::Scheduler::CronJob

(是的,不包含 IntervalJob的).

schedule blocks arguments (job, time)

一个 schedule block 可以带0,1,2个参数.

第一个是 "job", 他月Job 的实例有关. 他可以用到例如取消Job的功能.

scheduler.every '10m' do |job|

  status = determine_pie_status

  if status == 'burnt' || status == 'cooked'
    stop_oven
    takeout_pie
    job.unschedule
  end
end

第二个是 "time", 是任务即将触发的时间(不是当前时间).

Note that time is the time when the job got cleared for triggering. If there are mutexes involved, now = mutex_wait_time + time...

"every" jobs and changing the next_time in-flight

It's OK to change the next_time of an every job in-flight:

scheduler.every '10m' do |job|

  # ...

  status = determine_pie_status

  job.next_time = Time.now + 30 * 60 if status == 'burnt'
    #
    # if burnt, wait 30 minutes for the oven to cool a bit
end

It should work as well with cron jobs, not so with interval jobs whose next_time is computed after their block ends its current run.

scheduling handler instances

It's OK to pass any object, as long as it respond to #call(), when scheduling:

class Handler
  def self.call(job, time)
    p "- Handler called for #{job.id} at #{time}"
  end
end

scheduler.in '10d', Handler

# or

class OtherHandler
  def initialize(name)
    @name = name
  end
  def call(job, time)
    p "* #{time} - Handler #{name.inspect} called for #{job.id}"
  end
end

oh = OtherHandler.new('Doe')

scheduler.every '10m', oh
scheduler.in '3d5m', oh

The call method must accept 2 (job, time), 1 (job) or 0 arguments.

Note that time is the time when the job got cleared for triggering. If there are mutexes involved, now = mutex_wait_time + time...

scheduling handler classes

One can pass a handler class to rufus-scheduler when scheduling. Rufus will instantiate it and that instance will be available via job#handler.

class MyHandler
  attr_reader :count
  def initialize
    @count = 0
  end
  def call(job)
    @count += 1
    puts ". #{self.class} called at #{Time.now} (#{@count})"
  end
end

job = scheduler.schedule_every '35m', MyHandler

job.handler
  # => #<MyHandler:0x000000021034f0>
job.handler.count
  # => 0

If you want to keep that "block feeling":

job_id =
  scheduler.every '10m', Class.new do
    def call(job)
      puts ". hello #{self.inspect} at #{Time.now}"
    end
  end

pause and resume the scheduler

The scheduler can be paused via the #pause and #resume methods. One can determine if the scheduler is currently paused by calling #paused?.

While paused, the scheduler still accepts schedules, but no schedule will get triggered as long as #resume isn't called.

job options

:blocking => true

By default, jobs are triggered in their own, new thread. When :blocking => true, the job is triggered in the scheduler thread (a new thread is not created). Yes, while the job triggers, the scheduler is not scheduling.

:overlap => false

Since, by default, jobs are triggered in their own new thread, job instances might overlap. For example, a job that takes 10 minutes and is scheduled every 7 minutes will have overlaps.

To prevent overlap, one can set :overlap => false. Such a job will not trigger if one of its instance is already running.

:mutex => mutex_instance / mutex_name / array of mutexes

When a job with a mutex triggers, the job's block is executed with the mutex around it, preventing other jobs with the same mutex to enter (it makes the other jobs wait until it exits the mutex).

This is different from :overlap => false, which is, first, limited to instances of the same job, and, second, doesn't make the incoming job instance block/wait but give up.

:mutex accepts a mutex instance or a mutex name (String). It also accept an array of mutex names / mutex instances. It allows for complex relations between jobs.

Array of mutexes: original idea and implementation by Rainux Luo

Warning: creating lots of different mutexes is OK. Rufus-scheduler will place them in its Scheduler#mutexes hash... And they won't get garbage collected.

:timeout => duration or point in time

It's OK to specify a timeout when scheduling some work. After the time specified, it gets interrupted via a Rufus::Scheduler::TimeoutError.

scheduler.in '10d', :timeout => '1d' do
  begin
    # ... do something
  rescue Rufus::Scheduler::TimeoutError
    # ... that something got interrupted after 1 day
  end
end

The :timeout option accepts either a duration (like "1d" or "2w3d") or a point in time (like "2013/12/12 12:00").

:first_at, :first_in, :first, :first_time

This option is for repeat jobs (cron / every) only.

It's used to specify the first time after which the repeat job should trigger for the first time.

In the case of an "every" job, this will be the first time (modulo the scheduler frequency) the job triggers. For a "cron" job, it's the time after which the first schedule will trigger.

scheduler.every '2d', :first_at => Time.now + 10 * 3600 do
  # ... every two days, but start in 10 hours
end

scheduler.every '2d', :first_in => '10h' do
  # ... every two days, but start in 10 hours
end

scheduler.cron '00 14 * * *', :first_in => '3d' do
  # ... every day at 14h00, but start after 3 * 24 hours
end

:first, :first_at and :first_in all accept a point in time or a duration (number or time string). Use the symbol you think make your schedule more readable.

Note: it's OK to change the first_at (a Time instance) directly:

job.first_at = Time.now + 10
job.first_at = Rufus::Scheduler.parse('2029-12-12')

The first argument (in all its flavours) accepts a :now or :immediately value. That schedules the first occurence for immediate triggering. Consider:

require 'rufus-scheduler'

s = Rufus::Scheduler.new

n = Time.now; p [ :scheduled_at, n, n.to_f ]

s.every '3s', :first => :now do
  n = Time.now; p [ :in, n, n.to_f ]
end

s.join

that'll output something like:

[:scheduled_at, 2014-01-22 22:21:21 +0900, 1390396881.344438]
[:in, 2014-01-22 22:21:21 +0900, 1390396881.6453865]
[:in, 2014-01-22 22:21:24 +0900, 1390396884.648807]
[:in, 2014-01-22 22:21:27 +0900, 1390396887.651686]
[:in, 2014-01-22 22:21:30 +0900, 1390396890.6571937]
...

:last_at, :last_in, :last

This option is for repeat jobs (cron / every) only.

It indicates the point in time after which the job should unschedule itself.

scheduler.cron '5 23 * * *', :last_in => '10d' do
  # ... do something every evening at 23:05 for 10 days
end

scheduler.every '10m', :last_at => Time.now + 10 * 3600 do
  # ... do something every 10 minutes for 10 hours
end

scheduler.every '10m', :last_in => 10 * 3600 do
  # ... do something every 10 minutes for 10 hours
end

:last, :last_at and :last_in all accept a point in time or a duration (number or time string). Use the symbol you think make your schedule more readable.

Note: it's OK to change the last_at (nil or a Time instance) directly:

job.last_at = nil
  # remove the "last" bound

job.last_at = Rufus::Scheduler.parse('2029-12-12')
  # set the last bound

:times => nb of times (before auto-unscheduling)

One can tell how many times a repeat job (CronJob or EveryJob) is to execute before unscheduling by itself.

scheduler.every '2d', :times => 10 do
  # ... do something every two days, but not more than 10 times
end

scheduler.cron '0 23 * * *', :times => 31 do
  # ... do something every day at 23:00 but do it no more than 31 times
end

It's OK to assign nil to :times to make sure the repeat job is not limited. It's useful when the :times is determined at scheduling time.

scheduler.cron '0 23 * * *', :times => nolimit ? nil : 10 do
  # ...
end

The value set by :times is accessible in the job. It can be modified anytime.

job =
  scheduler.cron '0 23 * * *' do
    # ...
  end

# later on...

job.times = 10
  # 10 days and it will be over

Job methods

When calling a schedule method, the id (String) of the job is returned. Longer schedule methods return Job instances directly. Calling the shorter schedule methods with the :job => true also return Job instances instead of Job ids (Strings).

  require 'rufus-scheduler'

  scheduler = Rufus::Scheduler.new

  job_id =
    scheduler.in '10d' do
      # ...
    end

  job =
    scheduler.schedule_in '1w' do
      # ...
    end

  job =
    scheduler.in '1w', :job => true do
      # ...
    end

Those Job instances have a few interesting methods / properties:

id, job_id

Returns the job id.

job = scheduler.schedule_in('10d') do; end
job.id
  # => "in_1374072446.8923042_0.0_0"

scheduler

Returns the scheduler instance itself.

opts

Returns the options passed at the Job creation.

job = scheduler.schedule_in('10d', :tag => 'hello') do; end
job.opts
  # => { :tag => 'hello' }

original

Returns the original schedule.

job = scheduler.schedule_in('10d', :tag => 'hello') do; end
job.original
  # => '10d'

callable, handler

callable() returns the scheduled block (or the call method of the callable object passed in lieu of a block)

handler() returns nil if a block was scheduled and the instance scheduled else.

# when passing a block

job =
  scheduler.schedule_in('10d') do
    # ...
  end

job.handler
  # => nil
job.callable
  # => #<Proc:0x00000001dc6f58@/home/jmettraux/whatever.rb:115>

and

# when passing something else than a block

class MyHandler
  attr_reader :counter
  def initialize
    @counter = 0
  end
  def call(job, time)
    @counter = @counter + 1
  end
end

job = scheduler.schedule_in('10d', MyHandler.new)

job.handler
  # => #<Method: MyHandler#call>
job.callable
  # => #<MyHandler:0x0000000163ae88 @counter=0>

scheduled_at

Returns the Time instance when the job got created.

job = scheduler.schedule_in('10d', :tag => 'hello') do; end
job.scheduled_at
  # => 2013-07-17 23:48:54 +0900

last_time

Returns the last time the job triggered (is usually nil for AtJob and InJob). k

job = scheduler.schedule_every('1d') do; end
# ...
job.scheduled_at
  # => 2013-07-17 23:48:54 +0900

last_work_time, mean_work_time

The job keeps track of how long its work was in the last_work_time attribute. For a one time job (in, at) it's probably not very useful.

The attribute mean_work_time contains a computed mean work time. It's recomputed after every run (if it's a repeat job).

unschedule

Unschedule the job, preventing it from firing again and removing it from the schedule. This doesn't prevent a running thread for this job to run until its end.

threads

Returns the list of threads currently "hosting" runs of this Job instance.

kill

Interrupts all the work threads currently running for this job instance. They discard their work and are free for their next run (of whatever job).

Note: this doesn't unschedule the Job instance.

Note: if the job is pooled for another run, a free work thread will probably pick up that next run and the job will appear as running again. You'd have to unschedule and kill to make sure the job doesn't run again.

running?

Returns true if there is at least one running Thread hosting a run of this Job instance.

scheduled?

Returns true if the job is scheduled (is due to trigger). For repeat jobs it should return true until the job gets unscheduled. "at" and "in" jobs will respond with false as soon as they start running (execution triggered).

pause, resume, paused?, paused_at

These four methods are only available to CronJob, EveryJob and IntervalJob instances. One can pause or resume such a job thanks to them.

job =
  scheduler.schedule_every('10s') do
    # ...
  end

job.pause
  # => 2013-07-20 01:22:22 +0900
job.paused?
  # => true
job.paused_at
  # => 2013-07-20 01:22:22 +0900

job.resume
  # => nil

tags

Returns the list of tags attached to this Job instance.

By default, returns an empty array.

job = scheduler.schedule_in('10d') do; end
job.tags
  # => []

job = scheduler.schedule_in('10d', :tag => 'hello') do; end
job.tags
  # => [ 'hello' ]

[]=, [], key? and keys

Threads have thread-local variables. Rufus-scheduler jobs have job-local variables.

job =
  @scheduler.schedule_every '1s' do |job|
    job[:timestamp] = Time.now.to_f
    job[:counter] ||= 0
    job[:counter] += 1
  end

sleep 3.6

job[:counter]
  # => 3

job.key?(:timestamp)
  # => true
job.keys
  # => [ :timestamp, :counter ]

Job-local variables are thread-safe.

call

Job instances have a #call method. It simply calls the scheduled block or callable immediately.

job =
  @scheduler.schedule_every '10m' do |job|
    # ...
  end

job.call

Warning: the Scheduler#on_error handler is not involved. Error handling is the responsibility of the caller.

If the call has to be rescued by the error handler of the scheduler, call(true) might help:

require 'rufus-scheduler'

s = Rufus::Scheduler.new

def s.on_error(job, err)
  p [ 'error in scheduled job', job.class, job.original, err.message ]
rescue
  p $!
end

job =
  s.schedule_in('1d') do
    fail 'again'
  end

job.call(true)
  #
  # true lets the error_handler deal with error in the job call

AtJob and InJob methods

time

Returns when the job will trigger (hopefully).

next_time

An alias to time.

EveryJob, IntervalJob and CronJob methods

next_time

Returns the next time the job will trigger (hopefully).

count

Returns how many times the job fired.

EveryJob methods

frequency

It returns the scheduling frequency. For a job scheduled "every 20s", it's 20.

It's used to determine if the job frequency is higher than the scheduler frequency (it raises an ArgumentError if that is the case).

IntervalJob methods

interval

Returns the interval scheduled between each execution of the job.

Every jobs use a time duration between each start of their execution, while interval jobs use a time duration between the end of an execution and the start of the next.

CronJob methods

frequency

It returns the shortest interval of time between two potential occurences of the job.

For instance:

Rufus::Scheduler.parse('* * * * *').frequency         # ==> 60
Rufus::Scheduler.parse('* * * * * *').frequency       # ==> 1

Rufus::Scheduler.parse('5 23 * * *').frequency        # ==> 24 * 3600
Rufus::Scheduler.parse('5 * * * *').frequency         # ==> 3600
Rufus::Scheduler.parse('10,20,30 * * * *').frequency  # ==> 600

Rufus::Scheduler.parse('10,20,30 * * * * *').frequency  # ==> 10

It's used to determine if the job frequency is higher than the scheduler frequency (it raises an ArgumentError if that is the case).

brute_frequency

Cron jobs also have a #brute_frequency method that looks a one year of intervals to determine the shortest delta for the cron. This method can take between 20 to 50 seconds for cron lines that go the second level. #frequency above, when encountering second level cron lines will take a shortcut to answer as quickly as possible with a usable value.

looking up jobs

Scheduler#job(job_id)

The scheduler #job(job_id) method can be used to lookup Job instances.

  require 'rufus-scheduler'

  scheduler = Rufus::Scheduler.new

  job_id =
    scheduler.in '10d' do
      # ...
    end

  # later on...

  job = scheduler.job(job_id)

Scheduler #jobs #at_jobs #in_jobs #every_jobs #interval_jobs and #cron_jobs

Are methods for looking up lists of scheduled Job instances.

Here is an example:

  #
  # let's unschedule all the at jobs

  scheduler.at_jobs.each(&:unschedule)

Scheduler#jobs(:tag / :tags => x)

When scheduling a job, one can specify one or more tags attached to the job. These can be used to lookup the job later on.

  scheduler.in '10d', :tag => 'main_process' do
    # ...
  end
  scheduler.in '10d', :tags => [ 'main_process', 'side_dish' ] do
    # ...
  end

  # ...

  jobs = scheduler.jobs(:tag => 'main_process')
    # find all the jobs with the 'main_process' tag

  jobs = scheduler.jobs(:tags => [ 'main_process', 'side_dish' ]
    # find all the jobs with the 'main_process' AND 'side_dish' tags

Scheduler#running_jobs

Returns the list of Job instance that have currently running instances.

Whereas other "_jobs" method scan the scheduled job list, this method scans the thread list to find the job. It thus comprises jobs that are running but are not scheduled anymore (that happens for at and in jobs).

misc Scheduler methods

Scheduler#unschedule(job_or_job_id)

Unschedule a job given directly or by its id.

Scheduler#shutdown

Shuts down the scheduler, ceases any scheduler/triggering activity.

Scheduler#shutdown(:wait)

Shuts down the scheduler, waits (blocks) until all the jobs cease running.

Scheduler#shutdown(:kill)

Kills all the job (threads) and then shuts the scheduler down. Radical.

Scheduler#down?

Returns true if the scheduler has been shut down.

Scheduler#started_at

Returns the Time instance at which the scheduler got started.

Scheduler #uptime / #uptime_s

Returns since the count of seconds for which the scheduler has been running.

#uptime_s returns this count in a String easier to grasp for humans, like "3d12m45s123".

Scheduler#join

Let's the current thread join the scheduling thread in rufus-scheduler. The thread comes back when the scheduler gets shut down.

Scheduler#threads

Returns all the threads associated with the scheduler, including the scheduler thread itself.

Scheduler#work_threads(query=:all/:active/:vacant)

Lists the work threads associated with the scheduler. The query option defaults to :all.

  • :all : all the work threads
  • :active : all the work threads currently running a Job
  • :vacant : all the work threads currently not running a Job

Note that the main schedule thread will be returned if it is currently running a Job (ie one of those :blocking => true jobs).

Scheduler#scheduled?(job_or_job_id)

Returns true if the arg is a currently scheduled job (see Job#scheduled?).

Scheduler#occurrences(time0, time1)

Returns a hash { job => [ t0, t1, ... ] } mapping jobs to their potential trigger time within the [ time0, time1 ] span.

Please note that, for interval jobs, the #mean_work_time is used, so the result is only a prediction.

Scheduler#timeline(time0, time1)

Like #occurrences but returns a list [ [ t0, job0 ], [ t1, job1 ], ... ] of time + job pairs.

dealing with job errors

The easy, job-granular way of dealing with errors is to rescue and deal with them immediately. The two next sections show examples. Skip them for explanations on how to deal with errors at the scheduler level.

block jobs

As said, jobs could take care of their errors themselves.

scheduler.every '10m' do
  begin
    # do something that might fail...
  rescue => e
    $stderr.puts '-' * 80
    $stderr.puts e.message
    $stderr.puts e.stacktrace
    $stderr.puts '-' * 80
  end
end

callable jobs

Jobs are not only shrunk to blocks, here is how the above would look like with a dedicated class.

scheduler.every '10m', Class.new do
  def call(job)
    # do something that might fail...
  rescue => e
    $stderr.puts '-' * 80
    $stderr.puts e.message
    $stderr.puts e.stacktrace
    $stderr.puts '-' * 80
  end
end

TODO: talk about callable#on_error (if implemented)

(see scheduling handler instances and scheduling handler classes for more about those "callable jobs")

Rufus::Scheduler#stderr=

By default, rufus-scheduler intercepts all errors (that inherit from StandardError) and dumps abundent details to $stderr.

If, for example, you'd like to divert that flow to another file (descriptor). You can reassign $stderr for the current Ruby process

$stderr = File.open('/var/log/myapplication.log', 'ab')

or, you can limit that reassignement to the scheduler itself

scheduler.stderr = File.open('/var/log/myapplication.log', 'ab')

Rufus::Scheduler#on_error(job, error)

We've just seen that, by default, rufus-scheduler dumps error information to $stderr. If one needs to completely change what happens in case of error, it's OK to overwrite #on_error

def scheduler.on_error(job, error)

  Logger.warn("intercepted error in #{job.id}: #{error.message}")
end

Rufus::Scheduler #on_pre_trigger and #on_post_trigger callbacks

One can bind callbacks before and after jobs trigger:

s = Rufus::Scheduler.new

def s.on_pre_trigger(job, trigger_time)
  puts "triggering job #{job.id}..."
end

def s.on_post_trigger(job, trigger_time)
  puts "triggered job #{job.id}."
end

s.every '1s' do
  # ...
end

The trigger_time is the time at which the job triggers. It might be a bit before Time.now.

Warning: these two callbacks are executed in the scheduler thread, not in the work threads (the threads were the job execution really happens).

Rufus::Scheduler#on_pre_trigger as a guard

Returning false in on_pre_trigger will prevent the job from triggering. Returning anything else (nil, -1, true, ...) will let the job trigger.

Note: your business logic should go in the scheduled block itself (or the scheduled instance). Don't put business logic in on_pre_trigger. Return false for admin reasons (backend down, etc) not for business reasons that are tied to the job itself.

def s.on_pre_trigger(job, trigger_time)

  return false if Backend.down?

  puts "triggering job #{job.id}..."
end

Rufus::Scheduler.new options

:frequency

By default, rufus-scheduler sleeps 0.300 second between every step. At each step it checks for jobs to trigger and so on.

The :frequency option lets you change that 0.300 second to something else.

scheduler = Rufus::Scheduler.new(:frequency => 5)

It's OK to use a time string to specify the frequency.

scheduler = Rufus::Scheduler.new(:frequency => '2h10m')
  # this scheduler will sleep 2 hours and 10 minutes between every "step"

Use with care.

:lockfile => "mylockfile.txt"

This feature only works on OSes that support the flock (man 2 flock) call.

Starting the scheduler with :lockfile => ".rufus-scheduler.lock" will make the scheduler attempt to create and lock the file .rufus-scheduler.lock in the current working directory. If that fails, the scheduler will not start.

The idea is to guarantee only one scheduler (in a group of scheduler sharing the same lockfile) is running.

This is useful in environments where the Ruby process holding the scheduler gets started multiple times.

If the lockfile mechanism here is not sufficient, you can plug your custom mechanism. It's explained in advanced lock schemes below.

:max_work_threads

In rufus-scheduler 2.x, by default, each job triggering received its own, brand new, thread of execution. In rufus-scheduler 3.x, execution happens in a pooled work thread. The max work thread count (the pool size) defaults to 28.

One can set this maximum value when starting the scheduler.

scheduler = Rufus::Scheduler.new(:max_work_threads => 77)

It's OK to increase the :max_work_threads of a running scheduler.

scheduler.max_work_threads += 10

Rufus::Scheduler.singleton

Do not want to store a reference to your rufus-scheduler instance? Then Rufus::Scheduler.singleton can help, it returns a singleon instance of the scheduler, initialized the first time this class method is called.

Rufus::Scheduler.singleton.every '10s' { puts "hello, world!" }

It's OK to pass initialization arguments (like :frequency or :max_work_threads) but they will only be taken into account the first time .singleton is called.

Rufus::Scheduler.singleton(:max_work_threads => 77)
Rufus::Scheduler.singleton(:max_work_threads => 277) # no effect

The .s is a shortcut for .singleton.

Rufus::Scheduler.s.every '10s' { puts "hello, world!" }

advanced lock schemes

As seen above, rufus-scheduler proposes the :lockfile system out of the box. If in a group of schedulers only one is supposed to run, the lockfile mecha prevents schedulers that have not set/created the lockfile from running.

There are situation where this is not sufficient.

By overriding #lock and #unlock, one can customize how his schedulers lock.

This example was provided by Eric Lindvall:

class ZookeptScheduler < Rufus::Scheduler

  def initialize(zookeeper, opts={})
    @zk = zookeeper
    super(opts)
  end

  def lock
    @zk_locker = @zk.exclusive_locker('scheduler')
    @zk_locker.lock # returns true if the lock was acquired, false else
  end

  def unlock
    @zk_locker.unlock
  end

  def confirm_lock
    return false if down?
    @zk_locker.assert!
  rescue ZK::Exceptions::LockAssertionFailedError => e
    # we've lost the lock, shutdown (and return false to at least prevent
    # this job from triggering
    shutdown
    false
  end
end

This uses a zookeeper to make sure only one scheduler in a group of distributed schedulers runs.

The methods #lock and #unlock are overriden and #confirm_lock is provided, to make sure that the lock is still valid.

The #confirm_lock method is called right before a job triggers (if it is provided). The more generic callback #on_pre_trigger is called right after #confirm_lock.

parsing cronlines and time strings

Rufus::Scheduler provides a class method .parse to parse time durations and cron strings. It's what it's using when receiving schedules. One can use it diectly (no need to instantiate a Scheduler).

require 'rufus-scheduler'

Rufus::Scheduler.parse('1w2d')
  # => 777600.0
Rufus::Scheduler.parse('1.0w1.0d')
  # => 777600.0

Rufus::Scheduler.parse('Sun Nov 18 16:01:00 2012').strftime('%c')
  # => 'Sun Nov 18 16:01:00 2012'

Rufus::Scheduler.parse('Sun Nov 18 16:01:00 2012 Europe/Berlin').strftime('%c %z')
  # => 'Sun Nov 18 15:01:00 2012 +0000'

Rufus::Scheduler.parse(0.1)
  # => 0.1

Rufus::Scheduler.parse('* * * * *')
  # => #<Rufus::Scheduler::CronLine:0x00000002be5198
  #        @original="* * * * *", @timezone=nil,
  #        @seconds=[0], @minutes=nil, @hours=nil, @days=nil, @months=nil,
  #        @weekdays=nil, @monthdays=nil>

It returns a number when the output is a duration and a CronLine instance when the input is a cron string.

It will raise an ArgumentError if it can't parse the input.

Beyond .parse, there are also .parse_cron and .parse_duration, for finer granularity.

There is an interesting helper method named .to_duration_hash:

require 'rufus-scheduler'

Rufus::Scheduler.to_duration_hash(60)
  # => { :m => 1 }
Rufus::Scheduler.to_duration_hash(62.127)
  # => { :m => 1, :s => 2, :ms => 127 }

Rufus::Scheduler.to_duration_hash(62.127, :drop_seconds => true)
  # => { :m => 1 }

cronline notations specific to rufus-scheduler

first Monday, last Sunday et al

To schedule something at noon every first Monday of the month:

scheduler.cron('00 12 * * mon#1') do
  # ...
end

To schedule something at noon the last Sunday of every month:

scheduler.cron('00 12 * * sun#-1') do
  # ...
end
#
# OR
#
scheduler.cron('00 12 * * sun#L') do
  # ...
end

Such cronlines can be tested with scripts like:

require 'rufus-scheduler'

Time.now
  # => 2013-10-26 07:07:08 +0900
Rufus::Scheduler.parse('* * * * mon#1').next_time
  # => 2013-11-04 00:00:00 +0900

L (last day of month)

L can be used in the "day" slot:

In this example, the cronline is supposed to trigger every last day of the month at noon:

require 'rufus-scheduler'
Time.now
  # => 2013-10-26 07:22:09 +0900
Rufus::Scheduler.parse('00 12 L * *').next_time
  # => 2013-10-31 12:00:00 +0900

a note about timezones

Cron schedules and at schedules support the specification of a timezone.

scheduler.cron '0 22 * * 1-5 America/Chicago' do
  # the job...
end

scheduler.at '2013-12-12 14:00 Pacific/Samoa' do
  puts "it's tea time!"
end

# or even

Rufus::Scheduler.parse("2013-12-12 14:00 Pacific/Saipan")
  # => 2013-12-12 04:00:00 UTC

Behind the scenes, rufus-scheduler uses tzinfo to deal with timezones.

Here is a list of timezones known to my Debian GNU/Linux 7. It was generated with this script:

require 'tzinfo'
TZInfo::Timezone.all.each { |tz| puts tz.name }

Unknown timezones, typos, will be rejected by tzinfo thus rufus-scheduler.

On its own tzinfo derives the timezones from the system's information. On some system it needs some help, one can install the 'tzinfo-data' gem to provide the missing information.

so Rails?

Yes, I know, all of the above is boring and you're only looking for a snippet to paste in your Ruby-on-Rails application to schedule...

Here is an example initializer:

#
# config/initializers/scheduler.rb

require 'rufus-scheduler'

# Let's use the rufus-scheduler singleton
#
s = Rufus::Scheduler.singleton


# Stupid recurrent task...
#
s.every '1m' do

  Rails.logger.info "hello, it's #{Time.now}"
end

And now you tell me that this is good, but you want to schedule stuff from your controller.

Maybe:

class ScheController < ApplicationController

  # GET /sche/
  #
  def index

    job_id =
      Rufus::Scheduler.singleton.in '5s' do
        Rails.logger.info "time flies, it's now #{Time.now}"
      end

    render :text => "scheduled job #{job_id}"
  end
end

The rufus-scheduler singleton is instantiated in the config/initializers/scheduler.rb file, it's then available throughout the webapp via Rufus::Scheduler.singleton.

Warning: this works well with single-process Ruby servers like Webrick and Thin. Using rufus-scheduler with Passenger or Unicorn requires a bit more knowledge and tuning, gently provided by a bit of googling and reading, see Faq above.

support

see getting help above.

license

MIT, see LICENSE.txt

About

scheduler for Ruby (at, in, cron and every jobs)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Ruby 100.0%