Directory Monitoring

A common task with many batch processing systems is to look for the appearance of new files and queue jobs to process them. DirmonJob is a job designed to do this task.

DirmonJob runs every 5 minutes by default, looking for new files that have appeared based on configured entries called DirmonEntry. These entries can be managed programmatically, or via Rocket Job Web Interface, the web management interface for Rocket Job.

Example, creating a DirmonEntry

entry = RocketJob::DirmonEntry.create!(
  pattern:           '/path_to_monitor/*',
  job_class_name:    'MyFileProcessJob',
  archive_directory: '/exports/archive'
)

When a Dirmon entry is created it is initially disabled and needs to be enabled before DirmonJob will start processing it:

entry.enable!

Active dirmon entries can also be disabled:

entry.disable!

The attributes of DirmonEntry:

{ priority: 45 }

Starting the directory monitor

The directory monitor job only needs to be started once per installation by running the following code:

RocketJob::Jobs::DirmonJob.create!

Dirmon Job is a scheduled job which is set to run every 5 minutes. Once created, its cron_schedule can be changed at any time via the Rocket Job Web Interface (RJMC).

For example, to override the cron schedule when creating Dirmon Job:

RocketJob::Jobs::DirmonJob.create!(cron_schedule: "*/1 * * * * UTC")

The default priority for DirmonJob is 40, to increase it’s priority:

RocketJob::Jobs::DirmonJob.create!(
  cron_schedule: "*/5 * * * * UTC",
  priority:      25
)

Once DirmonJob has been started it’s priority and check interval can be changed at any time as follows:

RocketJob::Jobs::DirmonJob.first.update_attributes(
  cron_schedule: "*/5 * * * * UTC",
  priority:      20
)

High Availability

The DirmonJob will automatically re-schedule a new instance of itself to run in the future after it completes each scan/run. If successful the current job instance will destroy itself.

In this way it avoids having a single Directory Monitor process that constantly sits there monitoring folders for changes. More importantly it avoids a single point of failure that is typical for earlier directory monitoring solutions. Every time DirmonJob runs and scans the paths for new files it could be running on a different worker. If any worker is removed or shutdown it will not stop DirmonJob since it will just run on another worker instance.

There can only be one DirmonJob instance queued or running at a time.

If an exception occurs while running DirmonJob, a failed job instance will remain in the job list for problem determination. The failed job cannot be restarted and should be destroyed when no longer needed.