An integral aspect of a trouble-free SIS integration is the ability to automate and monitor the flow of the data to Learn as presented by the integration. You have two new capabilities which facilitate this process:

  1. Results from integration POST URLs contain a data set process uid. For example:

    Success: Feed File Uploaded. Use the reference code afc3d6e84df84f51944a06cccee8f59a to track these records in the logs.

  2. A new Data Set Status URL has been added. When contacted with the data set process id, the URL returns an XML block containing specifics regarding the data set status. Note this may be called well after POST operation for the original data file is complete.

    https:// ... /webapps/bb-data-integration-flatfile-[YOUR ID]/endpoint/dataSetStatus/afc3d6e84df84f51944a06cccee8f59a

    which returns:

    <dataSetStatus>
        <completedCount>5</completedCount>
        <dataIntegrationId type="blackboard.platform.dataintegration.DataIntegration">_123_1</dataIntegrationId>
        <dataSetUid>afc3d6e84df84f51944a06cccee8f59a</dataSetUid>
        <errorCount>0</errorCount>
        <lastEntryDate>2013-03-20T10:45:48-05:00</lastEntryDate>
        <queuedCount>0</queuedCount>
        <startDate>2013-03-20T10:45:48-05:00</startDate>
        <warningCount>0</warningCount>
    </dataSetStatus>

    Learn more about database identifiers in SaaS deployments

These additions facilitate scripted monitoring of automated Snapshot Flat File integrations.

This topic provides an overview of how to automate and monitor a Snapshot Flat File integration. The examples provided are written for a UNIX or Linux platform using CRON for automation and the UNIX shell scripting language BASH, but the concepts may be applied to any language which is capable of POST/GET operations, parsing strings and dates, performing database queries, and sending email. Thus PERL, JAVA, PHP, and Ruby are suitable for development, as are other shell or batch languages.

The goal is to provide a functional reference implementation for developing a monitored and automated SIS integration using Snapshot Flat File. This will contain two components - automation and monitoring. The reference implementation is available via the link at the bottom of this page.

Inclusion of logged error messages via the presented reference implementation is currently only compatible with self-hosted systems. This capability is outlined in the script and requires removing comments from the code for use in self-hosted environments. Managed Hosting clients will continue to rely on the Learn SIS Framework logging interface to access error messages. A future improvement to this document will provide a means of 'in-script' access to these log messages.

Use case: Automating SIS flat file processing

The following use case serves as the basis for the reference implementation and as a guide for your own development.

Summary

Automate the processing of SIS-generated Snapshot Flat File data and email reports to specified administrators. The solution should periodically determine the presence of new data files, the object and data source, and present the data to Learn. Processed files should be placed in a separate directory for archival purposes. The solution also should provide the ability to manually process data files. In all cases status emails on failure or success should be sent to administrators. The email should contain available data regarding the process and any error messages.

Actors

SIS, Operating System scheduler, Learn

Preconditions

In a specified directory, SIS generates and stores Snapshot Flat File formatted text files for Learn objects such as users, courses, enrollments, and staff assignments.

Description

  1. The SIS provides Snapshot Flat Files to a script-specified data directory
  2. The operating system scheduler (CRON) starts the sis_snpshtFF_auto script
  3. The sis_snpshtFF_auto script checks the data directory for the presence of any files
  4. The sis_snpshtFF_auto script determines the Learn object type of each file
  5. The sis_snpshtFF_auto script calls the sis_snpshtFF_manual script accordingly and in the object hierarchy: users, courses, memberships
  6. The sis_snpshtFF_manual script uses POST to send the data to Learn, determines completion state
  7. If there are errors and configured to do so sis_snpshtFF_manual queries the integration logs for error messages
  8. The sis_snpshtFF_manual script constructs an email containing status information and emails it to configured email addresses
  9. Steps 5-8 are repeated for each data file
  10. After it finishes processing all of the data files, the sis_snpshtFF_auto script sends a status email to configured email addresses

Postconditions

The data provided by the SIS-generated flat files is sequentially by data object added to Learn and the original files are archived with the processing timestamp added to the original filename. Script-configured administrators receive status emails per processed file and an overall status email covering the full processing task.

The solution

Setting aside the generation of the data files by the SIS, which is outside of this document's scope, there are three components to the automation problem:

  • The "When": determining when to run the processing of the provided data,
  • The "What": which data objects and data sources that provided data covers, and
  • The "How": processing and monitoring of that data

Using the above use case, we can build a set of configurable scripts that can determine if a data file exists (should this task do anything), what object type that data applies to, call a processing script with the appropriate parameters to meet the integration goals, process the data, and on completion, archive the data file.

The following sections address these three components. Putting all three in place will provide a monitorable automated process for moving SIS data into Learn via the Snapshot Flat File integration Type.

The BASH script is heavily commented so will not be provided here. Rather, let's focus on the overall flow and processing details.

The "How": Using sis_snpshtFF_auto.sh and sis_snpshtFF_manual scripts

Based on the above use case, the below BASH script (sis_snpshtFF_auto.sh) performs the following operations:

  • Checks if there are files in the specified directory
  • Determines the object type and operation based on the header information in the file
  • Orders the processing so that files are processed in the correct order. For example: Users, then courses, then memberships.
  • Calls a sub-script (sis_snpshtFF_manual.sh) for processing, monitoring, and admin email notification of the processing status
  • Archives the data file on completion of processing
  • Processes the next data file if it exists
  • Finally, mails a report indicating the cumulative result of the call to the automation script.

This provides the following general flow shown in Figure 1:

 

Figure 1: General flow of automating Snapshot Flat File processing as demonstrated in the provided reference solution.

In Figure 2, below, there is a further breakdown of the scripted portion of the process. We have two scripts sis_snpshtFF_auto.sh, on the left, and sis_snpshtFF_manual.sh, on the right.

Diagram of general workflow for snapshot automation including of the scripted portion of the process. We have two scripts sis_snpshtFF_auto.sh, on the left, and sis_snpshtFF_manual.sh, on the right.

Figure 2. Detailed workflow of scripted processes

The operations shown in Figure 2 are also included as comments in each script's automation reference implementation.

The "What": sis_snpshtFF_auto.sh

At a high level, the sis_snpsht_auto script loads files found in the configured data directory, analyzes the file data header to determine the object type referenced, and adds the file to the appropriate list for later processing. The header analysis determines the type of object the file is referring to and thus its order in the snapshot processing queue. This sorting allows for a single drop point for the SIS generated flat files.

After all files have been analyzed and are grouped into lists of their object type, the lists are processed in logic order of users, courses, memberships. Each file is handed off to the sis_snpshtFF_manual script for processing, which may also be run from the command-line, along with the appropriate arguments for each object type. The sis_snpshtFF_manual script takes the incoming arguments and uses the appropriate url to POST the data file to Learn. When complete, the script enters a monitoring loop and then builds a report and emails it to the configured list of recipients. The file is returned to the sis_snpsht_auto script, which sends the next file for processing. This process is repeated until all files have been processed. After all files have been processed, sis_snpshtFF_auto emails a final report to the list of configured recipients.

The "When": Using CRON to schedule snapshots

The purpose of automation is the ability to run the script set without human intervention. UNIX provides this capability with CRON, a built in scheduling application. A system process, CRON, periodically checks the system crontab, a system file that contains a list of commands and settings for when they should run. Entries are then assessed for whether they should be run now or later as indicated in the crontab entry.

Frequency of cron jobs and processing times for operations should be considered in setting crontab entries, as REFRESH operations may take longer than STORE operations. Processing of REFRESH vs STORE data may be handled through separate crontab entries and separate data source directories in the script arguments if using the below provided script.

CRON expressions for crontab Entries

Format: CRON uses a very specific format for scheduling data. It relies on a space separated list of five required data fields:

Field Description Allowed Value
MIN Minute field 0 to 59
HOUR Hour field 0 to 23
DOM Day of month 1-31
MON Month field 1-12
DOW Day of week 0-6
CMD Command Any command to be executed.

In practice, this format may be applied as simply or as complex you choose.

Examples:

An example of a simple crontab entry would be to run a task at the top of every hour:

0 * * * * /usr/local/blackboard/apps/snapshot/scripts/sis_snpshtFF_auto.sh

  • 0 - 0th Minute
  • * - Every Hour
  • * - Every Day
  • *- Every Month
  • * - Every day of the week

or once at midnight every day:

0 0 * * * /usr/local/blackboard/apps/snapshot/scripts/sis_snpshtFF_auto.sh

  • 0 - 0th Minute
  • 0 - Every Hour
  • * - Every Day
  • * - Every Month
  • * - Every day of the week

Run a task twice a day during the work week:

0 11,16 * * 1-5 /usr/local/blackboard/apps/snapshot/scripts/sis_snpshtFF_auto.sh

  • 0 - 0th Minute (Top of the hour)
  • 11,16 - 11 AM and 4 PM
  • * - Every day
  • * - Every month
  • 1-5 - Monday through Friday

or every two hours Monday through Friday:

0 */2 * * Mon-Fri /usr/local/blackboard/apps/snapshot/scripts/sis_snpshtFF_auto.sh

  • 0 - 0th Minute (Top of the hour)
  • */2 - Every even hour 12, 2, 4, 6, 8, 10, 12, 14, etc. or every other hour.
  • * - Every day
  • * - Every month
  • Mon-Fri - Monday through Friday

Learn more about CRON

You may also view the man page for your system via the command line using the command $ man 5 crontab

Adding a crontab

Using the above cron settings, we can add a crontab entry for scheduling when the Snapshot task will run.

  1. To edit your root crontab file, as the root user type the following command at the UNIX/Linux shell prompt:

    $ crontab -e

    Note that -e drops you into a vi editor.

  2. To run the Flat File automated processing script at midnight every day, add the following to the list of tasks:

    root@dev$ crontab -e

  3. Type i to enter edit mode, then type:

    0 0 * * * /usr/local/blackboard/apps/snapshot/scripts/sis_snpshtFF_auto.sh

  4. Press escape to exit edit mode.
  5. Type :wq to save your edit and quit the editor.

References

The following people contributed ideas, input, and suggestions for the BASH reference implementation:

Kelt Dockins contributed his BASH-based implementation for Snapshot Flat File content analysis (no longer available online).

Ross Brown and Jerald Todd caught some early issues exposed in the first version of the reference implementation.

Files

This downloadable SIS_SnpshtFF_Bash_Scripts archive (zip) contains functional and commented code demonstrating the concepts presented in this topic.

SIS Snapshot Flat File Bash Scripts Archive

Learn more

SIS Framework Overview

Data Source Key Overview