Scheduling in PDI
Once you're finished designing your PDI jobs and
transformations, you can arrange to run them at certain time intervals through
the DI Server,
or through your own scheduling mechanism (such as cron on
Linux, and the Task Scheduler or
the at command
on Windows). The methods of operation for scheduling and scripting are different; scheduling through the DI Server is done through the Spoon graphical interface, whereas scripting
using your own scheduler or executor is done by calling the pan or kitchen commands.
This section explains all of the details for scripting and scheduling PDI content.
You can schedule your jobs through:
1. Data
Integration (DI) Server
2. Manual scripting through pan or kitchen
commands
1. DI Server
This method is done through the Spoon
graphical interface & is only available for Enterprise repository
After you design your job, the steps are as
follows:
1. Open a job or transformation, then go to
the Action menu and select Schedule.
Enter your configurations in the the Schedule a Transformation dialog box.
2. Manual scripting through pan or kitchen commands
Command-Line Scripting Through Pan and Kitchen
Pan is the PDI command line tool for executing
transformations.
Kitchen is the PDI command line tool for executing
jobs.
You can use PDI's command line tools to
execute PDI content from outside of Spoon. Typically you would use these tools
in the context of creating a script or a cron job to run the job or
transformation based on some condition outside of the realm of Pentaho software.
Pan
pan.sh - option = value arg1 arg2
pan.bat / option : value arg1 arg2
Switch
|
Purpose
|
rep
|
Enterprise or database
repository name, if you are using one
|
user
|
Repository username
|
pass
|
Repository password
|
trans
|
The name of the
transformation (as it appears in the repository) to launch
|
dir
|
The repository
directory that contains the transformation, including the leading slash
|
file
|
If you are calling a
local KTR file, this is the filename, including the path if it is not in the
local directory
|
level
|
The logging level
(Basic, Detailed, Debug, Rowlevel, Error, Nothing)
|
logfile
|
A local filename to
write log output to
|
listdir
|
Lists the directories
in the specified repository
|
listtrans
|
Lists the
transformations in the specified repository directory
|
listrep
|
Lists the available
repositories
|
exprep
|
Exports all repository
objects to one XML file
|
norep
|
Prevents Pan from
logging into a repository. If you have set the KETTLE_REPOSITORY,
KETTLE_USER, and KETTLE_PASSWORD environment variables, then this option will
enable you to prevent Pan from logging into the specified repository,
assuming you would like to execute a local KTR file instead.
|
safemode
|
Runs in safe mode,
which enables extra checking
|
version
|
Shows the version,
revision, and build date
|
param
|
Set a named parameter
in a name=value format. For example: -param:FOO=bar
|
listparam
|
List information about
the defined named parameters in the specified transformation.
|
maxloglines
|
The maximum number of
log lines that are kept internally by PDI. Set to 0 to keep all rows (default)
|
maxlogtimeout
|
The maximum age (in
minutes) of a log line while being kept internally by PDI. Set to 0 to keep all rows indefinitely (default)
|
sh pan.sh -rep=initech_pdi_repo -user=pgibbons
-pass=lumburghsux -trans=TPS_reports_2011
pan.bat /rep:initech_pdi_repo /user:pgibbons
/pass:lumburghsux /trans:TPS_reports_2011
Kitchen Syntax
Kitchen runs jobs, either from a PDI repository (database or
enterprise), or from a local file. The syntax for the batch file and shell
script are shown below. All Kitchen options are the same for both.
kitchen.sh - option = value arg1 arg2
kitchen.bat / option : value arg1 arg2
Switch
|
Purpose
|
rep
|
Enterprise or database repository
name, if you are using one
|
user
|
Repository username
|
pass
|
Repository password
|
job
|
The name of the job (as it appears
in the repository) to launch
|
dir
|
The repository directory that
contains the job, including the leading slash
|
file
|
If you are calling a local KJB
file, this is the filename, including the path if it is not in the local
directory
|
level
|
The logging level (Basic,
Detailed, Debug, Rowlevel, Error, Nothing)
|
logfile
|
A local filename to write log
output to
|
listdir
|
Lists the directories in the
specified repository
|
listjob
|
Lists the jobs in the specified
repository directory
|
listrep
|
Lists the available repositories
|
export
|
Exports all linked resources of
the specified job. The argument is the name of a ZIP file.
|
norep
|
Prevents Kitchen from logging into
a repository. If you have set the KETTLE_REPOSITORY, KETTLE_USER, and
KETTLE_PASSWORD environment variables, then this option will enable you to
prevent Kitchen from logging into the specified repository, assuming you
would like to execute a local KTR file instead.
|
version
|
Shows the version, revision, and
build date
|
param
|
Set a named parameter in a name=value format.
For example: -param:FOO=bar
|
listparam
|
List information about the defined
named parameters in the specified job.
|
maxloglines
|
The maximum number of log lines
that are kept internally by PDI. Set to 0 to keep all rows
(default)
|
maxlogtimeout
|
The maximum age (in minutes) of a
log line while being kept internally by PDI. Set to 0 to
keep all rows indefinitely (default)
|
sh kitchen.sh -rep=initech_pdi_repo -user=pgibbons
-pass=lumburghsux -job=TPS_reports_2011
kitchen.bat /rep:initech_pdi_repo /user:pgibbons
/pass:lumburghsux /job:TPS_reports_2011
Comments
Post a Comment