AD Controller is ad utilities used to monitor/ control the workers execution.
-
Running AD controller.
Step 1 : Login as Applications Tier user & run the environment file. Environment file is located in APPL_TOP directory.
$ cd /prod/ebs/apps/apps_st/appl
Or
$ cd $APPL_TOP
$. ./ APPSebs_example.env
Step 2 : Run the following AD controller command.
$ adctrl
Note: You will be prompted for the location of APPL_TOP location, username of APPLSYS and password APPS.After providing the above information the AD controller menu will appear as shown below.
AD Controller Menu
—————————————————
- Show worker status
- Tell worker to restart a failed job
- Tell worker to quit
- Tell manager that a worker failed its job
- Tell manager that a worker acknowledges quit
- Restart a worker on the current machine
- Exit
Enter your choice [1] :
-
Checking the status of the workers?
After adctrl is started, we have to choose the first option “Show worker status”.
Please Note: If there is no session, used by the workers, then the following message will appear:
Error: The FND_INSTALL_PROCESSES table does not exist.
This table is used for communication with the
worker processes, and if it does not exist, it
means that the workers are not running, because
the ad utility has not started them yet.
You should check the file
adctrl.log
for errors.
This is because the FND_INSTALL_PROCESSES table is created when AD parallel jobs start (not the AD utility) and is dropped when the task is completed.
-
The meaning of each worker status.
STATUS |
DESCRIPTION |
Waiting | The worker is idle. |
Assigned | A job was assigned by the manager to a worker but the worker didn’t start the job. |
Running | The worker is running a job. |
Failed | The job failed due to an error. |
Fixed, Restart | When a jobs restart after the error has been fixed (during this time the worker run the failed job). |
Restarted | After the error has been fixed, the worker will have the status “Fixed, Restart” and after that “Restarted”. (The status will not change to “Running”) |
Completed | The job was completed and the manager did not yet assigned another job to that worker. |
-
Database Processing Phases concept
When a database patch/ operation will run, tasks are divided into functions. This is done by Oracle when the patch is created. Suppose a patch will create 2 tables and 2 sequences. In that case the patch driver contains 2 phases, one for tables creation and one for sequences creation. Because the sequences could be created in the same time, this will be done in parallel by using more workers.
Here are some Database Processing Phases:
seq = create sequence
tab = create tables, synonyms, grants privileges on tables
pls = create package specification
plb = create package body
vw = create views
-
Fixing a “Failed” worker
If the job fails 1st time
The job is deferred at the end of the phase and another job is assigned to that worker.
If the job fails 2nd time
– If the run time of the job was < 10 min => the job is deferred at the end of the phase and another job is assigned to that worker.
– If the run time of the job was >= 10 min => the job status will be “Failed”.
If the job fails 3nd time
The job status will be “Failed”.
To review the worker log information you have to check into
$APPL_TOP/admin/<SID>/log/adworkNNN.log
Example: adwork001.log will be the log file for the worker number 1.
After fixing the error we have to start (if is not already started) AD Controller and to use:
Option 2 : “Tell worker to restart a failed job”.
-
Restarting a Failed Patch Process
During a patch process (or adadmin process) if a job fails and cannot be restarted the patch must be restarted.
Here are the steps for doing this:
Option 3. Tell worker to quit (for all workers) —— [this manually shutdown/ quit the workers]
Option 4. Tell manager that a worker failed its job
Option 5. Tell manager that a worker acknowledges quit—[ Manager will stop, the AutoPatch will stop]
Then Restart the patch
PLEASE NOTE: When the patch will restart all the information in the database about this session must be accurate.
-
Determine if a process is Hanging or not
- We can check the log file to see if some information is added or not to the log file.
- We can determine if the worker process is consuming CPU by issuing below command.
$ ps -eo pcpu,pid,user,args | grep workerid
- We check if there are any child processes, which are consuming CPU by issuing following command:
$ ps -eo pcpu,pid,ppid,user,args | grep <Parent Process> | grep -v grep
-
Restarting a Hanging Worker Process
1.kill at the OS level the processes associated with the Hanging Worker Process.
$ kill -9 (Process Number)
- Fix the problem.
- Restart the worker (or the job)
-
Restart an AD utility after a Node Crash
- Start AD Controller
- Choose Option : “4. Tell manager that a worker failed its job”
- Choose Option : “2. Tell worker to restart a failed job”
- Restart the AD utility that was running when the node crashed.
-
Shutting down the Manager
- Start AD Controller
- Choose Option: “3. Tell worker to quit”
- Verify that no worker processes are running.