ODApost-07ebc366

ODA Space Utilization Issue

Oracle Database Appliance Space Utilization Issue

 

One of the basic issue when having ODA (Oracle Database Appliance) system is space utilization in /opt mount point.

Recently I was facing the same issue on my Oracle Database Appliance version 19.17.0.0.0 one node, on checking the /opt mount point thoroughly it was found a file “dcs-agent.log” was generating the same log message repeatedly and a file of around 1.5G was being made daily on just one node.

File Location :-

/opt/oracle/dcs/log/dcs-agent.log

 

Error:-

The following is found on the logs:

2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request XXX70 is still inside being-processed-queue

2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:02,162 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1614_XXX70_

2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request XXX71 is still inside being-processed-queue

2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:02,209 INFO [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1667_XXX71_

2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011201 is still inside in-queue

2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2422_11201_

2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011176 is still inside in-queue

2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1735_11176_

2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011181 is still inside in-queue

2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,718 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1916_11181_

2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request 0000011177 is still inside in-queue

2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,720 INFO [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1818_11177_

2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011199 is still inside in-queue

2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,722 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2289_11199_

2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request 0000011189 is still inside in-queue

2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,723 INFO [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2125_11189_

2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011220 is still inside in-queue

2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,728 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:3933_11220_

2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011185 is still inside in-queue

2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,731 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2018_11185_

2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011179 is still inside in-queue

2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

2024-01-02 07:48:03,734 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1765_11179_

2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011194 is still inside in-queue

2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.

Analysis:-

After analyzing the logs it was understood that the issue was with a job which was not completed and was still in running, on further checking it was found that this is a known bug of this version in which the connection between ODA clients is lost and the job keeps running.

 

Solution:-

In solution of the error following steps are to be performed.

  1. Run the following all four commands on node1 and node2 respectively to stop the clients.

Note:-

               Run all four commands on node 1 fist then run all four commands on node 2

 

  1. # /opt/zookeeper/bin/zkServer.sh stop
  2. # systemctl stop initdcsagent
  3. # systemctl stop initdcscontroller
  4. # systemctl stop initdcsadmi

 

  1. Now run all four following commands on node1 and node2 respectively to start the clients.

Note:-

               Run all four commands on node 1 fist then run all four commands on node 2

 

  1. # /opt/zookeeper/bin/zkServer.sh status
  2. # systemctl status initdcsagent
  3. # systemctl status initdcscontroller
  4. # systemctl status initdcsadmin
  5. # odacli ping-agent

 

Once the action plan is completed check the log file again it should have stop generating the error and the size increase should also be stopped.

    About Abdul Khalique Siddique

    In addition to my proficiency in Oracle Database, I have also specialized in Oracle E-Business Suite. I have hands-on experience in implementing, configuring, and maintaining EBS applications, enabling organizations to streamline their business processes and achieve operational efficiency. Also I have hands-on experience in Oracle Cloud Infrastructure (OCI). I have worked with OCI services such as compute, storage, networking, and database offerings, leveraging the power of the cloud to deliver scalable and cost-effective solutions. My knowledge of OCI architecture and deployment models allows me to design and implement robust and secure cloud environments for various business requirements. Furthermore, I have specialized in disaster recovery solutions for Oracle technologies. I have designed and implemented comprehensive disaster recovery strategies, including backup and recovery procedures, standby databases, and high availability configurations. My expertise in data replication, failover mechanisms, and business continuity planning ensures that organizations can quickly recover from disruptions and maintain uninterrupted operations.

    Leave a Reply