Oracle Database Appliance Space Utilization Issue
One of the basic issue when having ODA (Oracle Database Appliance) system is space utilization in /opt mount point.
Recently I was facing the same issue on my Oracle Database Appliance version 19.17.0.0.0 one node, on checking the /opt mount point thoroughly it was found a file “dcs-agent.log” was generating the same log message repeatedly and a file of around 1.5G was being made daily on just one node.
File Location :-
/opt/oracle/dcs/log/dcs-agent.log
Error:-
The following is found on the logs:
2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request XXX70 is still inside being-processed-queue
2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:02,162 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1614_XXX70_
2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request XXX71 is still inside being-processed-queue
2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:02,209 INFO [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1667_XXX71_
2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011201 is still inside in-queue
2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2422_11201_
2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011176 is still inside in-queue
2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1735_11176_
2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011181 is still inside in-queue
2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,718 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1916_11181_
2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request 0000011177 is still inside in-queue
2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,720 INFO [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1818_11177_
2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011199 is still inside in-queue
2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,722 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2289_11199_
2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request 0000011189 is still inside in-queue
2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,723 INFO [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2125_11189_
2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011220 is still inside in-queue
2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,728 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:3933_11220_
2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011185 is still inside in-queue
2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,731 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2018_11185_
2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011179 is still inside in-queue
2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,734 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1765_11179_
2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011194 is still inside in-queue
2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
Analysis:-
After analyzing the logs it was understood that the issue was with a job which was not completed and was still in running, on further checking it was found that this is a known bug of this version in which the connection between ODA clients is lost and the job keeps running.
Solution:-
In solution of the error following steps are to be performed.
- Run the following all four commands on node1 and node2 respectively to stop the clients.
Note:-
Run all four commands on node 1 fist then run all four commands on node 2
- # /opt/zookeeper/bin/zkServer.sh stop
- # systemctl stop initdcsagent
- # systemctl stop initdcscontroller
- # systemctl stop initdcsadmi
- Now run all four following commands on node1 and node2 respectively to start the clients.
Note:-
Run all four commands on node 1 fist then run all four commands on node 2
- # /opt/zookeeper/bin/zkServer.sh status
- # systemctl status initdcsagent
- # systemctl status initdcscontroller
- # systemctl status initdcsadmin
- # odacli ping-agent
Once the action plan is completed check the log file again it should have stop generating the error and the size increase should also be stopped.
Oracle Solutions We believe in delivering tangible results for our customers in a cost-effective manner