Oracle Database Appliance Space Utilization Issue
One of the basic issue when having ODA (Oracle Database Appliance) system is space utilization in /opt mount point.
Recently I was facing the same issue on my Oracle Database Appliance version 19.17.0.0.0 one node, on checking the /opt mount point thoroughly it was found a file “dcs-agent.log” was generating the same log message repeatedly and a file of around 1.5G was being made daily on just one node.
File Location :-
/opt/oracle/dcs/log/dcs-agent.log
Error:-
The following is found on the logs:
2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request XXX70 is still inside being-processed-queue
2024-01-02 07:48:02,162 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:02,162 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1614_XXX70_
2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request XXX71 is still inside being-processed-queue
2024-01-02 07:48:02,208 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:02,209 INFO [JobReportRecorder_TaskZJsonRpcExt_1156 : JobId=416a7f72-78c3-4c7f-bf60-133e4df77809] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1667_XXX71_
2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011201 is still inside in-queue
2024-01-02 07:48:03,712 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2422_11201_
2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011176 is still inside in-queue
2024-01-02 07:48:03,713 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,713 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1735_11176_
2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011181 is still inside in-queue
2024-01-02 07:48:03,718 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,718 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1916_11181_
2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request 0000011177 is still inside in-queue
2024-01-02 07:48:03,720 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,720 INFO [JobReportRecorder_TaskZJsonRpcExt_1196 : JobId=ddf01a02-6918-4c45-b5c0-b755aaf6d525] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1818_11177_
2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011199 is still inside in-queue
2024-01-02 07:48:03,721 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,722 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2289_11199_
2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request 0000011189 is still inside in-queue
2024-01-02 07:48:03,723 DEBUG [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,723 INFO [JobReportRecorder_TaskZJsonRpcExt_1276 : JobId=f297cee9-769a-4bd8-8d6f-017a693617c5] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2125_11189_
2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011220 is still inside in-queue
2024-01-02 07:48:03,728 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,728 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:3933_11220_
2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011185 is still inside in-queue
2024-01-02 07:48:03,731 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,731 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:2018_11185_
2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011179 is still inside in-queue
2024-01-02 07:48:03,733 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
2024-01-02 07:48:03,734 INFO [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Node not yet received from /nodes/node_1/cmd-out-q with prefix:1765_11179_
2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request 0000011194 is still inside in-queue
2024-01-02 07:48:03,734 DEBUG [DescribeSystemComponent with Node Information Node] [] c.o.d.a.z.DCSZQueue: Request is found on remote node.
Analysis:-
After analyzing the logs it was understood that the issue was with a job which was not completed and was still in running, on further checking it was found that this is a known bug of this version in which the connection between ODA clients is lost and the job keeps running.
Solution:-
In solution of the error following steps are to be performed.
- Run the following all four commands on node1 and node2 respectively to stop the clients.
Note:-
Run all four commands on node 1 fist then run all four commands on node 2
- # /opt/zookeeper/bin/zkServer.sh stop
- # systemctl stop initdcsagent
- # systemctl stop initdcscontroller
- # systemctl stop initdcsadmi
- Now run all four following commands on node1 and node2 respectively to start the clients.
Note:-
Run all four commands on node 1 fist then run all four commands on node 2
- # /opt/zookeeper/bin/zkServer.sh status
- # systemctl status initdcsagent
- # systemctl status initdcscontroller
- # systemctl status initdcsadmin
- # odacli ping-agent
Once the action plan is completed check the log file again it should have stop generating the error and the size increase should also be stopped.