User noted that reading had stopped collecting on a specific date. The ODC service was checked and it had staled however, restarting the ODC service did not correct the issue for all sites. Some sites were collecting ODC information and some were not.
The error logs were investigated and there were no reference errors in the indicating that the server was having problems. When we compared the sites On Premises ODC > Pending Requests, there were recent pending reading requests for all sites, for the sites not receiving readings the recent requests all had On-Premises status of "waiting to send", with a small subset of "sent to on-premises " dated from the day the readings stopped updating in Asset Reliability.
Next the ODC data sources were compared and found to be as expected, it was originally thought that the password may have been changed but it had not.
Next we met with the user's PI Administrator to check on the status of the APMOnPremODCAgent application installed on the PI server. The service has started. When reviewing the OnPremODCAgent log on the PI server there was a stand-out set of errors, the first part of the error set indicated a failed communication to an IP address via time-out or active connection refusal:
2024-08-19 07:02:30.3116Z - ERROR: GetData request for "https://localhost:44314/api/PendingODCRequests/?bMarkRequestAsSent=true" timed out:
2024-08-19 07:02:30.3116Z - ERROR: An error occurred while sending the request.
2024-08-19 07:02:30.3116Z - ERROR: Unable to connect to the remote server
2024-08-19 07:02:30.3116Z - ERROR: No connection could be made because the target machine actively refused it XXX.X.X.X:44314
The IP address was then pinged successfully indicating that there was no active blocking of the port.
The next major error still indicated a Web Exception with the same web address:
2024-08-19 07:02:30.3116Z - ERROR: GetData request for "https://localhost:44314/api/PendingODCRequests/?bMarkRequestAsSent=true" produced exception: System.AggregateException: One or more errors occurred. ---> System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it XXX.X.X.X:44314
at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)
at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)
--- End of inner exception stack trace ---
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at System.Net.Http.HttpClientHandler.GetResponseCallback(IAsyncResult ar)
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
at Bentley.OnPremODCAgent.RestWrapper.GetData[T](String requestUri)
---> (Inner Exception #0) System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it XXX.X.X.X:44314
at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)
at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)
--- End of inner exception stack trace ---
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at System.Net.Http.HttpClientHandler.GetResponseCallback(IAsyncResult ar)
--- End of inner exception stack trace ---<---
At this point it was noticed that the web address indicated an unusual port address, as most APM services use a different default port, including the ODC service. We then checked the OnPremODCAgent.config file that indicated the default address was set to the "localhost:44314" but that there was an API key indicating that the the service was hosted by Bentley.
In a hosted environment, the ODC service runs from the hosted Server Manager and therefore we identified the source of the problem. The APMOnPremODCAgent had been reinstalled on the PI Server and the default address had overwritten the "APMServerManagerAddress" with the default value of "https://Localhost:44314". The user then stopped the APMOnPremODCAgent Windows service, reran the APMOnPremODCAgent installer, updated the address with their Bentley hosted APM Server Manager Address, tested the connections, saved and restarted the Windows service.
Note, when updating the server manager address, there API key should already be filled in, do not overwrite this. If it is blank, you will need to use the API key that was originally provided by Bentley.
When we then checked APM ODC On Prem > Pending requests for the site in question, the pending requests had all been received and processed. We then verified that there were ODC indicators on the site that had readings dated to the time of the change, indicating that the full ODC service for the site had been fully restored.