java.sql.SQLRecoverableException: IO Error: Socket read timed out

java.sql.SQLRecoverableException: IO Error: Socket read timed out during adop/adpatch


While Application Team try to applying the R12.2 upgrade driver, They  faced the issue of WFXLoad.class failing in adworker log but showing up as running on adctrl
        Control





Worker  Code      Context            Filename                    Status
------  --------  -----------------  --------------------------  --------------
     1  Run       AutoPatch R120 pl  WFXLoad.class               Running     
     2  Run       AutoPatch R120 pl  WFXLoad.class               Running     
     3  Run       AutoPatch R120 pl  WFXLoad.class               Running     
     4  Run       AutoPatch R120 pl  WFXLoad.class               Running     
     5  Run       AutoPatch R120 pl  WFXLoad.class               Running     
     6  Run       AutoPatch R120 pl                              Wait       
     7  Run       AutoPatch R120 pl  WFXLoad.class               Running
     
 

log shows:




Exception in thread "main" java.sql.SQLRecoverableException: IO Error: Socket read timed out
        at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:482)
        at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:678)
        at oracle.jdbc.driver.T4CConnection.(T4CConnection.java:238)
        at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:34)
        at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:567)
        at java.sql.DriverManager.getConnection(DriverManager.java:571)
        at java.sql.DriverManager.getConnection(DriverManager.java:215)
        at oracle.apps.ad.worker.AdJavaWorker.getAppsConnection(AdJavaWorker.java:1041)
        at oracle.apps.ad.worker.AdJavaWorker.main(AdJavaWorker.java:276)
Caused by: oracle.net.ns.NetException: Socket read timed out
        at oracle.net.ns.Packet.receive(Packet.java:341)
        at oracle.net.ns.NSProtocol.connect(NSProtocol.java:308)
        at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1222)
        at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:330)
        ... 8 more





This was happening again and again. The DBAs were suspecting network issue, cluster issue, server issue and all the usual suspects.  In Database alert log we saw these errors coming every few seconds:
Fatal NI connect error 12537, connecting to:







 (LOCAL=NO)
  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Time: 11-NOV-2014 21:58:21
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12537

TNS-12537: TNS:connection closed
    ns secondary err code: 12560
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
opiodr aborting process unknown ospid (26388) as a result of ORA-609


We tried changing the parameters in sqlnet.ora and listener.ora as instructed in the article:
Troubleshooting Guide for

ORA-12537 / TNS-12537 TNS:Connection Closed (Doc ID 555609.1)
Sqlnet.ora: SQLNET.INBOUND_CONNECT_TIMEOUT=180
Listener.ora: INBOUND_CONNECT_TIMEOUT_listener_name=120


However, the errors continued.  To rule out any issues in network, I also restarted the network service

on Linux:
service network restart


One thing which I noticed was the extra amount of time that the connect was taking 4 seconds:


21:22:36 SQL> conn apps/apps
Connected.
21:22:40 SQL>



Checked from remote app tier and it was same 4 seconds.
Stopped listener and checked on DB server that uses bequeath protocol:


23:23:55 SQL> conn / as sysdba
Connected.
23:23:59 SQL> conn / as sysdba
Connected.



Again it took 4 seconds.
A few days back, I had seen that connect time had increased after turning setting the DB initialization parameter pre_page_sga to true in a different instance.  On a hunch, I checked this and indeed pre_page_sga was set to true.  I set this back to false:


alter system set pre_page_sga=false scope=spfile;
shutdown immediate;
exit
sqlplus /nolog
conn / as sysdba
startup
SQL> set time on





23:39:46 SQL> conn / as sysdba
Connected.
23:39:48 SQL>



The connections were happening instantly(2sec).  So I went ahead and resumed the patch after setting:

update fnd_install_processes
set control_code='W', status='W';



commit;


Application Team  restarted the patch and all the workers completed successfully.  And the patch was running significantly faster.  So I did a search on support.oracle.com to substantiate my solution with official Oracle documentation.  I found the following articles:


Slow Connection or ORA-12170 During Connect when PRE_PAGE_SGA init.ora Parameter is Set (Doc ID 289585.1)

Health Check Alert: Consider setting PRE_PAGE_SGA to FALSE (Doc ID 957525.1)



The first article (289585.1) says:

PRE_PAGE_SGA can increase the process startup duration, because every process that starts must access every page in the SGA. This approach can be useful with some applications, but not with all applications. Overhead can be significant if your system frequently creates and destroys processes by, for example, continually logging on and logging off. The advantage that PRE_PAGE_SGA can afford depends on page size.



The second article (957525.1) says:

Having the PRE_PAGE_SGA initialization parameter set to TRUE can significantly increase the time required to establish database connections.



 "Overhead can be significant if your system frequently creates and destroys processes by,
example, continually logging on and logging off.".  That is exactly what happens when you do adpatch or adop.


Keep this in mind, whenever you do adpatch or adop, make sure that pre_page_sga is set to false.  It is possible that you may get the error "java.sql.SQLRecoverableException: IO Error: Socket read timed out" if you don't.  Also the patch will run significantly slower if pre_page_sga is set to true.  So set it to false and avoid these issues.


adpatch or adop :  https://blogs.oracle.com/ebstech/top-9-frequently-asked-questions-about-online-patching