[Server-cvs] engine/core server_engine.cpp, 1.14.2.4.14.2, 1.14.2.4.14.3

[Server-cvs] engine/core server_engine.cpp, 1.14.2.4.14.2, 1.14.2.4.14.3

srao at helixcommunity.org srao at helixcommunity.org
Wed Jan 17 23:23:06 PST 2007


Update of /cvsroot/server/engine/core
In directory cvs01.internal.helixcommunity.org:/tmp/cvs-serv30906

Modified Files:
      Tag: SERVER_11_1
	server_engine.cpp 
Log Message:

Synopsis
========
After running for 3 days server is not responding to client requests. 


Branches: SERVER_11_1_RN 
Suggested Reviewer: Atin, Any one


Description
===========
After running for 3 days server is not responding to client requests.  
>From code we observed that, this scenario can happen if select returns -1 in main loop. 
Same we confirmed by modifying the select() call to returns -1 in main loop. Select() returns -1 if one of the provided descriptors is invalid. 

Fix:
===============

Implemented a new function CallbackContainer::HandleBadFds() to delete bad descriptors. This functions gets called whenever select returns -1 in mainloop.
This function traverse through descriptor list and calls getsockname() for each descriptor. whenever getsockname() returns -1 this will trigger Callbacks::Remove() to remove corresponding descriptor. Existing Callbacks::Remove() function deletes descriptor only if that descriptor available in list m_map. But it may be possible that error descriptor may not be available in list because of corruption. So modified this function to remove error descriptor even when it is not available in list m_map. 

the error statement inside the HandleBadFDs() will print error message only if getsockname() returns WSAENOTSOCK. That is the reason we are printing error message in main loop. This will print error message irrespective of the error code. Added error counter and printing error message only if error counter < 1000.

Files Affected
==============
/server/engine/core/server_engine.cpp
/server/engine/core/pub/platform/win/callback_container.h
/pub/platform/win/servcallback.h


Testing Performed
=================
As this is uptime issue, I modified code to select return -1 and verified that descriptor get deleted properly and main loop continue to accept new connections.



Build verified: win32-i386-vc7


QA Hints
===============
QA to do regression testing.




Index: server_engine.cpp
===================================================================
RCS file: /cvsroot/server/engine/core/server_engine.cpp,v
retrieving revision 1.14.2.4.14.2
retrieving revision 1.14.2.4.14.3
diff -u -d -r1.14.2.4.14.2 -r1.14.2.4.14.3
--- server_engine.cpp	15 Sep 2006 20:58:20 -0000	1.14.2.4.14.2
+++ server_engine.cpp	18 Jan 2007 07:23:03 -0000	1.14.2.4.14.3
@@ -87,6 +87,7 @@
 
 #include "server_context.h"
 #include "globals.h"
+#include "safestring.h"
 
 #ifdef PAULM_SOCKTIMING
 #include "sockettimer.h"
@@ -492,6 +493,8 @@
     Timeval     last_left_select_time;
     Timeval     last_in_select_time;
     volatile unsigned int guard2 = 0xcc110088;
+    UINT32 ulErrCode = 0;
+    UINT32 ulErrCounter = 0;
 
 #ifndef _WIN32
     /*
@@ -686,6 +689,15 @@
 
         n = callbacks.Select((struct timeval*)timeoutp);
 
+        if (n < 0)
+        {
+#ifdef _WIN32
+            ulErrCode = WSAGetLastError();
+#else
+            ulErrCode = errno;
+#endif
+        }
+
         m_ulMainloopIterations++;
 
         GETTIMEOFDAY(now);
@@ -703,6 +715,15 @@
         m_pSCurrentElem = schedule.get_execute_list(now);
         if (n < 0)
         {
+            ulErrCounter++;
+            if (ulErrCounter < 1000)
+            {
+                char buf[256] = "\0";
+                SafeSprintf(buf, 256, "select error in mainloop = %u , timeout: %ld.%06ld, procnum = %d\n",
+                        ulErrCode, timeout.tv_sec, timeout.tv_usec, proc->procnum());
+                proc->pc->error_handler->Report(HXLOG_ERR, HXR_FAIL, (ULONG32)HXR_FAIL, buf, 0);
+            }
+
             m_bMoreReaderOrWriter = FALSE;
             m_bMoreTSReaderOrWriter = FALSE;
 
@@ -714,6 +735,12 @@
                 callbacks.HandleBadFds(proc->pc->error_handler);
             }
 #endif
+#ifdef _WIN32
+            if (ulErrCode == WSAENOTSOCK)
+            {
+                callbacks.HandleBadFds(proc->pc->error_handler);
+            }
+#endif            
         }
         else if (n == 0)
         {




More information about the Server-cvs mailing list
 

Site Map   |   Terms of Use   |   Privacy Policy   |   Contact Us

Copyright © 1995-2007 RealNetworks, Inc. All rights reserved. RealNetworks and Helix are trademarks of RealNetworks.
All other trademarks or registered trademarks are the property of their respective holders.