[Common-dev] Re: [Client-dev] CR: sock option for enabling read buffer reallocation to save memory
Eric Hyche ehyche at real.com
Comments in-line...
At 06:39 PM 8/4/2004 -0700, Liam Murray wrote:
>(NOTE: this updates the previous CR that I sent and that no one has
>responded to yet.)
>
>Summary:
>
>Adds socket option that lets socket user control re-allocation of read
>buffer in order to reduce memory consumption.
>
>Details:
>
>With UDP reads we allocate a default UDP read buffer of 64K. After we read
>the buffer and are able to determine the actual size of the datagram we
>update the size of the buffer. This does not free unused memory from the
>buffer. Therefore each datagram held in the application buffers is using
>64K even if the actual datagram is very small. This can add up to a lot of
>memory usage. (Similar occurs with tcp reads in cases where the net buffer
>has less than the maximum segment size.)
>
>One solution to avoid the memory overhead is to copy the buffer to a
>precisely sized buffer once the true size is known. Another solution is to
>use ioctlsocket with FIONREAD to determine the pending datagram size. This
>was discussed earlier and there are some uncertainties to doing this. For
>now the ioctlsocket approach is not considered.
>
>This change adds adds support for a socket option (HX_SOCKOPT_RCVBUF_FLAG)
>that enables the user to specify one of the following flags.
>
>typedef enum
>{
> HX_SOCK_RCVBUF_SIZE_DEFAULT,
> HX_SOCK_RCVBUF_SIZE_COPY,
>} HXSockRcvBufFlag;
>
>If HX_SOCK_RCVBUF_SIZE_DEFAULT (the default) is set the behavior remains
>the same. If HX_SOCK_RCVBUF_SIZE_COPY is set, each read buffer allocated
>by the IHXSocket implementation is reallocated (copied) to a precisely
>sized buffer after the actual size is known.
>
>Some clients are particularly sensitive to memory usage and the cost of an
>extra copy is worth the memory we can save by doing the copy.
>
>There is some room for optimization. For example, a temporary buffer (or
>vector of buffers) sized to the max read size (max UDP or max segment) can
>be allocated once and reused for each read as long as we always allocate a
>new buffer and copy to it.
>
>Branch: HEAD
>
>Builds/functionality verified: Windows client
>
>
>Liam
>
>
>Index: hxnet.h
>===================================================================
>RCS file: /cvsroot/common/include/hxnet.h,v
>retrieving revision 1.22
>diff -u -w -r1.22 hxnet.h
>--- hxnet.h 30 Jul 2004 20:53:56 -0000 1.22
>+++ hxnet.h 5 Aug 2004 01:13:01 -0000
>@@ -439,6 +439,12 @@
> HX_SOCKBUF_TIMESTAMPED // IHXTimeStampedBuffer
> } HXSockBufType;
>
>+typedef enum
>+{
>+ HX_SOCK_RCVBUF_SIZE_DEFAULT, // avoids copy
>+ HX_SOCK_RCVBUF_SIZE_COPY, // conserves memory; results in a copy
>+} HXSockRcvBufFlag;
>+
> /* Socket options for IHXSocket::GetOption() and IHXSocket::SetOption() */
> /* XXXTDM: Separate these like the real getsockopt/setsockopt? */
> typedef enum {
>@@ -497,6 +503,7 @@
> HX_SOCKOPT_BUFFER_TYPE,
> HX_SOCKOPT_APP_SNDBUF,
> HX_SOCKOPT_APP_RCVBUF,
>+ HX_SOCKOPT_RCVBUF_FLAG,
>
> HX_SOCKOPT_LAST
> } HXSockOpt;
>
>
>Index: hxsockutil.cpp
>===================================================================
>RCS file: /cvsroot/common/netio/hxsockutil.cpp,v
>retrieving revision 1.12
>diff -u -w -r1.12 hxsockutil.cpp
>--- hxsockutil.cpp 30 Jul 2004 19:58:18 -0000 1.12
>+++ hxsockutil.cpp 5 Aug 2004 01:09:28 -0000
>@@ -42,6 +42,7 @@
> #include "safestring.h"
> #include "debug.h"
> #include "hxassert.h"
>+#include "netbyte.h"
> #include "hxheap.h"
> #include "hxsockutil.h"
>
>@@ -254,7 +255,9 @@
>
> }
>
>-HX_RESULT HXSockUtil::SetAddr(IHXSockAddr* pAddr /*modified*/, const
>char* pAddrString, UINT16 port)
>+HX_RESULT HXSockUtil::SetAddr(IHXSockAddr* pAddr /*modified*/,
>+ const char* pAddrString,
>+ UINT16 port)
> {
> HX_ASSERT(pAddr);
> HX_ASSERT(pAddrString);
>@@ -275,15 +278,15 @@
>
> HX_RESULT HXSockUtil::SetAddr(IHXSockAddr* pAddr /*modified*/,
> short family,
>- const void* pAddrData /*net-order
>octets/hextets*/,
>- UINT16 port)
>+ const void* pAddrData /*net order
>octets/hextets*/,
>+ UINT16 port /*net order*/)
> {
Since IHXSockAddr::SetPort() now takes a host-order port, shouldn't
we just change HXSockUtil::SetAddr() to also take a host-order
port? Seems strange for the app to have to produce a net-order
port just to have it changed back to a host-order port....
> HX_ASSERT(pAddrData);
>
> char szAddr[HX_ADDRSTRLEN]; // big enough for ipv6 and ipv4
> hx_inet_ntop(family, pAddrData, szAddr, HX_ADDRSTRLEN);
>
>- return SetAddr(pAddr, szAddr, port);
>+ return SetAddr(pAddr, szAddr, WToHost(port));
> }
>
>
>
>Index: platform/posix/sockimp.cpp
>===================================================================
>RCS file: /cvsroot/common/netio/platform/posix/sockimp.cpp,v
>retrieving revision 1.37
>diff -u -w -r1.37 sockimp.cpp
>--- platform/posix/sockimp.cpp 30 Jul 2004 20:32:57 -0000 1.37
>+++ platform/posix/sockimp.cpp 5 Aug 2004 01:09:28 -0000
>@@ -1005,6 +1005,7 @@
> m_type(HX_SOCK_TYPE_NONE),
> m_proto(HX_SOCK_PROTO_NONE),
> m_bufType(HX_SOCKBUF_DEFAULT),
>+ m_rcvBufFlag(HX_SOCK_RCVBUF_SIZE_DEFAULT),
> m_sock(HX_SOCK_NONE),
> #if defined(MISSING_DUALSOCKET)
> m_pSock4(NULL),
>@@ -1034,7 +1035,6 @@
> }
> HX_ASSERT(m_punkContext != NULL);
> HX_ASSERT(m_pCCF != NULL);
>- HX_ASSERT(m_pScheduler != NULL);
> }
>
> CHXSocket::CHXSocket(CHXNetServices* pNetSvc, IUnknown* punkContext,
>@@ -1077,7 +1077,6 @@
> }
> HX_ASSERT(m_punkContext != NULL);
> HX_ASSERT(m_pCCF != NULL);
>- HX_ASSERT(m_pScheduler != NULL);
> }
>
> CHXSocket::~CHXSocket(void)
>@@ -1141,28 +1140,32 @@
> break;
> case HX_SOCK_STATE_CONNECTING:
> HX_ASSERT(uEvent & HX_SOCK_EVENT_WRITE);
>- m_sock.state = HX_SOCK_STATE_NORMAL; //XXXTDM: What if connect fails?
>-
>- if (m_uSelectedEventMask & HX_SOCK_EVENT_CONNECT)
>+ if (uEvent & HX_SOCK_EVENT_WRITE)
> {
> UINT32 err = 0;
>- HX_RESULT status;
> hx_getsockopt(&m_sock, HX_SOCKOPT_SOCKERR, &err);
>- status = ErrorToStatus(err);
>+ HX_RESULT status = ErrorToStatus(err);
>+
>+ // NB: we shouldn't send a close event unless the connection
>is successful
>+ m_sock.state = HX_SOCK_STATE_NORMAL;
>+
> if (m_uSelectedEventMask & HX_SOCK_EVENT_CONNECT)
> {
> m_pResponse->EventPending(HX_SOCK_EVENT_CONNECT, status);
> }
>- }
>- //XXXTDM: Pass up write event on failed connect?
>- if (m_uSelectedEventMask & HX_SOCK_EVENT_WRITE)
>- {
>+
> HX_ASSERT(m_bIsResponseWriteEventPending);
> m_bIsResponseWriteEventPending = FALSE;
>+
>+ // only send the write event if the connection was
>successfully established
>+ if (HXR_OK == status && m_uSelectedEventMask &
>HX_SOCK_EVENT_WRITE)
>+ {
> m_pResponse->EventPending(HX_SOCK_EVENT_WRITE, HXR_OK);
> }
>+ }
> break;
> case HX_SOCK_STATE_LISTENING:
>+ HX_ASSERT(uEvent & HX_SOCK_EVENT_READ);
> if (uEvent & HX_SOCK_EVENT_READ)
> {
> m_uPendingEventMask = HX_SOCK_EVENT_ACCEPT;
>@@ -1743,6 +1746,7 @@
> #endif
> if (name == HX_SOCKOPT_IDLETIMEOUT)
> {
>+ HX_ASSERT(m_pScheduler);
> if (m_pScheduler == NULL)
> {
> return HXR_FAIL;
>@@ -1759,12 +1763,17 @@
> }
> return HXR_OK;
> }
>- if (name == HX_SOCKOPT_BUFFER_TYPE)
>+ else if (name == HX_SOCKOPT_BUFFER_TYPE)
> {
> m_bufType = (HXSockBufType)val;
> return HXR_OK;
> }
>- if (name == HX_SOCKOPT_APP_SNDBUF)
>+ else if (name == HX_SOCKOPT_RCVBUF_FLAG)
>+ {
>+ m_rcvBufFlag = (HXSockRcvBufFlag)val;
>+ return HXR_OK;
>+ }
>+ else if (name == HX_SOCKOPT_APP_SNDBUF)
> {
> if (m_pSendBuffer)
> {
>@@ -1773,6 +1782,7 @@
> return HXR_OK;
> }
>
>+
> if (hx_setsockopt(&m_sock, name, val) != 0)
> {
> return ErrorToStatus(hx_lastsockerr());
>@@ -2003,6 +2013,7 @@
> return HXR_UNEXPECTED;
> }
>
>+
> HX_RESULT
> CHXSocket::GetNativeAddr(IHXSockAddr* pAddr, sockaddr_in6* psa6,
> sockaddr** ppsa, size_t* psalen)
>@@ -2025,6 +2036,48 @@
> }
>
> HX_RESULT
>+CHXSocket::SetReadBufferSize(IHXBuffer** ppBuf /*modified*/, UINT32 cbRead)
>+{
>+ HX_ASSERT(ppBuf && *ppBuf);
>+ HX_ASSERT(cbRead != 0);
>+
>+ HX_RESULT hxr = HXR_OK;
>+
>+ //XXXLCM If we get rid of the following line and 'copy' is specified
>we can allocate
>+ // one temp buffer into which we read and avoid some extra
>allocations.
>+ if (cbRead != (*ppBuf)->GetSize())
>+ {
>+ HX_ASSERT(cbRead < (*ppBuf)->GetSize());
>+ if (HX_SOCK_RCVBUF_SIZE_COPY == m_rcvBufFlag)
>+ {
>+ // allocate a new buffer so buffer does not waste unused
>memory space
>+ IHXBuffer* pResizedBuf = 0;
>+ hxr = CreateReadBuffer(&pResizedBuf);
>+ if (HXR_OK == hxr)
>+ {
>+ hxr = pResizedBuf->Set((*ppBuf)->GetBuffer(), cbRead);
>+ if (HXR_OK == hxr)
>+ {
>+ HX_RELEASE(*ppBuf);
>+ *ppBuf = pResizedBuf;
>+ (*ppBuf)->AddRef();
>+ }
>+ HX_RELEASE(pResizedBuf);
>+ }
>+ }
>+ else
>+ {
>+ HX_ASSERT(HX_SOCK_RCVBUF_SIZE_DEFAULT == m_rcvBufFlag);
>+ // just re-specify the size and avoid a copy
>+ hxr = (*ppBuf)->SetSize(cbRead);
>+ }
>+ }
>+
>+ return hxr;
>+}
>+
>+
>+HX_RESULT
> CHXSocket::CreateReadBuffer(IHXBuffer** ppBuf)
> {
> HX_RESULT hxr = HXR_OK;
>@@ -2116,7 +2169,7 @@
> n = hx_readfrom(&m_sock, (*ppBuf)->GetBuffer(), m_mss, psa,
> salen, bPeek);
> if (n > 0)
> {
>- (*ppBuf)->SetSize(n);
>+ hxr = SetReadBufferSize(ppBuf, n);
> }
> else if (n == 0)
> {
>@@ -2247,7 +2300,11 @@
> }
> else if (vec[n].get_len() < ppBufVec[n]->GetSize())
> {
>- ppBufVec[n]->SetSize(vec[n].get_len());
>+ hxr = SetReadBufferSize(&ppBufVec[n], vec[n].get_len());
>+ if (FAILED(hxr))
>+ {
>+ return hxr;
>+ }
> }
> }
> return HXR_OK;
>@@ -2767,7 +2824,7 @@
> Close();
> }
>
>-void
>+HX_RESULT
> CHXNetServices::Init(IUnknown* punkContext)
> {
> m_punkContext = punkContext;
>@@ -2775,10 +2832,12 @@
>
> //XXXLCM this loads then unloads the winsock dll (undesireable)
> HXNetDrvLoader loader;
>- if(HXR_OK == loader.EnsureLoaded())
>+ HX_RESULT hr = loader.EnsureLoaded();
>+ if (HXR_OK == hr)
> {
> m_bIN6 = hx_netdrv_familyavail(HX_SOCK_FAMILY_IN6);
> }
>+ return hr;
> }
>
> void
>Index: platform/win/hxsocketselector.cpp
>===================================================================
>RCS file: /cvsroot/common/netio/platform/win/hxsocketselector.cpp,v
>retrieving revision 1.6
>diff -u -w -r1.6 hxsocketselector.cpp
>--- platform/win/hxsocketselector.cpp 30 Jul 2004 22:57:10 -0000 1.6
>+++ platform/win/hxsocketselector.cpp 5 Aug 2004 01:09:28 -0000
>@@ -229,12 +229,6 @@
> wsaMask |= TranlateMaskValue(eventMask,
> HX_SOCK_EVENT_WRITE, FD_WRITE);
> wsaMask |= TranlateMaskValue(eventMask,
> HX_SOCK_EVENT_CLOSE, FD_CLOSE);
>
>- if(!m_wsaEventSemantics)
>- {
>- // we only will report HX_SOCK_EVENT_READ and HX_SOCK_EVENT_WRITE
>as on UNIX
>- wsaMask &= ~FD_CONNECT;
>- }
>-
> // register to receive async socket event notifications
> int res = WSAAsyncSelect(fd, m_pMsgSink->GetSinkHandle(),
> HXMSG_SOCKETSELECT, wsaMask);
> if( SOCKET_ERROR != res )
>@@ -412,7 +406,7 @@
>
> }
>
>-UINT32 HXSocketSelector::TranslateWSAAsyncSelectEvent(UINT16 wsaEvent)
>+UINT32 TranslateWSAAsyncSelectEvent(bool wsaEventSemantics, UINT16
>wsaEvent, UINT16 err)
> {
> UINT32 hxEvent = HX_SOCK_EVENT_NONE;
> switch (wsaEvent)
>@@ -446,11 +440,15 @@
> // applications you do not need to handle this; you just wait
> // for the write event.
> //
>- HX_ASSERT(m_wsaEventSemantics);
>- if (m_wsaEventSemantics)
>+ if (wsaEventSemantics)
> {
> hxEvent = HX_SOCK_EVENT_CONNECT;
> }
>+ else if(err != 0)
>+ {
>+ // pass write event along with error code
>+ hxEvent = HX_SOCK_EVENT_WRITE;
>+ }
> break;
>
> case FD_CLOSE:
>@@ -458,7 +456,7 @@
> // Indicates that stream socket has transitioned to close-pending
> // state (either gracefully or abortively) from connected state
> //
>- if(m_wsaEventSemantics)
>+ if(wsaEventSemantics)
> {
> hxEvent = HX_SOCK_EVENT_CLOSE;
> }
>@@ -478,7 +476,7 @@
> //
> // Helix synthesizes HX_SOCK_EVENT_ACCEPT based on
> HX_SOCK_EVENT_READ
> //
>- if(m_wsaEventSemantics)
>+ if(wsaEventSemantics)
> {
> hxEvent = HX_SOCK_EVENT_ACCEPT;
> }
>@@ -525,13 +523,16 @@
>
> if(pHandler)
> {
>- UINT32 hxEvent = TranslateWSAAsyncSelectEvent(event);
>+ UINT32 hxEvent =
>TranslateWSAAsyncSelectEvent(m_wsaEventSemantics, event, err);
> HX_RESULT hxResult = TranslateWSAAsyncSelectError(err);
>
> DPRINTF(D_SOCKETSELECT, ("HXSocketSelector::OnAsyncSelect():
> %s:%s translated as %s(= %u) for handler [%p]\n",
> GetWSAEventString(event), GetWSAErrorString(err),
> GetSockEventString(hxEvent), hxEvent, pHandler));
>
>+ if (HX_SOCK_EVENT_NONE != hxEvent)
>+ {
> pHandler->OnSelectEvent(hxEvent, hxResult);
>+ }
> }
> else
> {
>Index: platform/win/hxwinsocklib.cpp
>===================================================================
>RCS file: /cvsroot/common/netio/platform/win/hxwinsocklib.cpp,v
>retrieving revision 1.6
>diff -u -w -r1.6 hxwinsocklib.cpp
>--- platform/win/hxwinsocklib.cpp 27 Jul 2004 01:22:10 -0000 1.6
>+++ platform/win/hxwinsocklib.cpp 5 Aug 2004 01:09:28 -0000
>@@ -44,7 +44,7 @@
>
> #define D_WINSOCK D_INFO //XXXLCM
>
>-
>+#include "hxheap.h"
> #ifdef _DEBUG
> #undef HX_THIS_FILE
> static const char HX_THIS_FILE[] = __FILE__;
>Index: pub/hxsockutil.h
>===================================================================
>RCS file: /cvsroot/common/netio/pub/hxsockutil.h,v
>retrieving revision 1.9
>diff -u -w -r1.9 hxsockutil.h
>--- pub/hxsockutil.h 30 Jul 2004 19:58:45 -0000 1.9
>+++ pub/hxsockutil.h 5 Aug 2004 01:09:28 -0000
>@@ -12,7 +12,9 @@
>
> struct HXSockUtil // namespace
> {
>- static HX_RESULT SetAddr(IHXSockAddr* pAddr /*modified*/, const char*
>pAddrString, UINT16 port);
>+ static HX_RESULT SetAddr(IHXSockAddr* pAddr /*modified*/,
>+ const char* pAddrString,
>+ UINT16 port = 0);
>
> static HX_RESULT SetAddr(IHXSockAddr* pAddr /*modified*/,
> short family,
>Index: pub/platform/posix/sockimp.h
>===================================================================
>RCS file: /cvsroot/common/netio/pub/platform/posix/sockimp.h,v
>retrieving revision 1.23
>diff -u -w -r1.23 sockimp.h
>--- pub/platform/posix/sockimp.h 30 Jul 2004 00:32:32 -0000 1.23
>+++ pub/platform/posix/sockimp.h 5 Aug 2004 01:09:28 -0000
>@@ -302,6 +302,7 @@
> sockaddr** ppsa, size_t* psalen);
> HX_RESULT ErrorToStatus(int err);
> HX_RESULT CreateReadBuffer(IHXBuffer** ppBuf);
>+ HX_RESULT SetReadBufferSize(IHXBuffer** ppBuf /*modified*/, UINT32
>cbRead);
> HX_RESULT DoRead(IHXBuffer** ppBuf, IHXSockAddr** ppAddr = NULL,
> BOOL bPeek = FALSE);
> HX_RESULT DoWrite(IHXBuffer* pBuf,
>@@ -336,6 +337,7 @@
> HXSockType m_type;
> HXSockProtocol m_proto;
> HXSockBufType m_bufType;
>+ HXSockRcvBufFlag m_rcvBufFlag;
> HX_SOCK m_sock;
> #if defined(MISSING_DUALSOCKET)
> CHXSocket* m_pSock4;
>@@ -441,7 +443,7 @@
> virtual ~CHXNetServices(void);
>
> public:
>- virtual void Init(IUnknown* punkContext);
>+ virtual HX_RESULT Init(IUnknown* punkContext);
> virtual void Close(void);
>
> // This is used by CHXSocket::Accept().
>Index: pub/platform/win/hxsocketselector.h
>===================================================================
>RCS file: /cvsroot/common/netio/pub/platform/win/hxsocketselector.h,v
>retrieving revision 1.5
>diff -u -w -r1.5 hxsocketselector.h
>--- pub/platform/win/hxsocketselector.h 30 Jul 2004 22:57:30 -0000 1.5
>+++ pub/platform/win/hxsocketselector.h 5 Aug 2004 01:09:28 -0000
>@@ -81,7 +81,6 @@
> HXSocketSelector();
> virtual ~HXSocketSelector();
> static HX_RESULT CreateSelector(HXSocketSelector*& pSelector);
>- UINT32 TranslateWSAAsyncSelectEvent(UINT16 wsaEvent);
>
> // HXRefCounted
> void FinalRelease();
>
Rest looks good.
Eric
>At 11:44 AM 8/4/2004, Liam Murray wrote:
>
>>Summary:
>>
>>Fixes case where connect failure is not passed on via
>>HX_SOCK_EVENT_CONNECT. I ran into this because there was a byte-ordering
>>issue with a port that occurred when a resolved address was translated to
>>an IHXSockAddr in the client resolver.
>>
>>Details:
>>
>>1) Byte-ordering fix for HXSockUtil::SetAddr().
>>
>>2) Fix in sockimp to better handle connect failure. When a connect fails
>>a write event is passed to OnEvent(). If an error is associated with the
>>event the connect event is forwarded to IHXSocketResponse with the error
>>code and no write event is subsequently sent. The first write event is
>>only sent if the connection is successful. (This is how WSA event
>>semantics works. Please verify this makes sense for us to follow.) Also
>>note that a close event should not be sent unless a connection was
>>successfully established for a socket. (Again, this corresponds with WSA
>>semantics.)
>>
>>3) When the windows client gets an FD_CONNECT along with a failure code
>>from WSAAsyncSelect, it is passed on as a write event to
>>CHXSocket::OnEvent(). Otherwise the connect is passed on upon receiving
>>the first write event.
>>
>>4) Removed
>>
>>HX_ASSERT(m_pScheduler != NULL);
>>
>>from constructors and put assert where scheduler is actually used in call
>>to SetSockOpt(). This prevents asserts in unit tests that use a mini
>>context which doesn't expose the scheduler, but still allows us to assert
>>when the scheduler is actually used.
>>
>>5) Changed CHXSocket::Init() so it returns an HX_RESULT. This is
>>particularly helpful for the client derivations because they
>>initialize/allocate resources during initialization.
>>
>>Branch: HEAD
>>
>>Builds, platforms verified: Windows client
>>
>>Liam
>>
>>
>>
>>
>>_______________________________________________
>>Common-dev mailing list
>>Common-dev at lists.helixcommunity.org
>>http://lists.helixcommunity.org/mailman/listinfo/common-dev
>
>
>_______________________________________________
>Client-dev mailing list
>Client-dev at lists.helixcommunity.org
>http://lists.helixcommunity.org/mailman/listinfo/client-dev
======================================
M. Eric Hyche (ehyche at real.com)
Core Technologies
RealNetworks, Inc.