Subject: Re: Packet drop on full socket problem

Re: Packet drop on full socket problem

From: Daniel Stenberg <>
Date: Wed, 6 Oct 2010 23:13:38 +0200 (CEST)

On Fri, 17 Sep 2010, Thomas Rauscher wrote:

(First, I'm sorry its taken me this long to respond...)

> I think I've found a problem that occurs when writing to the send socket
> returns -1 (EGAIN). Additional preconditions to trigger the problem are
> writing in larger chunks than the advertized window size, e.g. 128k writes
> vs. 12k window size.

I don't quite understand your problem. I'll add my questions and thoughts
inline below.

> The remote side is a dropbear SSH server which seems use 12k window size
> increments. This means that packets need to be split very often. If
> additionally the socket buffer gets full, the saved packet is never sent.
> A workaround is to use smaller writes (1k), but this only hides the problem.
> Details:
> 1) The application calls _libssh2_channel_write(..., 128*1024);

First, _libssh2_channel_write() will internally ignore everything that is
larger than 32768 bytes. It will only try to send the first 32768 bytes in
each function invoke.

The function will/should then make sure that it doesn't try to send any more
data than the remote has a window for. In this case, it should further
decrease the amount of data this function will attempt to send.

> * In _libssh2_transport_write()
> _libssh2_send returns -1 (EAGAIN) and the current packet is saved to
> p->odata, p->olen ...

You mean that it returns EAGAIN immediately or after having sent the first 12K
of data? I assume you mean that it first sends some data and then when it
loops it gets EAGAIN back.

> * _libssh2_transport_write() returns LIBSSH2_ERROR_EAGAIN to
> _libssh2_channel_write() which executes
> if(wrote) {
> _libssh2_transport_drain(session);
> goto _channel_write_done;
> }

... as it would only execute that if 'wrote' actually wasn't zero.

> _libssh2_transport_drain() frees p->outbuf and sets it to NULL.
> * _libssh2_transport_write then returns "wrote" (12k) to the application.

Right, as it did in fact successfully send away 12K.

> 2) Application calls _libssh2_channel_write(..., 128*1024) again.

Right, but that buffer should now be pointing 12K further into the data as 12K
was in fact sent in the previous invoke.

> _libssh2_transport_write() now calls send_existing() first which
> immediately returns because p->outbuf is NULL.
> if (!p->outbuf) {
> *ret = 0;
> }

Right, there's nothing save there. What do you think it should have saved

> * This results in not sending the saved packet, but sending the next packet.
> The SSH server then bails out and terminates the connection (saying "bad
> packet size").

... as you can see I didn't follow how it ended up like this! I'll get myself
a dropbear install and see if I can repeat this. Is uploading data with a 128K
buffer enough to trigger it? Like with the sftp_write_nonblock.c example?

Received on 2010-10-06