2010/8/11 Željko Marjanović <savethem4ever_at_gmail.com>:
> Is it possible to determine the character encoding the SSH/SFTP server is
> using? I have read the protocol
> specs for SFTP v3 and there is no mention of it, but in v4 default encoding
> is UTF-8. Â Is it safe to assume
> and use UTF-8 for default encoding?
Short answer, yes if connecting to machines running modern Unices.
The reason the v3 spec didn't mandate UTF-8 for filenames is probably
that some servers can't guarantee that. On Linux, for instance, you
can give the file a name using an arbitrary encoding of your choice -
it just stores a sequence of bytes [1][2]. When `ls` displays the
contents of a directory, it decides how to decode the filenames based
on the user's LANG environment variable. For instance, on my Ubuntu
machine, this is en_GB.UTF-8 so all filename data is interpreted as
UTF-8. If, by chance, an Arabic filename were encoded in MacArabic
encoding, it would be garbled in the listing.
This explains the problems encountered with a local `ls` but, of
course, a remote listing over SFTP faces all the same issues; the
filenames sent to the client can be a mix of UTF-8 and non-UTF-8. I
have no idea how SFTP v4 expects servers to guarantee they supply
UTF-8 when the server doesn't even know the encoding of its own
filenames!
In practice, however, modern Unices default to UTF-8 so it would be
unusual to encounter a filename with a different encoding. My project
assumes all filenames are UTF-8. A more correct solution would be to
default to UTF-8 but provide the user with an option to specify a
custom encoding.
[1] http://serverfault.com/questions/82821/how-to-tell-the-language-encoding-of-a-filename-on-linux
[2] http://www.linux.com/archive/feed/58689
HTH
Alex
-- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org) _______________________________________________ libssh2-devel http://cool.haxx.se/cgi-bin/mailman/listinfo/libssh2-develReceived on 2010-08-12