2015-10-26

I encountered an issue with the Mule ESB FTP Transport: when polling, the thread running the client would hang indefinitely without throwing an error. This causes FTP poll to stop completely. Mule uses Apache Commons Net FTPClient.

Looking further into the code, I think it is caused by the SocketTimeout of the FTPClient not being set, sometime causing infinite hanging when reading lines from the FTPClient's socket.

We can clearly see the problem in these stacks retrieved with jstack when the problem occured. The __getReply() function seems to be the more direct link to the problem.

This one hanging on connect() call when creating a new FTPClient:

And the other hanging on pasv() call when using listFiles():

I think the problem is caused by the use of the default FTPClient constructor (extending SocketClient) in Mule default FtpConnectionFactory.

Note the setConnectTimeout() values seems to be used only when calling socket.connect(), but ignored on other operations using the same socket:

It uses the FTPClient() constructor, itself using SocketClient with a 0 timeout, defined when creating the socket.

And then we call connec(), which calls _ connectAction()_.

In SocketClient:

In FTP, a new Reader is instanciated with our everlasting socket:

Then, when calling __getReply() function, we use this Reader-with-everlasting-socket:

Sorry for the long post, but I think this required correct explanations. A solution may be to call setSoTimeout() just after connect(), to define a Socket Timeout.

Having a default timeout does not seem an acceptable solution, as each users may have different needs and a default is not suitable in any case. https://issues.apache.org/jira/browse/NET-35

Finally, this raises 2 questions:

It seems like a bug to me, as it will completely stops FTP polling without giving error. What do you think?

What could be an easy way to avoid such situation? Calling setSoTimeout() with a custom FtpConnectionFactory? Am I missing a configuration or parameter somewhere?

Thanks by advance.

EDIT: I am using Mule CE Standalone 3.5.0, which seems to use Apache Commons Net 2.0. But looking in the code, Mule CE Standalone 3.7 with Commons Net 2.2 does not seem different. Here are the source codes involved:

https://github.com/mulesoft/mule/blob/mule-3.5.x/transports/ftp/src/main/java/org/mule/transport/ftp/FtpConnectionFactory.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/SocketClient.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/ftp/FTP.java

http://grepcode.com/file/repo1.maven.org/maven2/commons-net/commons-net/2.0/org/apache/commons/net/ftp/FTPClient.java

Show more