Since FileSystem.get is not thread safe, I use FileSystem.newInstance instead. but calling the newInstance method every time I need to connect to HDFS may not be a good idea. Therefore, I created the FileSystem connection pool.
This is the first question.
Is this a good approach?
Because I check the source of Hive, but they do not use this approach. just use the HDFS API directly and never even use newInstance. What for? How do they create a new FileSystem connection?
and they also do not use FileSystem.close ().
How do they guarantee that the FileSystem closes?
source share