How to ensure that hasoop FileSystem connections are managed when I use the pool

Since FileSystem.get is not thread safe, I use FileSystem.newInstance instead. but calling the newInstance method every time I need to connect to HDFS may not be a good idea. Therefore, I created the FileSystem connection pool.

This is the first question.

Is this a good approach?

Because I check the source of Hive, but they do not use this approach. just use the HDFS API directly and never even use newInstance. What for? How do they create a new FileSystem connection?

and they also do not use FileSystem.close ().

How do they guarantee that the FileSystem closes?

+4
source share

All Articles