Go doesn't release memory after http.Get

I load web pages using a simple thread pool and dynamically load urls from a file. But this small program slowly allocates as much memory as my server until the omm killer stops it. It looks like resp.Body.Close () does not free memory for the main text (memory size ~ loaded pages * page size). How to get golang to free memory allocated for body html text?

package main import ( "bufio" "fmt" "io/ioutil" "net/http" "os" "strings" "sync" ) func worker(linkChan chan string, wg *sync.WaitGroup) { defer wg.Done() for url := range linkChan { // Getting body text resp, err := http.Get(url) if err != nil { fmt.Printf("Fail url: %s\n", url) continue } body, err := ioutil.ReadAll(resp.Body) resp.Body.Close() if err != nil { fmt.Printf("Fail url: %s\n", url) continue } // Test page body has_rem_code := strings.Contains(string(body), "googleadservices.com/pagead/conversion.js") fmt.Printf("Done url: %s\t%t\n", url, has_rem_code) } } func main() { // Creating worker pool lCh := make(chan string, 30) wg := new(sync.WaitGroup) for i := 0; i < 30; i++ { wg.Add(1) go worker(lCh, wg) } // Opening file with urls file, err := os.Open("./tmp/new.csv") defer file.Close() if err != nil { panic(err) } reader := bufio.NewReader(file) // Processing urls for href, _, err := reader.ReadLine(); err == nil; href, _, err = reader.ReadLine() { lCh <- string(href) } close(lCh) wg.Wait() } 

Here are some results from the pprof tool:

  flat flat% sum% cum cum% 34.63MB 29.39% 29.39% 34.63MB 29.39% bufio.NewReaderSize 30MB 25.46% 54.84% 30MB 25.46% net/http.(*Transport).getIdleConnCh 23.09MB 19.59% 74.44% 23.09MB 19.59% bufio.NewWriter 11.63MB 9.87% 84.30% 11.63MB 9.87% net/http.(*Transport).putIdleConn 6.50MB 5.52% 89.82% 6.50MB 5.52% main.main 

This question seems to be , but it was fixed 2 years ago.

+6
source share
1 answer

Found the answer to this thread on golang-nut. http.Transport saves connections for subsequent reuse in the case of a request to the same host, which leads to bloating in my case (hundreds of thousands of different hosts). But disabling KeepAlives completely solves this problem.

Work code:

 func worker(linkChan chan string, wg *sync.WaitGroup) { defer wg.Done() var transport http.RoundTripper = &http.Transport{ DisableKeepAlives: true, } c := &http.Client{Transport: transport} for url := range linkChan { // Getting body text resp, err := c.Get(url) if err != nil { fmt.Printf("Fail url: %s\n", url) continue } body, err := ioutil.ReadAll(resp.Body) resp.Body.Close() if err != nil { fmt.Printf("Fail url: %s\n", url) continue } // Test page body has_rem_code := strings.Contains(string(body), "googleadservices.com/pagead/conversion.js") fmt.Printf("Done url: %s\t%t\n", url, has_rem_code) } } 
+4
source

All Articles