Right, you can't use go in a recursive routine. Use an auxiliary function
as the goroutine to do the actual fetch. Here's my solution, with
// The cache of fetched URLs to prevent grabbing the same URL twice.
// Maps urls to bodies: fetchedCache["http://golang.org"] = "The Go
Programming Language"

type FetchedCache map[string]string

// Return from a fetch. Since we're doing this using goroutines, results
// will be transmitted over a channel, not by return, and the channel will
// contain pointers to FetchReturns.

type FetchReturn struct {
     url string // the url used for the fetch
     body string // the body of the fetch
     urls []string // any urls in the fetch
     err error // error, if any

// goroutine to fetch an url. This simply calls fetcher.fetch, takes the
// results and stuffs them (along with the url used to fetch) into a
// then puts the FetchReturn onto return_channel
// Note that each fetch_url sticks exactly one pointer onto the shared

func fetch_url(url string, fetcher Fetcher, result_channel chan
*FetchReturn) {
     fetchReturn := new(FetchReturn);
     fetchReturn.url = url
     fetchReturn.body, fetchReturn.urls, fetchReturn.err = fetcher.Fetch(url)
     result_channel <- fetchReturn;

// Crawl is the front end to fetch_url. Modified from the
// prototype in the problem, since it is no longer recursive
// and doesn't repeat URLs..thus, depth is no longer tracked.
// All this does is just issue a fetch request (go fetch_url) for
// each url it hasn't fetched, pulls the results off the results
// channel and issues new fetch requests for each url it hasn't seen.
// Since each fetch_url thread will stick exactly one FetchReturn
// on result_channel, we keep track of the outstanding threads by
// incrementing a counter for each thread issued, and decrementing
// for each FetchReturn received. When the counter goes to 0, we are
// done and return.

func Crawl(url string, fetcher Fetcher) {

     // Create a cache for the URLs we've already fetched, and
     // a channel for the returns

     fetchedCache := make(FetchedCache)
     return_channel := make(chan *FetchReturn)

     // Issue a fetch request for an incoming URL

     go fetch_url(url, fetcher, return_channel)

     // Main loop. At each iteration, pull a FetchReturn off the
     // channel and note that the number of outstanding threads is
     // reduced by one. If the result was an error, print the
     // error and continue. Otherwise, note we've fetched the
     // result.url in the fetchedCache, so we don't get it again,
     // print the result. Then go through the urls in the return,
     // and for each we haven't seen issue a new fetch request and
     // bump up the number of outstanding threads by 1. When the
     // number of outstanding threads is 0, no more results are coming
     // in and we can return.

     for num_threads := 1; num_threads > 0; {
         result := <- return_channel
         if result.err != nil {
         fetchedCache[result.url] = result.body;
         fmt.Printf("found: %s %q: %v\n", result.url, result.body,
         for _, u := range result.urls {
             _, has_url := fetchedCache[u]

             if (!has_url) {
                 go fetch_url(u, fetcher, return_channel)
On Thursday, October 6, 2011 4:43:32 PM UTC-7, dlin wrote:


func Crawl(url string, depth int, fetcher Fetcher) {
for _, u := range urls {
// Crawl(u, depth-1, fetcher)
go Crawl(u, depth-1, fetcher) // I just think use 'go'
keyword here, but can NOT work
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupgolang-nuts @
postedJun 3, '13 at 10:33p
activeJun 3, '13 at 10:33p

1 user in discussion

Rick: 1 post



site design / logo © 2022 Grokbase