# 访问并读取页面数据

在下边这个程序中，数组中的url都将被访问：会发送一个简单的`http.Head()`请求查看返回值；它的声明如下：`func Head(url string) (r *Response, err error)`

返回状态码会被打印出来。

示例 15.7 [poll\_url.go](https://github.com/yangchuansheng/the-way-to-go_ZH_CN/tree/f30ab7d8c58f85840a0afb548024b93642b518d5/eBook/examples/chapter_15/poll_url.go)：

```go
package main

import (
    "fmt"
    "net/http"
)

var urls = []string{
    "http://www.google.com/",
    "http://golang.org/",
    "http://blog.golang.org/",
}

func main() {
    // Execute an HTTP HEAD request for all url's
    // and returns the HTTP status string or an error string.
    for _, url := range urls {
        resp, err := http.Head(url)
        if err != nil {
            fmt.Println("Error:", url, err)
        }
        fmt.Println(url, ": ", resp.Status)
    }
}
```

输出为：

```
http://www.google.com/ : 302 Found
http://golang.org/ : 200 OK
http://blog.golang.org/ : 200 OK
```

***译者注*** 由于国内的网络环境现状，很有可能见到如下超时错误提示：

```
Error: http://www.google.com/ Head http://www.google.com/: dial tcp 216.58.221.100:80: connectex: A connection attempt failed because the connected pa
rty did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
```

在下边的程序中我们使用`http.Get()`获取网页内容； `Get`的返回值`res`中的`Body`属性包含了网页内容，然后我们用`ioutil.ReadAll`把它读出来：

示例 15.8 [http\_fetch.go](https://github.com/yangchuansheng/the-way-to-go_ZH_CN/tree/f30ab7d8c58f85840a0afb548024b93642b518d5/eBook/examples/chapter_15/http_fetch.go)：

```go
package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
)

func main() {
    res, err := http.Get("http://www.google.com")
    checkError(err)
    data, err := ioutil.ReadAll(res.Body)
    checkError(err)
    fmt.Printf("Got: %q", string(data))
}

func checkError(err error) {
    if err != nil {
        log.Fatalf("Get : %v", err)
    }
}
```

当访问不存在的网站时，这里有一个`CheckError`输出错误的例子：

```
2011/09/30 11:24:15 Get: Get http://www.google.bex: dial tcp www.google.bex:80:GetHostByName: No such host is known.
```

***译者注*** 和上一个例子相似，你可以把google.com更换为一个国内可以顺畅访问的网址进行测试

在下边的程序中，我们获取一个twitter用户的状态，通过`xml`包将这个状态解析成为一个结构：

示例 15.9 [twitter\_status.go](https://github.com/yangchuansheng/the-way-to-go_ZH_CN/tree/f30ab7d8c58f85840a0afb548024b93642b518d5/eBook/examples/chapter_15/twitter_status.go)

```go
package main

import (
    "encoding/xml"
    "fmt"
    "net/http"
)

/*这个结构会保存解析后的返回数据。
他们会形成有层级的XML，可以忽略一些无用的数据*/
type Status struct {
    Text string
}

type User struct {
    XMLName xml.Name
    Status  Status
}

func main() {
    // 发起请求查询推特Goodland用户的状态
    response, _ := http.Get("http://twitter.com/users/Googland.xml")
    // 初始化XML返回值的结构
    user := User{xml.Name{"", "user"}, Status{""}}
    // 将XML解析为我们的结构
    xml.Unmarshal(response.Body, &user)
    fmt.Printf("status: %s", user.Status.Text)
}
```

输出：

```
status: Robot cars invade California, on orders from Google: Google has been testing self-driving cars ... http://bit.ly/cbtpUN http://retwt.me/97p<exit code="0" msg="process exited normally"/>
```

**译者注** 和上边的示例相似，你可能无法获取到xml数据，另外由于go版本的更新，`xml.Unmarshal`函数的第一个参数需是\[]byte类型，而无法传入`Body`。

我们会在[章节15.4](https://ryanyang.gitbook.io/the-way-to-go-zh-cn/di-san-bu-fen-go-gao-ji-bian-cheng/15.0/15.4)中用到`http`包中的其他重要的函数：

* `http.Redirect(w ResponseWriter, r *Request, url string, code int)`：这个函数会让浏览器重定向到url（是请求的url的相对路径）以及状态码。
* `http.NotFound(w ResponseWriter, r *Request)`：这个函数将返回网页没有找到，HTTP 404错误。
* `http.Error(w ResponseWriter, error string, code int)`：这个函数返回特定的错误信息和HTTP代码。
* 另`http.Request`对象的一个重要属性`req`：`req.Method`，这是一个包含`GET`或`POST`字符串，用来描述网页是以何种方式被请求的。

go为所有的HTTP状态码定义了常量，比如：

```
http.StatusContinue        = 100
http.StatusOK            = 200
http.StatusFound        = 302
http.StatusBadRequest        = 400
http.StatusUnauthorized        = 401
http.StatusForbidden        = 403
http.StatusNotFound        = 404
http.StatusInternalServerError    = 500
```

你可以使用`w.header().Set("Content-Type", "../..")`设置头信息

比如在网页应用发送html字符串的时候，在输出之前执行`w.Header().Set("Content-Type", "text/html")`。

练习 15.4：扩展 http\_fetch.go 使之可以从控制台读取url，使用[章节12.1](https://ryanyang.gitbook.io/the-way-to-go-zh-cn/di-san-bu-fen-go-gao-ji-bian-cheng/12.0/12.1)学到的接收控制台输入的方法 （[http\_fetch2.go](https://github.com/yangchuansheng/the-way-to-go_ZH_CN/tree/f30ab7d8c58f85840a0afb548024b93642b518d5/eBook/examples/chapter_15/http_fetch2.go)）

练习 15.5：获取json格式的推特状态，就像示例 15.9（twitter\_status\_json.go）

## 链接

* [目录](https://github.com/yangchuansheng/the-way-to-go_ZH_CN/tree/f30ab7d8c58f85840a0afb548024b93642b518d5/eBook/directory.md)
* 上一章：[一个简单的网页服务器](https://ryanyang.gitbook.io/the-way-to-go-zh-cn/di-san-bu-fen-go-gao-ji-bian-cheng/15.0/15.2)
* 下一节：[写一个简单的网页应用](https://ryanyang.gitbook.io/the-way-to-go-zh-cn/di-san-bu-fen-go-gao-ji-bian-cheng/15.0/15.4)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://ryanyang.gitbook.io/the-way-to-go-zh-cn/di-san-bu-fen-go-gao-ji-bian-cheng/15.0/15.3.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
