首页猿问如何使用 Go 跟踪分段上传到...

如何使用 Go 跟踪分段上传到 s3 的进度？

沧海一幻觉 2021-12-06 19:53:20

我正在尝试使用 Mitchell Hashimoto 的 goamz fork 提供的 PutPart 方法。可悲的是，每次我取回一部分并检查大小时，它似乎都认为它是整个文件的大小，而不仅仅是一个块。例如上传 15m 文件时，我希望看到Uploading...Processing 1 part of 3 and uploaded 5242880.0 bytes. Processing 2 part of 3 and uploaded 5242880.0 bytes. Processing 3 part of 3 and uploaded 5242880.0 bytes.相反，我看到：Uploading...Processing 1 part of 3 and uploaded 15728640 bytes. Processing 2 part of 3 and uploaded 15728640 bytes. Processing 3 part of 3 and uploaded 15728640 bytes.这是由于 file.Read(partBuffer) 的问题吗？任何帮助将非常感激。我在 Mac 上使用 go 1.5.1。package mainimport ( "bufio" "fmt" "math" "net/http" "os" "github.com/mitchellh/goamz/aws" "github.com/mitchellh/goamz/s3")func check(err error) { if err != nil { panic(err) }}func main() { fmt.Println("Test") auth, err := aws.GetAuth("XXXXX", "XXXXXXXXXX") check(err) client := s3.New(auth, aws.USWest2) b := s3.Bucket{ S3: client, Name: "some-bucket", } fileToBeUploaded := "testfile" file, err := os.Open(fileToBeUploaded) check(err) defer file.Close() fileInfo, _ := file.Stat() fileSize := fileInfo.Size() bytes := make([]byte, fileSize) // read into buffer buffer := bufio.NewReader(file) _, err = buffer.Read(bytes) check(err) filetype := http.DetectContentType(bytes) // set up for multipart upload multi, err := b.InitMulti("/"+fileToBeUploaded, filetype, s3.ACL("bucket-owner-read")) check(err) const fileChunk = 5242880 // 5MB totalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk))) parts := []s3.Part{} fmt.Println("Uploading...") for i := uint64(1); i < totalPartsNum; i++ { partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk)))) partBuffer := make([]byte, partSize) _, err := file.Read(partBuffer) check(err) part, err := multi.PutPart(int(i), file) // write to S3 bucket part by part check(err)

查看完整描述

3 回答

繁花不似锦

TA贡献1851条经验获得超4个赞

当您将文件部分传递给multi.PutPart方法 (n, strings.NewReader ("")) 时，您的代码必须更改一些点才能使其正常工作，下面的代码将起作用。

记住 PutPart 发送分段上传的一部分，从 r 读取所有内容，除了最后一个部分，每个部分的大小必须至少为 5MB。它在 goamz 文档中有描述。

我已更改为正常工作的要点：

在这里，我使用文件的所有字节创建我们的 headerPart

HeaderPart: = strings.NewReader (string (bytes) )

这里io.ReadFull (HeaderPart, partBuffer)我正在读取make ([] byte, partSize)命令的整个缓冲区部分，每次它都位于文件的某个部分。

当我们运行multi.PutPart (int (i) +1, strings.NewReader (string (partBuffer))) 时，我们必须+1，因为它不计算部分 0，而不是传递目标文件，我们将传递部分的内容使用strings.NewReader函数为此。

在下面查看您的代码，它现在可以正常工作。

package main

import(

"bufio"

"fmt"

"math"

"net/http"

"os"

"launchpad.net/goamz/aws"

"launchpad.net/goamz/s3"

)

func check(err error) {

if err != nil {

panic(err)

}

func main() {

fmt.Println("Test")

auth := aws.Auth{

AccessKey: "xxxxxxxxxxx", // change this to yours

SecretKey: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",

}

client := s3.New(auth, aws.USWest2)

b := s3.Bucket{

S3: client,

Name: "some-bucket",

}

fileToBeUploaded := "testfile"

file, err := os.Open(fileToBeUploaded)

check(err)

defer file.Close()

fileInfo, _ := file.Stat()

fileSize := fileInfo.Size()

bytes := make([]byte, fileSize)

// read into buffer

buffer := bufio.NewReader(file)

_, err = buffer.Read(bytes)

check(err)

filetype := http.DetectContentType(bytes)

// set up for multipart upload

multi, err := b.InitMulti("/"+fileToBeUploaded, filetype, s3.ACL("bucket-owner-read"))

check(err)

const fileChunk = 5242880 // 5MB

totalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk)))

parts := []s3.Part{}

fmt.Println("Uploading...")

HeaderPart := strings.NewReader(string(bytes))

for i := uint64(0); i < totalPartsNum; i++ {

partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk))))

partBuffer := make([]byte, partSize)

n , errx := io.ReadFull(HeaderPart, partBuffer)

check(errx)

part, err := multi.PutPart(int(i)+1, strings.NewReader(string(partBuffer))) // write to S3 bucket part by part

check(err)

fmt.Printf("Processing %d part of %d and uploaded %d bytes.\n ", int(i), int(totalPartsNum), int(n))

parts = append(parts, part)

}

err = multi.Complete(parts)

check(err)

fmt.Println("\n\nPutPart upload completed")

}

反对回复 2021-12-06

Qyouu

TA贡献1786条经验获得超11个赞

您读入的数据partBuffer根本没有使用。您传递file到multi.PutPart并读取全部内容的file，求它回到起点为必要吹你做的工作之外的所有。

你的代码最小的变化将是通过bytes.NewReader(partBuffer)进入PutPart，而不是file。bytes.Reader实现需要的io.ReadSeeker接口PutPart，并将其大小报告为partBuffer.

另一种方法是使用io.SectionReader类型 - 而不是自己将数据读入缓冲区，您只需SectionReader根据file您想要的大小和偏移量创建一系列s并将它们传递给PutPart，它们将传递读取到底层文件阅读器。这应该也能正常工作，并大大减少您必须编写（和错误检查）的代码。它还避免了在 RAM 中不必要地缓冲整个数据块。

反对回复 2021-12-06

噜噜哒

TA贡献1784条经验获得超7个赞

这里的问题可能是由于没有完全读取文件造成的。Read可能有点微妙：

Read 将最多 len(p) 个字节读入 p。它返回读取的字节数 (0 <= n <= len(p)) 和遇到的任何错误。即使 Read 返回 n < len(p)，它也可能在调用期间使用所有 p 作为暂存空间。如果某些数据可用但 len(p) 字节不可用，则 Read 通常会返回可用的数据，而不是等待更多数据。

所以你可能应该使用ioReadFullor (better) io.CopyN。

也就是说，我认为您应该尝试切换到官方的 AWS Go 软件包。他们有一个方便的上传器，可以为您处理所有这些：

package main

import (

"log"

"os"

"github.com/aws/aws-sdk-go/aws/session"

"github.com/aws/aws-sdk-go/service/s3/s3manager"

)

func main() {

bucketName := "test-bucket"

keyName := "test-key"

file, err := os.Open("example")

if err != nil {

log.Fatalln(err)

}

defer file.Close()

sess := session.New()

uploader := s3manager.NewUploader(sess)

// Perform an upload.

result, err := uploader.Upload(&s3manager.UploadInput{

Bucket: &bucketName,

Key: &keyName,

Body: file,

})

if err != nil {

log.Fatalln(err)

}

log.Println(result)

}

您可以在godoc.org上找到更多文档。

反对回复 2021-12-06

3 回答
0 关注
351 浏览

关注

添加回答

0/150

提交

取消

热搜

最近搜索清空

如何使用 Go 跟踪分段上传到 s3 的进度？

如何使用 Go 跟踪分段上传到 s3 的进度？

3 回答

添加回答