3 回答
TA贡献2041条经验 获得超4个赞
对于PowerShell而言,这是一项容易完成的任务,并由于标准Get-Content cmdlet不能很好地处理非常大的文件而变得复杂。我建议做的是使用.NET StreamReader类在PowerShell脚本中逐行读取文件,并使用Add-Contentcmdlet将每一行写入文件名中索引不断增加的文件。像这样:
$upperBound = 50MB # calculated by Powershell
$ext = "log"
$rootName = "log_"
$reader = new-object System.IO.StreamReader("C:\Exceptions.log")
$count = 1
$fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext)
while(($line = $reader.ReadLine()) -ne $null)
{
Add-Content -path $fileName -value $line
if((Get-ChildItem -path $fileName).Length -ge $upperBound)
{
++$count
$fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext)
}
}
$reader.Close()
TA贡献1812条经验 获得超5个赞
与此处的所有答案相同,但使用StreamReader / StreamWriter分割新行(逐行,而不是尝试一次将整个文件读入内存)。这种方法可以以我所知道的最快方式拆分大文件。
注意:我很少进行错误检查,因此无法保证它会根据您的情况顺利进行。它为我做的(1.7 GB TXT文件的400万行在95秒内分成了每个文件100,000行)。
#split test
$sw = new-object System.Diagnostics.Stopwatch
$sw.Start()
$filename = "C:\Users\Vincent\Desktop\test.txt"
$rootName = "C:\Users\Vincent\Desktop\result"
$ext = ".txt"
$linesperFile = 100000#100k
$filecount = 1
$reader = $null
try{
$reader = [io.file]::OpenText($filename)
try{
"Creating file number $filecount"
$writer = [io.file]::CreateText("{0}{1}.{2}" -f ($rootName,$filecount.ToString("000"),$ext))
$filecount++
$linecount = 0
while($reader.EndOfStream -ne $true) {
"Reading $linesperFile"
while( ($linecount -lt $linesperFile) -and ($reader.EndOfStream -ne $true)){
$writer.WriteLine($reader.ReadLine());
$linecount++
}
if($reader.EndOfStream -ne $true) {
"Closing file"
$writer.Dispose();
"Creating file number $filecount"
$writer = [io.file]::CreateText("{0}{1}.{2}" -f ($rootName,$filecount.ToString("000"),$ext))
$filecount++
$linecount = 0
}
}
} finally {
$writer.Dispose();
}
} finally {
$reader.Dispose();
}
$sw.Stop()
Write-Host "Split complete in " $sw.Elapsed.TotalSeconds "seconds"
分割1.7 GB文件的输出:
...
Creating file number 45
Reading 100000
Closing file
Creating file number 46
Reading 100000
Closing file
Creating file number 47
Reading 100000
Closing file
Creating file number 48
Reading 100000
Split complete in 95.6308289 seconds
- 3 回答
- 0 关注
- 907 浏览
添加回答
举报