1 回答
TA贡献1966条经验 获得超4个赞
我过去使用过 wkHtml2Pdf。
我的建议是立即停止,因为 wkhtmltopdf 使用的是非常旧的浏览器版本,无论如何您都可能会遇到问题。此外,wkHtmlToPdf 不能正常工作(而且性能很差)。
相反,您可以使用更好的选择。
该选项是将 Chrome DevTools 与远程调试协议一起使用:
https ://chromedevtools.github.io/devtools-protocol/
基本上像这样运行 Chrome
chrome.exe --remote-debugging-port=9222
可选配
$"--user-data-dir=\"{directoryInfo.FullName}\"";
和
"--headless --disable-gpu";
这是我在服务器上启动 Chrome 进程的方式(C# 代码)
public IChromeProcess Create(int port, bool headless)
{
string path = System.IO.Path.GetRandomFileName();
System.IO.DirectoryInfo directoryInfo = System.IO.Directory.CreateDirectory(
System.IO.Path.Combine(
System.IO.Path.GetTempPath(), path)
);
string remoteDebuggingArg = $"--remote-debugging-port={port}";
string userDirectoryArg = $"--user-data-dir=\"{directoryInfo.FullName}\"";
const string headlessArg = "--headless --disable-gpu";
// https://peter.sh/experiments/chromium-command-line-switches/
System.Collections.Generic.List<string> chromeProcessArgs =
new System.Collections.Generic.List<string>
{
remoteDebuggingArg,
userDirectoryArg,
// Indicates that the browser is in "browse without sign-in" (Guest session) mode.
// Should completely disable extensions, sync and bookmarks.
"--bwsi",
"--no-first-run"
};
if (false)
{
string proxyProtocol = "socks5";
proxyProtocol = "http";
proxyProtocol = "https";
string proxyIP = "68.183.233.181";
string proxyPort = "3128";
string proxyArg = "--proxy-server=\"" + proxyProtocol + "://" + proxyIP + ":" + proxyPort + "\"";
chromeProcessArgs.Add(proxyArg);
}
if (headless)
chromeProcessArgs.Add(headlessArg);
if(IsRoot)
chromeProcessArgs.Add("--no-sandbox");
string args = string.Join(" ", chromeProcessArgs);
System.Diagnostics.ProcessStartInfo processStartInfo = new System.Diagnostics.ProcessStartInfo(ChromePath, args);
System.Diagnostics.Process chromeProcess = System.Diagnostics.Process.Start(processStartInfo);
string remoteDebuggingUrl = "http://localhost:" + port;
return new LocalChromeProcess(new System.Uri(remoteDebuggingUrl), () => DirectoryCleaner.Delete(directoryInfo), chromeProcess);
}
我在这里使用这个 C# 库与 DevTools 交互(通过 WebSockets):
https://github.com/MasterDevs/ChromeDevTools
如果你在服务器上使用 NodeJS,你可以使用这个:https:
//github.com/cyrus-and/chrome-remote-interface
或者 TypeScript: https:
//github.com/TracerBench/chrome-debugging-client
为了生成 PDF,您需要发出 PrintToPDF 命令:
Dim cm2inch As UnitConversion_t = Function(ByVal centimeters As Double) centimeters * 0.393701
Dim mm2inch As UnitConversion_t = Function(ByVal milimeters As Double) milimeters * 0.0393701
Dim printCommand2 As PrintToPDFCommand = New PrintToPDFCommand() With {
.Scale = 1,
.MarginTop = 0,
.MarginLeft = 0,
.MarginRight = 0,
.MarginBottom = 0,
.PrintBackground = True,
.Landscape = False,
.PaperWidth = mm2inch(conversionData.PageWidth),
.PaperHeight = mm2inch(conversionData.PageHeight) '
}
要创建光栅图形,您需要发出 CaptureScreenshot-Command :
Dim screenshot As MasterDevs.ChromeDevTools.CommandResponse(Of CaptureScreenshotCommandResponse) = Await chromeSession.SendAsync(New CaptureScreenshotCommand With {
.Format = "png"
})
System.Diagnostics.Debug.WriteLine("Screenshot taken.")
conversionData.PngData = System.Convert.FromBase64String(screenshot.Result.Data)
请注意,要使屏幕截图正常工作,您需要通过 SetDeviceMetricsOverride-Command 设置宽度和高度:
Await chromeSession.SendAsync(New SetDeviceMetricsOverrideCommand With {
.Width = conversionData.ViewPortWidth,
.Height = conversionData.ViewPortHeight,
.Scale = 1
})
您可能必须将 overflow:hidden 放在 HTML 或一些子元素上,这样您就不会截取滚动条;)
顺便说一下,如果您需要特定版本的 Windows 版 Chrome(Chromium,因为出于安全原因旧版 Chrome 不可用),您可以从 Chocolatey-Repository 获取它们:https: //chocolatey.org/packages/chromium /#版本历史
这是我的完整测试代码供参考(减去一些类)
Imports MasterDevs.ChromeDevTools
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Browser
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Page
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Target
Namespace Portal_Convert.CdpConverter
Public Class ChromiumBasedConverter
Private Delegate Function UnitConversion_t(ByVal value As Double) As Double
Public Shared Sub KillHeadlessChromes(ByVal writer As System.IO.TextWriter)
Dim allProcesses As System.Diagnostics.Process() = System.Diagnostics.Process.GetProcesses()
Dim exeName As String = "\chrome.exe"
If System.Environment.OSVersion.Platform = System.PlatformID.Unix Then
exeName = "/chrome"
End If
For i As Integer = 0 To allProcesses.Length - 1
Dim proc As System.Diagnostics.Process = allProcesses(i)
Dim commandLine As String = ProcessUtils.GetCommandLine(proc)
If String.IsNullOrEmpty(commandLine) Then Continue For
commandLine = commandLine.ToLowerInvariant()
If commandLine.IndexOf(exeName, System.StringComparison.InvariantCultureIgnoreCase) = -1 Then Continue For
If commandLine.IndexOf("--headless", System.StringComparison.InvariantCultureIgnoreCase) <> -1 Then
writer.WriteLine($"Killing process {proc.Id} with command line ""{commandLine}""")
ProcessUtils.KillProcessAndChildren(proc.Id)
End If
Next
writer.WriteLine($"Finished killing headless chromes")
End Sub
Public Shared Sub KillHeadlessChromes()
KillHeadlessChromes(System.Console.Out)
End Sub
Private Shared Function __Assign(Of T)(ByRef target As T, value As T) As T
target = value
Return value
End Function
Public Shared Function KillHeadlessChromesWeb() As System.Collections.Generic.List(Of String)
Dim ls As System.Collections.Generic.List(Of String) = New System.Collections.Generic.List(Of String)()
Dim sb As System.Text.StringBuilder = New System.Text.StringBuilder()
Using sw As System.IO.StringWriter = New System.IO.StringWriter(sb)
KillHeadlessChromes(sw)
End Using
Using tr As System.IO.TextReader = New System.IO.StringReader(sb.ToString())
Dim thisLine As String = Nothing
While (__Assign(thisLine, tr.ReadLine())) IsNot Nothing
ls.Add(thisLine)
End While
End Using
sb.Length = 0
sb = Nothing
Return ls
End Function
Private Shared Async Function InternalConnect(ByVal ci As ConnectionInfo, ByVal remoteDebuggingUri As String) As System.Threading.Tasks.Task
ci.ChromeProcess = New RemoteChromeProcess(remoteDebuggingUri)
ci.SessionInfo = Await ci.ChromeProcess.StartNewSession()
End Function
Private Shared Async Function ConnectToChrome(ByVal chromePath As String, ByVal remoteDebuggingUri As String) As System.Threading.Tasks.Task(Of ConnectionInfo)
Dim ci As ConnectionInfo = New ConnectionInfo()
Try
Await InternalConnect(ci, remoteDebuggingUri)
Catch ex As System.Exception
If ex.InnerException IsNot Nothing AndAlso Object.ReferenceEquals(ex.InnerException.[GetType](), GetType(System.Net.WebException)) Then
If (CType(ex.InnerException, System.Net.WebException)).Status = System.Net.WebExceptionStatus.ConnectFailure Then
Dim chromeProcessFactory As MasterDevs.ChromeDevTools.IChromeProcessFactory = New MasterDevs.ChromeDevTools.ChromeProcessFactory(New FastStubbornDirectoryCleaner(), chromePath)
Dim persistentChromeProcess As MasterDevs.ChromeDevTools.IChromeProcess = chromeProcessFactory.Create(9222, True)
' await cannot be used inside catch ...
' Await InternalConnect(ci, remoteDebuggingUri)
InternalConnect(ci, remoteDebuggingUri).Wait()
Return ci
End If
End If
System.Console.WriteLine(chromePath)
System.Console.WriteLine(ex.Message)
System.Console.WriteLine(ex.StackTrace)
If ex.InnerException IsNot Nothing Then
System.Console.WriteLine(ex.InnerException.Message)
System.Console.WriteLine(ex.InnerException.StackTrace)
End If
System.Console.WriteLine(ex.[GetType]().FullName)
Throw
End Try
Return ci
End Function
Private Shared Async Function ClosePage(ByVal chromeSession As MasterDevs.ChromeDevTools.IChromeSession, ByVal frameId As String, ByVal headLess As Boolean) As System.Threading.Tasks.Task
Dim closeTargetTask As System.Threading.Tasks.Task(Of MasterDevs.ChromeDevTools.CommandResponse(Of CloseTargetCommandResponse)) = chromeSession.SendAsync(New CloseTargetCommand() With {
.TargetId = frameId
})
' await will block forever if headless
If Not headLess Then
Dim closeTargetResponse As MasterDevs.ChromeDevTools.CommandResponse(Of CloseTargetCommandResponse) = Await closeTargetTask
System.Console.WriteLine(closeTargetResponse)
Else
System.Console.WriteLine(closeTargetTask)
End If
End Function
Public Shared Async Function ConvertDataAsync(ByVal conversionData As ConversionData) As System.Threading.Tasks.Task
Dim chromeSessionFactory As MasterDevs.ChromeDevTools.IChromeSessionFactory = New MasterDevs.ChromeDevTools.ChromeSessionFactory()
Using connectionInfo As ConnectionInfo = Await ConnectToChrome(conversionData.ChromePath, conversionData.RemoteDebuggingUri)
Dim chromeSession As MasterDevs.ChromeDevTools.IChromeSession = chromeSessionFactory.Create(connectionInfo.SessionInfo.WebSocketDebuggerUrl)
Await chromeSession.SendAsync(New SetDeviceMetricsOverrideCommand With {
.Width = conversionData.ViewPortWidth,
.Height = conversionData.ViewPortHeight,
.Scale = 1
})
Dim navigateResponse As MasterDevs.ChromeDevTools.CommandResponse(Of NavigateCommandResponse) = Await chromeSession.SendAsync(New NavigateCommand With {
.Url = "about:blank"
})
System.Console.WriteLine("NavigateResponse: " & navigateResponse.Id)
Dim setContentResponse As MasterDevs.ChromeDevTools.CommandResponse(Of SetDocumentContentCommandResponse) = Await chromeSession.SendAsync(New SetDocumentContentCommand() With {
.FrameId = navigateResponse.Result.FrameId,
.Html = conversionData.Html
})
Dim cm2inch As UnitConversion_t = Function(ByVal centimeters As Double) centimeters * 0.393701
Dim mm2inch As UnitConversion_t = Function(ByVal milimeters As Double) milimeters * 0.0393701
Dim printCommand2 As PrintToPDFCommand = New PrintToPDFCommand() With {
.Scale = 1,
.MarginTop = 0,
.MarginLeft = 0,
.MarginRight = 0,
.MarginBottom = 0,
.PrintBackground = True,
.Landscape = False,
.PaperWidth = mm2inch(conversionData.PageWidth),
.PaperHeight = mm2inch(conversionData.PageHeight) '
}
'.PaperWidth = cm2inch(conversionData.PageWidth),
'.PaperHeight = cm2inch(conversionData.PageHeight)
If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.GetVersion) Then
Try
System.Diagnostics.Debug.WriteLine("Getting browser-version")
Dim version As MasterDevs.ChromeDevTools.CommandResponse(Of GetVersionCommandResponse) = Await chromeSession.SendAsync(New GetVersionCommand())
System.Diagnostics.Debug.WriteLine("Got browser-version")
conversionData.Version = version.Result
Catch ex As System.Exception
conversionData.Exception = ex
System.Diagnostics.Debug.WriteLine(ex.Message)
End Try
End If
If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToImage) Then
Try
System.Diagnostics.Debug.WriteLine("Taking screenshot")
Dim screenshot As MasterDevs.ChromeDevTools.CommandResponse(Of CaptureScreenshotCommandResponse) = Await chromeSession.SendAsync(New CaptureScreenshotCommand With {
.Format = "png"
})
System.Diagnostics.Debug.WriteLine("Screenshot taken.")
conversionData.PngData = System.Convert.FromBase64String(screenshot.Result.Data)
Catch ex As System.Exception
conversionData.Exception = ex
System.Diagnostics.Debug.WriteLine(ex.Message)
End Try
End If
If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToPdf) Then
Try
System.Diagnostics.Debug.WriteLine("Printing PDF")
Dim pdf As MasterDevs.ChromeDevTools.CommandResponse(Of PrintToPDFCommandResponse) = Await chromeSession.SendAsync(printCommand2)
System.Diagnostics.Debug.WriteLine("PDF printed.")
conversionData.PdfData = System.Convert.FromBase64String(pdf.Result.Data)
Catch ex As System.Exception
conversionData.Exception = ex
System.Diagnostics.Debug.WriteLine(ex.Message)
End Try
End If
System.Console.WriteLine("Closing page")
Await ClosePage(chromeSession, navigateResponse.Result.FrameId, True)
System.Console.WriteLine("Page closed")
End Using ' connectionInfo
End Function ' ConvertDataAsync
Public Shared Sub ConvertData(ByVal conversionData As ConversionData)
ConvertDataAsync(conversionData).Wait()
End Sub
End Class
End Namespace
请注意,如果有人使用 C#,最好使用此库: https:
//github.com/BaristaLabs/chrome-dev-tools-runtime
,它使用较少的外部依赖项,并且是 NetCore。我使用另一个只是因为我必须将它移植到旧的框架版本......
添加回答
举报