java如何網頁截屏？selenium來搞定

景

需求一直有，今年比較多，如題，工作中遇到網頁截圖這樣的需求，本著效果好，功能全又穩定的意圖，去網上搜索相關技術，像HTML2Image、cssbox、selenium等，還有很多其他的技術，這篇文章主要說說我測試使用并能滿足需求的cssbox,selenium。

Cssbox

CSSBox是一個用純Java編寫的（X）HTML/CSS渲染引擎。它的主要目的是提供關于呈現的頁面內容和布局的完整和進一步可處理的信息。但是，它也可以用于瀏覽Java Swing應用程序中呈現的文檔。核心CSSBox庫還可以用于獲得所呈現的文檔的位圖或矢量（SVG）圖像。使用SwingBox包，CSSBox可以用作Java Swing應用程序中的交互式Web瀏覽器組件。

官網地址：http://cssbox.sourceforge.net/

使用

1引入maven依賴

<!--網站轉換為圖片cssbox-->
<dependency>
<groupId>net.sf.cssbox</groupId>
<artifactId>cssbox</artifactId>
<version>5.0.0</version>
</dependency>

2使用

@Test
public void cssboxTest(){
    try {
        ImageRenderer render=new ImageRenderer();
        //網絡鏈接的html
        String url="https://www.zhangbj.com/p/524.html";
        //文件保存路徑
        String path="C:\\Users\\Administrator\\Desktop"+File.separator+"html.png";
        FileOutputStream out=new FileOutputStream(new File(FilenameUtils.normalize(path)));
        //開始截屏
        render.renderURL(url, out);
    } catch (Exception e) {
    e.printStackTrace();
    }
}

3結果

樣式可能出現問題，中文有時候亂碼

Selenium

1引入依賴

<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<version>3.141.59</version>
</dependency>

2相關準備

selenium+chromedriver谷歌驅動+chrome瀏覽器

1.注意谷歌驅動的版本要和谷歌瀏覽器的版本一樣或者版本最相近

2.注意chromedriver谷歌驅動需要放在jdk安裝目錄下，具體路徑為xxx/bin/chromedriver.exe，在linux和window中操作一樣，這樣切換系統是就無需改代碼。

3.需要安裝谷歌瀏覽器

谷歌驅動下載地址：https://registry.npmmirror.com/binary.html?path=chromedriver/

3使用

@Slf4j
public class Html2ImageUtil {
/**
* 將HTML轉為圖片，并保存至指定位置
* @param url 頁面地址
* @param targetPath 保存地址（包含圖片名，如 /images/test.png）
* @return
*/
public static String htmlToImage(String url, String targetPath) {
  if (StringUtils.isEmpty(url) || StringUtils.isEmpty(targetPath)) {
  throw new RuntimeException("截圖失敗！缺少必填項");
  }
  // 休眠時長
  Integer sleepTime=3 * 1000;
  // 無頭模式
  System.setProperty("java.awt.headless", "true");
  //獲取谷歌配置信息
  ChromeOptions chromeOptions=getChromeOptions();
  // 配置信息中有默認窗口大小，也可以單獨設置窗口大小
  chromeOptions.addArguments("--window-size=1920,6000");
  //創建webdriver 谷歌驅動
  WebDriver driver=new ChromeDriver(chromeOptions);
  //也可以通過如下方式設置窗口大小
  // Dimension dimension=new Dimension(1000, 30);
  // driver.manage().window().setSize(dimension);
  try {
    //加載頁面
    driver.get(url);
    //等待加載頁面
    Thread.sleep(sleepTime);
    //截屏
    File srcFile=((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
    //保存到指定位置
    FileUtils.copyFile(srcFile, new File(FilenameUtils.normalize(targetPath)));
  } catch (InterruptedException | IOException e) {
  e.printStackTrace();
  throw new RuntimeException(e.getMessage());
  } finally {
  driver.quit();
  }
  log.info("截圖成功！");
  return targetPath;
}
/**
* 獲取chrome配置信息
* 注意 chromedriver谷歌驅動需要放在jdk安裝目錄下，具體路徑為xxx/bin/chromedriver.exe ，在linux和window中操作一樣
* @return
*/
public static ChromeOptions getChromeOptions() {
    ChromeOptions options=new ChromeOptions();
    //獲取當前操作系統
    String os=System.getProperty("os.name");
    //獲取jdk安裝目錄，需要提前將谷歌驅動放進jdk的bin目錄下，在linux和window中操作一樣
    String sysPath=System.getProperty("java.home").replace("jre", "bin");
    String chromeDriver=sysPath + File.separator+"chromedriver.exe";
    options.addArguments("disable-infobars");
    //設置為 headless 模式，不需要真實啟動瀏覽器
    options.setHeadless(true);
    //options.addArguments("--headless");
    options.addArguments("--dns-prefetch-disable");
    options.addArguments("--no-referrers");
    options.addArguments("--disable-gpu");
    options.addArguments("--disable-audio");
    options.addArguments("--no-sandbox");
    options.addArguments("--ignore-certificate-errors");
    options.addArguments("--allow-insecure-localhost");
    options.addArguments("--window-size=1920,6000"); // 窗口默認大小
    String userAgent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36";
    userAgent="user-agent=" + userAgent;
    options.addArguments(userAgent);
    // 設置chrome二進制文件
    options.setPageLoadStrategy(PageLoadStrategy.EAGER);
    // 設置驅動
    System.setProperty("webdriver.chrome.driver", chromeDriver);
    log.debug("結束獲取chrome配置信息");
    return options;
}

測試

public static void main(String[] args) {
		htmlToImage("https://www.cnblogs.com/tester-ggf/p/12602211.html","C:\\Users\\Administrator\\Desktop\\aaa.png");
}

效果十分完美

總結

最完美的方案就是selenium+chromedriver谷歌驅動+chrome瀏覽器，無需多說，用吧。

您的贊和關注是對我創作的最大肯定謝謝大家！

前我曾寫過如何將canvas圖形轉換成圖片和下載canvas圖像的方法，這些都是在為這個插件做技術準備。

技術路線很清晰，將網頁的某個區域的內容生成圖像，保持到canvas里，然后將canvas內容轉換成圖片，保存到本地，最后上傳到微博。

我在網上搜尋到html2canvas這個能將指定網頁元素內容生成canvas圖像的javascript工具。這個js工具的用法很簡單，你只需要將它的js文件引入到頁面里，然后調用html2canvas()函數：

 html2canvas(document.body, {
     onrendered: function(canvas) {
         /* canvas is the actual canvas element,
            to append it to the page call for example
            document.body.appendChild( canvas );
         */
     }
 });

這個html2canvas()函數有個參數，上面的例子里傳入的參數是document.body，這會截取整個頁面的圖像。如果你想只截取一個區域，比如對某個p或某個table截圖，你就將這個p或某個table當做參數傳進去。

我最終并沒有選用html2canvas這個js工具，因為在我的實驗過程中發現它有幾個問題。

首先，跨域問題。我舉個例子說明這個問題，比如我的網頁網址是http://www.webhek.com/about/，而我在這個頁面上有個張圖片，這個圖片并不是來自www.webhek.com域，而是來自CDN圖片服務器www.webhek-cdn.com/images/about.jpg，那么，這張圖片就和這個網頁不是同域，那么html2canvas就無法對這種圖片進行截圖，如果你的網站的所有圖片都放在單獨的圖片服務器上，那么用html2canvas對整個網頁進行截圖是就會發現所有圖片的地方都是空白。

這個問題也有補救的方法，就是用代理：

 <!DOCTYPE html>
 <html>
     <head>
         <meta charset="utf-8">
         <title>html2canvas php proxy</title>
         <script src="html2canvas.js"></script>
         <script>
         //<![CDATA[
         (function() {
             window.onload=function(){
                 html2canvas(document.body, {
                     "logging": true, //Enable log (use Web Console for get Errors and Warnings)
                     "proxy":"html2canvasproxy.php",
                     "onrendered": function(canvas) {
                         var img=new Image();
                         img.onload=function() {
                             img.onload=null;
                             document.body.appendChild(img);
                         };
                         img.onerror=function() {
                             img.onerror=null;
                             if(window.console.log) {
                                 window.console.log("Not loaded image from canvas.toDataURL");
                             } else {
                                 alert("Not loaded image from canvas.toDataURL");
                             }
                         };
                         img.src=canvas.toDataURL("image/png");
                     }
                 });
             };
         })();
         //]]>
         </script>
     </head>
     <body>
         <p>
             <img alt="google maps static" src="http://maps.googleapis.com/maps/api/staticmap?center=40.714728,-73.998672&zoom=12&size=800x600&maptype=roadmap&sensor=false">
         </p>
     </body>
 </html>

這個方法只能用在你自己的服務器里，如果是對別人的網頁截圖，還是不行。

試驗的過程中還發現用html2canvas截屏出來的圖像有時會出現文字重疊的現象。我估計是因為html2canvas在解析頁面內容、處理css時不是很完美的原因。

最后，我在火狐瀏覽器的官方網站上找到了drawWindow()這個方法，這個方法和上面提到html2canvas不同之處在于，它不分析頁面元素，它只針對區域，也就是說，它接受的參數是四個數字標志的區域，不論這個區域中什么地方，有沒有頁面內容。

 void drawWindow(
   in nsIDOMWindow window,
   in float x, 
   in float y,
   in float w,
   in float h,
   in DOMString bgColor,
   in unsigned long flags [optional]
 );

這個原生的JavaScript方法看起來非常的完美，正是我需要的，但這個方法不能使用在普通網頁中，因為火狐官方發現這個方法會引起有安全漏洞，在這個bug修復之前，只有具有“Chrome privileges”的代碼才能使用這個drawWindow()函數。

雖然有很大的限制，但周折一下還是可以用的，在我開發的火狐addon插件中，main.js就是具有“Chrome privileges”的代碼。我在網上發現了一段火狐插件SDK里自帶代碼樣例：

 var window=require('window/utils').getMostRecentBrowserWindow();
 var tab=require('tabs/utils').getActiveTab(window);
 var thumbnail=window.document.createElementNS("http://www.w3.org/1999/xhtml", "canvas");
 thumbnail.mozOpaque=true;
 window=tab.linkedBrowser.contentWindow;
 thumbnail.width=Math.ceil(window.screen.availWidth / 5.75);
 var aspectRatio=0.5625; // 16:9
 thumbnail.height=Math.round(thumbnail.width * aspectRatio);
 var ctx=thumbnail.getContext("2d");
 var snippetWidth=window.innerWidth * .6;
 var scale=thumbnail.width / snippetWidth;
 ctx.scale(scale, scale);
 ctx.drawWindow(window, window.scrollX, window.scrollY, snippetWidth, snippetWidth * aspectRatio, "rgb(255,255,255)");
 // thumbnail now represents a thumbnail of the tab

這段代碼寫的非常清楚，只需要依據它做稍微的修改就能適應自己的需求。

這里小編是一個有著10年工作經驗的前端高級工程師，關于web前端有許多的技術干貨，包括但不限于各大廠的最新面試題系列、前端項目、最新前端路線等。需要的伙伴可以私信我

發送【前端資料】

就可以獲取領取地址，免費送給大家。對于學習web前端有任何問題（學習方法，學習效率，如何就業）都可以問我。希望你也能憑自己的努力，成為下一個優秀的程序員

景

最近再做一個需求，需要對網頁生成預覽圖，如下圖

但是網頁千千萬，總不能一個個打開，截圖吧；于是想著能不能使用代碼來實現網頁的截圖。其實要實現這個功能，無非就是要么實現一個仿真瀏覽器，要么調用系統瀏覽器，再進行截圖操作。

代碼實現

1、啟用線程Thread

 void startPrintScreen(ScreenShotParam requestParam) { Thread thread=new Thread(new ParameterizedThreadStart(do_PrintScreen)); thread.SetApartmentState(ApartmentState.STA); thread.Start(requestParam); if (requestParam.Wait) { thread.Join(); FileInfo result=new FileInfo(requestParam.SavePath); long minSize=1 * 1024;// 太小可能是空白圖，重抓 int maxRepeat=2;  while ((!result.Exists || result.Length <=minSize) && maxRepeat > 0) { thread=new Thread(new ParameterizedThreadStart(do_PrintScreen)); thread.SetApartmentState(ApartmentState.STA); thread.Start(requestParam); thread.Join(); maxRepeat--; } } }

2、模擬瀏覽器WebBrowser

 void do_PrintScreen(object param) { try { ScreenShotParam screenShotParam=(ScreenShotParam)param; string requestUrl=screenShotParam.Url; string savePath=screenShotParam.SavePath; WebBrowser wb=new WebBrowser(); wb.ScrollBarsEnabled=false; wb.ScriptErrorsSuppressed=true; wb.Navigate(requestUrl); logger.Debug("wb.Navigate"); DateTime startTime=DateTime.Now; TimeSpan waitTime=new TimeSpan(0, 0, 0, 10, 0);// 10 second while (wb.ReadyState !=WebBrowserReadyState.Complete) { Application.DoEvents(); if (DateTime.Now - startTime > waitTime) { wb.Dispose(); logger.Debug("wb.Dispose() timeout"); return; } }
 wb.Width=screenShotParam.Left + screenShotParam.Width + screenShotParam.Left; // wb.Document.Body.ScrollRectangle.Width (避掉左右側的邊線); wb.Height=screenShotParam.Top + screenShotParam.Height; // wb.Document.Body.ScrollRectangle.Height; wb.ScrollBarsEnabled=false; wb.Document.Body.Style="overflow:hidden";//hide scroll bar var doc=(wb.Document.DomDocument) as mshtml.IHTMLDocument2; var style=doc.createStyleSheet("", 0); style.cssText=@"img { border-style: none; }";
 Bitmap bitmap=new Bitmap(wb.Width, wb.Height); wb.DrawToBitmap(bitmap, new Rectangle(0, 0, wb.Width, wb.Height)); wb.Dispose(); logger.Debug("wb.Dispose()");
 bitmap=CutImage(bitmap, new Rectangle(screenShotParam.Left, screenShotParam.Top, screenShotParam.Width, screenShotParam.Height)); bool needResize=screenShotParam.Width > screenShotParam.ResizeMaxWidth || screenShotParam.Height > screenShotParam.ResizeMaxWidth; if (needResize) { double greaterLength=bitmap.Width > bitmap.Height ? bitmap.Width : bitmap.Height; double ratio=screenShotParam.ResizeMaxWidth / greaterLength; bitmap=Resize(bitmap, ratio); }
 bitmap.Save(savePath, System.Drawing.Imaging.ImageFormat.Gif); bitmap.Dispose(); logger.Debug("bitmap.Dispose();"); logger.Debug("finish");
 } catch (Exception ex) { logger.Info($"exception: {ex.Message}"); } }

3、截圖操作

 private static Bitmap CutImage(Bitmap source, Rectangle section) { // An empty bitmap which will hold the cropped image Bitmap bmp=new Bitmap(section.Width, section.Height); //using (Bitmap bmp=new Bitmap(section.Width, section.Height)) { Graphics g=Graphics.FromImage(bmp);
 // Draw the given area (section) of the source image // at location 0,0 on the empty bitmap (bmp) g.DrawImage(source, 0, 0, section, GraphicsUnit.Pixel);
 return bmp; } }
 private static Bitmap Resize(Bitmap originImage, Double times) { int width=Convert.ToInt32(originImage.Width * times); int height=Convert.ToInt32(originImage.Height * times);
 return ResizeProcess(originImage, originImage.Width, originImage.Height, width, height); }

完整代碼

 public static string ScreenShotAndSaveAmazonS3(string account, string locale, Guid rule_ID, Guid template_ID) {  //新的Template var url=string.Format("https://xxxx/public/previewtemplate?showTemplateName=0&locale={0}&inputTemplateId={1}&inputThemeId=&Account={2}", locale, template_ID, account ); 
 var tempPath=Tools.GetAppSetting("TempPath");
 //路徑準備 var userPath=AmazonS3.GetS3UploadDirectory(account, locale, AmazonS3.S3SubFolder.Template); var fileName=string.Format("{0}.gif", template_ID); var fullFilePath=Path.Combine(userPath.LocalDirectoryPath, fileName); logger.Debug("userPath: {0}, fileName: {1}, fullFilePath: {2}, url:{3}", userPath, fileName, fullFilePath, url);  //開始截圖，並暫存在本機 var screen=new Screen(); screen.ScreenShot(url, fullFilePath);
 //將截圖，儲存到 Amazon S3 //var previewImageUrl=AmazonS3.UploadFile(fullFilePath, userPath.RemotePath + fileName);
 return string.Empty; }

using System;using System.Collections.Generic;using System.Drawing;using System.IO;using System.Linq;using System.Text;using System.Threading;using System.Threading.Tasks;using System.Windows.Forms;
namespace PrintScreen.Common{ public class Screen { protected static NLog.Logger logger=NLog.LogManager.GetCurrentClassLogger();
 public void ScreenShot(string url, string path , int width=400, int height=300 , int left=50, int top=50 , int resizeMaxWidth=200, int wait=1) { if (!string.IsOrEmpty(url) && !string.IsOrEmpty(path)) { ScreenShotParam requestParam=new ScreenShotParam { Url=url, SavePath=path, Width=width, Height=height, Left=left, Top=top, ResizeMaxWidth=resizeMaxWidth, Wait=wait !=0 }; startPrintScreen(requestParam); } }
 void startPrintScreen(ScreenShotParam requestParam) { Thread thread=new Thread(new ParameterizedThreadStart(do_PrintScreen)); thread.SetApartmentState(ApartmentState.STA); thread.Start(requestParam); if (requestParam.Wait) { thread.Join(); FileInfo result=new FileInfo(requestParam.SavePath); long minSize=1 * 1024;// 太小可能是空白圖，重抓 int maxRepeat=2;  while ((!result.Exists || result.Length <=minSize) && maxRepeat > 0) { thread=new Thread(new ParameterizedThreadStart(do_PrintScreen)); thread.SetApartmentState(ApartmentState.STA); thread.Start(requestParam); thread.Join(); maxRepeat--; } } }
 void do_PrintScreen(object param) { try { ScreenShotParam screenShotParam=(ScreenShotParam)param; string requestUrl=screenShotParam.Url; string savePath=screenShotParam.SavePath; WebBrowser wb=new WebBrowser(); wb.ScrollBarsEnabled=false; wb.ScriptErrorsSuppressed=true; wb.Navigate(requestUrl); logger.Debug("wb.Navigate"); DateTime startTime=DateTime.Now; TimeSpan waitTime=new TimeSpan(0, 0, 0, 10, 0);// 10 second while (wb.ReadyState !=WebBrowserReadyState.Complete) { Application.DoEvents(); if (DateTime.Now - startTime > waitTime) { wb.Dispose(); logger.Debug("wb.Dispose() timeout"); return; } }
 wb.Width=screenShotParam.Left + screenShotParam.Width + screenShotParam.Left; // wb.Document.Body.ScrollRectangle.Width (避掉左右側的邊線); wb.Height=screenShotParam.Top + screenShotParam.Height; // wb.Document.Body.ScrollRectangle.Height; wb.ScrollBarsEnabled=false; wb.Document.Body.Style="overflow:hidden";//hide scroll bar var doc=(wb.Document.DomDocument) as mshtml.IHTMLDocument2; var style=doc.createStyleSheet("", 0); style.cssText=@"img { border-style: none; }";
 Bitmap bitmap=new Bitmap(wb.Width, wb.Height); wb.DrawToBitmap(bitmap, new Rectangle(0, 0, wb.Width, wb.Height)); wb.Dispose(); logger.Debug("wb.Dispose()");
 bitmap=CutImage(bitmap, new Rectangle(screenShotParam.Left, screenShotParam.Top, screenShotParam.Width, screenShotParam.Height)); bool needResize=screenShotParam.Width > screenShotParam.ResizeMaxWidth || screenShotParam.Height > screenShotParam.ResizeMaxWidth; if (needResize) { double greaterLength=bitmap.Width > bitmap.Height ? bitmap.Width : bitmap.Height; double ratio=screenShotParam.ResizeMaxWidth / greaterLength; bitmap=Resize(bitmap, ratio); }
 bitmap.Save(savePath, System.Drawing.Imaging.ImageFormat.Gif); bitmap.Dispose(); logger.Debug("bitmap.Dispose();"); logger.Debug("finish");
 } catch (Exception ex) { logger.Info($"exception: {ex.Message}"); } }
 private static Bitmap CutImage(Bitmap source, Rectangle section) { // An empty bitmap which will hold the cropped image Bitmap bmp=new Bitmap(section.Width, section.Height); //using (Bitmap bmp=new Bitmap(section.Width, section.Height)) { Graphics g=Graphics.FromImage(bmp);
 // Draw the given area (section) of the source image // at location 0,0 on the empty bitmap (bmp) g.DrawImage(source, 0, 0, section, GraphicsUnit.Pixel);
 return bmp; } }
 private static Bitmap Resize(Bitmap originImage, Double times) { int width=Convert.ToInt32(originImage.Width * times); int height=Convert.ToInt32(originImage.Height * times);
 return ResizeProcess(originImage, originImage.Width, originImage.Height, width, height); }
 private static Bitmap ResizeProcess(Bitmap originImage, int oriwidth, int oriheight, int width, int height) { Bitmap resizedbitmap=new Bitmap(width, height); //using (Bitmap resizedbitmap=new Bitmap(width, height)) { Graphics g=Graphics.FromImage(resizedbitmap); g.InterpolationMode=System.Drawing.Drawing2D.InterpolationMode.High; g.SmoothingMode=System.Drawing.Drawing2D.SmoothingMode.HighQuality; g.Clear(Color.Transparent); g.DrawImage(originImage, new Rectangle(0, 0, width, height), new Rectangle(0, 0, oriwidth, oriheight), GraphicsUnit.Pixel); return resizedbitmap; } }
 }
 class ScreenShotParam { public string Url { get; set; } public string SavePath { get; set; } public int Width { get; set; } public int Height { get; set; } public int Left { get; set; } public int Top { get; set; } /// <summary> /// 長邊縮到指定長度 /// </summary> public int ResizeMaxWidth { get; set; } public bool Wait { get; set; } }
}

效果

完成，達到預期的效果。

在線咨詢

上一篇：自動化生產線的形式劃分
下一篇：《寄生蟲》橫掃奧斯卡，Python告訴你這部電影到底

您的項目需求

*請認真填寫需求信息，我們會在24小時內與您取得聯系。

整合營銷服務商