by

Capturing web page video with PhantomJS and Selenium

At first glance, it may seem that web page video capturing is the trivial task and you can just google and get set of solutions.

However, I found only two ways on the Internet. The first way it uses some browsers extensions and certainly requires a GUI.

The second approach it uses PhatomJS in pair with FFmpeg, you should capture page screenshots with overlay (for example 25 screenshots for 1-second video with 25 FPS) and then join all these screenshots into video with FFmpeg.

This simple solution works, but only for small videos.
When you need to capture 5 minutes video with 60 FPS it requires around 5 min * 60 sec * 60 frames/sec = 18000 frames (screenshots). As you understand, this works very slowly because it requires a lot I\O operations.

More information about this solution you can read here.

In this article, I would like to describe a faster approach for record web page video with:

  • PhatomJS
  • Java
  • Selenium
  • Humble Video

Humble Video (ex Xuggler) allows Java Virtual Machine developers (Java, Scala, Clojure, JRuby, etc.) to decode, analyze/modify and encode audio and video data into 100s of different formats (e.g. H264, AAC, MP3, FLV, etc.). It uses the FFmpeg project under the covers. Humble Video is a mix of Java and native code, and the native code is written in C++, C and Assembly.

Despite Humble is based on FFmpeg I discovered that in my case it works faster that FFmpeg directly (for example with ffmpeg-cli-wrapper).

Prerequisites

So, first of all, we need to create new Java project. Let’s create Gradle project and add following dependencies:

compile('org.seleniumhq.selenium:selenium-java:3.0.0') // Selenium
compile('com.github.detro:ghostdriver:2.0.0') // Driver for PhatomJS
compile('io.humble:humble-video-all:0.2.1') // Humble

And repository with ghostdriver

maven { url 'https://jitpack.io' }

Also, let’s download and install PhantomJS and add binary to the path

Windows (run cmd as administrator):

setx path "%path%;C:\Program Files\phantomjs-%VERSION%-windows\bin"

Linux:

sudo ln -s /path/to/phantomjs /usr/local/bin/

And check the installation

phantomjs --version
2.1.1

Capture screenshot

Let’s create SeleniumService class for capturing screenshot with Selenium and PhantomJS

package org.dd.webvideo.service;

import org.openqa.selenium.Dimension;
import org.openqa.selenium.OutputType;
import org.openqa.selenium.phantomjs.PhantomJSDriver;
import org.openqa.selenium.remote.DesiredCapabilities;
import org.openqa.selenium.remote.RemoteWebDriver;

import java.io.File;

public class SeleniumService {

    private final int width;
    private final int height;

    public SeleniumService(int width, int height){
        this.width = width;
        this.height = height;
    }

    public File getUrlScreenshot(String url){
        DesiredCapabilities caps = new DesiredCapabilities();
        RemoteWebDriver driver = new PhantomJSDriver(caps);

        driver.manage().window().setSize(new Dimension(width, height));
        driver.get(url);

        File screenshotFile = driver.getScreenshotAs(OutputType.FILE);
        screenshotFile.deleteOnExit();

        driver.quit();

        return screenshotFile;
    }

}

Render a video

First of all, we need to calculate required frames count based on screenshot size and required video duration.

int framesRequired = duration * fps;
double frameOffset = getFrameOffset(sourceImage, framesRequired);

.....
private double getFrameOffset(BufferedImage sourceImage, final double framesRequired) {
    return (double) (sourceImage.getHeight() - height) / framesRequired;
}

Then, create Muxer and Encoder with default codec for the format (mp4).

private static final PixelFormat.Type DEFAULT_PIXEL_FORMAT = PixelFormat.Type.PIX_FMT_YUV420P;
private static final String DEFAULT_PRESET = "ultrafast";
....
final MediaPacket packet = MediaPacket.make();
final Muxer muxer = Muxer.make(outputFilename.getAbsolutePath(), null, formatName);
final MuxerFormat muxerFormat = muxer.getFormat();
final Encoder encoder = createEncoder(muxerFormat);
muxer.addNewStream(encoder);
muxer.open(null, null);

....
private Encoder createEncoder(MuxerFormat muxerFormat) {
	final Codec codec = Codec.findEncodingCodec(muxerFormat.getDefaultVideoCodecId());

	Encoder encoder = Encoder.make(codec);
	encoder.setProperty("preset", DEFAULT_PRESET);

	encoder.setWidth(width);
	encoder.setHeight(height);
	encoder.setPixelFormat(DEFAULT_PIXEL_FORMAT);
	encoder.setTimeBase(getFrameRate());

	if (muxerFormat.getFlag(MuxerFormat.Flag.GLOBAL_HEADER))
		encoder.setFlag(Encoder.Flag.FLAG_GLOBAL_HEADER, true);

	encoder.open(null, null);
	return encoder;
}

Next, we need to make sure we have the right MediaPicture format objects to encode data with. We also need to create MediaPictureConverter for convert BufferedImage to MediaPicture.

final MediaPicture frameMediaPicture = createMediaPicture(encoder);
MediaPictureConverter converter = null; // Lazy converter creation

And finally, let’s encode and write frames.

for (int frameIndex = 0; frameIndex < framesRequired; frameIndex++) {
	int currentFrameOffset = (int) (frameIndex * frameOffset);
	final BufferedImage frameImage = cropImage(sourceImage, new Rectangle(0, currentFrameOffset, width, height));

	if (converter == null)
		converter = MediaPictureConverterFactory.createConverter(frameImage, frameMediaPicture);

	converter.toPicture(frameMediaPicture, frameImage, frameIndex);
	writeFrame(muxer, encoder, frameMediaPicture, packet);
}

...
private BufferedImage cropImage(BufferedImage src, Rectangle rect) {
        BufferedImage clipped = src.getSubimage(rect.x, rect.y, rect.width, rect.height);
        BufferedImage out = new BufferedImage(clipped.getWidth(), clipped.getHeight(), clipped.getType());
        out.getGraphics().drawImage(clipped, 0, 0, null);
        return out;
}

...
private void writeFrame(Muxer muxer, Encoder encoder, MediaPicture picture, MediaPacket packet) {
	do {
		encoder.encode(packet, picture);
		if (packet.isComplete())
			muxer.write(packet, false);
	} while (packet.isComplete());
}

The last step it’s flushing encoder cache and close the Muxer.

flushCache(muxer, encoder, packet);
muxer.close();

....

private void flushCache(Muxer muxer, Encoder encoder, MediaPacket packet) {
	writeFrame(muxer, encoder, null, packet);
}

Then, assemble these code into ScreenshotVideoService class.

package org.dd.webvideo.service;

import io.humble.video.Codec;
import io.humble.video.Encoder;
import io.humble.video.MediaPacket;
import io.humble.video.MediaPicture;
import io.humble.video.Muxer;
import io.humble.video.MuxerFormat;
import io.humble.video.PixelFormat;
import io.humble.video.Rational;
import io.humble.video.awt.MediaPictureConverter;
import io.humble.video.awt.MediaPictureConverterFactory;

import javax.imageio.ImageIO;
import java.awt.Rectangle;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import java.util.logging.Logger;

public class ScreenshotVideoService {

    private static final Logger LOG = Logger.getLogger(ScreenshotVideoService.class.getName());
    private static final PixelFormat.Type DEFAULT_PIXEL_FORMAT = PixelFormat.Type.PIX_FMT_YUV420P;
    private static final String DEFAULT_PRESET = "ultrafast";

    private final int width;
    private final int height;
    private final String formatName;
    private final int fps;

    public ScreenshotVideoService(int width, int height, String formatName, int fps) {
        this.width = width;
        this.height = height;
        this.formatName = formatName;
        this.fps = fps;
    }

    public File prepareVideo(File input, int duration) {
        LOG.info("Starting screenshot video encoding");

        File outputFile = new File("./output-" + System.currentTimeMillis() + "." + formatName);

        try {
            BufferedImage sourceImage = openSourceImage(input);
            recordVideo(sourceImage, outputFile, duration);
        } catch (InterruptedException | IOException e) {
            LOG.severe("Can't record page video");
            throw new IllegalStateException(e);
        }

        return outputFile;
    }

    private void recordVideo(BufferedImage sourceImage, final File outputFilename, final int duration) throws
            InterruptedException, IOException {

        int framesRequired = duration * fps;
        double frameOffset = getFrameOffset(sourceImage, framesRequired);

        final MediaPacket packet = MediaPacket.make();
        final Muxer muxer = Muxer.make(outputFilename.getAbsolutePath(), null, formatName);
        final MuxerFormat muxerFormat = muxer.getFormat();
        final Encoder encoder = createEncoder(muxerFormat);
        muxer.addNewStream(encoder);
        muxer.open(null, null);

        final MediaPicture frameMediaPicture = createMediaPicture(encoder);

        LOG.info(String.format("Start decode. Frames required: %s. Duration: %s sec. Frame offset: %s.", framesRequired, duration, frameOffset));
        MediaPictureConverter converter = null;

        long startTime = System.currentTimeMillis();

        for (int frameIndex = 0; frameIndex < framesRequired; frameIndex++) {
            logFps(frameIndex, startTime);

            int currentFrameOffset = (int) (frameIndex * frameOffset);
            final BufferedImage frameImage = cropImage(sourceImage, new Rectangle(0, currentFrameOffset, width, height));

            if (converter == null)
                converter = MediaPictureConverterFactory.createConverter(frameImage, frameMediaPicture);

            converter.toPicture(frameMediaPicture, frameImage, frameIndex);
            writeFrame(muxer, encoder, frameMediaPicture, packet);
        }

        flushCache(muxer, encoder, packet);
        muxer.close();

    }

    private void writeFrame(Muxer muxer, Encoder encoder, MediaPicture picture, MediaPacket packet) {
        do {
            encoder.encode(packet, picture);
            if (packet.isComplete())
                muxer.write(packet, false);
        } while (packet.isComplete());
    }

    private void flushCache(Muxer muxer, Encoder encoder, MediaPacket packet) {
        writeFrame(muxer, encoder, null, packet);
    }

    private void logFps(int currentFrame, long startTime) {
        if (currentFrame > 0 && currentFrame % 100 == 0) {
            int fps = (int) (currentFrame / ((System.currentTimeMillis() - startTime) / 1000));
            LOG.info(String.format("Decode video. Frame [%s]. FPS [%s]", currentFrame, fps));
        }
    }

    private MediaPicture createMediaPicture(Encoder encoder) {
        final MediaPicture picture = MediaPicture.make(encoder.getWidth(), encoder.getHeight(), DEFAULT_PIXEL_FORMAT);
        picture.setTimeBase(getFrameRate());
        return picture;
    }

    private Encoder createEncoder(MuxerFormat muxerFormat) {
        final Codec codec = Codec.findEncodingCodec(muxerFormat.getDefaultVideoCodecId());

        Encoder encoder = Encoder.make(codec);
        encoder.setProperty("preset", DEFAULT_PRESET);

        encoder.setWidth(width);
        encoder.setHeight(height);
        encoder.setPixelFormat(DEFAULT_PIXEL_FORMAT);
        encoder.setTimeBase(getFrameRate());

        if (muxerFormat.getFlag(MuxerFormat.Flag.GLOBAL_HEADER))
            encoder.setFlag(Encoder.Flag.FLAG_GLOBAL_HEADER, true);

        encoder.open(null, null);
        return encoder;
    }

    private Rational getFrameRate() {
        return Rational.make(1, fps);
    }

    private BufferedImage openSourceImage(File input) throws IOException {
        return convertImageToBGR(ImageIO.read(input));
    }

    private double getFrameOffset(BufferedImage sourceImage, final double framesRequired) {
        return (double) (sourceImage.getHeight() - height) / framesRequired;
    }

    private BufferedImage convertImageToBGR(BufferedImage sourceImage) {
        BufferedImage image = new BufferedImage(sourceImage.getWidth(), sourceImage.getHeight(), BufferedImage.TYPE_3BYTE_BGR);
        image.getGraphics().drawImage(sourceImage, 0, 0, null);

        return image;
    }

    private BufferedImage cropImage(BufferedImage src, Rectangle rect) {
        BufferedImage clipped = src.getSubimage(rect.x, rect.y, rect.width, rect.height);
        BufferedImage out = new BufferedImage(clipped.getWidth(), clipped.getHeight(), clipped.getType());
        out.getGraphics().drawImage(clipped, 0, 0, null);
        return out;
    }

}

and create Main class

package org.dd.webvideo;

import org.dd.webvideo.service.ScreenshotVideoService;
import org.dd.webvideo.service.SeleniumService;

import java.io.File;
import java.util.logging.Logger;

public class Main {

    private static final Logger LOG = Logger.getLogger(Main.class.toString());

    private static final int DEFAULT_WIDTH = 1920;
    private static final int DEFAULT_HEIGHT = 1080;
    private static final int DEFAULT_FPS = 30;
    private static final String DEFAULT_FORMAT = "mp4";

    public static void main(String[] args) {

        if (args.length != 2){
            LOG.severe("Please provide argument. Example: java -jar webvideo.jar <URL> <DURATION_SECONDS>");
            System.exit(1);
        }

        String url = args[0];

        int durationSeconds = Integer.valueOf(args[1]);
        int width = DEFAULT_WIDTH;
        int height = DEFAULT_HEIGHT;

        SeleniumService seleniumService = new SeleniumService(width, height);
        ScreenshotVideoService screenshotVideoService = new ScreenshotVideoService(width, height, DEFAULT_FORMAT, DEFAULT_FPS);

        File screenshotFile = seleniumService.getUrlScreenshot(url);
        LOG.info(String.format("Screenshot captured to %s", screenshotFile.getAbsolutePath()));

        final File videoFile = screenshotVideoService.prepareVideo(screenshotFile, durationSeconds);
        LOG.info(String.format("Video saved to %s", videoFile.getAbsolutePath()));

    }

}

Tests

INFO: Starting screenshot video encoding
Jul 30, 2017 12:11:57 AM org.dd.webvideo.service.ScreenshotVideoService recordVideo
INFO: Start decode. Frames required: 1800. Duration: 60 sec. Frame offset: 1.64.
Jul 30, 2017 12:11:59 AM org.dd.webvideo.service.ScreenshotVideoService logFps
INFO: Decode video. Frame [100]. FPS [100]
...
Jul 30, 2017 12:12:18 AM org.dd.webvideo.service.ScreenshotVideoService logFps
INFO: Decode video. Frame [1700]. FPS [85]
Jul 30, 2017 12:12:20 AM org.dd.webvideo.service.ScreenshotVideoService recordVideo
INFO: Decoding done. Took 22035 msec
Jul 30, 2017 12:12:20 AM org.dd.webvideo.Main main
INFO: Video saved to C:\output-1501362717572.mp4

Recording of 60 seconds video with 30 frames per seconds took 22035 msec (~22 sec) on the laptop with  Intel Core i7-5600U @ 2.60GHz. Average performance is 85 FPS.

Sample video

Build

You can download build here.

  • Extract downloaded archive.
  • Navigate to bin folder.
  • Run:
    • Windows:
      webvideo.bat <URL> <DURATION_SECONDS>
    • Linux
      ./webvideo <URL> <DURATION_SECONDS>

Sources

Source code will be available soon on GitHub.

Write a Comment

Comment

  1. This is awesome! We’re actually looking at implementing something like this for a client and maybe we could work together, could you flick me an email?

    Cheers!
    Tim