Web Scraping From Client Side Rendered Page Using Node.js

Published on 5 Aug, 2021

Client Side rendering happens in browser. We need a headless browser like puppeteer to actually execute the JavaScript code and render the content. Once we have the page html, we need a tool like cheerio to select required elements easily.

Here we are trying to get an item name and its price from a Flipkart page.

Create index.js.

Add reference to puppeteer and cheerio.

const puppeteer = require("puppeteer");
const cheerio = require("cheerio");

Create a function getData() that retrieves the HTML using puppeteer.

async function getData() {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(
  const data = await page.content();
  await browser.close();

What we can see above is:

  • Launch puppeteer browser
  • Open a tab in the browser
  • Navigate to the Flipkart product page
  • Get the content of the page
  • Close the browser
  • Pass the content to another function which we need to create processData().

Now create processData() that gets the required data using cheerio.

function processData(data) {
  const $ = cheerio.load(data);
  const productTitle = $(".B_NuCI").text();
  const productPrice = $("._16Jk6d").text();

The syntax of cheerio is similar to jQuery.

Run the file using node index.js.

We can see the output as:

acer Aspire 7 Core i5 10th Gen - (8 GB/512 GB SSD/Windows 10 Home/4 GB Graphics/NVIDIA GeForce GTX 1650) A715-75G-50TA/ A715-75G-41G Gaming Laptop  (15.6 inch, Black, 2.15 Kg)