Skip to content Skip to sidebar Skip to footer

Trying To Scrape Table Using Pandas From Selenium's Result

I am trying to scrape a table from a Javascript website using Pandas. For this, I used Selenium to first reach my desired page. I am able to print the table in text format (as show

Solution 1:

You can get the table using the following code

import time
from selenium import webdriver
import pandas as pd

chrome_path = r"Path to chrome driver"
driver = webdriver.Chrome(chrome_path)
url = 'http://www.bursamalaysia.com/market/securities/equities/prices/#/?filter=BS02'

page = driver.get(url)
time.sleep(2)

df = pd.read_html(driver.page_source)[0]
print(df.head())

This is the output

No  Code    Name    Rem Last Done   LACP    Chg % Chg   Vol ('00)   Buy Vol ('00)   Buy Sell    Sell Vol ('00)  High    Low
015284CB  LCTITAN-CB  s   0.0250.0200.005   +25.00406550198780.0200.0251066300.0250.015121201    SUMATEC [S] s   0.0500.050   -   -   389354438150.0500.0551873010.0550.050235284    LCTITAN [S] s   4.4704.700   -0.230  -4.893673354304.4704.480344.7804.140340176    KRONO [S]   -   0.8750.8050.070   +8.7030047337700.8700.8757970.9000.775455284CE  LCTITAN-CE  s   0.1300.135   -0.005  -3.7029237972140.1250.130500.1550.100

To get data from all pages you can crawl the remaining pages and use df.append

Solution 2:

Answer:

df = pd.read_html(target[0].get_attribute('outerHTML'))

Result:

enter image description here

Reason for target[0]:

driver.find_elements_by_id('bm_equities_prices_table') returns a list of selenium webelements, in your case, there's only 1 element, hence [0]

Reason for get_attribute('outerHTML'):

we want to get the 'html' of the element. There are 2 types of such get_attribute methods: 'innerHTML' vs 'outerHTML'. We chose the 'outerHTML' becasue we need to include the current element, where the table headers are, I suppose, instead of only the inner contents of the element.

Reason for df[0]

pd.read_html() returns a list of data frames, the first of which is the result we want, hence [0].

Post a Comment for "Trying To Scrape Table Using Pandas From Selenium's Result"