How to Extract Dashboard Data using Python Web Scraping

Combine various data resources to create a meaningful dashboard

HOW TO EXTRACT DASHBOARD DATA USING PYTHON WEB SCRAPING

Data and information is used at a massive pace in modern society while at the same time, the human consideration span has distinctly decreased also because of the attack of stimulation from numerous resources.

Data Visualization is extremely powerful with the intrinsic ability to offer shortened pieces of data in a manner that is friendly for the human brain, therefore it is getting more popular. Furthermore, finance and economy have conventionally been amongst the domains where visualization of data is heavily utilized and today it is no exclusion.

The proofs above produced the idea about this project that is to merge many macros as well as pricing indices of the economy of Greece in a single dashboard. The logic behind the selection of particular metrics is providing a glimpse into the Greece economy, which might be expressive for an average person (a dashboard is not suited for the investors because this doesn’t deep-dive in the details).

The objective of this blog is to define the methodology as well as share a few insights on different challenges we have faced, which might be helpful to ones dealing with similar projects.

Stage 1: Research

As already stated, metrics for a dashboard were chosen by having an average person in mind: what might somebody living in Greece require to identify about the country’s economy developing from the directories? We have given the subsequent 4 pillars:

1. Stock Market Index

In any modern country, the stock market is an important part of an economic ecosystem as well as is frequently utilized as a pointer of any country’s performance. We have decided to utilize the index value of Athens Stock Market (ΧΑΑ) however to skip all separate indices.

2. Fundamental Macro-Economic Metrics

A suitable analysis of any country’s financial outlook might involve dozens of KPIs, indices, and metrics. We have comprised a group of them, the ones we consider as the most efficient in people’s daily lives:

  • GDP Growth: An annual rate at which any economy grows
  • Inflation: The rate of increasing prices of services and goods
  • Unemployment: The percentage of the financial active population that is jobless

3. Bonds

One common practice to raise money either through corporations or countries in today’s budget is through issuing bonds. The income using which a country derives money is an extremely dependable indicator of a country’s reliability in the money market. For Greece having her current economic chaos, these pointers have become more important.

We have added long-term bonds, which a country has delivered over the past years, with both 10 Years as well as 20 Years bonds.

4. Energy Prices

Out of different price indices, people concerning energy are having the most important impact on the economic activities and buying power of any population accordingly. We have chosen to comprise graphs showing price variations of the most frequently used energy resources: diesel and unleaded representative fuel as well as electricity as the key household energy resource.

Stage 2: Getting Resources

Getting an API for retrieving all the data given above has proved to be a non-option because all the applicable APIs were either inadequate or too expensive. The next accessible option was utilizing web scraping services for retrieving data from present websites. This alternative has the given risks:

a) Websites gathering data (particularly the ones providing the options of paid APIs to have access to their data) might potentially allow a severe policy against extraction that might break the doorway to the dashboard.

b) Web scraping methods depend on the scraping structure of a website at any given moment for accessing the required data, so any small change in CSS or HTML code (i.e. changing a table class name) could break the code also.

Although for a non-production project, it is a tolerable risk. Plus, we have looked for comparatively stable sites as our data resources. The ones we have chosen:

  • Trading Economics: Utilized to get all the macro-economic indices, stock market indexes, and bonds.
  • Statista: Data resource for monthly electricity pricing
  • fuelo.net: Data resource for average fuel pricing (diesel as well as unleaded)

Stage 3: Writing a Few Python Codes

We have used Python as well as a powerful BeautifulSoup library for parsing HTML from resource websites. As X-Byte does only identify Pandas’ DataFrame data structure for potential input, we need to make an adaptation to DataFrame for every individual table.

We have divided our code into 4 different functions (one for every data retrieval procedure) and after making sure it gets data in a correct format, we have utilized this code because our data resource is in X-Byte (that we would describe in Stage 4).

You could see through a schematic given the steps starting with combined Python script as well as ending in final data models we would utilize for a Dashboard. In the nutshell:

1. HTML Repossession

2. HTML parsing, as well as conversion with DataFrames

3. DataFrames’ Power Query conversion to data tables

We will not provide more data on a Python script, you may look at a source code that exists in the repository.

Stage Writing a Few Python Codes

Stage 4: X-Byte Configuration & Data Presenting

X-Byte provides countless ways to import data to the model, among which is using Python Script.

Stage X Byte Configuration Data Presenting

Power Query would run a script, get the data as well as store them with different tables. Our subsequent consideration might be to authorize that data types that Power Query has supposed are the suitable ones, as well as make changes in case applicable.

When the data model feels like a screenshot, here we are all set to jump into the visualization procedure!

Stage X Byte Configuration Data Presenting

Data visualization is very simple and we have utilized only 3 kinds of visuals for achieving: Cards, Line Charts, and a customized scroller.

Cards can be used to show all the macro values (unemployment, inflation, GDP growth) as well as bond prices and variations.

A scroller can be used to show a stock index while line charts are getting some interactivity for the dashboard because they shoe an entire set of ethics that could be previewed when a user hovers his mouse over graphics.

Finally, the slicer related to line charts can be utilized to shorten or extend the given period of the fuel pricing observation.

For more details, contact X-Byte Enterprise Crawling or ask for a free quote!

Send Message

    Send Message