Simple Python Script to retrieve all stocks data from Google Finance Screener

A simple python script to retrieve key financial metrics for all stocks from Google Finance Screener. Google screener have more metrics avaliable compared to SGX screener and also contains comprehensive stocks data for various stock exchanges.

In addition, retrieving data from Google Screener is much faster compared to data retrieved from Yahoo Finance or Yahoo Finance API (See the respective blog post from links).

The reason for the fast retrieval is that the information are stored in the form of single json format for all stocks such that it will reduce the number of request calls and downloading. Being in json format also allows easy conversion to a Pandas Dataframe object.

To retrieve the json url of the stock data, go to the Google Screener and select the criteria (like what is normally done when setting up a filter).  Open up the criteria to full range of the particular metrics. In this way, all the stocks will be selected instead of being filtered off. Using the developer tab of any browser, retrieve the full url. For further description of how to retrieve the url, you can refer to the following post: “Getting historic financial statistics of stocks using Python

Two points to take note:  Firstly the URL only include stock list from 1 -20 due to page setting. Set the end stock to a large number eg 3000 (in blue) to include the full stock list. Below is a sample of the corresponding url.

https://www.google.com/finance?output=json&start=0&num=3000&noIL=1&q=[%28exchange%20%3D%3D%20%22SGX%22%29%20%26%20%28dividend_next_year%20%3E%3D%200%29%20%26%20%28dividend_next_year%20%3C%3D%201.46%29%20%26%20%28price_to_sales_trailing_12months%20%3C%3D%20850%29]&restype=company&ei=BjE7VZmkG8XwuASFn4CoDg

Secondly, as Google only allows 12 criteria to be set at any one go, you would need to repeat process multiple times to obtain all the parameters. Repeat the above process by selecting different criteria and join all the parameters together.

Once the url is formed, the same process is used when scraping web data using python as described in most posts in this blog. The main tools are Python Pandas and Python Pattern. Python Pattern is to help with the json file download and Pandas to convert the json file to Data frame which can then be used to join with other parameters.

The difficult part of the script is to obtain the url. Once the url is known, other methods can be employed to download and read the data from the json file.

The script (for all stocks in Singapore) is available in Github. Due to the long url format, the script will form the full url by concatenating the start and end url with the middle portion (which are all the criteria) stored in a file. File is also found in Github.

 (Update: Thanks to Bob1: the start url is changed to https://finance.google.com/finance?output=json&start=0&num=3000&noIL=1&q=)

Advertisements

20 comments

    1. Hi Josiah, thanks for your comment. I am not sure the differences in the first and 2nd url but i assuming the part on &num=3000.
      You are right that increasing the num to a very big value will get all the stocks. 🙂 Thanks.

  1. excellent piece of code. when using with LON stock exchange there were some ascii errors with the £ character. to get round this I altered this statement at the end to force utf coding. this seemed to get round the problems and allow the csv to generate successfully. i also used the tip mentioned above to retrieve all stocks.
    hh.result_google_ext_df.to_csv(r’c:\data\LON.csv’, index =False, encoding=’utf-8′)
    thanks, keep up the good work

    1. Hi Shah, Google finance stock screener do not have the day high and day low parameters so would need to get from elsewhere. Not sure which exchange are you looking at but you can try pulling from local exchange website or yahoo finance and join the information to the info retrieved from the google script.

      While not exactly relevant, the following post demonstrate how to retrieve information from local exchange (in this case, SGX). You can apply similar concept to retrieve daily price data. https://simply-python.com/2015/04/15/retrieving-stock-news-and-ex-date-from-sgx-using-python.

      Hope it helps.

      1. hi.. tnx for your reply. I am looking for data of NSE (India). I am not able to get a single JSON to retrieve HIGH / LOW of all stocks (1800 odd stocks).

      2. Hi Shah, as mentioned in my earlier comment, the link I provided only for Singapore Stock Exchange and only can be used as reference.

        Do you have a website that has the NSE (india) high and Low information. If yes, perhaps I can help take a look.

  2. Does anyone know where I can find all the URL parameters available? The screener itself is down but the URLs to the JSON data still work.

    I’m looking specifically for Quote change (%) parameter.

    1. Hi Scott, the stock screener is still up as what I know of.

      You can refer to the following post to retrieve the json url “Getting historic financial statistics of stocks using Python”. First select the parameter you required, in this case, Quote Change in the screener and use the developer tools under Chrome web browser, select network and find the json output. Right click to copy url link or open in new tab.

      Hope that helps!

      1. Sorry. Do I run the following code on terminal?

        hh = GoogleStockDataExtract()

        hh.target_exchange = ‘NASDAQ’

        hh.retrieve_all_stock_data()

        print hh.result_google_ext_df.head()

        hh.result_google_ext_df.to_csv(r’c:\data\temp.csv’, index =False) #any save file name

  3. Hi – this has been working very well until about 2 weeks ago, when it stopped working properly and now only returns 20 rows. I cant work out what the problem is but presume google have changed something on the stock screener? any ideas? it was working perfectly up until this point…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s