Python Integrated Stock data Retrieval and Stock Filter

This post aims to summarize all the works described in previous posts and shows a consolidated python module that can retrieve multiple stock data sets and act as a simple stock filter. The flowchart below shows the full steps taken to run a filter. If using the alternative time-saving approach as show in the flow chart, the time to scan through around 500 stocks would take less than 15 min. It can generate different series of filtered stocks depending on the list of criteria files created and can be scheduled to run each day prior to the market opening.

Python Integrated Stock retrieval and filter

The list below described how individual scripts are created at each posts.

  1. Getting most recent prices and stock info from Yahoo API: “Extracting stocks info from yahoo finance using python (Updates)”
  2. Criteria filtering: “Filter stocks data using python”
  3. Historical data/dividend several alternatives:
    1. Scraping from Yahoo API: “Getting historical stock quotes and dividend Info using python”.
    2. Scraping using YQL: “Get historical stock prices using Yahoo Query Language (YQL) and Python”.
    3. Retrieve from database: “Storing and Retrieving Stock data from SQLite database”.
  4. Company info and company financial data several alternatives:
    1. Direct scraping: “Direct Scraping Stock Data from Yahoo Finance”
    2. Scraping using YQL:“Scraping Company info using Yahoo Query Language (YQL) and Python”.
  5. Web scraping for stock tech analysis. “Basic Stock Technical Analysis with python”.

Below shows a sample run with a few sets of criteria. The qty left after each filtered parameters are displayed. Finally the results sample from one of the run, the “strict” criteria, are shown. Note that the filtered results depends on the accuracy and also whether the particular parameter is present in Yahoo database.

The combined run script is Stock_Combine_info_gathering.py and it is avaliable with rest of the modules at the GitHub.

 List of filter for the criteria: lowprice
—————————————-
NumYearPayin4Yr > 3
PERATIO > 4
Qtrly Earnings Growth (yoy) > 0
PERATIO < 15
Pre3rdYear_avg greater OPEN 0 # means current  price lower than 3yr ago

Processing each filter…
—————————————-
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 142
Current Screen criteria: Greater PERATIO
Modified_df qty: 110
Current Screen criteria: Less PERATIO
Modified_df qty: 66
Current Screen criteria: Compare Pre3rdYear_avg,OPEN
Modified_df qty: 19

END

List of filter for the criteria: highdivdend
—————————————-
NumYearPayin4Yr > 3
LeveredFreeCashFlow > -1
TRAILINGANNUALDIVIDENDYIELDINPERCENT > 5
PRICEBOOK < 1.5
TrailingAnnualDividendYieldInPercent < 100
TotalDebtEquity < 50

Processing each filter…
—————————————-
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 142
Current Screen criteria: Greater LeveredFreeCashFlow
Modified_df qty: 107
Current Screen criteria: Greater TRAILINGANNUALDIVIDENDYIELDINPERCENT
Modified_df qty: 30
Current Screen criteria: Less PRICEBOOK
Modified_df qty: 25
Current Screen criteria: Less TotalDebtEquity
Modified_df qty: 20
END

List of filter for the criteria: strict
—————————————-
CurrentRatio > 1.5
EPSESTIMATECURRENTYEAR > 0
DilutedEPS > 0
ReturnonAssets > 0
NumYearPayin4Yr > 2
PERATIO > 4
LeveredFreeCashFlow > 0
TRAILINGANNUALDIVIDENDYIELDINPERCENT > 2
PERATIO < 15
TotalDebtEquity < 70
PRICEBOOK < 1.5
PEGRatio < 1.2
YEARHIGH greater OPEN 0

Processing each filter…
—————————————-
Current Screen criteria: Greater CurrentRatio
Modified_df qty: 139
Current Screen criteria: Greater EPSESTIMATECURRENTYEAR
Modified_df qty: 42
Current Screen criteria: Greater DilutedEPS
Modified_df qty: 41
Current Screen criteria: Greater ReturnonAssets
Modified_df qty: 37
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 32
Current Screen criteria: Greater PERATIO
Modified_df qty: 32
Current Screen criteria: Greater LeveredFreeCashFlow
Modified_df qty: 20
Current Screen criteria: Greater TRAILINGANNUALDIVIDENDYIELDINPERCENT
Modified_df qty: 15
Current Screen criteria: Less PERATIO
Modified_df qty: 8
Current Screen criteria: Less TotalDebtEquity
Modified_df qty: 7
Current Screen criteria: Less PRICEBOOK
Modified_df qty: 5
Current Screen criteria: Less PEGRatio
Modified_df qty: 5
Current Screen criteria: Compare YEARHIGH,OPEN
Modified_df qty: 5
END

 Results from “strict” criteria:

sample stock results

 

 

Advertisement

36 comments

  1. I am trying to work on these scripts and using ubuntu 15.04. These scripts requried many modules. Is there list of modules and place to get them. Please suggest

    1. Hi Zealny, thanks for the feedback. For each of the sub module, I usually include the modules and links required at each individual posts. You can also install them using pip. However, I will try to write a separate post on the list of modules required as I know it can be confusing.

      As for now, the key modules (link found in posts) are as below:
      1. Pandas (data frame and data tables)
      2. Pattern (web related)
      3. simplejson (json handling)

      There are some additional modules that are used for more specific modules such as
      1. scipy (regression and other scientific function)
      2. matplotlib (plotting)
      3. difflib (string comparison)
      4. pypushbullet (notification)

      Below are two modules that can be found in my github (spidezad):
      1. Dict_create_fr_text (get dict fr text)
      2. xls_table_extract_module (get table fr excel)

      Note that the main module “yahoo_finance_data_extract” require Excel (Windows) to extract certain paramters. This can be disabled if not running on window system. You can email me if you need help on this. Thanks.

      1. Hello Kok Hua,

        Thank you for helping me.
        I installed all modules except difflib. I could not find it online.

        I am having problem as follows
        >python Stock_Combine_info_gathering.py
        Unable to use the GUI function.
        Traceback (most recent call last):
        File “Stock_Combine_info_gathering.py”, line 40, in
        from SGX_stock_announcement_extract import SGXDataExtract
        File “SGX_stock_announcement_extract.py”, line 49, in
        from xls_table_extract_module import XlsExtractor
        ImportError: No module named xls_table_extract_module

        Help me.

      2. Hi Kok Hua,

        How do I disable it so that I can use this in OSX?

        Thank you.
        Jessie

      3. Hi Jessie,

        You can comment out the module (xls extractor). The module mainly for extracting list of stock symbols. You can replace the lines where it call the module function with appropriate list.

        You can refer to my Reply to Alberto on Mar 5, 2016 for more details. Just scrolled further down to it.

        Hope it helps.

    1. Hi Zealny, I think you can straight away import the difflib library, think it is already available in the std library.

      As for the pyExcel, it is just provide a more convenient way for me to set the parameters that I need for the data retrieval. You can bypass it by setting it directly in yahoo_finance_data_extract module as below;
      self.enable_form_properties_fr_exceltable = 0 # set to zero
      self.cur_quotes_property_str = ‘nsl1opvkj’ #default list of properties to copy. can set the properties here.

      The list of properties can be found in the below url link:
      https://code.google.com/p/yahoo-finance-managed/wiki/enumQuoteProperty

      Hope that helps

  2. Hello Kok Hua,

    Thanks for the great work, this code is amazing!

    I’m currently trying to run the combined run script Stock_Combine_info_gathering.py, however I’m getting the following error:

    File “Stock_Combine_info_gathering.py”, line 40, in
    from SGX_stock_announcement_extract import SGXDataExtract
    File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/SGX_stock_announcement_extract.py”, line 49, in
    from xls_table_extract_module import XlsExtractor
    File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/xls_table_extract_module.py”, line 32, in
    from pyExcel import UseExcel
    File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/pyExcel.py”, line 126, in
    import win32com.client.dynamic
    ImportError: No module named win32com.client.dynamic

    I’m using a Mac… I read in the comments section that you have a way to avoid this error when you are not using windows. Can you please help me?

    Thanks in advance!

    1. Hi Alberto,

      Thanks for your comments and feedback. I mainly use the xlsExtractor to retrieve certain settings such as company list or parameter list which can be easily replaced by setting it to be a list. You can remove all instance of xlsExtractor and replace them by a list of the required parameters. For example,

      Under SGX_stock_announcement_extract.py (Line 118):
      ## target stocks for announcements — using excel query
      xls_set_class = XlsExtractor(fname = r’C:\data\stockselection_for_sgx.xls’, sheetname= ‘stockselection’,
      param_start_key = ‘stock//’, param_end_key = ‘stock_end//’,
      header_key = ‘header#2//’, col_len = 2)
      xls_set_class.open_excel_and_process_block_data()
      self.announce_watchlist = xls_set_class.data_label_list #also get the company name

      You can comment all and replace the self.announce_watchlist with a list of stock symbol, replace the self.companyname_watchlist with list of corresponding company name.

      self.announce_watchlist = [‘O5RU’, ‘A17U’, ‘B20’]
      self.companyname_watchlist = [‘AIMSAMP Cap Reit’, ‘Ascendas Reit’, ‘Biosensors’]

      Hope that helps.

  3. I’m getting an error and having a lot of trouble figuring out the problem:

    Traceback (most recent call last):
    File “Stock_Combine_info_gathering.py”, line 34, in
    from Basic_data_filter import InfoBasicFilter
    File “/home/corncob/Projects/Stocks/yahoo_finance_data_extract-master/Basic_data_filter.py”, line 46, in
    from DictParser.Dict_create_fr_text import DictParser
    ImportError: No module named DictParser.Dict_create_fr_text

    I installed the DictParser module using “python setup.py install” and checked the dist-package directory, all of the files seem to be present. Any help you can offer is greatly appreciated.

    Some amplifying information:
    Currently using Ubuntu 14.04, Python 2.7.6

    1. Hi Jake, are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.

  4. Hi, first of all thanks for this great work !
    I am new to python and trying to run your scripts.
    Which version of python are you using for the scripts ?
    I have different problem when running using different version of python.
    Thanks

      1. I am now having problem on the pyExcel part.
        I installed the pyexcel using pip, and the pyexcel from your github too.

        however I got this error :
        File “c:\yahoo_finance\SGX_stock_announcement_extract.py”, line 49, in
        from xls_table_extract_module import XlsExtractor
        File “C:\Python27\lib\site-packages\xls_table_extract_module.py”, line 36, in
        from pyET_tools.pyExcel import UseExcel
        ImportError: No module named pyET_tools.pyExcel

        Can you help?

      2. Hi Kok Hua, manage to make it pass the last error.
        Now I stuck at the stockselection_for_sgx.xls file.
        Can you share the format of this xls file, so that i can create my own list.
        Thanks

  5. Hi, great work on this code. I am getting an error “ImportError: cannot import name pyExcel” but I have already installed pyExcel (from Git). Also, there are some references to pyET_tools which does not exist. Can you help?

    1. Hi, are you able to find this module PyExcel folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.

  6. Hey kok Hua, I am getting an issue where there is no module named DictParser. I read a previous comment someone else was struggling with it. I downloaded dict Parser module and ran the setup.py install. All ran correctly but it still wont import. After checking out the site-packages folder i have : DictParser-1.0-py2.7.egg-info(file not a folder), Dict_create_fr_text.py all inside just site-packages , there is no DictParser Folder. so it doesnt look right at all. How do i fix this?

    1. Hi geoff, sorry to hear about the issue. Are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works. If there is no DictParser Folder, please create one in site-packages and copy both the __init__.py and the Dict_create_fr_text.py to the folder. Hope that helps. Please let me know if that is not working for you.Thanks

      1. Hi IKEL, regarding your question on the get started instruction,unfortunately I do not have an instruction on hand. I will try to write one in near future. For now, perhaps you can start with the Stock_Combine_info_gathering.py which will run the scripts. There is an option to run different parts of the module in line 85.
        partial_run = [‘a2′,’b’,’c_pre’,’c’,’d_pre’,’e’,’f’, ‘g’]#e is storing data

        For a simple running version, you can try the other script which is the simpler to use
        https://github.com/spidezad/google_screener_data_extract.

        Hope that helps

    1. Hi IKEL, regarding the error on the ‘YFinanceDataExtr not defined, can you check the yahoo_finance_data_extract.py file is in the same directory as the script (Stock_Combine_info_gathering.py)you running? It should work if they are all under same directory.

      Let me know if you still have problems running.

    2. I have some “get blown out” instruction…
      Try Running It!
      I was not able to run it either.
      No DictParser & other stuff.

      self.FeelingStupid = True

      1. Hi Avraham, most of the yahoo stock API no longer working so you might have trouble running this. Let me see if I can come up with a new post for a working version.

  7. Hello,

    First of all, great job!

    Secondly, I’m having the following error while compiling it in anaconda navigator environment:

    File “Stock_Combine_info_gathering.py”, line 82
    print time.ctime()
    ^
    SyntaxError: invalid syntax

    1. Hi Slasanto, thank you for your compliment. 🙂 For the anaconda environment, are you using Python 3.x? If yes, then the scripts will not work as they are based on python 2.x.

      1. It worked. Thank you. Now I’m having troubles with the DictParser. I installed but I can’t see any package named “DictParser” or something similar. What I do see are the following files:
        “DictParser-1.0-py2.7.egg-info
        Dict_create_fr_text.py
        Dict_create_fr_text.pyc”

        All three are in the anaconda site-packages, more detailed:
        /Users/user-name/anaconda/envs/py27/lib/python2.7/site-packages

      2. Hi Slasanto, are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s