This post aims to summarize all the works described in previous posts and shows a consolidated python module that can retrieve multiple stock data sets and act as a simple stock filter. The flowchart below shows the full steps taken to run a filter. If using the alternative time-saving approach as show in the flow chart, the time to scan through around 500 stocks would take less than 15 min. It can generate different series of filtered stocks depending on the list of criteria files created and can be scheduled to run each day prior to the market opening.
The list below described how individual scripts are created at each posts.
- Getting most recent prices and stock info from Yahoo API: “Extracting stocks info from yahoo finance using python (Updates)”
- Criteria filtering: “Filter stocks data using python”
- Historical data/dividend several alternatives:
- Scraping from Yahoo API: “Getting historical stock quotes and dividend Info using python”.
- Scraping using YQL: “Get historical stock prices using Yahoo Query Language (YQL) and Python”.
- Retrieve from database: “Storing and Retrieving Stock data from SQLite database”.
- Company info and company financial data several alternatives:
- Direct scraping: “Direct Scraping Stock Data from Yahoo Finance”
- Scraping using YQL:“Scraping Company info using Yahoo Query Language (YQL) and Python”.
- Web scraping for stock tech analysis. “Basic Stock Technical Analysis with python”.
Below shows a sample run with a few sets of criteria. The qty left after each filtered parameters are displayed. Finally the results sample from one of the run, the “strict” criteria, are shown. Note that the filtered results depends on the accuracy and also whether the particular parameter is present in Yahoo database.
The combined run script is Stock_Combine_info_gathering.py and it is avaliable with rest of the modules at the GitHub.
List of filter for the criteria: lowprice
—————————————-
NumYearPayin4Yr > 3
PERATIO > 4
Qtrly Earnings Growth (yoy) > 0
PERATIO < 15
Pre3rdYear_avg greater OPEN 0 # means current price lower than 3yr ago
Processing each filter…
—————————————-
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 142
Current Screen criteria: Greater PERATIO
Modified_df qty: 110
Current Screen criteria: Less PERATIO
Modified_df qty: 66
Current Screen criteria: Compare Pre3rdYear_avg,OPEN
Modified_df qty: 19
END
List of filter for the criteria: highdivdend
—————————————-
NumYearPayin4Yr > 3
LeveredFreeCashFlow > -1
TRAILINGANNUALDIVIDENDYIELDINPERCENT > 5
PRICEBOOK < 1.5
TrailingAnnualDividendYieldInPercent < 100
TotalDebtEquity < 50
Processing each filter…
—————————————-
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 142
Current Screen criteria: Greater LeveredFreeCashFlow
Modified_df qty: 107
Current Screen criteria: Greater TRAILINGANNUALDIVIDENDYIELDINPERCENT
Modified_df qty: 30
Current Screen criteria: Less PRICEBOOK
Modified_df qty: 25
Current Screen criteria: Less TotalDebtEquity
Modified_df qty: 20
END
List of filter for the criteria: strict
—————————————-
CurrentRatio > 1.5
EPSESTIMATECURRENTYEAR > 0
DilutedEPS > 0
ReturnonAssets > 0
NumYearPayin4Yr > 2
PERATIO > 4
LeveredFreeCashFlow > 0
TRAILINGANNUALDIVIDENDYIELDINPERCENT > 2
PERATIO < 15
TotalDebtEquity < 70
PRICEBOOK < 1.5
PEGRatio < 1.2
YEARHIGH greater OPEN 0
Processing each filter…
—————————————-
Current Screen criteria: Greater CurrentRatio
Modified_df qty: 139
Current Screen criteria: Greater EPSESTIMATECURRENTYEAR
Modified_df qty: 42
Current Screen criteria: Greater DilutedEPS
Modified_df qty: 41
Current Screen criteria: Greater ReturnonAssets
Modified_df qty: 37
Current Screen criteria: Greater NumYearPayin4Yr
Modified_df qty: 32
Current Screen criteria: Greater PERATIO
Modified_df qty: 32
Current Screen criteria: Greater LeveredFreeCashFlow
Modified_df qty: 20
Current Screen criteria: Greater TRAILINGANNUALDIVIDENDYIELDINPERCENT
Modified_df qty: 15
Current Screen criteria: Less PERATIO
Modified_df qty: 8
Current Screen criteria: Less TotalDebtEquity
Modified_df qty: 7
Current Screen criteria: Less PRICEBOOK
Modified_df qty: 5
Current Screen criteria: Less PEGRatio
Modified_df qty: 5
Current Screen criteria: Compare YEARHIGH,OPEN
Modified_df qty: 5
END
Results from “strict” criteria:
I am trying to work on these scripts and using ubuntu 15.04. These scripts requried many modules. Is there list of modules and place to get them. Please suggest
Hi Zealny, thanks for the feedback. For each of the sub module, I usually include the modules and links required at each individual posts. You can also install them using pip. However, I will try to write a separate post on the list of modules required as I know it can be confusing.
As for now, the key modules (link found in posts) are as below:
1. Pandas (data frame and data tables)
2. Pattern (web related)
3. simplejson (json handling)
There are some additional modules that are used for more specific modules such as
1. scipy (regression and other scientific function)
2. matplotlib (plotting)
3. difflib (string comparison)
4. pypushbullet (notification)
Below are two modules that can be found in my github (spidezad):
1. Dict_create_fr_text (get dict fr text)
2. xls_table_extract_module (get table fr excel)
Note that the main module “yahoo_finance_data_extract” require Excel (Windows) to extract certain paramters. This can be disabled if not running on window system. You can email me if you need help on this. Thanks.
Hello Kok Hua,
Thank you for helping me.
I installed all modules except difflib. I could not find it online.
I am having problem as follows
>python Stock_Combine_info_gathering.py
Unable to use the GUI function.
Traceback (most recent call last):
File “Stock_Combine_info_gathering.py”, line 40, in
from SGX_stock_announcement_extract import SGXDataExtract
File “SGX_stock_announcement_extract.py”, line 49, in
from xls_table_extract_module import XlsExtractor
ImportError: No module named xls_table_extract_module
Help me.
Hi zealny, not sure if difflib is available in standard python 2.7 library, Can you try to just import it? If not, you may need to use pip install.
xls_table_extract_module can be found in Github as followed: https://github.com/spidezad/excel_table_extract. You also need pyExcel which is available in Github as well.
Hi Kok Hua,
How do I disable it so that I can use this in OSX?
Thank you.
Jessie
Hi Jessie,
You can comment out the module (xls extractor). The module mainly for extracting list of stock symbols. You can replace the lines where it call the module function with appropriate list.
You can refer to my Reply to Alberto on Mar 5, 2016 for more details. Just scrolled further down to it.
Hope it helps.
Hello Kok Hua,
Thank you for help.
I found documents for difflib module but not the source code
https://docs.python.org/2/library/difflib.html#a-command-line-interface-to-difflib
pyExcel.py need win32com.client.dynamic module. I think I cant make it work on linux systems.
Hi Zealny, I think you can straight away import the difflib library, think it is already available in the std library.
As for the pyExcel, it is just provide a more convenient way for me to set the parameters that I need for the data retrieval. You can bypass it by setting it directly in yahoo_finance_data_extract module as below;
self.enable_form_properties_fr_exceltable = 0 # set to zero
self.cur_quotes_property_str = ‘nsl1opvkj’ #default list of properties to copy. can set the properties here.
The list of properties can be found in the below url link:
https://code.google.com/p/yahoo-finance-managed/wiki/enumQuoteProperty
Hope that helps
I have been trying to do something like this. Wow, there are so much details in all your posts. Have to say Thanks very much!
Thank you for your comments. Glad it is helpful to you 🙂
Hello Kok Hua,
Thanks for the great work, this code is amazing!
I’m currently trying to run the combined run script Stock_Combine_info_gathering.py, however I’m getting the following error:
File “Stock_Combine_info_gathering.py”, line 40, in
from SGX_stock_announcement_extract import SGXDataExtract
File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/SGX_stock_announcement_extract.py”, line 49, in
from xls_table_extract_module import XlsExtractor
File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/xls_table_extract_module.py”, line 32, in
from pyExcel import UseExcel
File “/Users/Alberto/Desktop/yahoo_finance_data_extract-master/pyExcel.py”, line 126, in
import win32com.client.dynamic
ImportError: No module named win32com.client.dynamic
I’m using a Mac… I read in the comments section that you have a way to avoid this error when you are not using windows. Can you please help me?
Thanks in advance!
Hi Alberto,
Thanks for your comments and feedback. I mainly use the xlsExtractor to retrieve certain settings such as company list or parameter list which can be easily replaced by setting it to be a list. You can remove all instance of xlsExtractor and replace them by a list of the required parameters. For example,
Under SGX_stock_announcement_extract.py (Line 118):
## target stocks for announcements — using excel query
xls_set_class = XlsExtractor(fname = r’C:\data\stockselection_for_sgx.xls’, sheetname= ‘stockselection’,
param_start_key = ‘stock//’, param_end_key = ‘stock_end//’,
header_key = ‘header#2//’, col_len = 2)
xls_set_class.open_excel_and_process_block_data()
self.announce_watchlist = xls_set_class.data_label_list #also get the company name
You can comment all and replace the self.announce_watchlist with a list of stock symbol, replace the self.companyname_watchlist with list of corresponding company name.
self.announce_watchlist = [‘O5RU’, ‘A17U’, ‘B20’]
self.companyname_watchlist = [‘AIMSAMP Cap Reit’, ‘Ascendas Reit’, ‘Biosensors’]
Hope that helps.
I’m getting an error and having a lot of trouble figuring out the problem:
Traceback (most recent call last):
File “Stock_Combine_info_gathering.py”, line 34, in
from Basic_data_filter import InfoBasicFilter
File “/home/corncob/Projects/Stocks/yahoo_finance_data_extract-master/Basic_data_filter.py”, line 46, in
from DictParser.Dict_create_fr_text import DictParser
ImportError: No module named DictParser.Dict_create_fr_text
I installed the DictParser module using “python setup.py install” and checked the dist-package directory, all of the files seem to be present. Any help you can offer is greatly appreciated.
Some amplifying information:
Currently using Ubuntu 14.04, Python 2.7.6
Hi Jake, are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.
Thanks Kok Hua!
You welcome.
Hi, first of all thanks for this great work !
I am new to python and trying to run your scripts.
Which version of python are you using for the scripts ?
I have different problem when running using different version of python.
Thanks
Hi Wai Kun, thank you for your good feedback. I am using python 2.7. Hope that helps.
I am now having problem on the pyExcel part.
I installed the pyexcel using pip, and the pyexcel from your github too.
however I got this error :
File “c:\yahoo_finance\SGX_stock_announcement_extract.py”, line 49, in
from xls_table_extract_module import XlsExtractor
File “C:\Python27\lib\site-packages\xls_table_extract_module.py”, line 36, in
from pyET_tools.pyExcel import UseExcel
ImportError: No module named pyET_tools.pyExcel
Can you help?
Hi Wai Kun, can you change the line to: from pyExcel import UseExcel and see if it works? Thanks.
Hi Kok Hua, manage to make it pass the last error.
Now I stuck at the stockselection_for_sgx.xls file.
Can you share the format of this xls file, so that i can create my own list.
Thanks
Hi Wai Kun,
I have added the file to the git hub. https://github.com/spidezad/yahoo_finance_data_extract. Hope it helps.
Hi, great work on this code. I am getting an error “ImportError: cannot import name pyExcel” but I have already installed pyExcel (from Git). Also, there are some references to pyET_tools which does not exist. Can you help?
Hi, are you able to find this module PyExcel folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.
Hey kok Hua, I am getting an issue where there is no module named DictParser. I read a previous comment someone else was struggling with it. I downloaded dict Parser module and ran the setup.py install. All ran correctly but it still wont import. After checking out the site-packages folder i have : DictParser-1.0-py2.7.egg-info(file not a folder), Dict_create_fr_text.py all inside just site-packages , there is no DictParser Folder. so it doesnt look right at all. How do i fix this?
Hi geoff, sorry to hear about the issue. Are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works. If there is no DictParser Folder, please create one in site-packages and copy both the __init__.py and the Dict_create_fr_text.py to the folder. Hope that helps. Please let me know if that is not working for you.Thanks
Hello, is there a get started instruction on how to use it?
thanks
as i keep getting” name ‘YFinanceDataExtr’ is not defined” error even though I already imported
Hi IKEL, regarding your question on the get started instruction,unfortunately I do not have an instruction on hand. I will try to write one in near future. For now, perhaps you can start with the Stock_Combine_info_gathering.py which will run the scripts. There is an option to run different parts of the module in line 85.
partial_run = [‘a2′,’b’,’c_pre’,’c’,’d_pre’,’e’,’f’, ‘g’]#e is storing data
For a simple running version, you can try the other script which is the simpler to use
https://github.com/spidezad/google_screener_data_extract.
Hope that helps
Hi IKEL, regarding the error on the ‘YFinanceDataExtr not defined, can you check the yahoo_finance_data_extract.py file is in the same directory as the script (Stock_Combine_info_gathering.py)you running? It should work if they are all under same directory.
Let me know if you still have problems running.
I have some “get blown out” instruction…
Try Running It!
I was not able to run it either.
No DictParser & other stuff.
self.FeelingStupid = True
Hi Avraham, most of the yahoo stock API no longer working so you might have trouble running this. Let me see if I can come up with a new post for a working version.
Hello,
First of all, great job!
Secondly, I’m having the following error while compiling it in anaconda navigator environment:
File “Stock_Combine_info_gathering.py”, line 82
print time.ctime()
^
SyntaxError: invalid syntax
Hi Slasanto, thank you for your compliment. 🙂 For the anaconda environment, are you using Python 3.x? If yes, then the scripts will not work as they are based on python 2.x.
It worked. Thank you. Now I’m having troubles with the DictParser. I installed but I can’t see any package named “DictParser” or something similar. What I do see are the following files:
“DictParser-1.0-py2.7.egg-info
Dict_create_fr_text.py
Dict_create_fr_text.pyc”
All three are in the anaconda site-packages, more detailed:
/Users/user-name/anaconda/envs/py27/lib/python2.7/site-packages
Hi Slasanto, are you able to find this module DictParser folder in the python site-packages directory? If yes, you might be missing an __init__.py. Can you try create an empty file and rename is as __init__.py? See if it works.