Python Forum
Stock clustering and portfolio diversification. Suitable features. - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Stock clustering and portfolio diversification. Suitable features. (/thread-41846.html)



Stock clustering and portfolio diversification. Suitable features. - timurkanaz - Mar-26-2024

Hi. I want to conduct a research. I want to cluster stocks and then make a safe diversified portfolio. I want to work with annual data. From price data I took beta coefficient and average price change. I also want to use companies reports to take some fundamental data, especially multipliers. Aiming to build a diversified portfolio, what multipliers can I take from the company's annual reports? Maybe someone had an experience? Thanks in advance for answers!


RE: Stock clustering and portfolio diversification. Suitable features. - Larz60+ - Mar-27-2024

The yahoo finance package will get you the historic information you need
reference: https://pypi.org/project/yfinance/

simple code to get one years data:

import yfinance as yf
import pandas as pd
from pathlib import Path

# supply ticker symbol:
symbol = "NVDA"

ticker = yf.Ticker(symbol)

# get one years historical data data is formatted as (data will be pandas dataframe)
data = ticker.history(period="1y")

# to save as csv (change path as desired)
currentpath = Path(".")
savefilename = currentpath / f"{symbol}_History_1yr.csv"
data.to_csv(savefilename, index=False)
first few lines of file :
Output:
Open,High,Low,Close,Volume,Dividends,Stock Splits 268.28396165972663,269.91344399919114,263.5654735665401,265.2249450683594,36102600,0.0,0.0 264.3851962988121,265.0449883242188,258.4171094180526,264.01531982421875,35610400,0.0,0.0 268.1639924579148,270.6931800574298,265.88472470225616,269.75347900390625,39369400,0.0,0.0 272.20270324123067,274.90181922605285,270.9330909208125,273.7421875,36451600,0.0,0.0 271.3129559427999,278.250732726046,270.96306208621024,277.680908203125,43393300,0.0,0.0 275.00179770776697,279.91022713743365,273.272341393005,279.5603332519531,39871600,0.0,0.0
as for the clustering, see this
click on Lecture 7 for slides.

Also, this blog looks interesting