Monday, January 12, 2015

Movie Stats .. Bollywood

Hi, I have been recently thinking of an area to analyse and after watching some bad movies, I decided to give a go for bollywood movies. Can we crunch it through numbers/stats to fin out whether the movie will do good or bad ( irrespective of what the movie is about). This project is comprehensive and it will take time to collect data, so i will go slowly

I took the 2001-2010 decade and tried to analyse the move gross by months. We see that that the month of Dec did not hold a lot of promise. It was obviously pre-Aamir Khan era. From 2007 there has been a surge in the gross collected in the month of December. A jump of almost 150% to 100 crore rupees

In the above images we can see a lot of shifts

1) Increase in movie gross Revenue ( in crores )

2) Change in the monthly Gross Revenue ( in crores )

It can be easily seen that purely on the basis of Month, we will not be able to ascertain the earnings. E.g. In the era 2002 to 2009, there were few months that would steal away most of the revenues. In 2007-2008 alone Aug, Oct and Dec stole away 50% of the yearly gross. In 2013 the top 3 months have taken away 38% of the revenue. The figure was even less in the years 2012 and 2011. Movies have begun to spread out evenly and there are more opportunities for newcomers and new genres

On running a simple linear regression between the
a) Earnings of Dec : Dependant Variable
b) Earnings of Aug, Sep, Oct, Nov : Independant Variables

Intercept 11.28481646
Aug 0.281777793
Sep 0.280503546
Oct -0.094513125
Nov -0.419597559

It is inversely related to the month of November ( highly inverse )

There are more factors at play
1) Director/Producer
2) Actor
3) Festival
4) etc.. etc

I will be taking all of these into consideration in my next article. Please let me know if you have some other suggestions/ideas

Thursday, January 8, 2015

Pulling Company Information through R

For some time now, I was grappling  with getting company fundamental ratios to be able to do analysis. I have written a small R script that downloads the necessary market data for that stock ( in the example AAPL ). The source of the data is YAHOO Finance

After execution the R workspace will contain 2 data frames
ratios    -> Quantitative Data
ratios2  -> Qualitative Data

Download File

I am working on getting more balance sheet related information. It is always better to have more information while doing analysis

If anybody wants to have the complete list of companies enlisted in NSE, you can get it here Download File

After searching around for some time, I wrote a script that will scrape and list the tickers etc for products around the world. The current list contains of 110,000 products. You can download the Excel file here Download File Timestamp : 2nd March,2015 ( I refrained from using csv, because the company name consisted of all sorts of characters which can meddle with string separation :) )