Executive Summary
Task
While talking with a Statistician at an unnamed investment firm during the semester, I was proposed with an idea that would eventually become my final project. I approached him
looking for internship opportunities, however the firm did not have a history
of taking interns because they typically could not add value. When I mentioned I
was taking a class on Visual Basic he became excited and then shared his
problem with me.
The firm had numerous quantitative
investment strategies with few quantitative analysts. He being one of them was
responsible for any data mining or statistical analysis that took place in the
program R. The firm relies heavily on this information. This causes a “bottle
neck” of research, he told me. What they needed was for the non-quantitative employees
to be able to do their own in depth research without typing a single line of
code. “I would like them to do something like a Generalized AutoRegressive
Conditional Heteroskedastic model through excel”, he said. This kind of
analysis obviously could not be done in excel, but through the open source
statistical software R, it could. The task would then be:
1. Use excel only as a front end interface
with R
2 . Use R as a backend to access the server information and perform analysis
3. To do it all remotely and robustly from a simple userform
2 . Use R as a backend to access the server information and perform analysis
3. To do it all remotely and robustly from a simple userform
Solution
The whole program needs to be scalable.
The firm will want to add their own models, many of them proprietary, and
integrate it into the overall investment strategy of the firm. The development
of the application will be ongoing as strategies change depending on investment
climates. There are an unlimited amount of statistical models and test to be ran,
this application provides a template that could be enlarged to meet the needs
of this investment firm. The program runs like so:
1 1. Queries the data base to populate the
userform to let the user know which tables are available for analysis
2. Retrieves table names, column headings, data type or any other information wanted.
3. After the type of statistical test, table and variables are selected excel then, connects to R remotely. The parameterized R script is rewritten with the variables that where selected in the userform. This now rewritten text file is passed to R, the analysis is done and sent to another text file. The R addin package “RODBC” is installed to allow data base interaction and SQL statement compatibility. A Data Source Name must be configured in administrative tools on your control panel, then referenced in the parameterized R code. The parameterized code must coincide the structure of the data being retrieved from the server (for instance commands to a matrix requires different operations).
4. Excel then retrieves the parsed R output using the agent and displays it on a worksheet (this is scalable to retrieve graphical out).
2. Retrieves table names, column headings, data type or any other information wanted.
3. After the type of statistical test, table and variables are selected excel then, connects to R remotely. The parameterized R script is rewritten with the variables that where selected in the userform. This now rewritten text file is passed to R, the analysis is done and sent to another text file. The R addin package “RODBC” is installed to allow data base interaction and SQL statement compatibility. A Data Source Name must be configured in administrative tools on your control panel, then referenced in the parameterized R code. The parameterized code must coincide the structure of the data being retrieved from the server (for instance commands to a matrix requires different operations).
4. Excel then retrieves the parsed R output using the agent and displays it on a worksheet (this is scalable to retrieve graphical out).
Over the summer I will be expanding this
project for the investment firm. I am presenting the program to the head
quantitative analyst at the end of the semester and I am sure they will be
pleased to have the beginning of a solution for the vexing problem.
Files for download
http://files.gove.net/shares/files/13w/jessebo/FinalWriteUp.pdf
http://files.gove.net/shares/files/13w/jessebo/final.xlsm
http://files.gove.net/shares/files/13w/jessebo/VBAFinalcomplete.zip - complete archive of folders and files
Hi, THis is really good but i am unable to understand it.. could anyone explain about this...
ReplyDelete