These are projects posted by the students of Dr. Gove Allen at Brigham Young University. These students have taken one semester-long course on VBA and generally have had no prior programming experience

Tuesday, April 16, 2013

Using Excel as A Front End For R and SQL

Executive Summary


While talking with a Statistician at an unnamed investment firm during the semester, I was proposed with an idea that would eventually become my final project. I approached him looking for internship opportunities, however the firm did not have a history of taking interns because they typically could not add value. When I mentioned I was taking a class on Visual Basic he became excited and then shared his problem with me.
The firm had numerous quantitative investment strategies with few quantitative analysts. He being one of them was responsible for any data mining or statistical analysis that took place in the program R. The firm relies heavily on this information. This causes a “bottle neck” of research, he told me. What they needed was for the non-quantitative employees to be able to do their own in depth research without typing a single line of code. “I would like them to do something like a Generalized AutoRegressive Conditional Heteroskedastic model through excel”, he said. This kind of analysis obviously could not be done in excel, but through the open source statistical software R, it could. The task would then be:
1.      Use excel only as a front end interface with R
      2    .     Use R as a backend to access the server information and perform analysis
      3.     To do it all remotely and robustly from a simple userform

The whole program needs to be scalable. The firm will want to add their own models, many of them proprietary, and integrate it into the overall investment strategy of the firm. The development of the application will be ongoing as strategies change depending on investment climates. There are an unlimited amount of statistical models and test to be ran, this application provides a template that could be enlarged to meet the needs of this investment firm. The program runs like so:                                                                                                                                                                                                                             

1    1.     Queries the data base to populate the userform to let the user know which tables are available for analysis
2.     Retrieves table names, column headings, data type or any other information wanted.
3.     After the type of statistical test, table and variables are selected excel then, connects to R remotely. The parameterized R script is rewritten with the variables that where selected in the userform. This now rewritten text file is passed to R, the analysis is done and sent to another text file. The R addin package “RODBC” is installed to allow data base interaction and SQL statement compatibility. A Data Source Name must be configured in administrative tools on your control panel, then referenced in the parameterized R code. The parameterized code must coincide the structure of the data being retrieved from the server (for instance commands to a matrix requires different operations).
4.     Excel then retrieves the parsed R output using the agent and displays it on a worksheet (this is scalable to retrieve graphical out).

Over the summer I will be expanding this project for the investment firm. I am presenting the program to the head quantitative analyst at the end of the semester and I am sure they will be pleased to have the beginning of a solution for the vexing problem.

Files for download - complete archive of folders and files


  1. Hi, THis is really good but i am unable to understand it.. could anyone explain about this...

  2. BlueHost is the best hosting provider for any hosting services you might require.


Blog Archive