Executive Summary
Background:
This semester I have been working on a website idea that focuses on user comments. There are websites that focus on the news, pictures, videos, jokes and micro blogs, but none really on comments. Many people spend a ton of time reading and writing comments, but all the commenting systems used now could be improved. We believe people want to hear and be heard, and that there is a ton of good comments that could be crowd-sourced in the right system.
This website would seek to be that improvement by allowing to users to comment on anything (videos, articles, pictures, etc). Currently we are in the validation stage working on finding out what would add the most value to users of this theoretical website. While experimenting with different comment filters, voting systems, rankings and incentive systems, we need to have a database of comments to use for tests.
System Overview:
The purpose of this system is to have a convenient way to scrape highly scored comments off of websites and store them for use in testing and providing content for the future website. The two websites to be scraped are YouTube and Imgur. Both are extremely popular websites that host videos and pictures respectively. Each of them boasts a large and active user comment system. Some of these comments are entertaining or enlightening, while most are not. This system does the following:
· Accept URL’s from either website as inputs.
· Let the user know if the webpage has no comments.
· Parse through the data on the webpage to find the text, author and score of each comment.
· Filter the comment text to eliminate profanity and unnecessary links.
· Return on the data sheet the title, link, and filtered info for the top filtered comments of each webpage.
· Clear and automatically reformat the database at a click of button.
No comments:
Post a Comment