Thursday, March 22, 2012

Twitter Inforgraphic


create infographics with visual.ly

This infographic is a comparison between my account and one of my favorite comedian's, Rob Delaney's, account. I thought that this would be an interesting thing to compare because I leisurely use twitter and Rob Delaney uses it to further his career. Being that he uses it to make impressions on people and promote himself his usage is much higher but it is interesting to see just how much more he uses twitter.

To start of Rob has a TON more followers than I do. He is usually regarded as one of the funniest people to follow on twitter but then again.. I consider myself one of the funniest people on twitter as well. It could be related to the next piece of the infographic which is the amount of people we follow. Rob follows a little more than ten times the amount of people I do. This may be why I do not have as many followers as him. Based on these previous two facts it can be seen that I have a significantly lower following to follower ratio. This means that my popularity is much smaller than Rob's. The next part that can seen is the significantly the lower amount of times I am mentioned per week than Rob. He is consistently mentioned over 30,000 times a week to my mere 10 or so. This is odd because all of my friends tell me that they really like me but apparently Rob's friends like him more. Nothing personal though. I do average one mention per tweet which I am proud of because I am not a fraction.

The next part is very interesting because it shows the timing and location of my tweets and followers. As you can see in the infographic I do not tweet a lot and their are times during the day that I don't usually tweet. Conversely there are not many times of the day when Rob does not tweet. Only a few hours out of every day are there times when Rob is not tweeting. The next piece is interesting to see where all of your followers are. Rob is mostly in North America but mine is different. My followers are allegedly from Africa mostly but I do not think that piece is correct.

In the end it is interesting to see the breakdown that someone likes myself who is a nobody and does not use twitter to much and someone who is a celebrity of sorts and uses twitter all the time. The usage among the two of us could not be more different and the infographic does a good job of taking this info and making it easily understood visually.

Tuesday, March 6, 2012

Twitter Sentiment

Wordle: Twitter Sentiment WRX Wordle: twitter sentiment subaru

I used two different twitter sentiment sites to find how much Subaru and Wrx were mentioned. The first site i used produced the words on the left and I used wordle to put them into a word cloud. I retrieved less tweets on this site than the second one I used. Because of this more words appear large in the word cloud because they were mentioned relatively the same because of the small sample it pulled. The site did not do a great job of pulling lots of tweets compared to the second site I used. The second word cloud appears a little strange because the website I used to produce the words pulled many more tweets than the other. Because of this large pool their was lots of disparity between the top used words and less used words. Hence, the very large SUBARU tag and much smaller secondary words. The second site also did a better job of evaluating the sentiment associated with a word and even had a neutral sentiment.

Monday, March 5, 2012

Think Big Analytics

Think Big Analytics is a company based out of Mountain View, California. The company is dedicated to consulting companies and discovering opportunities for businesses to use Hadoop. Think Big Analytics comes into businesses and convinces them to start using Hadoop and really unlock the value in all their data. Hadoop allows the organizations to store all of their structured and unstructured data. Hadoop can be further utilized by running detailed and complex queries to unlock the value in the unstructured data and find trends that were not obvious, or even apparent because the data was not being stored and analyzed without Hadoop. Think Big Analytics consults companies on the ways that they could be using their data to create business value in ways that they had not thought possible with their simple relational databases that can't handle the unstructured data they were creating every day. Think Big Analytics also offers a seminar and a brainstorming sessions so businesses can conceptualize the need for a system to analyze their unstructured data and really use it for maximum results.

Think Big Analytics works with a variety of fields and companies to try and bring Hadoop and its advantages to companies. However, it is impossible to find any information on jobs that Think Big Analytics has completed without formally requesting whitepapers. Think Big Analytics has helped companies in the fields of advertising, social media, financial services, retail, and healthcare. In theory, since I don't have a actual case, Think Big sits down with the client and educate them on the benefits of using Hadoop and implementing the system in their business. The next step is the envisioning process. Here Think Big helps the clients to brainstorm what Hadoop will be used for within their organization. The last part of the process is the engineering process. Here, Think Big Anaytics constructs solutions for the clients using Hadoop and the supporting elements associated with it. I imagine this is what the process would look like in an actual case. For instance, Think Big might help an advertising company to use Hadoop to gather unstructured data pertaining to companies they represent and the sentiment surrounding them. Think Big would help them put this data to use and set up solutions for creating value from the data.

My evaluation of Think Big Analytics is that they could be useful to a company. It is difficult though because, their may not be much of a calling for them in several years. If companies adapt Hadoop at a decent rate and  understand the value of Hadoop I can picture many companies keeping in house Hadoop professionals to set up and provide Hadoop solutions. They seem to do well as a start up for the Hadoop process within companies but if Hadoop continues to grow in popularity they might fade out as less companies need help setting up Hadoop and more people become comfortable setting up and working within the Hadoop infrastructure.

This is Think Big Analytics Reference Architecture.



Cloudera


1. Hadoop is a big deal because of the flexibility it provides to companies that need data managment. One major advantage of Hadoop is that it is open source. That appeals to lots of companies in and of itself. Another advantage to Hadoop is that it can handle incredibly large amounts of unstructured data and structured data. Lots of major companies are adopting it to handle their data management needs. Its revolutionizing the way we store unstructured data.

2. Cloudera is the enterprise that offers Hadoop as a package but is free under Apache licenses. It was formed inThey offer two packages. The first is Hadoop in its raw form without any technical help. The second product offers Hadoop but also assists in consulting, setting up, and managing Hadoop.

3. PIG is a platform that is used in conjunction with Hadoop to analyze large data sets. The query language is humorously called PIG latin and queries can be created by their owners to do special processing of the data sets. The advantage of PIG is that it, much like a actual pig, can consume anything. Their is no data set that PIG can't analyze.

4. HIVE is similar to PIG. Hive does similar functions of PIG in that it analyzes large data sets of structured and unstructured data. The main advantage of HIVE is that it is based on SQL. Because SQL is already in use in most of the organizations it is one less thing to learn when using Hadoop.

5. Cassandra is a hybid non-sequel, non-relational database. The major advantage of Cassandra is that fields do not have to be predetermined before you add data to the database. This allows for the database to be scaled up without having to manually move data or restart processes. Cassandra also backs up data so that their can never be a single point of failure.

6. Mahout is a machine that learns, and interprets data sets and gives useful feedback on trends or patterns. Mahout is a extremely glorified data mining machine that learns from past experiences and provides useful business data. Mahout is open source and focuses on giving scalable machine-learning algorithms.

HTML

Sherlock Holmes: A Game of Shadows

Sherlock has his girl that he likes kidnapped and she dies of TB. His friend gets married and while on the way to the honeymoon almost get killed. They find out that the man who is trying to kill them has been buying up all of the resources needed to maintain a war. Sherlock must infiltrate a gathering and discover the man who is set to kill a political candidate in order to start the war. Sherlock saves the person and ends up killing his arch nemesis.

Shadows
  1. Robert Downey Jr. is awesome.

  2. Jude Law is awesome.

  3. I thought Rachel McAdams would be in it more.

  4. The first one was pretty good.

  5. I enjoy the mystery genre.

Sherlock Holmes: A Game of Shadows