Thursday, April 19, 2012

Internet in China

It is well known that the Chinese government has strict rules when it comes to internet use but I did not know just how strict this was. Upon further reading, I found out a lot of about Chinese internet service and content. The Chinese government bans way more sites than I could imagine. Sites like Twitter, Facebook, and YouTube are blocked. These are sites that I literally use every single day here. They have made several sites that are meant to mimic these sites but are approved by the government. I can't speak to the usefulness of the sites personally because I don't use them but I can imagine that they are not all to similar to our social media sites here in America.
This brings us to the net major topic which is censorship. It is reported that the government employs thousands of individuals to comb through just about everything that is posted on the internet and remove things that they find to rebellious. It is also reported that the search word "freedom" from Google. To go along with all of the sites that are blocked and the messages that are erased they employ more people to sway peoples opinions. These people go into forums and message boards and make posts that our pro-government or promote the ideas of the government. Basically they are removing the ability to tell if what someone has said in a forum is genuine or not. Now I'm sure that this happens here in the US but probably not for the government. I can see different organizations going into forums to promote their views but I think that the government doing it is a bit much. They have also gone as far as to remove sites of groups who promote democratic ideals even if they are sites of very intellectual and often high standing individuals in society. Other blogs, like blogs pertaining to sexual activity, are often removed.
Google has had its fair share of trouble in China. Chinese government filters results in Google which is a conflict of interest for Google who would like unfiltered results in there searches. They shut down service for a while in 2010 and this angered the Chinese government. Google also complained that several civil rights activists email accounts had been hacked. All of the issues that are present with China's internet service is interesting to say the least. I cannot even imagine after having used internet here for so long trying to get around the internet in China. It's a shame because they have more internet users than anywhere else in the world but they do not have the access that most countries do. More recently two microblog sites had been censored to not allow comments in the last year. They also tried to pass a law that would have software on every new computer that would allow the government to follow just about everything that you do on a computer. They do scan most all activity that goes on on the internet there anyway but this would be a further extension of that. It would be absolutely terrifying to surf the internet in China and after reading about their internet I am truly grateful for mine.

Thursday, March 22, 2012

Twitter Inforgraphic


create infographics with visual.ly

This infographic is a comparison between my account and one of my favorite comedian's, Rob Delaney's, account. I thought that this would be an interesting thing to compare because I leisurely use twitter and Rob Delaney uses it to further his career. Being that he uses it to make impressions on people and promote himself his usage is much higher but it is interesting to see just how much more he uses twitter.

To start of Rob has a TON more followers than I do. He is usually regarded as one of the funniest people to follow on twitter but then again.. I consider myself one of the funniest people on twitter as well. It could be related to the next piece of the infographic which is the amount of people we follow. Rob follows a little more than ten times the amount of people I do. This may be why I do not have as many followers as him. Based on these previous two facts it can be seen that I have a significantly lower following to follower ratio. This means that my popularity is much smaller than Rob's. The next part that can seen is the significantly the lower amount of times I am mentioned per week than Rob. He is consistently mentioned over 30,000 times a week to my mere 10 or so. This is odd because all of my friends tell me that they really like me but apparently Rob's friends like him more. Nothing personal though. I do average one mention per tweet which I am proud of because I am not a fraction.

The next part is very interesting because it shows the timing and location of my tweets and followers. As you can see in the infographic I do not tweet a lot and their are times during the day that I don't usually tweet. Conversely there are not many times of the day when Rob does not tweet. Only a few hours out of every day are there times when Rob is not tweeting. The next piece is interesting to see where all of your followers are. Rob is mostly in North America but mine is different. My followers are allegedly from Africa mostly but I do not think that piece is correct.

In the end it is interesting to see the breakdown that someone likes myself who is a nobody and does not use twitter to much and someone who is a celebrity of sorts and uses twitter all the time. The usage among the two of us could not be more different and the infographic does a good job of taking this info and making it easily understood visually.

Tuesday, March 6, 2012

Twitter Sentiment

Wordle: Twitter Sentiment WRX Wordle: twitter sentiment subaru

I used two different twitter sentiment sites to find how much Subaru and Wrx were mentioned. The first site i used produced the words on the left and I used wordle to put them into a word cloud. I retrieved less tweets on this site than the second one I used. Because of this more words appear large in the word cloud because they were mentioned relatively the same because of the small sample it pulled. The site did not do a great job of pulling lots of tweets compared to the second site I used. The second word cloud appears a little strange because the website I used to produce the words pulled many more tweets than the other. Because of this large pool their was lots of disparity between the top used words and less used words. Hence, the very large SUBARU tag and much smaller secondary words. The second site also did a better job of evaluating the sentiment associated with a word and even had a neutral sentiment.

Monday, March 5, 2012

Think Big Analytics

Think Big Analytics is a company based out of Mountain View, California. The company is dedicated to consulting companies and discovering opportunities for businesses to use Hadoop. Think Big Analytics comes into businesses and convinces them to start using Hadoop and really unlock the value in all their data. Hadoop allows the organizations to store all of their structured and unstructured data. Hadoop can be further utilized by running detailed and complex queries to unlock the value in the unstructured data and find trends that were not obvious, or even apparent because the data was not being stored and analyzed without Hadoop. Think Big Analytics consults companies on the ways that they could be using their data to create business value in ways that they had not thought possible with their simple relational databases that can't handle the unstructured data they were creating every day. Think Big Analytics also offers a seminar and a brainstorming sessions so businesses can conceptualize the need for a system to analyze their unstructured data and really use it for maximum results.

Think Big Analytics works with a variety of fields and companies to try and bring Hadoop and its advantages to companies. However, it is impossible to find any information on jobs that Think Big Analytics has completed without formally requesting whitepapers. Think Big Analytics has helped companies in the fields of advertising, social media, financial services, retail, and healthcare. In theory, since I don't have a actual case, Think Big sits down with the client and educate them on the benefits of using Hadoop and implementing the system in their business. The next step is the envisioning process. Here Think Big helps the clients to brainstorm what Hadoop will be used for within their organization. The last part of the process is the engineering process. Here, Think Big Anaytics constructs solutions for the clients using Hadoop and the supporting elements associated with it. I imagine this is what the process would look like in an actual case. For instance, Think Big might help an advertising company to use Hadoop to gather unstructured data pertaining to companies they represent and the sentiment surrounding them. Think Big would help them put this data to use and set up solutions for creating value from the data.

My evaluation of Think Big Analytics is that they could be useful to a company. It is difficult though because, their may not be much of a calling for them in several years. If companies adapt Hadoop at a decent rate and  understand the value of Hadoop I can picture many companies keeping in house Hadoop professionals to set up and provide Hadoop solutions. They seem to do well as a start up for the Hadoop process within companies but if Hadoop continues to grow in popularity they might fade out as less companies need help setting up Hadoop and more people become comfortable setting up and working within the Hadoop infrastructure.

This is Think Big Analytics Reference Architecture.



Cloudera


1. Hadoop is a big deal because of the flexibility it provides to companies that need data managment. One major advantage of Hadoop is that it is open source. That appeals to lots of companies in and of itself. Another advantage to Hadoop is that it can handle incredibly large amounts of unstructured data and structured data. Lots of major companies are adopting it to handle their data management needs. Its revolutionizing the way we store unstructured data.

2. Cloudera is the enterprise that offers Hadoop as a package but is free under Apache licenses. It was formed inThey offer two packages. The first is Hadoop in its raw form without any technical help. The second product offers Hadoop but also assists in consulting, setting up, and managing Hadoop.

3. PIG is a platform that is used in conjunction with Hadoop to analyze large data sets. The query language is humorously called PIG latin and queries can be created by their owners to do special processing of the data sets. The advantage of PIG is that it, much like a actual pig, can consume anything. Their is no data set that PIG can't analyze.

4. HIVE is similar to PIG. Hive does similar functions of PIG in that it analyzes large data sets of structured and unstructured data. The main advantage of HIVE is that it is based on SQL. Because SQL is already in use in most of the organizations it is one less thing to learn when using Hadoop.

5. Cassandra is a hybid non-sequel, non-relational database. The major advantage of Cassandra is that fields do not have to be predetermined before you add data to the database. This allows for the database to be scaled up without having to manually move data or restart processes. Cassandra also backs up data so that their can never be a single point of failure.

6. Mahout is a machine that learns, and interprets data sets and gives useful feedback on trends or patterns. Mahout is a extremely glorified data mining machine that learns from past experiences and provides useful business data. Mahout is open source and focuses on giving scalable machine-learning algorithms.

Monday, February 27, 2012

Many Eyes

This graph depicts the relationship between the net number of migrants from different countries and the overall US population. The graph gives information pertaining to past, current, and predictive trends in migration to the U.S. The graph is excellent for showing the different times through history when spikes in migration have occurred from certain countries. Two particular countries are Germany and Russia. You can see the quick spikes in the amount of people immigrating and even emigrating. In the early 90's its interesting to see the emigration of people from Bosnia leaving the US.

This graph depicts death totals around the world in comparison to their overall populations. The graph gives information on past, present, and predictive amounts of population and death amounts. The most interesting part of the data set is the predictive portion. When you look at China's bubble you see just how large of a population it has. You also notice that as the population is predicted to grow it begins to slow and the death amount begins to climb significantly. Then the population begins to decrease and the death amounts continue to increase. Basically, the data set predicts that China will not be able to support its growing population and because of this the death toll will grow. Because the population reaches its threshold the deaths begin to add up and the population experiences a downturn. India on the other hand, shows that it does have the resources to expand its population. It is predicted to have a larger population than China and a fewer death total as well.

This last graph is a depiction of life expectancy at birth as compared to their population. The data compares past, present, and futuristic predictions of life expectancy at birth and their population. I know when I was looking at the data I expected the US to be right at the top of the list for life expectancy and expected to see China and India in the middle of the pack. However it was not so, the United States is not at the top and surprisingly, China and India are both within 10 years of the US's life expectancy. It is also interesting to see the various African nations that have steep drop offs in life expectancy. Rwanda, Somalia, and Cambodia all have life expectancies that drop to under 20 years with Rwanda's reaching a low of about 10 years at one point. This can be linked to the historical genocides that have happened in those countries in the last 50 years. Another interesting part of the graph is the unexpected longest life expectancy at birth going to Monaco at nearly 90 years.

Energy Production - comparing the US and China

This first graph shows the relationship between our per-capita income and our CO2 emissions. As expected the U.S. has one of the highest per-capita GDPs. On the other end of the graph is China who has a relatively low per-capita GDP. Americans, while not really taking all the steps necessary to overcome global warming, are at least aware of how much CO2 we emit in comparison to other countries. So it is understandable that the U.S. is on the high end of the spectrum for CO2 emissions and GDP per-capita. As the years have advanced we've seen a steady climb in the greenhouse gas output in America. That is not uncommon because most of the countries do have a gradual increase in the CO2 output. However, China goes through an incredible increase in the last decade in the graph. Perhaps, it is because of the economic growth that China has been experiencing that has resulted in a increase in the output of greenhouse gasses.

This second graph shows the energy production in quarts of oil with relation to GDP per-capita. On the right side of the graph the U.S. boasts their high GDP per-capita. On the left side is China with their very low GDP per capita. The graph itself shows the changes in energy production and GDP per capita over time. As the years progress their is a general trend of an increase in GDP and energy production. One interesting piece of the graph is the Russia bubble. Russia is interesting because they exhibit a massive energy production because of the nuclear energy they were harnessing for a while. After some of the issues they had with it, it falls significantly down the graph. China increase's their energy production and when compared to my other graph you can see that as their CO2 greenhouse gass has increased so has their energy production.

Wednesday, January 25, 2012

Yahoo Pipes 1

My pipe is meant to sort through all of the ESPN stories and just return the ones that contain the words nba or basketball.


Yahoo Pipes 1

Wednesday, January 18, 2012

The Big Switch


·     The cloud is very similar to the idea of a power supplier in the 1800’s. Business’ do not need to store and manage their data but can now pay google to do so on their cloud.
·      We are on the brink of a shift in the way people and businesses store data.
·       It makes more sense for companies to buy into a giant storage facility like google cloud rather than invest lots of their own money into building their own data warehouse.
·        Cloud computing challenges the assumption that businesses need to store and manage their own data and data warehouses.
·         By storing data in a cloud you can allocate more of your capital to things like innovation and R and D.
·        Clouds allow for companies to be more centralized and efficient in their data storage.
·        People’s attitudes towards cloud computing is changing for the better and evolving with the technology.
·      Cloud computing and virtualization allow companies to free up things that used to be hard assets into cheaper virtual assets.
·       The biggest sign that cloud computing is going to be big is on the consumer side. We have already began to adapt where consumers no longer have to go out and buy software in physical form and put in the disc and download the programs anymore.
·       Businesses are following the trend rather than setting the trend in cloud computing and now are starting to make the shift that consumers have already made.

Saturday, January 14, 2012

Electric Telegraph


Similarities
1.       Soon after their creation electric telegraphs could transmit messages across the world almost instantly.
2.       In telegraphs and the internet the message is encoded and decoded on the other end of the spectrum.
3.       In Both cases lines are used in order to transfer data from point to point. The internets just happens to be much more complex now.

Differences
1.       The internet as we know it today does not require someone to decode the message like was necessary as the telegraph was getting started. Rather someone rights code and systems have ways of decoding that message and displaying it on your device.
2.       One obvious difference is the complexity the internet can handle. A telegraph was used for very basic messages almost like writing letters to people where now the internet allows for people to send messages with links to other messages that contain huge amounts of data.
3.       Another obvious difference is the visual aspect of the internet. Telegraphs did not allow for things to be shown visually like the internet does with pictures and videos.

Interesting Fact
1.       According to Wikipedia it took 15 years between the first commercial Morse Telegraph line being formed and the first transcontinental telegraph to be laid.

HTML

Sherlock Holmes: A Game of Shadows

Sherlock has his girl that he likes kidnapped and she dies of TB. His friend gets married and while on the way to the honeymoon almost get killed. They find out that the man who is trying to kill them has been buying up all of the resources needed to maintain a war. Sherlock must infiltrate a gathering and discover the man who is set to kill a political candidate in order to start the war. Sherlock saves the person and ends up killing his arch nemesis.

Shadows
  1. Robert Downey Jr. is awesome.

  2. Jude Law is awesome.

  3. I thought Rachel McAdams would be in it more.

  4. The first one was pretty good.

  5. I enjoy the mystery genre.

Sherlock Holmes: A Game of Shadows