Category Archives: Visualization

Visualizing Life Satisfaction data by Multivariate Analysis

This week OECD relaunched their Better Life Index this week and provided data behind it.

I’ve applied multivariate statistical data analysis methods to average value and you can see results below. Quite interesting groups of countries had emerged.

X-axis separates countries by those with high Life satisfaction index vs those with low. Y axis separates countries by job availability.

  • The most satisfied group of countries is within top right quadrant, having all highly developed countries.
  • The least satisfied group of countries is in the bottom left quadrant of the plot with unemployment being the major factor contributing into their unsatisfaction. This is highest for Eastern European block which experienced economical difficulties in recent years.
  • Countries at top left quadrant are less happy  than  those in the top right quadrant, but not by much. The major factors are high level of crime and long hard working hours. The least satisfied in this group is Turkey (farthest on the plot from Life Satisfaction Index).  Interestingly, Israel has one of the highest wealth and health indicators, lowest crime but at the same time long working hours and worse housing conditions.
  • Countries in the center of the plot is where all indicators balance out. Level of life satisfaction for this group is in between the worse groups. It balanced out by not very high “positive” indicators such as wealth and health and not very high “negative” indicators such as unemployment and crime.
  • Education does not seem to affect life satisfaction as much as other parameters. It is lowest for the group of countries in the left top quadrant and highest for the group in the right top quadrant but both these groups are quite satisfied with life.
Analysis of women and men values separately will be done soon as well.
PLSDA, PLS_Toolbox 6. in Matlab was used with autoscaling options for processing
Original data used for analysis:
Tagged , ,

Data analysis for vaccine awareness week

I have combined the following data into one data matrix:
2009 Vaccination data table (subset of most often given vaccines reflecting the trend) ;
– American Human Development Index by State from American Human development Project;

Classification of Blue and Red states;

First, I have applied PCA to just vaccination rates data, where states partisan classification was used for classes.
No correlation between vaccination rates and partisan class was observed.

Second, Principal Component Analysis was applied to all data (vaccination rates and human development index) with autoscaling.

PC1 captures 39% of variance in the data and separates samples by those having high rank (cumulative index), high Education, Income and Health index from those having low Rank. Mostly blue states and some red states (AK, ND, NE, UT, KS) have highest rank. Vaccination rates do not contribute into PC1 (close to 0) indicating that there is no direct correlation between human development index and vaccination.

PC 2 separates states by vaccination rates. Those on top have higher vaccination rates that those on the bottom of a biplot. There is week correlation (captured in ~16% of variance in the data) between vaccination rates and education and income index and rank.

4 groups of states are classified by PCA:
1. Red states having quite good vaccination rates and very bed HD index.
2. Mostly blue states and some red states having best vaccination rates and best HD index.
3. Purple states having worst vaccination rates and worst HD index.
4. Mixed states – some blue, some red and Co with very bed vaccination rates but high Health Index.

New (milder) version of visualization of sexual data

Map of sexual activities created by PCA

Some additional interpretation is included as well

Visualizing sexual health data from national survey

This is my attempt in response to challenge to visualize data from National Survey of Sexual Health and Behavior.

I have applied Principal Component Analysis and have used biplot as indicator to create a map, which shows distribution of various sexual activities and showing separation of men and women by age on the map.
The 1st PC (horizontal axis) separates data by gender-specific activities: in the middle there are activities common to both genders, to the left – men-specific and to the right – women-specific activities.
The 2nd PC (vertical axis) separates by degree of “advancement” in type of activity, masturbation being the most conservative one common to very youngest and oldest men and women, and anal sex being the most rare mostly common for the most sexually active age group of both men and women.

Gracenote music maps

Gracenote: Music Maps

Not that different – at the end

Powered by ScribeFire.

Me on

1. kartyush’s eclectic score is


If your score is small (lower than 70) your musical preferences are very limited, and if it is large (larger than 80), then you have an eclectic musical preference.

2. AEP for kartyush is: 4.09 – If it is greater than 4 you have very diverse taste in music


Artyushkova at TouchGraph

TouchGraph | Products: Google Browser
Use this free Java application to explore the connections between related websites.


Map of science image in the journal Nature

W. Bradford Paley: Map of science image in the journal Nature

Image was constructed by sorting roughly 800,000 scientific papers (shown as white dots) into 776 different scientific paradigms (red circular nodes) based on how often the papers were cited together by authors of other papers. Links (curved lines) were made between the paradigms that shared common members, then treated as rubber bands, holding similar paradigms nearer one another when a physical simulation had every paradigm repel every other: thus the layout derives directly from the data. Larger paradigms have more papers. Labels list common words unique to each paradigm.

Technorati Tags: ,

Do you keep track of these data in our life?

feltron vii
a personal annual report documenting a multitude of events, including the
‘airmiles traveled’, ‘number of emails sent’, ‘number of photos taken’,
‘animals eaten’, ‘books read’, ‘most frequented bar’, ‘best meal’,
‘miles run’, ‘plants killed’ or ‘beverages drunk by type’.

powered by performancing firefox