Data Management and Visualization – Coursera Course
The embedded pdf includes the code along with output.
Its time to get cracking …
The initial part where the text file “gap minder” is read in and analyzed with “describe” is self-explanatory. Hence Mean and Spread (in terms of Standard Deviation) can be read directly
Similarly for the other variable:
As the variables of interest “Internet Rate” and “Urban Rate” are both QUANTITATIVE, we analyze them using Histograms. Various options such as number of bins are varied to study the data distribution.
And so on for the other variable
In terms of finding the bi-variate relationship a scatter plot is used.
Instead of using Urban Rate as the independent variable, I have used Internet Rate. The relationship seems to hold this way as well, as the Internet Rate increases, so does the UrbanRate. It may not simply be due the fact that people move out to bigger cities, it might very well be that smaller cities “grow” or get “modernized” in different aspects, including getting good internet.