18.104.22.168 Lab – Basic Data Analysis (Instructor Version)
Use very simple methods to describe existing data, fill-in missing data values and to make simple predictions.
- Part 1: Learn how to Use Data as Information
- Part 2: Plot data and predict values
Data is meaningless in and of itself. Information is meaningful and useful. Data only becomes information when it used in context to answer specific questions. In this lab, you will use graphs of existing data to create missing values and to predict values based on trends.
- PC or mobile device with Internet access
- Browser capable of playing a video from the Internet.
- Audio capability to listen to video narration.
Part 1: Learn how to Use Data as Information
Data analysis can occur in many different ways. The ultimate goal is to discover something in the data that gives insight into what has happened or to predict what may happen in the future. Descriptive statistics summarizes what happened and provides the data in a numeric or graphical way. Predictive analytics answers the question of what may happen in the future based on past data.
Step 1: Describe the data.
a. Review the data chart shown on Worksheet 1.
b. Over the 110-year period for the data set, what is the range (lower and upper limits) of median ages for males and females at first marriage?
Male median age: 22.8 to 26.8
Female median age: 20.3 to 25.1
c. The range is a type of descriptive statistic that summarizes the data. It has been presented above in a numerical format. But if you need to look at the trend of the data, it may be better to graph or plot the data.
Step 2: Plot the data.
a. Plot the data on the chart that is found on Worksheet 1 at the end of this lab. Use different colors for the male and female data. Do not connect the data points until you are told to do so in the video in Step 3.
Step 3: Perform simple data prediction.
a. Khan Academy is an excellent source for lessons on a wide range of topics including statistics and statistical concepts. You will watch a short video and follow along with the narrator using the data table and chart provided at the end of this lab. Navigate to https://youtu.be/aVDiAGZmcPo and follow the Predicting with Linear Models video provided by Khan Academy. Pause the video and complete the activities along with the video instructor. Use the plot you created.
Watch the entire video. You will only work with the first dataset that the instructor discusses in this lab. In the video, the instructor demonstrates how to use data points as information to create new estimated data points.
What is a linear model?
using a line to discripe some trends in data
The instructor in the video teaches the processes of interpolation and extrapolation as tools for estimating or predicting data in a linear model. Define each term.
trying to estimate what happened between two data points
last two data points are? And see what trend looks like when keep on continue
that trend? And see what might happened if that trend continue
What are two interesting observations that the instructor in the video makes regarding the trends in the median age of marriage and the ages of the males and females who marry?
1- Median age for males and females goes to lower and lower until 1960 then they got marriage older. That is mean on 1960 people got marriage on a younger age for males and females.
2- Age difference between median age of males and females comes smaller and smaller till 2000
Part 2: Plot data and Predict Values
Other social trends can make use of the simple linear model that was demonstrated in the video. In this section of the lab, you will interpolate and extrapolate values from a new data set.
a. Plot the data on Worksheet 2 at the end of the lab. Use different colors for the two different variables.
b. Note that the data was collected at different intervals. Before 2000, the data was collected every ten years, however, after 2000 it was collected every 5 years. Interpolate values for the three missing years. Plot your values.
|Missing Year||Woman hours||Men Hours|
c. Extrapolate values for the year 2020 by creating a line that best summarizes the values for the previous five periods.
For man it will be 12.2 hours and for woman it will be 15.5 hours in week
d. Another kind of information that can be derived from this data is about the gap between the number of hours of housework for men versus the number of hours of housework for women. This will display another trend regarding the equality between men and woman over this period. Complete the table below by filling it in with the amount of time that women do house work subtracted from the amount of time that men do housework.
|Date||Men housework hours/week||Woman housework hours/week||Women hours – Men hours|
e. Graph the calculated values on the chart provided on Worksheet 3 at the end of this lab.
What has been the trend for equality between men and women doing housework?
I don’t believe the equality in this filed sense the woman most doing housework more than men
If men and woman were completely equivalent in the amount of housework they do per week, in 2020, where would the next data point be plotted?
In the IoT, Big Data comes from many sources. Sometimes values are missing because a sensor temporarily lost connectivity or data points were lost in transmission. Interpolation can serve as one strategy for replacing missing data. Extrapolation is used to predict values for events that have not yet occurred. Because the IoT yields so much data, predictive analytic models can be built that reliably see into the future by extrapolating trends from historical data.