Big data & Analytics Chapter 4 Quiz Answers – Advanced Data Analytics and Machine Learning

1. When you follow the scientific method, which step would occur after testing the hypotheses through experimentation?

  • Communicate the results of the process.
  • Analyze data from an experiment to draw a conclusion.
  • Ask a question about an observation.
  • Perform research.

Explanation: The scientific method is commonly used in scientific discovery and contains the following steps:

Step 1. Ask a question about an observation such as what, when, how, or why.
Step 2. Perform research.
Step 3. Form a hypothesis from this research.
Step 4. Test the hypothesis through experimentation.
Step 5. Analyze the data from the experiments to draw a conclusion.
Step 6. Communicate the results of the process.

2. What is the most commonly used statistical method for analyzing data?

  • regression analysis
  • mean estimation
  • mean analysis
  • sample proportion

Explanation: Regression analysis is the most commonly used statistical method for analyzing data and there are many regression models available. Regression analysis can look for correlations between one predictor variable and one target variable or for correlations between more than one predictor variable and a target variable.

3. If the results of a study do not align with previous studies, what question should an evaluator ask?

  • Are there any experts that disagree with the findings?
  • Who paid for the research study?
  • Can the study be replicated to verify the findings?
  • Did the study have an appropriate sample size?

Explanation: When following the evaluation guidelines, if a study does not produce findings that confirm or align with the results of current studies in the field, the study should be replicated to verify the reliability of the findings.

4. In a linear regression, which variable is also known as the target or response variable?

  • independent
  • predictor
  • first
  • dependent

Explanation: The dependent variable is also known as the target or response variable. The independent variable is also known as the predictor or explanatory variable.

5. When a number of items are grouped together, which type of machine learning algorithm can determine which items in the group predict the presence of other items?

  • clustering
  • classification
  • regression
  • association

Explanation: Two types of unsupervised machine learning algorithms are association and clustering. Association algorithms determine which items in the group predict the presence of other items when given a number of items that are grouped together. Clustering algorithms determine which items occur most often in clusters when given many items.

6. What is the goal of linear regression?

  • to compute a line the interpolates the data, and which can be expressed as a weighted average of the predictor variables and any other function
  • to provide a formula that does not require validation
  • to provide a summary of the data
  • to construct a flow chart

Explanation: Linear regression is used for predicting a value based on gathered data. Regression analysis has a trend line in a scatter plot that shows the target variable plotted on the y-axis and the independent variable plotted on the x-axis.

7. What are two types of supervised machine learning algorithms? (Choose two.)

  • mean
  • clustering
  • regression
  • classification
  • association
  • mode

Explanation: Two algorithms used with supervised machine learning are classification and regression. Supervised machine learning algorithms are the most common algorithms used in big data analytics.

8. What type of error has occurred when a data scientist records a measurement incorrectly after viewing the correct value on the measuring device?

  • systematic
  • gross
  • random
  • instrumental

Explanation: The different types of errors in measurement include the following:

  • Instrumental – Every device is limited in how precise it can be.
  • Gross – An incorrect value is accidentally recorded after the correct value is viewed.
  • Random – The measuring device is correctly measuring an item and providing a varying value.
  • Systematic – The measuring tool is not correctly calibrated.

9. Which type of regression analysis is often used to model variables that have an exponential relationship?

  • mean
  • polynomial
  • median
  • nonlinear

Explanation: Nonlinear regression analysis is often used to model variables that have an exponential relationship. A nonlinear regression plot may appear as a set of points arranged to a curved path.

10. Which type of information can distort the results of an analysis and careful consideration should be given to their removal from a data set?

  • z-axis
  • outliers
  • units of measurement
  • azimuth

Explanation: Outliers include corrupt or distorted data that deviates far from expected values and can distort the results of an analysis. After careful consideration has been given, these data points are frequently removed from the dataset.

11. When is an experiment considered reliable?

  • if someone else can modify the experiment and achieve the same conclusions
  • if someone else can modify the experiment and achieve similar conclusions
  • if someone else can repeat the experiment and find the same conclusion
  • if someone else can repeat the experiment and find different conclusions

Explanation: An experiment is considered reliable if someone else can repeat it and achieve the same results as the original scientist achieved.

12. Refer to the exhibit. What is the purpose of the blue sphere?

  • to display the mean
  • to indicate data clusters
  • to measure true error
  • to categorize historical data

Explanation: A scientist must calculate a decision boundary to detect anomalies. Anomalous data points are points that lie beyond the decision boundary sphere.

13. A researcher has measured the reliability of a test using the parallel-forms method. What is the expected result of this measurement?

  • How similar are the scores of two different tests that are created from the same content domain?
  • What is the variation of scores for different items in the same test?
  • How much variation exists between scores for the same person taking a test multiple times?
  • How similarly do different people score on the same test?

Explanation: The four different types of reliability that a scientist could examine are as follows:

  • Inter-rater – How similarly do different people score on the same test?
  • Test-retest – How much variation exists between scores for the same person taking a test multiple times?
  • Parallel-forms – How similar are the scores of two different tests that are created from the same content domain?
  • Internal consistency – What is the variation of scores for different items in the same test?

14. Which type of machine learning algorithm uses data sets verified by experts as its learning basis?

  • supervised
  • clustering
  • routing
  • association

Explanation: Supervised machine learning algorithms can learn from a dataset that has already been processed by people. Two types of algorithms used with supervised machine learning are regression algorithms and classification algorithms.

15. Which type of reliability would a scientist measure if the scientist wants to examine the variation between exam scores for a person taking a single test multiple times?

  • internal consistency
  • inter-rater
  • test-retest
  • parallel-forms

Explanation: The four different types of reliability that a scientist could examine include the following:

  • Inter-rater – How similarly do different people score on the same test?
  • Test-retest – How much variation exists between scores for the same person taking a test multiple times?
  • Parallel-forms – How similar are the scores of two different tests that are created from the same content domain?
  • Internal consistency – What is the variation of scores for different items in the same test?
    The correct answer is: test-retest


guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x