## 1.3.2.6 Lab – Demonstrate Spreadsheet Limitations in Data Analysis (Instructor Version)

### Objectives

Observe the limitations of a spreadsheet when it is used for data analysis.

• Part 2: Determine the Root Bridge

### Background / Scenario

There are massive amounts of open data available on the Internet for you to download and use to perform analysis. In this lab, you will explore the limitations of spreadsheets in data analysis using an open dataset from the City of Seattle.

### Required Resources

• PC device with high-speed Internet access
• Spreadsheet program, such as Microsoft Excel, Google Sheet, LibreOffice Calc, Apple Numbers, or OpenOffice Calc

You will also explore some of the characteristics of this dataset.

b. Click View Data.

This dataset lists the road and air temperatures from the sensors embedded on bridges and surface
streets within the city limits of Seattle. The data is updated every fifteen minutes.

How many fields (columns) are in the dataset? How many records (rows)?
6(columns) and 100 rows__

What are the field names?

### Part 2: Explore the Limitations of a Spreadsheet

a. Open your spreadsheet program. Microsoft Excel is used in this example.

c. The Text Import Wizard displays. In step 1 of 3, click Next to continue.

d. In step 2 of 3, select Comma as the delimiter and unselect Tab. Click Next to continue.

e. In step 3 of 3, click Finish to continue.

f. Click OK to import the data.

What is the warning message you received?
The size of data is very large than one worksheet can fit it.

How will this limitation impede your analysis of the data?
Need many discrete worksheet to analysis a data and it’s difficult on big data.

g. Open a new spreadsheet. Try to reach the maximum number of rows in the spreadsheet. What is the maximum number of rows in the spreadsheet of your choice?
1048576 in excel 2019.

Compare the maximum number of rows in the spreadsheet to the number of records in the Road Weather Information Stations dataset from https://data.seattle.gov. Using this observation, explain the limitation of the spreadsheet.

### Reflection

If the popular spreadsheet programs cannot handle larger data sets, what data analysis tools are available? Use the Internet to search for some possible tools.
Microsoft HDInsight: is a Spark and Hadoop service in the cloud. It provides big data cloud offerings in two categories, Standard and Premium. It provides an enterprise-scale cluster for the organization to run their big data workloads.

Spark: Apache Spark is a powerful open source big data analytics tool. It offers over 80 high-level operators that make it easy to build parallel apps. It is used at a wide range of organizations to process large datasets.