Coursera Google Data Analytics Professional Data Analysis with R Programming (Week 3) Quiz Answer-Working with data in R.

Recommended Courses:
Coursera: Machine Learning
Google Data Analytics Professional Certificate.

3.Working with data in R.

Question 1

A data analyst is creating a new data frame. Their dataset has dates, currency, and text strings. What characteristic of data frames is this an instance of?

Columns should contain the same number of items
Columns should be named
Data stored can be many different types
Variables should be named

A data frame is a collection of columns. Characteristics of data frames include: all columns should be named, data stored can be many different types, and all columns should contain the same number of items. The dataset in question has a variety of data types, which is related to the idea that data stored can be many different types.

Question 2

A data analyst is considering using tibbles instead of basic data frames. What are some of the limitations of tibbles? Select all that apply.

Tibbles can never change the input type of the data
Tibbles can overload a console
Tibbles can never create row names
Tibbles won’t automatically change the names of variables

Tibbles are useful when working with large datasets because they make printing easier. But tibbles can never change the input type of the data, create row names, or change the names of variables.

Question 3

A data analyst is working with a large data frame. It contains so many columns that they don’t all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?

head()
str()
colnames()
mutate()

The colnames() function will return a list of all the column names in a data frame for easy reference.

Question 4

A data analyst is working with the ToothGrowth dataset in R. What code chunk will allow them to get a quick summary of the dataset?

separate(ToothGrowth)
glimpse(ToothGrowth)
min(ToothGrowth)
colnames(ToothGrowth)

The code chunk is glimpse(ToothGrowth). The glimpse() function provides the analyst with a quick summary of the data in the ToothGrowth dataset. This function shows what all of the column names are and how many rows there are.

Question 5

A data analyst is working with the penguins dataset. What code chunk does the analyst write to make sure all the column names are unique and consistent and contain only letters, numbers, and underscores?

drop_na(penguins)
rename(penguins)
clean_names(penguins)
select(penguins)

The code chunk is clean_names(penguins). The clean_names() function ensures that there are only characters, numbers, and underscores in the names used in the data frame.

Question 6

A data analyst is working with the penguins data. They write the following code:

penguins %>%

The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. What code chunk does the analyst add to create a data frame that only includes the Gentoo species?

filter(species == "Gentoo")
filter(Gentoo == species)
filter(species <- "Gentoo")
filter(species == "Adelie")

The code chunk is filter(species == "Gentoo"). The filter function allows the data analyst to specify which part of the data they want to view. Two equal signs in an argument mean "exactly equal to." Using this operator instead of the assignment operator <- calls only the data about Gentoo penguins to the dataset.

Question 7

A data analyst is working with the penguins dataset. They write the following code:

penguins %>%
    group_by(species) %>%

What code chunk does the analyst add to find the mean value for the variable body_mass_g?

summarize(mean(body_mass_g))
summarize(=body_mass_g)
summarize(max(body_mass_g))
summarize(body_mass_g(mean))

The code chunk is summarize(mean(body_mass_g)). The summarize function gives high-level information about a dataset.

Question 8

A data analyst is working with a data frame named salary_data. They want to create a new column named wages that includes data from the rate column multiplied by 40. What code chunk lets the analyst create the wages column?

mutate(salary_data, wages = rate * 40)
mutate(salary_data, rate = wages * 40)
mutate(wages = rate * 40)
mutate(salary_data, wages = rate + 40)

The code chunk is mutate(salary_data, wages = rate * 40). The analyst can use the mutate() function to create a new column called wages that includes data from the rate column multiplied by 40. The mutate() function can create a new column without affecting any existing columns.

Question 9

A data analyst is working with a data frame named customers. It has separate columns for area code (area_code) and phone number (phone_num). The analyst wants to combine the two columns into a single column called phone_number, with the area code and phone number separated by a hyphen. What code chunk lets the analyst create the phone_number column?

unite(customers, "phone_number", area_code, phone_num, sep="-")
unite(customers, area_code, phone_num, sep="-")
unite(customers, "phone_number", area_code, phone_num)
unite(customers, "phone_number", area_code, sep="-")

The code chunk unite(customers, "phone_number", area_code, phone_num, sep="-"). lets the analyst create the phone_number column. The unite() function lets the analyst combine the area code and phone number data into a single column. In the parentheses of the function, the analyst writes the name of the data frame, then the name of the new column in quotation marks, followed by the names of the two columns they want to combine. Finally, the argument sep="-" places a hyphen between the area code and phone number data in the phone_number column.

Question 10

A data analyst wants to summarize their data with the sd(), cor(), and mean(). What kind of measures are these?

Numerical
Summary
Statistical
Standard

Standard deviation, correlation, mean, maximum, and minimum are statistical measures which can be used to summarize data.

Question 11

In R, which statistical measure demonstrates how strong the relationship is between two variables?

Maximum
Standard deviation
Correlation
Average

Correlation measures how strong the relationship between two variables is. This is represented by the cor() function.

Question 12

A data analyst is studying weather data. They write the following code chunk:

bias(actual_temp, predicted_temp)

What will this code chunk calculate?

The average difference between the actual and predicted values
The minimum difference between the actual and predicted values
The maximum difference between the actual and predicted values
The total average of the values

The bias() function can be used to calculate the average amount a predicted outcome and actual outcome differ in order to determine if the data model is biased.

一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一一

Machine Learning Coursera-All weeks solutions [Assignment + Quiz] click here

Coursera Google Data Analytics Professional Quiz Answers click here

Have no concerns to ask doubts in the comment section. I will give my best to answer it.
If you find this helpful kindly comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks & Regards,
- Wolf

Search This Blog

Solution provider

Coursera Google Data Analytics Professional Data Analysis with R Programming (Week 3) Quiz Answer-Working with data in R.

Comments

Post a Comment

Popular posts from this blog

Coursera Google IT Support Professional Certificate Quiz Answers of all 5 courses.

Coursera Google Data Analytics Professional Quiz Answers of all 8 courses.Data Analytics Professional Certificate

Coursera Google Data Analytics Professional Data Analysis with R Programming (Week 5) Quiz Answer-Documentation and reports.