Data Science Interview Questions and Answers Set 7 -

61. How will you combine multiple different string like “Data”, “Science”, “in” ,“R”, “Programming” as a single string “Data_Science_in_R_Programmming” ?

paste(“Data”, “Science”, “in” ,“R”, “Programming”,sep=”_”)

62. Write a function to extract the first name from the string

“Mr. Tom White”. substr (“Mr. Tom White”,start=5, stop=7)

63. Can you tell if the equation given below is linear or not? Emp_sal= 2000+2.5(emp_age)2

Yes it is a linear equation as the coefficients are linear.

64. What will be the output of the following R programming code? var2<- c(“I”,”Love,”DeZyre”)

var2

It will give an error.

65. What will be the output of the following R programming code? x<-5

if(x%%2==0)

print(“X is an even number”) else

print(“X is an odd number”)

Executing the above code will result in an error as shown below –

## Error: :4:1: unexpected ‘else’

## 3: print(“X is an even number”)

## 4: else

## ^

R programming language does not know if the else related to the first ‘if’ or not as the first if() is a complete command on its own.

DATA SCIENCE TRAINING
Weekend / Weekday Batch

66. I have a string “contact@dezyre.com”. Which string function can be used to split the string into two different strings “contact@dezyre” and “com” ?

This can be accomplished using the strsplit function which splits a string based on the identifier given in the function call. The output of strsplit() function is a list.

strsplit(“contact@dezyre.com”,split = “.”)

Output of the strsplit function is –

## [[1]]

## [1] ” contact@dezyre” “com”

67. What is R Base package?

R Base package is the package that is loaded by default whenever R programming environent is loaded .R base package provides basic fucntionalites in R environment like arithmetic calcualtions, input/output.

68. How will you merge two dataframes in R programming language?

Merge () function is used to combine two dataframes and it identifies common rows or columns between the 2 dataframes. Merge () function basically finds the intersection between two different sets of data.

Merge () function in R language takes a long list of arguments as follows – Syntax for using Merge function in R language –

merge (x, y, by.x, by.y, all.x or all.y or all )

• X represents the first dataframe.
• Y represents the second dataframe.
• by.X- Variable name in dataframe X that is common in Y.
• by.Y- Variable name in dataframe Y that is common in X.

• all.x – It is a logical value that specifies the type of merge. all.X should be set to true, if we want all the observations from dataframe X . This results in Left Join.

• all.y – It is a logical value that specifies the type of merge. all.y should be set to true , if we want all the observations from dataframe Y . This results in Right Join.

• all – The default value for this is set to FALSE which means that only matching rows are returned resulting in Inner join. This should be set to true if you want all the observations from dataframe X and Y resulting in Outer join.

69. Write the R programming code for an array of words so that the output is displayed in decreasing frequency order.

R Programming Code to display output in decreasing frequency order –

tt <- sort(table(c(“a”, “b”, “a”, “a”, “b”, “c”, “a1”, “a1”, “a1”)), dec=T)

depth <- 3 tt[1:depth]

Output –

1) a a1 b

2) 3 3 2

70. How to check the frequency distribution of a categorical variable?

The frequency distribution of a categorical variable can be checked using the table function in R language. Table () function calculates the count of each categories of a categorical variable.

gender=factor(c(“M”,”F”,”M”,”F”,”F”,”F”))

table(sex)

Output of the above R Code – Gender

F M 4 2

Programmers can also calculate the % of values for each categorical group by storing the output in a dataframe and applying the column percent function as shown below –

t = data.frame(table(gender))
t$percent= round(t$Freq / sum(t$Freq)*100,2)

Gender	Frequency	Percent
F	4	66.67
M	2	33.33