Data Science Interview Questions and Answers Set 5

41. What do you understand by element recycling in R?

If two vectors with different lengths perform an operation –the elements of the shorter vector will be re-used to complete the operation. This is referred to as element recycling.

Example – Vector A <-c(1,2,0,4) and Vector B<-(3,6) then the result of A*B will be ( 3,12,0,24). Here 3 and 6 of vector B are repeated when computing the result.

42. How can you verify if a given object “X” is a matrix data object?

If the function call is.matrix(X) returns true then X can be considered as a matrix data object otheriwse not.

43. How will you measure the probability of a binary response variable in R language?

Logistic regression can be used for this and the function glm () in R language provides this functionality.

44. What is the use of sample and subset functions in R programming language?

Sample () function can be used to select a random sample of size ‘n’ from a huge dataset.

Subset () function is used to select variables and observations from a given dataset.

45. There is a function fn(a, b, c, d, e) a + b * c – d / e.

Write the code to call fn on the vector c(1,2,3,4,5) such that the output is same as fn(1,2,3,4,5).

do.call (fn, as.list(c (1, 2, 3, 4, 5)))

DATA SCIENCE TRAINING
Weekend / Weekday Batch


46. How can you resample statistical tests in R language?

Coin package in R provides various options for re-randomization and permutations based on statistical tests. When test assumptions cannot be met then this package serves as the best alternative to classical methods as it does not assume random sampling from well-defined populations.

47. What is the purpose of using Next statement in R language?

If a developer wants to skip the current iteration of a loop in the code without terminating it then they can use the next statement. Whenever the R parser comes across the next statement in the code, it skips evaluation of the loop further and jumps to the next iteration of the loop.

48. How will you create scatterplot matrices in R language?

A matrix of scatterplots can be produced using pairs. Pairs function takes various parameters like formula, data, subset, labels, etc.

The two key parameters required to build a scatterplot matrix are –

• formula- A formula basically like ~a+b+c . Each term gives a separate variable in the pairs plots where the terms should be numerical vectors. It basically represents the series of variables used in pairs.

• data- It basically represents the dataset from which the variables have to be taken for building a scatterplot.

49. How will you check if an element 25 is present in a vector? There are various ways to do this-

i. It can be done using the match () function- match () function returns the first appearance of a particular element.

ii. The other is to use %in% which returns a Boolean value either true or false.

iii. Is.element () function also returns a Boolean value either true or false based on whether it is present in a vector or not.

50. What is the difference between library() and require() functions in R language?

There is no real difference between the two if the packages are not being loaded inside the function. require () function is usually used inside function and throws a warning whenever a particular package is not found. On the flip side, library () function gives an error message if the desired package cannot be loaded.