We have two options for serving ads within Newsfeed: 1 – out of every 25 stories, one will be an ad 2 – every story has a 4% chance of being an ad For each option, what is the expected number of ads shown in 100 news stories? If we go with option 2, what is the chance a user will be shown only a single ad in 100 stories? What about no ads at all?

The expected number of ads shown in 100 new stories for option 1 is equal to 4 (100/25 = 4). Similarly, for option 2, the expected number of ads shown in 100 new stories is also equal to 4 (4/100 = 1/25 which suggests that one out of every 25 stories will be an ad, … Read more

There’s a game where you are asked to roll two fair six-sided dice. If the sum of the values on the dice equals seven, then you win $21. However, you must pay $5 to play each time you roll both dice. Do you play this game? And in the follow-up: If he plays 6 times what is the probability of making money from this game?

The first condition states that if the sum of the values on the 2 dices is equal to 7, then you win $21. But for all the other cases you must pay $5. First, let’s calculate the number of possible cases. Since we have two 6-sided dices, the total number of cases => 6*6 = … Read more

You are given a data set consisting of variables having more than 30% missing values? Let’s say, out of 50 variables, 8 variables have missing values higher than 30%. How will you deal with them?

Assign a unique category to the missing values, who knows the missing values might uncover some trend. We can remove them blatantly. Or, we can sensibly check their distribution with the target variable, and if found any pattern we’ll keep those missing values and assign them a new category while removing others.

How are NumPy and SciPy related?

NumPy is part of SciPy. NumPy defines arrays along with some basic numerical functions like indexing, sorting, reshaping, etc. SciPy implements computations such as numerical integration, optimization and machine learning using NumPy’s functionality. NumPy and SciPy are closely related libraries in Python, often used in conjunction with each other for scientific computing and data analysis … Read more

Name a few libraries in Python used for Data Analysis and Scientific Computations

Here is a list of Python libraries mainly used for Data Analysis: NumPy SciPy Pandas SciKit Matplotlib Seaborn Bokeh Certainly! In Python, there are several libraries commonly used for data analysis and scientific computations. Some of the most popular ones include: NumPy: This library provides support for large, multi-dimensional arrays and matrices, along with a … Read more