Introduction

The following is a study guide created by Mark Stephan, Founder of ProductHired, to help others study for the more difficult Product Manager Interviews.

  1. Did you find a question not listed in these resources? Email it to us!
  2. Did you answer one of the questions here and think it’s good? Email it to us! We’ll add it and give you credit if you like.
  3. Have a company you want added? Email your info to us!

Did you use this document to help with your interview preparation? Did you just finish the interview process?

Help us improve the community knowledge by giving us data right after your interview. I know it’s the last thing you’re thinking about, but do try to remember. I’m not saying break the NDA, simply adjust your data/questions appropriately, let us know via email how it went.

Finally, help us help you study for other interviews. Once you finish any interview, help the community by filling out this form. This helps us improve how we serve you.

Other Connected Documents:

Because the original document has grown too large, they are broken out here into separate documents.

Table of Contents

Google Recruiter Overview

The following is one of the emails recruiters send to PM candidates. Although this email changes over time, the following should give you a good idea on what recruiters tell candidates:

As an overview, our PM’s bring to fruition new products and features that genuinely benefit our users while at the same time make good business sense. They act as general managers of our products, providing leadership across functional teams to conceptualize, build and deliver Google’s next great app. PM’s find our entrepreneurial culture to be exciting and challenging, because they are never stuck maintaining an existing product, but are instead focused on developing new product ideas and strategies.

We have openings across all of our products in areas such as Consumer, Mobile, Apps, Enterprise and Infrastructure to name a few. As a brief outline, we have an agnostic interview process in which we aim to hire PM “generalists”, who may have niche experience but can easily float through our evolving product lines. We find this keeps our Product Managers fresh and with distributed, homogeneous experiences for our project teams. So, in a nutshell, we do not hire for a specific product, but rather, are seeking generalists who can work on multiple products. As such, you’ll interview with PM’s working on any number of our various products. At a later point, our leadership reviews your interests, background, and interviews to identify relevant projects that align with business need.

What to Expect

There are five components to the Google product manager (PM) interview:

  1. Product design

Google PMs put users first. PMs are zealous about providing the best user experiences. It starts with customer empathy and always includes a passion for products, down to the smallest details. They can sketch a wireframe to convey an idea to a designer. Sample questions include:

How would you improve Google Maps?

How would you reduce Gmail storage size?

How would you improve restaurant search?

What’s favorite Google product? What do you like or not like about it?

If you were to build the next killer feature for Google, what would it be?

You’re part of the Google Search web spam team. How would you detect duplicate websites?

2. Analytical

Google PMs are fluent with numbers. They define the right metrics. They can interpret and make decisions from A/B test results.

They don’t mind getting their hands dirty. Sometimes they write SQL queries; other times, they run scripts to extract data from logs. They make their point by crisply communicating their analysis. Some examples of analytical questions:

How many queries per second does Gmail get?

How many iPhones are sold in the US each year?

As the PM for Google Glass ‘Enterprise Edition,’ which metrics would you track? How do you know if the product is successful?

3. Cultural fit

Google PMs dream of the next moonshot idea. They lead and influence effectively They have a bias for action and get things done. If Google PMs were working anywhere else, they’d probably be CEOs of their own company. Sample questions to assess cultural fit:

Why Google?

Why PM?

4. Technical

Google PMs lead product development teams. To lead effectively, PMs must have influence and credibility with engineers. During the final round (aka onsite) interview, a senior member of the engineering team will evaluate your technical competence Be prepared for whiteboard coding questions at the onsite interview. Example questions include:

Write an algorithm that detects meeting conflicts.

5. Strategy

Google PMs are business leaders. As a result, they must be familiar with business issues. It’s not necessary for PMs to have business experience or formal business training. However, they do expect you to pick up business intuition and judgment quickly. Sample interview questions include:

  • If you were Google’s CEO, would you be concerned about Microsoft?
  • Should Google offer a StubHub competitor? That is, sell sports, concert, and theater tickets?

Also be prepared for behavioral interview questions such as Tell me a time when you had to influence engineering to build a particular feature. Google PM interviewers are relying more on behavioral interview questions in recent months.

What Not to Expect

Brain teasers, such as logic puzzles, are rarely used in today’s Google PM interviews. Google’s HR department found a low correlation between job performance and a candidate’s ability to solve brain teasers Examples of brain teasers include:

  • I roll two dice. What is the probability that the 2nd number is greater than the 1st?
  • What’s 27 x 27 without using a calculator or paper?

However, hypothetical questions have not been banned at all. Hypothetical questions are imaginary situations that ARE related to the job. (This is in contrast with brain teasers, which ARE NOT related to the job.) Examples of hypothetical questions include How would you design an algorithm to source data from the USDA and display on Google nutrition?

How to Prepare

Here’s what I’d recommend to get ready for the Google PM interview:

Review tech blogs, such as Stratechery.

  • Product design. Practice leading design discussions using a framework. (Need a framework? Try CIRCLES Method: http://qr.ae/i6kRM). Start with possible personas and detail use cases. Prioritize use cases and brainstorm solutions. Many PM candidates (wrongly) suggest solutions that are incremental or derivatives of a competitor’s feature set. The Google interviewers are evaluating your creativity, and they place a big emphasis on big ideas (aka “moonshots”). Inspire them with unique, compelling ideas. Drawing wireframes on a whiteboard will help illustrate your ideas. To practice, download a wireframing tool like Balsamiq. Also study popular web and mobile design patterns for inspiration.
  • Technical. Coding questions are unlikely during the phone interviews. But if you are invited to an on-site interview, you must prepare for programming interviews. The technical interviewer does not expect your programming syntax to be perfect, but you should have sufficient mastery of technical concepts so that you can participate in technical discussions and help make technical trade-offs. I would recommend going over computer science fundamentals and practicing a couple coding questions One of my favorite resources is How to Ace the Software Engineering Interview Also be prepared to describe key technologies including search engines, machine learning, and MapReduce.
  • Analytical. Prepare for estimation questions such as How many queries per second does Gmail get? Get well-versed in product launch metrics and A/B testing, including interpretation of results.
  • Strategy. Use a framework to structure your strategy discussions If you’re not familiar with strategy or frameworks, Porter’s Five Forces is a good start.
  • Cultural fit. Understand what it means to be Googley by reading Google’s corporate philosophy. Review Google’s Android design principles. Optional readings: Google’s visual asset guidelines and Steven Levy’s 2007 (but still useful) article on the Google APM program Another optional, but more in-depth (and recent) perspective, read Steven Levy’s “In the Plex: How Google Thinks, Works, and Shapes Our Lives.”

Google Technical Interview Questions

The following are technical interview questions from Google as found on Glassdoor and other sites. Some of the questions have been generalized by the user so that it doesn’t conflict with their NDA.

2018/19 Questions

  • How would you make networking and performance improvements on a website
  • List and apply various Machine Learning algorithms to a situation.
  • Which would you prioritize: a bug fix over a revenue maker?
  • Users are complaining that a mobile app is slow. As a product manager, what will you do.
  • System design for a feature I’d build for Google Cloud.
  • What happens when you enter a URL in browser.
  • What is object Oriented programming? 
  • What is an interface? 
  • What is a cache and how do they work? (Beyond the Google Chrome Cache)
  • Write the algorithm for a mathematical problem
  • Write the SQL code for solving a problem that was given 
  • Draw a class diagram that reflects a little league
  • What is Https, how does it work, and how does it benefit users.

Older Questions

Explain SSO (Single Sign On)

Example 1)

  • Customer uses one set of ID and password to log into multiple services/websites, those services are usually somehow related and connected on backend. Basically there’s one database containing the login and password for the user, and each of those different services/websites uses this one database to check user’s login and password. Often used for a network of websites under the same domain (different subdomains).
  • Cookies are often used to automatically log user in on another subdomain of the same domain. Cookies can’t be shared between domains of 2nd level but they can be shared if it’s subdomains with the same 2nd level domain.
  • There’s a tricky way to make it work with different domains though. When the user authenticates on site-a.com, you set a cookie on site-a.com domain. Then on site-b.com, you link a dynamic javascript from site-a.com, generated by server side script (php, etc) who has access to the created cookie (AND THAT site-a.com ALREADY HAS YOUR COOKIE THAT PROVES YOUR IDENTITY), and then copy the same cookie on site-b.com on the client-side using js. Now both sites have the same cookie, without the need of asking the user to re-login. http://stackoverflow.com/questions/1784219/cookie-based-sso
  • MAKE SURE TO LEARN THE DIFFERENCE BETWEEN AUTHENTICATION AND AUTHORIZATION http://stackoverflow.com/questions/6367865/is-there-a-difference-between-authentication-and-authorization
  • JWT (JSON web tokens) are another way and are increasingly popular – Check out jwt.io for more info.

Explain OAuth (Open standard Authorization)

Explain MapReduce to your grandma

MapReduce breaks down the task to do into smaller chunks and processes them in the parallel. MapReduce processes a large dataset using cluster (Master-Slave technique). Map breaks down the tasks in chunks and Reduce processes them. Think your grandma needs to make cupcakes. She is the master, she explains the plans to her daughters (slaves), She assigned tasks them for to do. The result is the cupcake.

Design a load balancer using data structures

Cost model of the data structure to be used depends on the cloud infrastructure (Think McDonald’s waiter as a load balancer and Kitchen staff as resources of a queue OR Netflix video streaming servers)

Round Robin

  • Use a Queue, not a Stack or hash table  (Talk about pitfalls)
  • Least recently used (if requests are fixed) ,

Randomization

  • Problem it could hit the same

Polling (most expensive)

Other ways of load balancing:

Round Robin

  • A simple method of load balancing servers or for providing simple fault tolerance. Multiple identical servers are configured to provide exactly the same services. All are configured to use the same Internet domain name but each has a unique IP address. A DNS server has a list of all the unique IP addresses that are associated with the Internet domain name. When requests for the IP Address associated with the Internet domain name are received, the addresses are returned in a rotating sequential manner.

Weighted Round Robin

  • This builds on the simple Round Robin load balancing method. In the weighted version each server in the pool is given a static numerical weighting. Servers with higher ratings get more requests sent to them.

Least Connection

  • Neither Round Robin or Weighted Round Robin take the current server load into consideration when distributing requests. The Least Connection method does take current server load into consideration. The current request goes to the server that is servicing the least number of active sessions at the current time.

Weighted Least Connection

  • Builds on the Least Connection method. Like in the Weighted Round Robin method each server is given a numerical value. The load balancer uses this when allocating requests to servers. If two servers have the same number of active connections then the server with the higher weighting will be allocated the new request.

Agent Based Adaptive Load Balancing

  • Each server in the pool has an agent that reports on its current load to the load balancer. This real time information is used when deciding which server is best placed to handle a request. This is used in conjunction with other techniques such as Weighted Round Robin and Weighted Least Connection.

Chained Failover (Fixed Weighted)

  • In this method a predetermined order of servers is configured in a chain. All requests are sent to the first server in the chain. If it can’t accept any more requests the next server in the chain is sent all requests, then the third server. And so on.

Weighted Response Time

  • This method uses the response information from a server health check to determine the server that is responding fastest at a particular time. The next server access request is then sent to that server. This ensures that any servers that are under heavy load, and which will respond more slowly, are not sent new requests. This allows the load to even out on the available server pool over time.

Source IP Hash

  • Source IP Hash load balancing uses an algorithm that takes the source and destination IP address of the client and server and combines them to generate a unique hash key. This key is used to allocate the client to a particular server. As the key can be regenerated if the session is broken this method of load balancing can ensure that the client request is directed to the same server that it was using previously. This is useful if it’s important that a client should connect to a session that is still active after a disconnection. For example, to retain items in a shopping cart between sessions.

Explain OOP

Example 1)

Introduction

This article provides a brief description about the various Object Oriented Programming concepts.

Object Oriented Programming

It is a type of programming in which programmers define not only the data type of a data structure, but also the types of operations (functions) that can be applied to the data structure. In this way, the data structure becomes an object that includes both data and functions. In addition, programmers can create relationships between one object and another. For example, objects can inherit characteristics from other objects.

One of the principal advantages of object-oriented programming techniques over procedural programming techniques is that they enable programmers to create modules that do not need to be changed when a new type of object is added. A programmer can simply create a new object that inherits many of its features from existing objects. This makes object-oriented programs easier to modify.

Object

Objects are the basic run-time entities in an object-oriented system. Programming problem is analyzed in terms of objects and nature of communication between them. When a program is executed, objects interact with each other by sending messages. Different objects can also interact with each other without knowing the details of their data or code.

An object is an instance of a class. A class must be instantiated into an object before it can be used in the software. More than one instance of the same class can be in existence at any one time.

Class

A class is a collection of objects of a similar type. Once a class is defined, any number of objects can be created which belong to that class. A class is a blueprint, or prototype, that defines the variables and the methods common to all objects of a certain kind.

Instance

The instance is the actual object created at runtime. One can have an instance of a class or a particular object.

State

The set of values of the attributes of a particular object is called its state. The object consists of state and the behaviour that’s defined. in the object’s class.

Method

Method describes the object’s abilities. A Dog has the ability to bark. So bark() is one of the methods of the Dog class.

Message Passing

The process by which an object sends data to another object or asks the other object to invoke a method. Message passing corresponds to “method calling”.

Abstraction

Abstraction refers to the act of representing essential features without including the background details or explanations. Classes use the concept of abstraction and are defined as a list of abstract attributes.

Encapsulation

It is the mechanism that binds together code and data in manipulates, and keeps both safe from outside interference and misuse. In short, it isolates a particular code and data from all other codes and data. A well-defined interface controls the access to that particular code and data. The act of placing data and the operations that perform on that data in the same class. The class then becomes the ‘capsule’ or container for the data and operations.

Storing data and functions in a single unit (class) is encapsulation. Data cannot be accessible to the outside world and only those functions which are stored in the class can access it.

Inheritance

It is the process by which one object acquires the properties of another object. This supports the hierarchical classification. Without the use of hierarchies, each object would need to define all its characteristics explicitly. However, by use of inheritance, an object need only define those qualities that make it unique within its class. It can inherit its general attributes from its parent. A new sub-class inherits all of the attributes of all of its ancestors.

Polymorphism

Polymorphism means the ability to take more than one form. An operation may exhibit different behaviours in different instances. The behaviour depends on the data types used in the operation.

It is a feature that allows one interface to be used for a general class of actions. The specific action is determined by the exact nature of the situation. In general, polymorphism means “one interface, multiple methods”, This means that it is possible to design a generic interface to a group of related activities. This helps reduce complexity by allowing the same interface to be used to specify a general class of action. It is the compiler’s job to select the specific action (that is, method) as it applies to each situation.

Generalization

Generalization describes an is-a relationship which represent a hierarchy between classes of objects. Eg:- a “fruit” is a generalization of “apple”, “orange”, “mango” and many others. animal is the generalization of pet.

Specialization

Specialization means an object can inherit the common state and behavior of a generic object. However, each object needs to define its own special and particular state and behavior. Specialization means to subclass. animal is the generalization and pet is the specialization, indicating that a pet is a special kind of animal.

Advantages of OOP

Object-Oriented Programming has the following advantages over conventional approaches:

  • OOP provides a clear modular structure for programs which makes it good for defining abstract data types where implementation details are hidden and the unit has a clearly defined interface.
  • OOP makes it easy to maintain and modify existing code as new objects can be created with small differences to existing ones.
  • OOP provides a good framework for code libraries where supplied software components can be easily adapted and modified by the programmer.

Explain Machine Learning to your grandma

Example 1) Analogy: Growing up in the World

From the childhood you have been meeting, observing and interacting with people. Their behavior and impression on you gets stored in your brain. Your brain becomes a huge data center. You keep on adding more data as you meet new people. Soon you are able to guess how your experience will be with the next person you meet. The person smiles well, wears spectacles and has short hair. You become friendly with him because other smiling people who wear specs are good to you. Then a big 6′ man with a beard and broken tooth comes and you run away as a kid. This is all part of Data Mining within your brain.

As you grow up, you realize that spectacles, beards and size are not the only things that can tell you what people are like. You begin to see their position in society and their behavior in new situations. So the relevant attributes may change. Your algorithms improve by themselves. This is machine learning.

Example 2) Let’s be short, sweet and non-bookish.

Machine Learning (ML), as the name suggests imply to enable machines/computers to learn (over time) to take decisions. What are those decisions, and how are they taken, forms the core of ML and there is lot of fancy maths to support it.

In general, ML is used to perform three types of actions, viz. classification, clustering and prediction.

In classification, we normally have abundant ‘labelled’ data and the scope of our decision space is known a priori. For e.g. if a classifier is exposed to only labelled pictures of cats and dogs and if a new image is shown, it will classify it as one of the category, even if it is a cow. However, it can be useful in many situations where the domain of decisions is pre-set, for e.g. credit card fraud detection – you only want to know if a transaction is legitimate or fraudulent.

Clustering deals with cases when you have data but no information about it. Imagine you are shown images but they are not labelled…so it is for the classifier (or you) to form certain clusters or groups based on some notion of similarity.  for e.g. among a set of pictures you may build clusters based on if they are animals (cat, dog, cow) or birds

(crow, sparrow) or another set as herbivorous (cow, sparrow) and carnivorous (cat, dog, crow). Here the notion of similarity plays a major role, because we don’t know what we are looking at before-hand. Our brain does this clustering and grouping of objects all the time based on our perceived notions about things and objects.

Finally, prediction means to learn from historical data to predict the future or at least the present. Weather forecast, Stock price movement, customer churn rate etc are some examples.

However, ML is not magic or rocket science. There is no ‘one-fit-all’ algorithm as stated by No Free Lunch Theorems. ML algorithms are data dependent, parameters dependent, and objective dependent. However, it is amazing when you start to observe its connection with the actual decision making of our mind – which we generally take for granted.

Example 3) Mango Shopping

Suppose you go shopping for mangoes one day. The vendor has laid out a cart full of mangoes. You can handpick the mangoes, the vendor will weigh them, and you pay according to a fixed Rs per Kg rate (typical story in India).

Obviously, you want to pick the sweetest, most ripe mangoes for yourself (since you are paying by weight and not by quality). How do you choose the mangoes?

You remember your grandmother saying that bright yellow mangoes are sweeter than pale yellow ones. So you make a simple rule: pick only from the bright yellow mangoes. You check the color of the mangoes, pick the bright yellow ones, pay up, and return home. Happy ending?

Not quite.

Suppose you go home and taste the mangoes. Some of them are not sweet as you’d like. You are worried. Apparently, your grandmother’s wisdom is insufficient. There is more to mangoes than just color.

After a lot of pondering (and tasting different types of mangoes), you conclude that the bigger, bright yellow mangoes are guaranteed to be sweet, while the smaller, bright yellow mangoes are sweet only half the time (i.e. if you buy 100 bright yellow mangoes, out of which 50 are big in size and 50 are small, then the 50 big mangoes will all be sweet, while out of the 50 small ones, on average only 25 mangoes will turn out to be sweet).

You are happy with your findings, and you keep them in mind the next time you go mango shopping. But next time at the market, you see that your favorite vendor has gone out of town. You decide to buy from a different vendor, who supplies mangoes grown from a different part of the country. Now, you realize that the rule which you had learnt (that big, bright yellow mangoes are the sweetest) is no longer applicable. You have to learn from scratch. You taste a mango of each kind from this vendor, and realize that the small, pale yellow ones are in fact the sweetest of all.

Now, a distant cousin visits you from another city. You decide to treat her with mangoes. But she mentions that she doesn’t care about the sweetness of a mango, she only wants the most juicy ones. Once again, you run your experiments, tasting all kinds of mangoes, and realizing that the softer ones are more juicy.

Now, you move to a different part of the world. Here, mangoes taste surprisingly different from your home country. You realize that the green mangoes are in fact tastier than the yellow ones.

You marry someone who hates mangoes. She loves apples instead. You go apple shopping. Now, all your accumulated knowledge about mangoes is worthless. You have to learn everything about the correlation between the physical characteristics and the taste of apples, by the same method of experimentation. You do it, because you love her.

Enter computer programs

Now, imagine that all this while, you were writing a computer program to help you choose your mangoes (or apples). You would write rules of the following kind:

if (color is bright yellow and size is big and sold by favorite vendor): mango is sweet.

if (soft): mango is juicy.

etc.

You would use these rules to choose the mangoes. You could even send your younger brother with this list of rules to buy the mangoes, and you would be assured that he will pick only the mangoes of your choice.

But every time you make a new observation from your experiments, you have to manually modify the list of rules. You have to understand the intricate details of all the factors affecting the quality of mangoes. If the problem gets complicated enough, it can get really difficult to make accurate rules by hand that cover all possible types of mangoes. Your research could earn you a PhD in Mango Science (if there is one).

But not everyone has that kind of time.

Enter Machine Learning algorithms

ML algorithms are an evolution over normal algorithms. They make your programs “smarter”, by allowing them to automatically learn from the data you provide.

You take a randomly selected specimen of mangoes from the market (training data), make a table of all the physical characteristics of each mango, like color, size, shape, grown in which part of the country, sold by which vendor, etc (features), along with the sweetness, juiciness, ripeness of that mango (output variables). You feed this data to the machine learning algorithm (classification/regression), and it learns a model of the correlation between an average mango’s physical characteristics, and its quality.

Next time you go to the market, you measure the characteristics of the mangoes on sale (test data), and feed it to the ML algorithm. It will use the model computed earlier to predict which mangoes are sweet, ripe and/or juicy. The algorithm may internally use rules similar to the rules you manually wrote earlier (for eg, a decision tree), or it may use something more involved, but you don’t need to worry about that, to a large extent.

Voila, you can now shop for mangoes with great confidence, without worrying about the details of how to choose the best mangoes. And what’s more, you can make your algorithm improve over time (reinforcement learning), so that it will improve its accuracy as it reads more training data, and modifies itself when it makes a wrong prediction. But the best part is, you can use the same algorithm to train different models, one each for predicting the quality of apples, oranges, bananas, grapes, cherries and watermelons, and keep all your loved ones happy 🙂

And that, is Machine Learning for you. Tell me if it isn’t cool.

Machine Learning: Making your algorithms smart, so that you don’t need to be. 😉

Explain recursion to grandma or 7 yr old

Example 1)

Someone in a movie theater asks you what row you’re sitting in. You don’t want to count, so you ask the person in front of you what row they are sitting in, knowing that you will respond one greater than their answer. The person in front will ask the person in front of them. This will keep happening until word reaches the front row, and it is easy to respond: “I’m in row 1!” From there, the correct message (incremented by one each row) will eventually make its way back to the person who asked.

Why is this a good explanation?

It gets across three points:

  1. There are some questions that may be inherently recursive and that some questions are easier to solve recursively.
  2. The question I am asking (“what row am I in?”) can be rephrased recursively as: “how many people are in front of me + 1?” with a base case of zero people in front of me.
  3. It also illustrates the idea of a recursive call stack and how calls are pushed on then popped off the stack.

Example 2)

Imagine people staying in line. Queue or whatever. The front person in line needs to step back but there’s someone else. He’s asking him to step back too, but there’s again another person there. It goes all the way back to the end of the queue, where the last person is actually able to make a step back, and give a response to the person in front of him “Done, you can now step back too”, and this way it gets back all the way to the front and every person can take an action (step back) based on the positive feedback they got from the person behind them. Oh, and at the same time they could be counting how many of them there is!:)

Example 3) If you have a pack of cards,

We have a stack of cards. We are looking for a particular card. We move the cards from one stack to Another until we find our card. This is recursion. (This is also iteration that is why  I liked the queue example much better).

Explain Pagerank algorithm

Example 1)

  • It measures importance of the web pages.
  • Probably the way it was at day 1 of its existence is it way counting what pages link to what pages. The pages that got the biggest amount of “referrals” would be higher in the search results.
  • Then they started to estimate the “quality” of the links. What’s the easiest way to do it? Well, if the page that is already up in the search results links to another page, this other page is probably “legit” too.
  • Pages basically share their values with the other pages that they link to.
  • Wikipedia has a good “simplified” algorithm explanation.
  • PR(Node) = (No of hyperlinks to the node, quality of the node)

Example 2)

How would you describe a database to a third grader?  

Ask clarifying questions. Relational or Non-relational database.

If it has an index it is relational.

Example 1)

Refrigerator is non-relational. Unless has a recipe.

Example 2)

Library is relational with card catalog, etc…

Example 3)

Non-relational

You know how you have to put your toys away when you’re done playing, so you can find them easily the next time you want to play with them?

A database is like a shelf to put your toys away. Every toy has an index (number). Except the toys are data instead.

Now, the database can always find the right data easily – whether it’s “all the dinosaur toys” or “all the yellow toys,” the computer can get everything out to play with very quickly, because it’s in a database.

Example 4)

Non-relational. Explanation for Mom. Blog and comments example. Relational vs Non-Relational and their pitfalls.

https://www.ignoredbydinosaurs.com/posts/210-explaining-non-relational-databases-my-mom

Other Examples)

  • Phone book relational
  • Book with index relational
  • Books with table of contents is relational

Tell me what happens when you press enter after typing google.com into the browser

Example 1)

In an extremely rough and simplified sketch, assuming the simplest possible HTTP request, no proxies, IPv4 and no problems in any step:

  1. browser checks cache; if requested object is in cache and is fresh, skip to #9
  2. browser asks OS for website’s IP address (browser goes to DNS and gets IP address -> browser opens TCP connection using IP address to server which has services & apps -> server processes HTTP request and serves data (javascript, HTML, CSS, images) to requester -> goes back to browser -> browser decodes response, determines what to do with it
  3. OS makes a DNS lookup and replies the IP address to the browser
  4. browser opens a TCP connection to server (this step is much more complex with HTTPS)
    1. If HTTPs, browser checks with the authority site delivered with the certificate from the server.
    2. If Authority site says certificate is valid, browser creates a secure connection.
  5. browser sends the HTTP request through TCP connection
  6. browser receives HTTP response and may close the TCP connection, or reuse it for another request
  7. browser checks if the response is a redirect or a conditional response (3xx result status codes), authorization request (401), error (4xx and 5xx), etc.; these are handled with differently from normal responses (2xx)
  8. The server receives the response and distributes it to multiple servers to process the HTTPS request server up the data from the database (javascript, HTML, CSS, Images) and serves them, to the Requester.
  9. if cacheable, response is stored in cache
  10. browser decodes response (e.g. if it’s gzipped)
  11. browser determines what to do with response (e.g. is it a HTML page, is it an image, is it a sound clip?)
  12. browser renders response, or offers a download dialog for unrecognized types

Again, discussion of each of these points have filled countless pages; take this only as a short summary. Also, there are many other things happening in parallel to this (processing typed-in address, speculative prefetching, adding page to browser history, displaying progress to user, notifying plugins and extensions, rendering the page while it’s downloading, pipelining, connection tracking for keep-alive, checking for malicious content etc.) – and the whole operation gets an order of magnitude more complex with HTTPS (certificates and ciphers and pinning, oh my!).

Proxies:

  • caching (the cache can be public or private, like the browser cache)
  • filtering (like an antivirus scan, parental controls, …)
  • load balancing (to allow multiple servers to serve the different requests)
  • authentication (to control access to different resources)
  • logging (allowing the storage of historical information)

HTTP is stateless ie. no more than 1 request can be fulfilled at once on a single TCP connection. However, they can have stateful sessions using HTTP cookies.

How would you reduce the size of Gmail?

Example 1)

Ask if deletion is an option. What does reduce Gmail storage size mean? Reduce per user quota? Free up disk space? Less data overhead in the DB? Clarify what exactly “reduce Gmail storage size” is without sounding like an idiot.

For e.g., the question might be rephrased as

“You have a 15GB GMail account and Google has decided to make all GMail accounts 10GB, what do you do now”. Yea, yea that’s dumb and Google won’t pull that stunt but that’s not the point. The point is to show Google that you aren’t a pencil pusher.

You can also talk about growing spaces. For example, everyone is allocated space on a need basis. This means that while 15GB is the max, the server does not actually allocate 15GB, saves a lot of money.

Is there an option to provide offline downloads of e-mail before deletion? As in a very easy way (one click download – think of Facebook data download) instead of forcing users to use POP3/IMAP clients.

Is there an option to sort emails by size and/or date? [There is but this isn’t user friendly].

Is there an option to mass-archive e-mails? Text lends itself to incredible compression(needs to be lossless – RLE). However attachments might still be a problem (images may be lossy?)

Provide option to save e-mails and/or DOC attachments to something like Google Docs (unlimited space!). Offer conversion of non-support documents to documents that Google Docs supports.

Look into databases/backend with least data overhead?

How would you reduce bandwidth consumption in search when users from third 3rd world countries search?  

Example 1)

  • Don’t show Previews
  • Disable type-ahead or “live search” (Google Instant?)
  • Don’t have AJAX interactivity at all: let all be static web pages
  • Maybe set up local caching servers in the country to cache popular searches?
  • Don’t load anything from 3rd-party servers (like ads)

Example 2)

Tricky question and very tricky

You must be able to say that we want a tradeoff between user experience and bandwidth.

All the ideas deteriorate user experience, so need to identify how to do that.

Compressed data packages can be another way of saving bandwidths, and not deteriorating user experience.

Assume that you put the [code sample app] up on the web, and it became really popular. What are some optimizations you could do to speed up response times under load?

Example 1)

  • Caching. Both database caching (database gives you a response very fast to a popular query) and web server caching (web server gives a user cached page for pages that are static and have been requested by someone within the last N minutes).
  • Set up a load balancer for the web server with which you can have multiple servers responding to customer’s requests. Same for databases.
  • Set up the server in a “cloud” with servers located in different locations, distributed, closer to the customer
  • Upload all static information on CDNs
  • Use Google PageSpeed and similar tools to optimize the page in other ways
  • AMP – Accelerated mobile pages – if the sample app has content as the hero

Example 2)

You can add in-memory databases like redis to optimize speed.

This can be combined with a lot of algorithms that you have used for distributed systems.

Reading about pre-fetching will also help. This essentially marks a resource to be prefetched, and store in the local cache, which is useful to improve response times..

Separating client side and server side optimization is useful as well.

Design a streaming service (live stream video and send it to people interested in watching it)

  • Would love an example here…
    • Some not exactly tech answers… https://www.quora.com/How-can-I-build-my-own-Netflix-like-subscription-streaming-video-site
    • (Manu’s suggestion) My go-to answer is “Provide the quality that users can afford”. This is how netflix works.  They poll the bandwidth with the lowest quality of data and increase it slowly to update it, and figure out where the service is optimal. In other words, if the user can see a buffer or a loading icon that is bad experience. On the other hand, the user realizes that what is the best quality he can afford. So, we experiment by changing bandwidths from lowest to highest until we figure out a buffer condition.
      It can be implemented in multiple ways : For example, it can be a continuous upgrade to bandwidth like an analog current or it can be like a digital streaming service.
  • How to Design Youtube (via Gainlo)

Implement a Fibonacci algorithm, first naively and then optimize (iterative vs. recursive, memorizing, etc…)

How would you design the system around implementing Google Predictive search suggestions.

Vague question.

Clarifying questions :

1. What data do we have access to?

2. What is the kind of data we are collecting?

Based on these two (or more factors),

1. Identify user needs using machine learning and then implement it.

You have a ladder of N steps (rungs). You can go up the ladder by taking either 1 step or two steps at a time, in any combination. How many different routes are there (combinations of 1 steps or 2 steps) to make it up the ladder?  

Example 1)

The answer is 89.

You can use recursive fibonacci function, in this case n is 10:

function countP(n) {
   if (n == 1 || n == 2) return n;
   return countP(n-1) + countP(n-2);
}

Or you can use combinations. There can be 10 ones, 8 ones and 1 two, 6 ones and 2 twos, 4 ones and 3 twos, 2 ones and 4 twos, or 5 twos which means. We can pick the place of ones (or twos) in 10 slots:

C(10,10)+c(9,8)+C(8,6)+C(7,4)+C(6,2)+C(5,0) = 89

Example 2)

You don’t need to be familiar with the Fibonacci series. Simply test the first few cases manually and you can deduce that there’s a pattern.

A ladder with 2 rungs (that is, the floor, rung #1 and rung#2): 2 ways to climb. 1+1, 2.

A ladder with 3 rungs: 3 ways to climb. 1+1+1, 1+2, 2+1.

A ladder with 4 rungs: 5 ways to climb. Think of it as climbing 1 rung and then you’re at a 3-rung ladder (3 ways to climb) or climbing 2 rungs and then you’re at a 2 rungs ladder (2 ways to climb). Overall you have 3+2 ways.

A ladder with 5 rungs: like the previous case, you climb 1 and reach a 4-rungs ladder, or climb 2 and reach a 3-rungs ladder. overall 5+3, or 8 ways.

..

..

..

A ladder with N rungs: sum of climbing (N-1) ladder and climbing (N-2) ladder:

Ways(N) = Ways(N-1) + Ways (N-2).

this can be solved with recursion or brute force.

How does a Web App work?

When are Bayesian methods (probability of A given B) more appropriate than “Artificial Intelligence” techniques for predictive analytics?

Ask clarifying questions…

What does the interviewer mean by AI? Machine learning? Because bayesian methods are being used in much Machine Learning.

Again, a hazy question.

You should definitely ask the interviewer as to what does he/she mean by artificial intelligence.

Traditional AI is based on Knowledge based systems which follows a set of rules like First order logic or predicate logic.

These can be used in natural language systems. For example, we know that English is Subject Verb Object.

Another example can be a social network, where we know the size of network and everyone in it. We can find the most influential people without any bayesian inference.

Bayesian inference is based on the premise that, given that we know something about the past, how likely can a different event occurs.

For example, how likely is the Trump to become the President despite of the media not supporting him?

Machine learning is bayesian in nature : It says what is the maximum likelihood of an event occurring given that we have a lot of information in the past about it.

In short, if we know there are rules in a system, then we will use traditional AI methods, these rules cannot be broken. For example, in terms of predictive analysis. We will always know for sure where to place a book in a library. If we do not know, then we will have to improve our cataloging system aka add a new rule.

However, stock markets, we know a lot about the past but we will always have an error and cannot really live without it, and hence bayesian.

Optimize algorithms to create an ordered list

What is Big O Notation?

How would you design a scalable system that can stream youtube video to Google  glass provided the screen resolution is X.  

Design a simple load balancer for google.com.

  • What data structures would you use? Why?
  • Define access/delete/add complexity (Order of) for each data structure and explain your choices.
  • Design an algorithm to add/delete nodes to/from the data structure.
  • How would you pick which server to send a request to? Why? Why not?

Design the Google search service – essential pieces, logic, discuss why/why not/trade offs.

How I would perform statistical frequency analysis on a random raw data source to get a set of results that are most relevant to humans.  

Clarifying questions :

1. Relevancy is defined by answers we are seeking, so defining the question is important.

2. Look at the dataset. Understand the data.

3. Converting the data into variables (1)

4. Find out basic stats (mean, median, variance, mode, interquartile range).

5. Visualize using different charts.

6. Look at the relationships.

“How would you ‘write a program’ to do [X]?”

(a very vague question), and the interviewer  became dissatisfied when I answered from an architecture standpoint. What he really wanted to ask was “How would you design an ALGORITHM to do [X]”?

How does online authentication work?

https://medium.com/@bitshadow/how-basic-http-authentication-and-session-works-d29af9caec31

How is the distribution of a page loading time? Why? What is bigger, the mean or the median? What about mobile sites?  

How would you explain cloud computing to a 6 year old  

  • Example 1)
  • Example 2)
    • You design a great new type of pizza and you want everyone to be able to eat it
    • You decide to start a restaurant and you hire 10 chefs for your first week. Monday to Friday hardly anyone comes to the restaurant so only 2 of your chefs are needed and the other 8 are playing board games on your payroll.
    • The weekend comes around and suddenly your restaurant is swamped. Your chefs are struggling to make enough pizza – you’d need 20 chefs to make this much pizza, and you have to turn away half your customers.
    • So you are wasting money on chefs you don’t need in the week, and you don’t have enough on the weekend. Chef contracts are long and expensive to set up and they don’t like working half a week. What’s more, the pattern of people coming to eat pizza isn’t always predictable.
    • Enter the pizza chef agency. They have 100 pizza chefs all around the city. You pay them to supply chefs to you and they always supply the number of chefs needed at any time so you always have enough and never too many. You pay a flat fee to the agency, and then you only ever pay for how chefs are making pizza for you at any one time
    • Let’s say it’s a Thursday and you have 3 chefs in the kitchen making pizza. Suddenly, a massive group arrives and you need a lot more chefs. The agency automatically detects this and sends over 5 more chefs ASAP on their motorbikes to make pizza. When the group leaves, the restaurant  is really empty, so only one of the chefs stay, while the rest leave for another suddenly busy pizza place that hires the pizza chef agency
    • You managing your own restaurant is you running your own servers. Pizza chefs are the servers. Requests for pizza are web requests to your servers, pizza being the website/response. The pizza chef agency is a cloud computing company providing scalable servers/PaaS a la AWS.
    • Bit of a silly example but I think it’s quite fun!

What happens from the point when you type in a URL in your browser to the point the page gets displayed?  

  • Example 2

1. You enter a URL into the browser

2. The browser looks up the IP address for the domain name

3. The browser sends a HTTP request to the web server

4. The facebook server responds with a permanent redirect

5. The browser follows the redirect

6. The server ‘handles’ the request

7. The server sends back a HTML response

8. The browser begins rendering the HTML

9. The browser sends requests for objects embedded in HTML

10. The browser sends further asynchronous (AJAX) requests

How do you code integer division without using divider (‘/’)  

  • Division is effectively multiple subtraction, e.g. let’s do 12/4. You have to subtract 4 from 12 3 times to get 0. This is integer division so we can ignore fractional answers.
  • Code would look like:
    total = 12
    divisor = 4
    answer = 0
    while (total > 0) {
    total = total – divisor
    answer = answer + 1
    }
    return answer

What are the network implications of a handheld device meant for elementary school kids

  • I suspect there are two ways to look at this.
    • The network of the school, wireless, etc…
  • Kids like to play games, watch videos / cartoons.

So bandwidth should be pretty high.

Talk about optimizing.

How does TCP/IP work?1

Blogpost: TCP vs UDP

Debugging – YouTube has a performance problem with page load in some region, how do you diagnose.

We are assuming here that the page load time has gone up. There could be various reasons for it:

Has the Traffic Increased from the Region

  1. Compare the current traffic to the historical value for the day of week and hour of day to get an estimate of expected traffic.
  2. If an anomaly is detected and the traffic has increased then we need to boot new servers to handle the new traffic.
  3. Traffic could be increased due to the following reasons:
    1. Video(s) going viral
    2. Marketing activity in the area which is bringing in a lot of traffic
    3. Is there some localised event (like a national holiday) which is bringing in a lot of traffic to youtube.

No Anomaly detected in the traffic. This means the problem is at our end.

Data Centres

  1. Are the data centres serving the region working normally or are there some faults which is causing a lot of connections to be dropped?

Tracking Code

  1. Did we change the tracking code which might be throwing false negatives?

Segment by Device

  1. Is page load affected on mobile or desktops?
  2. If desktop then is it specific to a particular browser or a specific browser?
  3. If it’s specific to mobile then which OS? iOS or Android? Any specific versions?

Services

What is the status of the services being used to render this page?

  • CDN
  • Streaming
  • Authentication
  • Recommendations
  • Analytics
  • Logging

Did we deploy something new?

  1. Frontend changes – Like a new JS which is not loading properly
  2. Backend Changes – fetching or streaming changes which might be affecting the page loads
  3. UI Changes – May the CDN’s are not serving the UI Images because of which the page might be loading slowly.

Other Answers

  • Example 1)
    • First let’s see what regions we have and what server supposed to be responding in that region
    • Then see if it’s actually THAT server responding (its IP address)
    • Check if DNS is quick enough in that region
  • Example 2)

1. Number of requests, has it suddenly increased. Is it regular, if so then we might have to add new hardware.

2. Server logs. 3. Provider performance slow down.

What is hash table, how does it compare to other data structures, and when would you use one?  

In computing, a hash table (hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

Ideally, the hash function will assign each key to a unique bucket, but it is possible that two keys will generate an identical hash causing both keys to point to the same bucket. Instead, most hash table designs assume that hash collisions—different keys that are assigned by the hash function to the same bucket—will occur and must be accommodated in some way.

In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized) constant average cost per operation.

In many situations, hash tables turn out to be more efficient than search trees or any other table lookup structure. For this reason, they are widely used in many kinds of computer software, particularly for associative arrays, database indexing, caches, and sets.

What makes up the octet of an IP range? Why is it important in routing?  

The first numbers identify the country and regional network of the device. Then come the subnetworks. Then the address of the specific device. Some IP ranges are reserved (e.g. 127 denotes local host).

Here’s a data set. Something changed and here’s the new data set. Explain what happened.

What changed?

Is the change statistically significant, a classic A/B testing question.

If it is significant, what caused the change.

What is Cap Theorem?

Explain Polymorphism to your grandmother

  • A dog, a spider, and a human can all walk at a rate of speed. Everyone knows this. But, when talking about the specifics of how they walk, each implementation is different. A spider has 8 legs and will ‘scurry’ around very quickly ( relative ). A dog has 4 legs and will leap and jump and run around. A human has 2 legs and will place one in front of the other, enabling them to walk.
    Well, if you were in control of a group of misfits, namely a group of spiders, dogs and humans, and you were their leader, you expect them to be competent and think for themselves. So, when you command them to, “Walk” you don’t want to look at spiders and yell, “Alright, scurry and move all 8 legs!” and then look at dogs and yell, “Alright, leap using your 4 legs!” and then finally look at the human and yell, “Alright, and you use 2 legs, go!”
    You simply want to look at your group of misfits, no matter what they might be, and say, “Walk.” They know how to handle themselves from there.
    Implementing Polymorphism is a lot like this, in my opinion. Your main driver program, or whatever happens to be using a variety of objects at the moment, does not want to do the guess work of figuring out what object it is dealing with and their specific implementation of something as basic as walking.
    So, we use things like inheritance which guarantees things that have something similar / in common will have similar properties. Dogs, humans, and spiders will all have walk methods. So, no matter what object we are currently working with, we do not care. We tell that object to walk, and it will handle the ‘How’ based on its implementation.
    That is basically polymorphism for me.
    Disclaimer: I’m a first year student so that’s just my interpretation, if this makes no sense, I’m sorry. I’m learning too.

Product Design Questions

The following are Product Design questions from Google that have been found on Glassdoor and other places on the web. If you have any answers you want to add to the questions below, or if you have more questions to add, email them to us!

  • List your top 3 Google products you use everyday, pick one, what three features do you like about it, what three things would you improve? How would you grow it 2x?
  • How would /you improve YouTube?
  • Design a home automation system
  • Recommend 5 YouTube videos to a user and how will you measure success/failure of your suggestions?
  • Design a washer and dryer.
  • We discussed about Netflix model and what can be improved.
  • 1. Suppose you were Google CEO, what would your strategy be?
  • 2. Suppose I want to launch a death star, how would you go about that?
  • Design a replacement for the set top box in hotel room.
  • How would you design a Mars rover
  • Design an app for the DMV
  • List some use cases for driverless cars
  • How would you implement a Amazon Mayday button on Gmail?
  • Design an app for celiac patients
  • The alarm clock industry is really waning as of late. What could you do to curb this trend?
  • Design scenario: Let’s say you have a tv remote with 3 buttons, mute, vol up, vol down. What would you expect to happen if a user hits vol up button when it’s muted? Talk through the scenarios and what the user is trying to do. What would you expect to happen if you hit vol down button when it’s muted?  
  • How would you design a dictionary look-up for scrabble?
  • Differences between iOS and Android
  • 8% drop in hits to Google.com. Larry Page walks into your office asks you to think about what the reasons might be. Enumerate.
  • 100 floors in a building – 3 main occupants – top 10 floors (company 1), next 90 floors evenly split, top 45 company 2, bottom 45, company 3. Design the elevator system.
  • You’re a PM for Twitter and you’re launching a feature that tracks the geographical position of every tweet. Go!
  • What would you do to double the revenue for an existing Google product in the next 12 months?  
  • Let’s talk about Google Docs. Let’s say we wanted to offer offline functionality, and we want to automatically make some documents available offline. How would you go about deciding which ones?
  • Choose a consumer product. You have unlimited funds. What’s your strategy?  
  • How would you design a product that increases the consumption of fruit in the US?  
  • What would you want to put up in a reporting dashboard for a web service like adwords?  
  • Imagine you were creating a search engine for events, how would you go about it?  
  • You have localized a website in spain and the traffic has now reduced, what could be the reasons?
  • How would you improve throughput of an airport by 100%? What are the implications?
  • How would you change the security/check process if you had unlimited resources?
  • Design a streaming service (live stream video and send it to people interested in watching it)  
  • Given Engineers, Computers and Servers, how would you build a travel and accommodation web service, and would you recommend doing so?  

Designing a New Product

If you have any answers you want to add to the questions below, or if you have more questions to add, email them to us!

How do you design a bookcase for children?

Assume that children are 5-10 years old and the bookshelf will be used in children’s rooms.

  • Think about the use cases for using the bookshelf.
  • Think about any special use cases that the kids might have which adults don’t.
  • Identify gaps. Solve those gaps.
  • See the links below after you  have designed the solution to verify against real real world ideas.

https://www.linkedin.com/pulse/design-everyday-things-bookcase-children-amey-bhalerao

https://in.pinterest.com/explore/kids-room-shelves/

https://www.babble.com/home/10-inspiring-book-displays-for-kids-rooms-2/

https://www.linkedin.com/pulse/product-owner-interview-tip-how-improve-x-uglje%C5%A1a-eri%C4%87

Estimation / Analytical / Case Questions

The following are Estimation and Case Study questions that have been found on Glassdoor and other places on the web. If you have any answers you want to add to the questions below, or if you have more questions to add, email them to us!

Unanswered Questions

  • How would you measure the success of WWDC event.
  • What is the number of book titles published in the US each year?
  • What is the market size for rocket boots.
  • Estimate total size of storage needed for Inbox
  • What is the total bandwidth used by a Gmail server?
  • Estimate the required bandwidth for a college campus
  • How many Gmail users are there in the world?
  • How much revenue does YouTube make in a day?
  • Let’s say we wanted to build a service to index all the images in the world. How would you think about the cost to do that?  
  • Size the digital advertising market? 
  • At what rate is the photo upload traffic on Facebook growing per year?
  • You have a grocery delivery service (similar to Amazon Fresh) which delivers food within 24 hours.
    Estimate how many trucks you need to operate this service.
  • If Nest had to launch an automatically garage door opener what would the market size be in the USA? Would it be worth it? How would you launch the first generation product.  
  • How many calories are in a grocery store?  
  • How much storage space is required to host all the images of Google Street view?
  • How would you solve homelessness in San Francisco?
  • What is the market for driverless cars by 2020?
  • The NBA championships are about to happen and you produce merchandise showcasing the winning team–but, you don’t know which team that will be. What do you produce and how much do you produce to dress the stadium visitors with merchandise?
  • Calculate the storage needs for all videos on YouTube for current and future years  
  • How much time is spent waiting in SFO airport every year?
  • If you were issued a Cease and Desist for your website, how would you go about taking it down?  
  • Estimate the bandwidth needed if you built an optical fiber connection to a colony on Mars
  • After you have designed a home automation system, they ask estimation question. How much revenue for the first year?
  • How many hard drives do you need to store all the data on gmail for all users?

Answered Questions

# of buses needed to transport Google employees between mountain view and bay area residences.

Assumption by Richard:

1 Only consider morning rush hour (6-9AM), while the bus will be used in afternoon for the opposite route

2 1 hour each trip, so the 6 am bus will arrive at campus around 7 am and it will back to residence around 8AM and take 1 more trip.

3 each bus can take around 50 people.

4 Total Google employees in Mountain view 15K

5 Only 40% of employee would like to take bus. The other 60% will either can use public transport or prefer to drive

6 Google commuters will evenly distribute in 3 hours.

Calculation:

Total Google commuters are 20K*40%=6K

The 6AM commuters will be 1/3 is 6K/3=2K

Bus need on 6 AM is 2K/50=40 bus

Bus need on 7 AM is 2K/50=40 bus

Bus need on 8 AM will use the bus came back after 6AM trip.

Result:

The total bus need is 40+40=80 BUS

You see Starbucks on the either side of the road sometimes, why do you think they do this?

https://www.quora.com/You-see-Starbucks-on-either-side-of-the-road-sometimes-why-do-you-think-Starbucks-Corp-does-this

Data Driven decision making

How long should you run an A/B test on your site before you declare one a winner? (Question on Quora)

Conversion optimization A-B testing guidelines

  1. Always run a test to statistical confidence of 90% or greater. Note that different A-B testing systems use different algorithms such as chi squared, z-score and derivatives of these. This means there is a variance when comparing data between different testing systems but if you are testing to above 90% you are ok.
  2. Always run a test for at least seven days even if you have a statistical winner earlier. This will account for advertising, user personas and user device variants. Of course this also depends on the amount of traffic and conversions you are getting. It is not unusual to run a test on a page that has a low volume of couple thousand visitors for a month.
  3. Look for test variants cumulative conversion rate compared to the control and see if the lines cross or are consistently above or below.
  4. > Above control – most likely a solid lift
  5. > Criss-crossing the control – the test has not run its course yet and is not statistically accurate
  6. > Below the control and not going positive – most likely a negative lift, this type of variant can be stopped early to speed up testing.
  7. Check the data by device. Often devices can cancel each other’s results out. A desktop may have positive result but mobile may not, making the total conversion lift not as great as could be. This also tips you off that mobile only tests are needed.
  8. Be conscious of other factors during your testing period such as holidays, ad campaign changes, lifts in traffic due to other corporate initiatives.
  9. Doing an A/A test can help determine length of test and minimum conversion lift.
  10. If you are not sure it does not hurt to continue running the test unless the results are negligible.
  11. Things to consider:  Representative nature of your data, the normalization of your data and the representation of short versus long term behavior, how much differentiation (you need a lot less data for 20% change then for a 4% change), normal variance of your site (90% of sites are right at 2% with enough data, so you can use that if not sure).  Once you have ALL of that, then you can evaluate a statistical test with enough information to leverage its answers in light of the assumptions that make up that model.
  12. It all depends on statistical significance. As a rule of thumb, I like to have at least 50 data points (25 for A, 25 for B). If you have a highly trafficked web-page, you could get this result in five minutes. For a less popular page, it might take two weeks.
  13. The main thing is that you need to have enough data that you can confidently pick one test over the other, and not have the results thrown off by a few flukes.
  14. type 1 and type 2 errors (false positive and false negative)