First let me tell you how I have selected these questions. I simply trusted Google. I put the title of this very post on the search bar. than looked at the results in all the posts in the first two pages (I did not have all day). Then I selected 12 questions that were common in almost all of them.
Then I used parts of those questions as keywords and searched their search volumes and selected the top five. Well, a machine learning algorithm trained to do the same task could have done it in a fraction of a second. However, it could take a pretty long time to train that algorithm. Keep this in mind; I am going to refer to this statement later in the article.
1. Are data science and data analytics the same?
I got a number of variants of this question. Basic questions like these show one thing for sure; a lot of people want to get in the know with analytics. Now, let us try to answer it.
No. Data science and data analytics are not the same. Data science is a discipline that combines computer science and statistics to find patterns in data. The researchers in data science are concerned with finding new ways of looking at data, improving the existing models, and creating new models to drive prediction, classification, and other such processes.
Data analytics is the study of data analysis. Data analytics professionals are concerned with cleaning and structuring data, running them through predictive models, using the results to find actionable insights for a business.
The two might sound similar but they are not.
2. What is the difference between classification and prediction in case of a decision tree
Well, this is a jump; I am trying to follow some kind of an order though. Before I answer the question, a decision tree is a model that aligns decisions with their possible outcomes factoring in a set of parameters.
Let us use an example to answer the question. Let us say, there are 15 patients in the cardiac ward of a hospital. You build a model that classifies those patients according to their specific ailment and features the outcomes of the already applied treatment. This is an example of classification.
A new patient arrives with a certain ailment that is common with a few of the existing patients. Now, if you use the classification model to figure out the outcome of certain treatment of the patient, it would be a prediction.
3. What are the best practices in data mining?
There are, in fact, quite a few golden rules that you should learn by heart.
- Identify and focus upon actionable data. In order to this, it is important to know the problem you are trying to solve very well.
- Do not lose focus while cleaning the data.
- Find and eliminate false predictors.
- Be critical if the results are way too good to be true.
4. Why does analytics fail?
I actually found this one. It is not funny because the concern behind this question is true. A staggering number of companies are disappointed with their analytics results. Let us see what might be the primary reasons.
The most cited reasons behind analytics failure are
- Lack of skilled and experienced analysts.
- Want of planning on the management’s part.
- Failure to turn analytics into an integral part of the business processes.
5. What is the difference between business analytics and business intelligence?
This is an important one. The number of people who use these two terms interchangeably is alarming.
Business intelligence engages in studying the business and the market in terms of structured, numerical data available on balance sheets and Excel projects. BI professionals use visualization tools to create reports and keep track of the profitability and efficiency of business processes.
Business analytics includes business intelligence practices and adds the analysis of unstructured data. The datasets used and processed by business analytics professionals are usually much larger, hence necessitating the use of advanced analytics tools. This is why business analytics course focuses as much on the analytical side of things as on the business side of things.
Do you think I am missing out on some important questions? Well, I am quite sure that I have had to leave out a lot. I wanted to focus on answering the questions well rather than stacking up a lot of them. The goal is quality over quantity, just like in data analytics.