Understanding machine learning is nothing but an understanding Machine learning process. In the last post, we have seen some machine learning concepts now we will look at how exactly the machine learning process works.
- Iterative – In Machine Learning process you repeat things in both big and small ways
- Challenging – It is very rare that the machine learning process is easy. Since it is constantly working with a large amount of potentially complex data and you try to find a pattern which is significant and predictive. This could be difficult. This is the reason we work with data scientist and why they are so important.
- Often rewarding – The process is often rewarding. The benefits of success here is substantial. It is not always the case and you may fail in the process so be aware of this fact.
Asking the Right Question
Choosing Right Question – The first problem you face in the machine learning process is determining what question to ask. Asking the right question is very important. Deciding what questions to ask is the most important part of the process. The reason is quite obvious if you ask the wrong question then you won’t get the answer you expect.
Getting Right Data – After asking the right question the next thing you have to ask yourself whether you have the right data to answer the question.
For example: If you want to determine whether credit card transaction is fraudulent or not? Maybe we need data like whether a customer is a homeowner or renter. Maybe how long customer lives at the current address. You might not have that data so you won’t know the correct conclusion unless having the right data. So it is always important to have the right data to get the correct answer to the right question.
Measuring Success – After choosing the right question and gathering the right data you need to ask at what point you will determine whether you are successful in the process. Ultimately you will get a model which makes predictions. You need to decide how good that prediction should be.
For Example: Let’s say the model is providing the right answer for 8 out of 10 cases of credit card frauds . would you say that it is good enough to measure the success of the model. It is important to know this otherwise you will never know when you are done.
Machine Learning Process Illustration
Let’s look at machine learning process in little more detail. To start you choose the data which you want to work with. You often go to work with a domain expert in the area to do this. These are expert who knows a lot about the domain like credit card fraud or robot failure detection. They exactly know which data is highly likely to be predictive.
Raw Data – The Raw data is almost not in the right form. It contains duplicate or missing data or extra stuff. The raw data need to apply preprocessing and machine learning products provide a variety of data pre-processing module.
Prepared Data – The result of the above process is prepared data which is appropriate for machine learning. To get Prepared data above process needs to be iterated again and again. In typical machine learning projects, most of the time spend to get prepared data.
Machine learning Algorithm – After getting prepared data machine learning algorithm can be applied to that data. Machine learning product commonly provides a number of algorithms.
Model – The result of this process is a model but it is not a final model. It is a candidate model and first model you create in a process. It is certainly not the best one and you cannot know that until you produce several. So once iterate again and again so this process is iterative. You have to do this until the model is good enough to deploy.
Application- Once you deploy the model then the application can make use of the model to detect a pattern in the data and predict the result.
Model Re-creation – As we saw iteration is a small level while preparing prepared data and while creating a model by applying the various algorithm. This machine learning process can be iterative at large level. The entire process needs to be repeated over and over again to re-create a model regularly. Since you have to keep the model up to date with trends.
The model re-creation contains processing new data or applying a new algorithm or something else. In summary, you need to re-create a model regularly. This process is iterative at both small and large scales.