In the last post, we have seen what is Machine Learning. Let’s see now how we can train the model in machine learning.
Let us closely look at the process of creating a model and training a model. We start with training data and we work to get it until it is beautiful, Pristine and just what we need. Since we are using supervised learning the target value is part of training data.
In case of credit card example for instance whether a transaction is fraudulent or not. Our first problem is to choose the features that we think will be most predictive of that target value.
In Credit card example maybe we decide that country in which card is issued, a country it is used in, the age of the user is the most likely features to help us predict a transaction is fraudulent or not. Imagine we have chosen feature 1,3,6 in our training data.
Now we input this training data into our chosen learning algorithm. But notice here that we have sent only 75% data of saying all the features which we have chosen i.e. feature 1,3,6.
- Choosing Features in Machine Learning – Now you are thinking about how to choose the feature and an algorithm which will help to get the results we are looking for. Well if it is a simple problem then the choice can be limited in the machine learning process and not too hard.
- The Role of Data Scientist – Imagine if we have a complex problem with lots of data and powerful machine learning technology with plenty of algorithms then this can be hard. Consider data with 100 or 200 features and which ones are predictive and how many we should use. This is what a data scientist is for. They are the people with machine learning expertise and domain knowledge. For such complex, they can help us to get a model.
- Generate Target Value – The result of this is to generate a candidate model. The next question of whether or not this model is any good. In supervised learning, we input test data to candidate model form feature 1,3 and 6 we choose earlier. The test data can be the remaining 25% of the data of the feature we choose earlier. Candidate model now can generate target value from test data.
- Check Model Productiveness – Now we know that target should be available in training data. Now, all we have to do is to compare the target value generated by the candidate model with test Training Data. This is how we can determine whether or not our Model is predictive or not with supervised learning.
- Improving a Model – Suppose that a Model is not predictive after the above process then what to do? Well, we can try different options.
We have chosen the wrong feature and this time we can choose another set of features. We may have the wrong data or may need some new data or more example data. The algorithm could also be an issue so we can modify some parameter in an algorithm or choose entirely different.
- Iterative – Whatever we do of the above mention options to improve model we generate new Candidate Model and process repeats i.e. iterates. This process is called Machine learning. The above Machine learning process is an iterative and fancy way of saying trial and error. The process is called Machine learning and notice that human efforts required to make a feature decision, Choosing algorithm, Parameter. The process is very much human although it is called Machine Learning Process.