In regression problems, we are trying to predict continuous values as the output. This differs from classification, where the output is a category or class. There are a number of different types of regression problems we support using the following algorithms:
- Linear Regression (OLS)
- Radial Base Functions
- Regression Trees (e.g. Random Forest)
- Support Vector Regression (SVR)
As an example, we will build a predictive model to predict house price (price is a number from some defined range, so it will be regression task). We will be using linear regression to predict sales price based on multiple attributes.
You can download the house price dataset here.
Let's suppose you want to sell your house and you are wondering what you can get for it. You usually look for other homes similar to yours, in the same area and close to the same age as yours. We will do something similar, but with Linear Regression Machine Learning.
Attribute Information:
1. CRIM per capita crime rate by town
2. ZN proportion of residential land zoned for lots over 25,000 sq.ft.
3. INDUS proportion of non-retail business acres per town
4. CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
5. NOX nitric oxides concentration (parts per 10 million)
6. RM average number of rooms per dwelling
7. AGE proportion of owner-occupied units built prior to 1940
8. DIS weighted distances to five Boston employment centers
9. RAD index of accessibility to radial highways
10. TAX full-value property-tax rate per $10,000
11. PTRATIO pupil-teacher ratio by town
12. B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
13. LSTAT % lower status of the population
14. PRICE True value of owner-occupied homes in $1000's
We will be training our model using PRICE.
Now, navigate over to our Workspaces page and you will be lead through all the necessary steps to create your model.