Project information

  • Category: Data Analytics
  • Authors: @aadhityasw @sumitajmera
  • Project date: March, 2021 - June, 2021
  • Project URL: Github Repository
  • Tags:

    Exploratory Data Analytics

    Machine Learning

    sk-learn

    Decision Trees

    Light GBM

    Naive Bayes

CRIME ANALYSIS AND PREDICTION

    Machine Learning has swept the world and has shown what computers are capable of with the right data. In this paper, we aim to apply Machine Learning Algorithms to improve the field of crime analysis and prediction. We have analysed the top rated machine learning algorithms on being able to predict crimes to a good scale of accuracy.

    The Classification algorithm is a Supervised Learning technique that is used to identify the category of new observations on the basis of training data. In Classification, a program learns from the given dataset or observations and then classifies new observations into a number of classes or groups. The main goal of the Classification algorithm is to identify the category of a given dataset, and these algorithms are mainly used to predict the output for the categorical data. Some of the best examples of this are the spam classifiers in our mailboxes which classifies the mail to be spam or not. More specifically here we use a Multi-class Classifier algorithm, which refers to those classification tasks that have more than two class labels. Multi-class classification does not have the notion of normal and abnormal outcomes. Instead, examples are classified as belonging to one among a range of known classes. The number of class labels may be very large on some problems like in this current case.

We use a total of 7 classifiers to compare their performance :

  1. Decision Tree (ID3)
  2. Random Forest
  3. Extra Tree
  4. K Nearest Neighbor
  5. Bernoulli Naive Bayes
  6. Gaussian Naive Bayes
  7. Light GBM
    We have in this paper taken up 7 Machine Learning Algorithms, and have applied them to a total of 3 scenarios to see how they perform in varying conditions. We have also concluded by the results that the LightGBM model has secured the best metric scores among the lot in these scenarios. Thus it can be concluded for now that this algorithm can be used, but the fight against crime does not stop at this level of accuracy.

Refer to the below PDF Document for more detailed explanation of the same

Project Report