0.1 What is machine learning?

  • Machine learning maps input X to output Y as:

\[ Y = f(X) \]

  • Machine learning provides two major things:
    • Predicion
    • Feature selection
  • Categorized into:
    • Parametric
    • Non-parametric
  • Unsupervised learning
    • Data-driven approach
  • Supervised learning
    • Hypothesis-driven approach
  • Which features are characterisic for the type of cells etc. you want to predict?

0.1.1 Main steps of machine learning

  • Clean the data: correct, normalize, standardize etc.
  • Identify features in the data (deep learning skips this step, it builds its own features)
  • Machine learning model is fitted on the training and evaluated on an independent subset

0.1.2 How does machine learning work?

  • Five steps:

    • Split data set into train, validation and test subsets
      • Randomly assign 70 % to training and 30 % to test (approx.)
    • Fit model in the train subset
    • Validate model on validation subset
    • Repeat steps 1-3 a number of times
    • Test the accuracy of the optimized model on test subset

0.1.3 What is a hyperparameter?

  • Machine learning design parameters which are set before the learning process starts
    • E.g. the number of covariates to adjust the main variable x of interest for

0.2 Random Forest

  • Bases predictions on TRUE/FALSE trees
  • Makes predictions based on the information given by iterating through the tree

0.3 What is Deep Learning?

  • Artificial neural networks with multiple layers