[DNN] Check Wrong Predictions and Draw number with OpenCV and Predict

    INTRODUCTION


    What is ' MNIST ' ? It's a dataset of handwritten, but also makes a lot of newbies frustrated.
    https://en.wikipedia.org/wiki/MNIST_database

     

    MNIST database - Wikipedia

    From Wikipedia, the free encyclopedia Database of handwritten digits Sample images from MNIST test dataset The MNIST database (Modified National Institute of Standards and Technology database[1]) is a large database of handwritten digits that is commonly u

    en.wikipedia.org

     
    So, I'm gonna introduce how to learn ' MNIST ' data into model and figure wrong predictions out.
    Also draw number with open-source package ' OpenCV ' and predict it !
     
     
    Let's check the Result and Code below.
     

    RESULT

     

    CODE

    import cv2
    import numpy as np
    import tensorflow as tf
    import os
    import matplotlib.pyplot as plt
    import random
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
    
    # Global Variables
    mouse_mode = False
    pt = (0, 0)
    color = (200, 200, 200)
    thickness = 9
    image = np.full((280, 280, 3), 0, np.uint8)
    Test = np.full((280, 280, 3), 0, np.uint8)
    Predict_Window = np.full((280, 500, 3), 255, np.uint8)
    prediction = None
    
    # MNIST Data Set import
    mnist = tf.keras.datasets.mnist
    (train_data, train_label), (test_data, test_label) = mnist.load_data()
    
    # Normalize
    train_data, test_data = train_data/255.0, test_data/255.0
    
    # Flatten
    train_data = train_data.reshape(60000, 784).astype('float32')
    test_data = test_data.reshape(10000, 784).astype('float32')
    
    
    # Model
    model = tf.keras.models.Sequential([
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])
    
    # Model Compile
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    
    model.fit(train_data, train_label, epochs=10)
    train_result = model.evaluate(test_data, test_label)
    print('loss :', train_result[0])
    print('Acc  :', train_result[1])
    
    predict_result = model.predict(test_data)
    predict_label = np.argmax(predict_result, axis=1)
    
    
    # Selection
    wrong_result = []
    
    for i in range(0, len(test_label)):
        if(predict_label[i] != test_label[i]):
            wrong_result.append(i)
    
    print("Error : " + str(len(wrong_result)))
    
    sample = random.choices(population=wrong_result, k=16)
    
    # Plot
    plt.figure(figsize=(14, 12))
    
    for i, id in enumerate(sample):
        plt.subplot(4, 4, i+1)
        plt.imshow(test_data[id].reshape(28, 28), cmap='gray')
        plt.title("Label : " + str(test_label[id]) + " | Predict : " + str(predict_label[id]))
        plt.axis('off')
    
    plt.show()
    
    # Save
    # plt.savefig("fig1.png", dpi=1500)
    
    def onMouse(event, x, y, flags, param):
        global pt, mouse_mode, color, thickness, image, Test, prediction, Predict_Window
    
        if event == cv2.EVENT_LBUTTONDOWN:
            pt = (x, y)
            mouse_mode = True
    
        elif event == cv2.EVENT_MOUSEMOVE:
            if mouse_mode == True:
                cv2.line(image, pt, (x, y), color, thickness)
                pt = (x, y)
    
        elif event == cv2.EVENT_LBUTTONUP:
            mouse_mode = False
            cv2.line(image, pt, (x, y), color, thickness)
    
        elif event == cv2.EVENT_RBUTTONDOWN:
            Test = cv2.resize(image, (28, 28), interpolation=cv2.INTER_LINEAR)
            Test = cv2.cvtColor(Test, cv2.COLOR_BGR2GRAY)
            Test = Test / 255.0
            Test = Test.reshape(1, 784).astype('float32')
            predict_result = model.predict(Test)
            prediction = np.argmax(predict_result, axis=1)
            print(prediction)
            image = np.full((280, 280, 3), 0, np.uint8)
            Predict_Window = np.full((280, 500, 3), 255, np.uint8)
            cv2.putText(Predict_Window, 'Predict : ' + str(prediction[0]), (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 0), 2)
            cv2.imshow("Predict", Predict_Window)
    
    
    cv2.imshow("PaintCV", image)
    cv2.putText(Predict_Window, 'Predict : ', (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 0), 2)
    cv2.imshow("Predict", Predict_Window)
    cv2.setMouseCallback("PaintCV", onMouse)
    
    while True:
    
        cv2.imshow("PaintCV", image)
        if cv2.waitKey(1) == 27:
            break
    
    cv2.destroyAllWindows()

     

    Fig. 1. Wrong Predictions
    Fig. 2. Trainning Result

     

    SOFTWARE SPECIFICATION

    OS Ubuntu 18.04 LTS
    Python 3. 9. 16
    OpenCV 3. 4. 11
    Tensorflow 2. 7. 0

     

    CODE EXPLAIN

    Let's check the code sentence by sentence.

     

     

    Import Tensorflow

    import tensorflow as tf
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

    os sentence for Tensorflow error.

     

     

    Load Dataset

    # MNIST Data Set import
    mnist = tf.keras.datasets.mnist
    (train_data, train_label), (test_data, test_label) = mnist.load_data()
    
    # Normalize
    train_data, test_data = train_data/255.0, test_data/255.0
    
    # Flatten
    train_data = train_data.reshape(60000, 784).astype('float32')
    test_data = test_data.reshape(10000, 784).astype('float32')

    First sentence of above code block loads mnist data from dataset of Keras in Tensorflow library, and divides it with train and test data.
     
    And Second sentence is called ' Normalization ' and the reason why should we normalize data is.. check link below.
    https://pipeline.zoominfo.com/operations/what-is-data-normalization

     

    Data Normalization: 3 Reason to Normalize Data | ZoomInfo

    Data normalization creates relativity and context within your database by grouping similar values into one common value. Here's why that's important.

    pipeline.zoominfo.com

    And next, flatting our data to match learning form.

     

     

    Model

    # Model
    model = tf.keras.models.Sequential([
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    Our Model use ' Relu ' function, short of ' Rectified Linear Unit '.
    It literally used to rectify Input.

    Fig. 3. Relu

    Again, I'll left the concept of ' Relu ' function with link below.
    https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

     

    Rectifier (neural networks) - Wikipedia

    From Wikipedia, the free encyclopedia Activation function Plot of the ReLU rectifier (blue) and GELU (green) functions near x = 0 In the context of artificial neural networks, the rectifier or ReLU (rectified linear unit) activation function[1][2] is an ac

    en.wikipedia.org

     
    There's another well-known rectifier, ' Sigmoid '.

    Fig. 4. Sigmoid

    https://en.wikipedia.org/wiki/Sigmoid_function

     

    Sigmoid function - Wikipedia

    From Wikipedia, the free encyclopedia Mathematical function having a characteristic "S"-shaped curve or sigmoid curve A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. A common example of a sigmoid fun

    en.wikipedia.org

    But recently, people prefer ' Relu ' than ' Sigmoid '.
     
    And also, there's another function, ' Softmax '.

    Fig. 5. Softmax

    https://en.wikipedia.org/wiki/Softmax_function

     

    Softmax function - Wikipedia

    From Wikipedia, the free encyclopedia Smooth approximation of one-hot arg max This article is about the smooth approximation of one-hot arg max. For the smooth approximation of max, see LogSumExp. "Softmax" redirects here. For the Korean video game and ent

    en.wikipedia.org

    Please check why these functions are used.

     

     

    Model Optimizer

    # Model Compile
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    And next, it indicates how to compile model.
     
    First, Optimizer ' Adam '. Optimizer is a concept of how to minimize lose function and there're ' SGD ', Stochastic Gradient Descent, ' Adam ' and ' AdaGrad ' and so on.
    https://keras.io/api/optimizers/adam/

     

    Keras documentation: Adam

    Adam [source] Adam class tf.keras.optimizers.Adam( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, weight_decay=None, clipnorm=None, clipvalue=None, global_clipnorm=None, use_ema=False, ema_momentum=0.99, ema_overwrite_frequenc

    keras.io

     
    And next one is ' Cross-Entropy '.

    Fig. 6. Cross - Entropy

    In deep learning, Cross - Entropy 's p(x) is ' One - Hot - Encoding ', which means give 1 to desired set and 0 to undesired, so the output of undesired set should be 0.
    https://en.wikipedia.org/wiki/Cross_entropy

     

    Cross entropy - Wikipedia

    From Wikipedia, the free encyclopedia Information theory measure In information theory, the cross-entropy between two probability distributions p {\displaystyle p} and q {\displaystyle q} over the same underlying set of events measures the average number o

    en.wikipedia.org

     

     

    Train Result

    model.fit(train_data, train_label, epochs=10)
    train_result = model.evaluate(test_data, test_label)
    print('loss :', train_result[0])
    print('Acc  :', train_result[1])

    And ' Epoch ' means a number to iterate learning. Deep Learning has very complex process, so it advanced by iteration.
    https://en.wikipedia.org/wiki/Epoch

     

    Epoch - Wikipedia

    From Wikipedia, the free encyclopedia Reference point from which time is measured In chronology and periodization, an epoch or reference epoch is an instant in time chosen as the origin of a particular calendar era. The "epoch" serves as a reference point

    en.wikipedia.org

     

     

    Plot Wrong Predictions

    # Selection
    wrong_result = []
    
    for i in range(0, len(test_label)):
        if(predict_label[i] != test_label[i]):
            wrong_result.append(i)
    
    print("Error : " + str(len(wrong_result)))
    
    sample = random.choices(population=wrong_result, k=16)
    
    # Plot
    plt.figure(figsize=(14, 12))
    
    for i, id in enumerate(sample):
        plt.subplot(4, 4, i+1)
        plt.imshow(test_data[id].reshape(28, 28), cmap='gray')
        plt.title("Label : " + str(test_label[id]) + " | Predict : " + str(predict_label[id]))
        plt.axis('off')
    
    plt.show()
    
    # Save
    # plt.savefig("fig1.png", dpi=1500)

    Next Code is just pick random samples with ' random ' library in python, and append wrong predicitions on wrong_result list. ' Matplotlib ' helps you to plot easily.

     

     

    Callback to Draw

    def onMouse(event, x, y, flags, param):
        global pt, mouse_mode, color, thickness, image, Test, prediction, Predict_Window
    
        if event == cv2.EVENT_LBUTTONDOWN:
            pt = (x, y)
            mouse_mode = True
    
        elif event == cv2.EVENT_MOUSEMOVE:
            if mouse_mode == True:
                cv2.line(image, pt, (x, y), color, thickness)
                pt = (x, y)
    
        elif event == cv2.EVENT_LBUTTONUP:
            mouse_mode = False
            cv2.line(image, pt, (x, y), color, thickness)
    
        elif event == cv2.EVENT_RBUTTONDOWN:
            Test = cv2.resize(image, (28, 28), interpolation=cv2.INTER_LINEAR)
            Test = cv2.cvtColor(Test, cv2.COLOR_BGR2GRAY)
            Test = Test / 255.0
            Test = Test.reshape(1, 784).astype('float32')
            predict_result = model.predict(Test)
            prediction = np.argmax(predict_result, axis=1)
            print(prediction)
            image = np.full((280, 280, 3), 0, np.uint8)
            Predict_Window = np.full((280, 500, 3), 255, np.uint8)
            cv2.putText(Predict_Window, 'Predict : ' + str(prediction[0]), (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 0), 2)
            cv2.imshow("Predict", Predict_Window)
    
    
    cv2.imshow("PaintCV", image)
    cv2.putText(Predict_Window, 'Predict : ', (50, 100), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 0), 2)
    cv2.imshow("Predict", Predict_Window)
    cv2.setMouseCallback("PaintCV", onMouse)
    
    while True:
    
        cv2.imshow("PaintCV", image)
        if cv2.waitKey(1) == 27:
            break
    
    cv2.destroyAllWindows()

    Let's look our final code. onMouse ' callback ' function receives mouse event, and draw lines on paint image. If you pull right button on mouse, the Code hand over our drawn image to the model, and the model predicts number.
     
    ' While ' sentence keeps updating our paint image and show.
     
    Since out paint image is not fit with the model, so interpolates it with ' Bilinear Interpolation ' by cv2.INTER_LINEAR.
    https://en.wikipedia.org/wiki/Bilinear_interpolation

     

    Bilinear interpolation - Wikipedia

    From Wikipedia, the free encyclopedia Method of interpolating functions on a 2D grid Example of bilinear interpolation on the unit square with the z values 0, 1, 1 and 0.5 as indicated. Interpolated values in between represented by color. In mathematics, b

    en.wikipedia.org

    You can use other interpolation methods if it is proper.
     
     

     

    Are you wondering how to recognize multi-digits handwritten ?

    next version : https://loookup.tistory.com/12

     

    [DNN] Recognize Multi-Digits Handwritten by MNIST & DNN and Monitor with the Tensorboard

    This is the next version of the previous post : https://loookup.tistory.com/11 [Deep Learning] Check Wrong Predictions and Draw number with OpenCV and Predict it by MNIST Data What is ' MNIST ' ? It's a dataset of handwritten, but also makes a lot of newbi

    loookup.tistory.com

     

     

    Thank you for watching !

    댓글