您好,登录后才能下订单哦!
卷积神经网络(Convolutional Neural Networks, CNNs)是深度学习领域中最常用的模型之一,尤其在图像识别任务中表现出色。手写数字识别(如MNIST数据集)是深度学习中的经典问题,常用于验证模型的性能。虽然现代深度学习框架(如TensorFlow、PyTorch)提供了便捷的工具来构建CNN,但通过NumPy手动实现CNN可以帮助我们深入理解其底层原理。
本文将详细介绍如何使用NumPy从头搭建一个简单的卷积神经网络,并应用于MNIST手写数字识别任务。我们将逐步实现卷积层、池化层、全连接层以及反向传播算法,最终训练模型并评估其性能。
在开始之前,确保已安装以下Python库:
pip install numpy matplotlib
MNIST数据集包含60,000张训练图像和10,000张测试图像,每张图像为28x28的灰度图。我们可以使用keras.datasets
加载数据集:
from tensorflow.keras.datasets import mnist
import numpy as np
# 加载MNIST数据集
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# 归一化像素值到[0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# 将标签转换为one-hot编码
def one_hot_encode(labels, num_classes=10):
return np.eye(num_classes)[labels]
y_train = one_hot_encode(y_train)
y_test = one_hot_encode(y_test)
# 调整输入数据的形状为 (样本数, 高度, 宽度, 通道数)
x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)
卷积层是CNN的核心组件,通过卷积核(filter)提取输入图像的特征。以下是卷积层的实现:
class Conv2D:
def __init__(self, num_filters, filter_size, stride=1, padding=0):
self.num_filters = num_filters
self.filter_size = filter_size
self.stride = stride
self.padding = padding
self.filters = np.random.randn(num_filters, filter_size, filter_size) * 0.1
def forward(self, input):
self.input = input
input_height, input_width = input.shape[1], input.shape[2]
output_height = (input_height - self.filter_size + 2 * self.padding) // self.stride + 1
output_width = (input_width - self.filter_size + 2 * self.padding) // self.stride + 1
output = np.zeros((input.shape[0], output_height, output_width, self.num_filters))
for i in range(output_height):
for j in range(output_width):
h_start = i * self.stride
h_end = h_start + self.filter_size
w_start = j * self.stride
w_end = w_start + self.filter_size
output[:, i, j, :] = np.sum(input[:, h_start:h_end, w_start:w_end, :] * self.filters, axis=(1, 2))
return output
池化层用于降低特征图的维度,同时保留重要信息。以下是最大池化层的实现:
class MaxPool2D:
def __init__(self, pool_size=2, stride=2):
self.pool_size = pool_size
self.stride = stride
def forward(self, input):
self.input = input
input_height, input_width = input.shape[1], input.shape[2]
output_height = (input_height - self.pool_size) // self.stride + 1
output_width = (input_width - self.pool_size) // self.stride + 1
output = np.zeros((input.shape[0], output_height, output_width, input.shape[3]))
for i in range(output_height):
for j in range(output_width):
h_start = i * self.stride
h_end = h_start + self.pool_size
w_start = j * self.stride
w_end = w_start + self.pool_size
output[:, i, j, :] = np.max(input[:, h_start:h_end, w_start:w_end, :], axis=(1, 2))
return output
全连接层将特征图展平为一维向量,并通过权重矩阵进行线性变换:
class Dense:
def __init__(self, input_size, output_size):
self.weights = np.random.randn(input_size, output_size) * 0.1
self.bias = np.zeros(output_size)
def forward(self, input):
self.input = input
return np.dot(input, self.weights) + self.bias
激活函数引入非线性,使网络能够学习复杂的模式。以下是ReLU激活函数的实现:
def relu(x):
return np.maximum(0, x)
Softmax函数用于多分类任务,将输出转换为概率分布:
def softmax(x):
exp_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
return exp_x / np.sum(exp_x, axis=-1, keepdims=True)
将上述组件组合成一个完整的CNN模型:
class CNN:
def __init__(self):
self.conv1 = Conv2D(num_filters=8, filter_size=3, stride=1, padding=1)
self.pool1 = MaxPool2D(pool_size=2, stride=2)
self.conv2 = Conv2D(num_filters=16, filter_size=3, stride=1, padding=1)
self.pool2 = MaxPool2D(pool_size=2, stride=2)
self.dense1 = Dense(input_size=7*7*16, output_size=128)
self.dense2 = Dense(input_size=128, output_size=10)
def forward(self, input):
x = self.conv1.forward(input)
x = relu(x)
x = self.pool1.forward(x)
x = self.conv2.forward(x)
x = relu(x)
x = self.pool2.forward(x)
x = x.reshape(x.shape[0], -1) # 展平
x = self.dense1.forward(x)
x = relu(x)
x = self.dense2.forward(x)
return softmax(x)
使用交叉熵损失函数衡量预测值与真实标签的差异:
def cross_entropy_loss(y_pred, y_true):
return -np.mean(np.sum(y_true * np.log(y_pred + 1e-10), axis=1))
通过反向传播计算梯度并更新参数:
def backward(model, y_pred, y_true, learning_rate=0.01):
# 计算输出层梯度
grad_output = y_pred - y_true
# 更新全连接层参数
grad_dense2 = np.dot(model.dense1.input.T, grad_output)
model.dense2.weights -= learning_rate * grad_dense2
model.dense2.bias -= learning_rate * np.sum(grad_output, axis=0)
# 计算隐藏层梯度
grad_hidden = np.dot(grad_output, model.dense2.weights.T)
grad_hidden[model.dense1.input <= 0] = 0 # ReLU梯度
# 更新全连接层参数
grad_dense1 = np.dot(model.pool2.output.reshape(model.pool2.output.shape[0], -1).T, grad_hidden)
model.dense1.weights -= learning_rate * grad_dense1
model.dense1.bias -= learning_rate * np.sum(grad_hidden, axis=0)
# 反向传播到池化层和卷积层(略)
model = CNN()
epochs = 5
batch_size = 32
for epoch in range(epochs):
for i in range(0, len(x_train), batch_size):
x_batch = x_train[i:i+batch_size]
y_batch = y_train[i:i+batch_size]
# 前向传播
y_pred = model.forward(x_batch)
# 计算损失
loss = cross_entropy_loss(y_pred, y_batch)
print(f"Epoch {epoch+1}, Batch {i//batch_size}, Loss: {loss}")
# 反向传播
backward(model, y_pred, y_batch)
在测试集上评估模型性能:
y_pred_test = model.forward(x_test)
accuracy = np.mean(np.argmax(y_pred_test, axis=1) == np.argmax(y_test, axis=1))
print(f"Test Accuracy: {accuracy * 100:.2f}%")
通过本文,我们使用NumPy从头实现了一个简单的卷积神经网络,并成功应用于MNIST手写数字识别任务。虽然手动实现CNN的过程较为复杂,但它有助于我们深入理解卷积、池化、全连接层以及反向传播的原理。在实际应用中,建议使用成熟的深度学习框架(如TensorFlow、PyTorch)以提高开发效率和模型性能。
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。