python中怎么模拟感知机算法

发布时间：2021-07-10 13:45:05 作者：Leah
来源：亿速云阅读：213

# Python中怎么模拟感知机算法

## 1. 感知机算法简介

感知机（Perceptron）是Frank Rosenblatt在1957年提出的二分类线性分类模型，是神经网络和支持向量机的基础。它通过特征向量与权重的线性组合进行决策，是机器学习中最简单的监督学习算法之一。

### 1.1 基本概念

- **输入向量**：x = (x₁, x₂, ..., xₙ)
- **权重向量**：w = (w₁, w₂, ..., wₙ)
- **偏置项**：b
- **激活函数**：通常使用阶跃函数（step function）

### 1.2 数学模型

感知机的决策函数可表示为：

f(x) = sign(w·x + b)

其中sign是符号函数：
sign(a) = +1 if a ≥ 0
         -1 otherwise

## 2. 感知机学习算法

### 2.1 原始形式

```python
import numpy as np

class Perceptron:
    def __init__(self, learning_rate=0.01, n_iters=1000):
        self.lr = learning_rate  # 学习率
        self.n_iters = n_iters    # 迭代次数
        self.weights = None       # 权重
        self.bias = None          # 偏置
    
    def fit(self, X, y):
        n_samples, n_features = X.shape
        
        # 初始化参数
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        # 确保标签是-1和1
        y_ = np.array([1 if i > 0 else -1 for i in y])
        
        for _ in range(self.n_iters):
            for idx, x_i in enumerate(X):
                condition = y_[idx] * (np.dot(x_i, self.weights) + self.bias)
                if condition <= 0:  # 误分类点
                    self.weights += self.lr * y_[idx] * x_i
                    self.bias += self.lr * y_[idx]
    
    def predict(self, X):
        linear_output = np.dot(X, self.weights) + self.bias
        return np.sign(linear_output)

2.2 对偶形式

对偶形式通过Gram矩阵计算，适用于特征维度较高的情况：

class DualPerceptron:
    def __init__(self, learning_rate=0.01, n_iters=1000):
        self.lr = learning_rate
        self.n_iters = n_iters
        self.alpha = None    # 对偶变量
        self.bias = None
        self.X_train = None
        self.y_train = None
    
    def fit(self, X, y):
        n_samples, _ = X.shape
        self.alpha = np.zeros(n_samples)
        self.bias = 0
        self.X_train = X
        self.y_train = np.array([1 if i > 0 else -1 for i in y])
        
        # 预计算Gram矩阵
        gram_matrix = np.dot(X, X.T)
        
        for _ in range(self.n_iters):
            for i in range(n_samples):
                if self.y_train[i] * (np.sum(self.alpha * self.y_train * gram_matrix[i]) + self.bias) <= 0:
                    self.alpha[i] += self.lr
                    self.bias += self.lr * self.y_train[i]
    
    def predict(self, X):
        # 计算权重向量w
        w = np.sum(self.alpha[:, None] * self.y_train[:, None] * self.X_train, axis=0)
        return np.sign(np.dot(X, w) + self.bias)

3. 算法实现细节

3.1 数据预处理

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# 生成模拟数据
X, y = make_classification(
    n_samples=1000, 
    n_features=2, 
    n_redundant=0,
    n_clusters_per_class=1,
    flip_y=0.1,
    random_state=42
)
y = np.where(y == 0, -1, 1)  # 转换为-1和1

# 数据标准化
scaler = StandardScaler()
X = scaler.fit_transform(X)

# 划分训练测试集
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

3.2 训练与评估

# 初始化感知机
perceptron = Perceptron(learning_rate=0.1, n_iters=1000)

# 训练模型
perceptron.fit(X_train, y_train)

# 预测
y_pred = perceptron.predict(X_test)

# 评估准确率
accuracy = np.mean(y_pred == y_test)
print(f"Accuracy: {accuracy:.2f}")

4. 可视化决策边界

import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

def plot_decision_boundary(model, X, y):
    cmap = ListedColormap(["#FFAAAA", "#AAFFAA"])
    
    # 创建网格点
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
                         np.arange(y_min, y_max, 0.02))
    
    # 预测每个网格点
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    # 绘制决策边界
    plt.contourf(xx, yy, Z, alpha=0.4, cmap=cmap)
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap, edgecolor='k')
    plt.title("Perceptron Decision Boundary")
    plt.xlabel("Feature 1")
    plt.ylabel("Feature 2")
    plt.show()

plot_decision_boundary(perceptron, X_test, y_test)

5. 感知机的局限性

5.1 线性不可分问题

感知机只能处理线性可分的数据集，对于异或(XOR)等线性不可分问题无法收敛：

# 生成XOR数据
X_xor = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y_xor = np.array([-1, 1, 1, -1])

xor_perceptron = Perceptron(n_iters=1000)
xor_perceptron.fit(X_xor, y_xor)
print("XOR predictions:", xor_perceptron.predict(X_xor))

5.2 解决方案

多层感知机(MLP)：通过堆叠多个感知机构成神经网络
核方法：将数据映射到高维空间使其线性可分
其他算法：如支持向量机(SVM)等

6. 与逻辑回归的比较

特性	感知机	逻辑回归
输出	硬分类(-¹⁄₁)	概率输出(0-1)
损失函数	0-1损失	对数似然损失
优化方法	随机梯度下降	梯度下降/牛顿法
适用场景	线性可分数据	各类分类问题
收敛性	有限步收敛(线性可分时)	总是收敛

7. 实际应用案例

7.1 手写数字识别

from sklearn.datasets import load_digits
from sklearn.metrics import classification_report

# 加载数据
digits = load_digits()
X = digits.data
y = digits.target

# 二分类问题：识别数字0
y = np.where(y == 0, 1, -1)

# 标准化
X = StandardScaler().fit_transform(X)

# 训练感知机
perceptron = Perceptron(n_iters=1000)
perceptron.fit(X, y)
y_pred = perceptron.predict(X)

print(classification_report(y, y_pred))

7.2 乳腺癌分类

from sklearn.datasets import load_breast_cancer

data = load_breast_cancer()
X = data.data
y = data.target
y = np.where(y == 1, 1, -1)  # 转换为-1和1

# 特征选择
X = X[:, [0, 1]]  # 选择前两个特征便于可视化

# 训练模型
perceptron = Perceptron(n_iters=1000)
perceptron.fit(X, y)
plot_decision_boundary(perceptron, X, y)

8. 性能优化技巧

特征缩放：标准化或归一化提升收敛速度
学习率调整：使用动态学习率如lr = 1/(t+1)
批量计算：使用矩阵运算替代循环
早停机制：当准确率不再提升时停止训练

class ImprovedPerceptron(Perceptron):
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        y_ = np.array([1 if i > 0 else -1 for i in y])
        
        best_acc = 0
        no_improve = 0
        
        for epoch in range(self.n_iters):
            # 动态学习率
            current_lr = self.lr / (epoch + 1)
            
            for idx, x_i in enumerate(X):
                condition = y_[idx] * (np.dot(x_i, self.weights) + self.bias)
                if condition <= 0:
                    self.weights += current_lr * y_[idx] * x_i
                    self.bias += current_lr * y_[idx]
            
            # 早停检查
            y_pred = self.predict(X)
            acc = np.mean(y_pred == y_)
            if acc > best_acc:
                best_acc = acc
                no_improve = 0
            else:
                no_improve += 1
                if no_improve >= 10:
                    print(f"Early stopping at epoch {epoch}")
                    break

9. 总结

感知机作为最简单的神经网络单元，具有以下特点： 1. 原理简单直观，易于实现 2. 在线性可分数据上能保证收敛 3. 为更复杂的神经网络模型奠定基础 4. 计算效率高，适合大规模数据

虽然现代深度学习已经发展出更复杂的模型，但理解感知机的工作原理仍然是学习机器学习的重要基础。通过Python实现感知机算法，可以帮助我们深入理解梯度下降、参数更新等核心概念。

本文完整代码已上传至GitHub仓库：perceptron-implementation “`