DL4S
此框架包含反向模式自动微分、常见矩阵和向量化算子的向量化实现以及高级神经网络操作(如卷积、循环单元等)的实现。
概述
安装
CocoaPods
target 'Your-App-Name' do
use_frameworks!
pod 'DL4S', '~> 0.1.0'
end
Swift 包管理器
将依赖项添加到你的 Package.swift
文件中
.package(url: "https://github.com/palle-k/DL4S.git", .branch("master"))
然后,将 DL4S
添加为你的目标的依赖项
.target(name: "MyPackage", dependencies: ["DL4S"])
功能
层
- 卷积
- 密集/线性/全连接
- 长短期记忆网络(LSTM)
- 门控循环单元(GRU)
- 普通循环神经网络(RNN)
- 双向循环神经网络(RNNs)
- 最大池化
- 平均池化
- ReLU
- tanh
- Sigmoid
- Softmax
- 嵌入
- 批归一化
- 层归一化
- Lambda
- 顺序
- Dropout
优化器
- 随机梯度下降(SGD)
- 动量
- Adam
- AdaGrad
- AdaDelta
- RMSProp
损失
- 二元交叉熵
- 分类交叉熵
- 均方误差(MSE)
- L1 & L2 规范化
张量操作
- broadcast-add
- broadcast-sub
- broadcast-mul
- broadcast-div
- 矩阵乘法(matmul)
- neg
- exp
- pow
- log
- sqrt
- sin
- cos
- tan
- tanh
- l1 范数
- l2 范数
- 求和(sum)
- 最大值(max)
- ReLU
- 漏式 ReLU(leaky relu)
- 减少求和(reduce sum)
- 减少最大值(reduce max)
- 二维卷积(conv2d)
- 最大池化(max pool)
- 平均池化(avg pool)
- 下标
- 下标范围
- 转置
- 轴排列
- 反转
- im2col
- col2im
- 堆叠/连接
引擎
- CPU(加速框架)
- GPU(Metal)
架构
以下架构提供了默认实现
- 残差网络(ResNet,目前仅ResNet-18)
- VGG
- AlexNet
示例
算术与微分
let a = Tensor<Float, CPU>([[1,2],[3,4],[5,6]], requiresGradient: true)
let prod = mmul(a.T, a)
let s = sum(prod)
let l = log(s)
print(l) // 5.1873856
// Backpropagate
l.backwards()
print(a.gradientDescription!)
/*
[[0.03351955, 0.03351955],
[0.07821229, 0.07821229],
[0.12290502, 0.12290502]]
*/
卷积网络
用于MNIST分类的示例
// Input must be 1x28x28
let model = Sequential<Float, CPU>(
Conv2D(inputChannels: 1, outputChannels: 6, kernelSize: 5, padding: 0).asAny(), // 4x24x24
Relu().asAny(),
MaxPool2D(windowSize: 2, stride: 2).asAny(), // 4x12x12
Conv2D(inputChannels: 6, outputChannels: 16, kernelSize: 5, padding: 0).asAny(), // 16x8x8
Relu().asAny(),
MaxPool2D(windowSize: 2, stride: 2).asAny(), // 16x4x4
Flatten().asAny(), // 256
Dense(inputFeatures: 256, outputFeatures: 120).asAny(),
Relu().asAny(),
Dense(inputFeatures: 120, outputFeatures: 10).asAny(),
Softmax().asAny()
)
let optimizer = Adam(parameters: model.trainableParameters, learningRate: 0.001)
// Single iteration of minibatch gradient descent
optimizer.zeroGradient()
let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]
let pred = model.forward(batch)
let loss = categoricalCrossEntropy(expected: y_true, actual: pred)
loss.backwards()
optimizer.step()
循环网络
用于MNIST分类的示例
LSTM从上到下扫描图像并使用最终隐藏状态进行分类。
let model = Sequential<Float, CPU>(
LSTM(inputSize: 28, hiddenSize: 128).asAny(),
Dense(inputFeatures: 128, outputFeatures: 10).asAny(),
Softmax().asAny()
)
let optimizer = Adam(parameters: model.trainableParameters, learningRate: 0.001)
// Single iteration of minibatch gradient descent
optimizer.zeroGradient()
let batch: Tensor<Float, CPU> = ... // shape: [batchSize, 28, 28]
let y_true: Tensor<Int32, CPU> = ... // shape: [batchSize]
let x = batch.permuted(to: 1, 0, 2) // Swap first and second axis
let pred = model.forward(x)
let loss = categoricalCrossEntropy(expected: y_true, actual: pred)
loss.backwards()
optimizer.step()
生成对抗网络
例如,生成类似于MNIST数据库中随机图像的示例
let images: Tensor<Float, CPU> = ... // shape [numImages x 28 x 28]
let d1 = Dropout<Float, CPU>(rate: 0.5)
let d2 = Dropout<Float, CPU>(rate: 0.5)
let generator = Sequential<Float, CPU>(
Dense(inputFeatures: 20, outputFeatures: 200).asAny(),
Tanh().asAny(),
d1.asAny(),
Dense(inputFeatures: 200, outputFeatures: 800).asAny(),
Tanh().asAny(),
d2.asAny(),
Dense(inputFeatures: 800, outputFeatures: 28 * 28).asAny(),
Sigmoid().asAny(),
Reshape(shape: 28, 28).asAny()
)
let discriminator = Sequential<Float, CPU>(
Flatten().asAny(),
Dense(inputFeatures: 28 * 28, outputFeatures: 400).asAny(),
Tanh().asAny(),
Dense(inputFeatures: 400, outputFeatures: 100).asAny(),
Tanh().asAny(),
Dense(inputFeatures: 100, outputFeatures: 1).asAny(),
Sigmoid().asAny()
)
let network = Sequential(generator.asAny(), discriminator.asAny())
let optimGen = Adam(parameters: generator.trainableParameters, learningRate: 0.0003)
let optimDis = Adam(parameters: discriminator.trainableParameters, learningRate: 0.0003)
let batchSize = 32
let epochs = 10_000
let regularization: Float = 0.001
let genInputs = Tensor<Float, CPU>(repeating: 0, shape: batchSize, 20)
for epoch in 1 ... epochs {
optimDis.zeroGradient()
let real = Random.minibatch(from: images, count: batchSize)
Random.fillNormal(genInputs)
let realResult = discriminator.forward(real)
let fakeResult = network.forward(genInputs)
let dRegLoss = optimDis.parameters.map {l2loss($0, loss: regularization)}.reduce(0, +)
let discriminatorLoss = -mean(log(realResult)) - mean(log(1 - fakeResult)) + dRegLoss
discriminatorLoss.backwards()
optimDis.step()
var generatorLoss = Tensor<Float, CPU>(0)
for _ in 0 ..< 4 {
optimGen.zeroGradient()
Random.fillNormal(genInputs)
let genResult = network.forward(genInputs)
let gRegLoss = optimGen.parameters.map {l2loss($0, loss: regularization)}.reduce(0, +)
generatorLoss = -0.5 * mean(log(genResult)) + gRegLoss // heuristic non-saturating loss
generatorLoss.backwards()
optimGen.step()
}
if epoch % 100 == 0 {
print(" [\(epoch)/\(epochs)] loss d: \(discriminatorLoss.item), g: \(generatorLoss.item)")
}
}
Random.fillNormal(genInputs)
let genResult = generator.forward(genInputs)
for i in 0 ..< batchSize {
let slice = genResult[i].T.unsqueeze(at: 0)
guard let image = NSImage(slice), let imgData = image.tiffRepresentation else {
continue
}
guard let rep = NSBitmapImageRep.init(data: imgData) else {
continue
}
let png = rep.representation(using: .png, properties: [:])
try? png?.write(to: URL(fileURLWithPath: "generated_\(i).png"))
}