mirror of https://github.com/lordmathis/CUDANet.git synced 2026-07-18 18:08:47 +00:00

Convolutional Neural Network inference library running on CUDA

convolutional-neural-networks cpp cuda pytorch

C++ 44.6%
Cuda 34.5%
Python 19.8%
CMake 1%
Shell 0.1%

Find a file

LordMathis 8fdbe60a75 Add initial dense forward test		2026-03-12 22:57:40 +01:00
examples	Refactor Backend and Layer interfaces	2025-11-18 18:27:57 +01:00
include	Add missing tensor default contructor	2026-03-12 21:59:23 +01:00
src	Fix namespace usage in tensor.cpp	2026-03-12 21:41:21 +01:00
test	Add initial dense forward test	2026-03-12 22:57:40 +01:00
tools	Update inception v3 readme	2024-09-04 21:32:05 +02:00
.clang-format	Format source code using clang-format	2024-02-27 18:52:12 +01:00
.gitignore	Update path passing in generators	2026-03-12 22:46:16 +01:00
CMakeLists.txt	Fix some test compilation errors	2026-02-28 23:29:54 +01:00
Doxyfile	Add doxygen config	2024-03-12 22:09:37 +01:00
LICENSE	Initial commit	2024-02-07 20:06:30 +01:00
README.md	Update README	2024-04-28 21:47:12 +02:00
run_tests.sh	Add run_tests utility script	2026-03-05 21:35:58 +01:00

README.md

CUDANet

Convolutional Neural Network inference library running on CUDA.

Quickstart Guide

requirements

cmake
CUDA
Google Test (for testing only)

build

mkdir build
cd build
cmake -S .. -DCMAKE_CUDA_ARCHITECTURES=75  # Replace with you cuda architecture
make

build and run tests

make test_main
./test/test_main

Create Layers and Model

CUDANet::Model *model =
    new CUDANet::Model(inputSize, inputChannels, outputSize);

// Conv2d
CUDANet::Layers::Conv2d *conv2d = new CUDANet::Layers::Conv2d(
    inputSize, inputChannels, kernelSize, stride, numFilters,
    CUDANet::Layers::Padding::VALID,
    CUDANet::Layers::ActivationType::NONE
);

if (setWeights) {
    conv2d->setWeights(getConv1Weights().data());
}
model->addLayer("conv1", conv2d);

Sequential and Functional API

Run prediction by passing the input through the layers in the order they have been added.

std::vector<float> input = {...};
model->predict(input.data());

If you want to use more complex forward pass, using Concat or Add layers, you can subclass the model class and override the default predict function

class MyModel : public CUDANet::Model {
    ...
}

...

float* MyModel::predict(const float* input) {
    float* d_input = inputLayer->forward(input);

    d_conv1 = getLayer("conv1")->forward(d_input);
    d_conv2 = getLayer("conv2")->forward(d_input);

    d_output = concatLayer->forward(d_conv1, d_conv2);

    return outputLayer->forward(d_input);
}

Load Pre-trained Weights

CUDANet uses format similar to safetensors to load weights and biases.

[u_short version, u_int64 header size, header, tensor values]

where header is a csv format

<tensor_name>,<tensor_size>,<tensor_offset>

To load weights call load_weights function on Model object. To export weights from pytorch you can use the export_model_weights function from tools/utils.py script. Currently only float32 weights are supported