Pytorch

Pytorch is a python library that started as a fast matrix multiplication and derivative computation. It has modules for deep-learning.

There are multiple other well-known library for machine learning, mainly Tensorflow + Keras.

When I started machine learning project, I was a big Tensorflow user and ran into several issues. First the API so not stable between versions, so a lot of project that use tensorflow 1 simply break with Tensorflow 2 and it's not easy to fix them. This gets worse when your version of Keras (which is the high level library that you interact with to use tensorflow) breaks in weird ways because of changes to Tensorflow. Also, Tensorflow requires a python version between 3.6 and 3.9, so your project can break if another library depends on a newer python version. All in all, I think I was never able to run a project made by somebody else from GitHub that used Tensorflow on my windows machine.

Moreover, Tensorflow makes the lower level objects that you are manipulating (matrixes), less apparent. Meanwhile, with Pytorch, you get more low level flexibility allowing you to truly experiment and get a sense for how everything works. Pytorch works with every python version above 3.8.

Example of training a neural network for image recognition in Pytorch: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):  # loop over the dataset multiple times
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')
            running_loss = 0.0

print('Finished Training')

I think this example showcases how the pytorch API is simple yet powerful. You can write whatever you like inside your forward function and pytorch will compute the derivative for you.