DenseNet, ResNet – ITpasjonat!

1. DenseNet

W DenseNet, każda warstwa jest połączona z wszystkimi poprzednimi warstwami poprzez konkatenację. Poniżej znajduje się implementacja sieci, która wykorzystuje te cechy.

import torch
import torch.nn as nn

class Net4_densenet(nn.Module):
    def __init__(self, input_dim, hidden_dim1=48, hidden_dim2=24, hidden_dim3=12, output_dim=1, dropout_rate=0.0001):
        super(Net4_densenet, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim1)
        self.swish1 = nn.SiLU()
        self.dropout1 = nn.Dropout(p=dropout_rate)
        
        # Wejście do fc2 to input_dim + hidden_dim1
        self.fc2 = nn.Linear(input_dim + hidden_dim1, hidden_dim2)
        self.swish2 = nn.SiLU()
        self.dropout2 = nn.Dropout(p=dropout_rate)
        
        # Wejście do fc3 to input_dim + hidden_dim1 + hidden_dim2
        self.fc3 = nn.Linear(input_dim + hidden_dim1 + hidden_dim2, hidden_dim3)
        self.swish3 = nn.SiLU()
        self.dropout3 = nn.Dropout(p=dropout_rate)
        
        # Wejście do fc4 to input_dim + hidden_dim1 + hidden_dim2 + hidden_dim3
        self.fc4 = nn.Linear(input_dim + hidden_dim1 + hidden_dim2 + hidden_dim3, output_dim)
    
    def forward(self, x):
        x1 = self.fc1(x)
        x1 = self.swish1(x1)
        x1 = self.dropout1(x1)
        
        # Skonkatynowanie wejścia x z wyjściem x1
        x2_input = torch.cat([x, x1], dim=1)
        x2 = self.fc2(x2_input)
        x2 = self.swish2(x2)
        x2 = self.dropout2(x2)
        
        # Skonkatynowanie wejścia x, wyjścia x1 i x2
        x3_input = torch.cat([x, x1, x2], dim=1)
        x3 = self.fc3(x3_input)
        x3 = self.swish3(x3)
        x3 = self.dropout3(x3)
        
        # Skonkatynowanie wejścia x, wyjścia x1, x2 i x3
        x4_input = torch.cat([x, x1, x2, x3], dim=1)
        x4 = self.fc4(x4_input)
        
        return x4

2. ResNet

W ResNet stosuje się skip connections (rezydualność), gdzie wyjście z jednej warstwy jest dodawane do wyjścia kolejnej warstwy. Oto jak możesz zaimplementować tę architekturę:

import torch
import torch.nn as nn

class Net4_resnet(nn.Module):
    def __init__(self, input_dim, hidden_dim1=48, hidden_dim2=24, hidden_dim3=12, output_dim=1, dropout_rate=0.0001):
        super(Net4_resnet, self).__init__()
        self.fc1 = nn.Linear(input_dim, hidden_dim1)
        self.swish1 = nn.SiLU()
        self.dropout1 = nn.Dropout(p=dropout_rate)
        
        # Warstwa 2
        self.fc2 = nn.Linear(hidden_dim1, hidden_dim2)
        self.swish2 = nn.SiLU()
        self.dropout2 = nn.Dropout(p=dropout_rate)
        
        # Warstwa 3
        self.fc3 = nn.Linear(hidden_dim2, hidden_dim3)
        self.swish3 = nn.SiLU()
        self.dropout3 = nn.Dropout(p=dropout_rate)
        
        # Warstwa 4
        self.fc4 = nn.Linear(hidden_dim3, output_dim)
    
    def forward(self, x):
        x1 = self.fc1(x)
        x1 = self.swish1(x1)
        x1 = self.dropout1(x1)
        
        x2 = self.fc2(x1)
        x2 = self.swish2(x2)
        x2 = self.dropout2(x2)
        x2 += x1  # Skip connection
        
        x3 = self.fc3(x2)
        x3 = self.swish3(x3)
        x3 = self.dropout3(x3)
        x3 += x2  # Skip connection
        
        x4 = self.fc4(x3)
        # Brak aktywacji po fc4, bo to zazwyczaj ostatnia warstwa
        
        return x4

Kluczowe różnice:

DenseNet:
- Konkatenacja: Wyjście każdej warstwy jest konkatenowane z wejściem do kolejnej warstwy.
- Wymiar wejściowy kolejnych warstw rośnie w miarę dodawania nowych wyjść.
ResNet:
- Skip connections: Wyjście jednej warstwy jest dodawane do wyjścia kolejnej warstwy, co utrzymuje stały wymiar tensorów przez cały proces.
- Rezydualność: Umożliwia efektywne uczenie głębokich sieci przez dodawanie wyjść poprzednich warstw, co pomaga w radzeniu sobie z problemem znikającego gradientu.

1. DenseNet

2. ResNet

Kluczowe różnice:

Dodaj komentarz Anuluj pisanie odpowiedzi