losses module

Transfer Losses module.

This module is intended to provide a set of transfer losses for domain adaptation application.

The code can be found here: https://github.com/jindongwang/transferlearning/tree/master/code/DeepDA

This file gathered some of the available losses in order to have the functionalities in a single module.


MIT License

Copyright (c) 2018 Jindong Wang

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

class losses.AdversarialLoss(gamma=1.0, max_iter=1000, use_lambda_scheduler=True, input_dim=256, **kwargs)

Bases: Module

AdversarialLoss.

This class inherits from the PyTorch Module class and represents a loss function for domain adaptation based on the adversarial criterion.

Acknowledgement: The adversarial loss implementation is inspired by http://transfer.thuml.ai/

__init__.

This method initializes the AdversarialLoss object and sets the attributes according to the parameters. It also creates a domain classifier as a submodule of the AdversarialLoss object. If the use_lambda_scheduler flag is True, it also creates a lambda scheduler as a submodule of the AdversarialLoss object.

Parameters:
  • gamma – coefficient for the scheduler

  • max_iter – maximum number of iterations for the scheduler

  • use_lambda_scheduler – True to use lambda scheduler

  • input_dim – data input dimension

  • kwargs – other arguments

forward(source, target)

forward.

This method computes and returns the adversarial loss for domain adaptation.

  1. it takes the source and target tensors as inputs and computes the lambda value for the gradient reversal using the lambda scheduler

  2. it computes the adversarial loss for the source and target data using the get_adversarial_result method

  3. it returns the average of the source and target adversarial losses as a scalar tensor

Parameters:
  • source – source tensor

  • target – target tensor

get_adversarial_result(x, source=True, lamb=1.0)

get_adversarial_result.

This method computes and returns the adversarial loss for a given input and a domain label.

  1. it takes the input as a tensor and applies the ReverseLayerF function to it, which reverses the gradient flow during the backward pass

  2. it passes the reversed input to the domain classifier, which predicts the probability of the input belonging to the source or the target domain

  3. it creates the domain label as a tensor of ones or zeros, depending on whether the input is from the source or the target domain

  4. it computes the binary cross-entropy loss between the domain prediction and the domain label

  5. it returns the loss as a scalar tensor

Parameters:
  • x – tensor of the input data

  • source – True of domain is source, False for target

  • lamb – parameter controlling the gradient reversal strength

losses.BNM(source, target)

BNM. Batch nuclear-norm maximization, CVPR 2020.

This function computes and returns the batch nuclear-norm maximization (BNM) loss for domain adaptation. The function is based on the paper “Batch Nuclear-Norm Maximization for Domain Generalization and Adaptation” by Li et al., CVPR 2020. The function does not require the source domain data, only the target domain data.

Parameters:
  • source – source tensor. Not used

  • target – a tensor, softmax target output

losses.CORAL(source, target, **kwargs)

CORAL.

This function computes and returns the CORAL loss for domain adaptation. The function is based on the paper “Deep CORAL: Correlation Alignment for Deep Domain Adaptation” by Sun and Saenko, ECCV 2016. The function takes the source and target tensors as inputs and calculates their covariance matrices. The function then computes the Frobenius norm of the difference between the covariance matrices and returns it as the loss.

Parameters:
  • source – Source data tensor, with shape (batch_size, input_dim)

  • target – Target data tensor, with shape (batch_size, input_dim)

  • kwargs – Additional arguments. Not used

class losses.DAANLoss(input_dim, num_class, gamma=1.0, max_iter=1000, **kwargs)

Bases: AdversarialLoss, LambdaScheduler

DAANLoss.

This class inherits from the AdversarialLoss and LambdaScheduler classes and represents a loss function for domain adaptation based on the dynamic adversarial adaptation network (DAAN) criterion

__init__.

This method initializes the DAANLoss object and sets the attributes according to the parameters. It also creates a list of local classifiers, one for each class, as submodules of the DAANLoss object. It also initializes the dynamic factor and the variables for tracking the global and local losses.

Parameters:
  • input_dim – input data dimension

  • num_class – number of classes in classification task. Set it to 1 for regression

  • gamma – coefficient for the lambda scheduler

  • max_iter – maximum number of iterations for the lambda scheduler

  • kwargs – additional arguments

forward(source, target, source_logits, target_logits)

forward.

This method computes and returns the DAAN loss for domain adaptation.

  1. it takes the source and target tensors and their logits as inputs and computes the global and local adversarial losses using the get_adversarial_result and the get_local_adversarial_result methods

  2. it updates the d_g and d_l variables with the global and local losses

  3. it computes the DAAN loss as a weighted combination of the global and local losses, using the dynamic factor as the weight

  4. it then returns the DAAN loss as a scalar tensor

Parameters:
  • source – source tensor

  • target – target tensor

  • source_logits – source logits

  • target_logits – target logits

get_local_adversarial_result(x, logits, c, source=True, lamb=1.0)

get_local_adversarial_result.

This method computes and returns the local adversarial loss for domain adaptation. 1. it takes the input data, the logits, the class index, and

the domain label as inputs and computes the local features by multiplying the input data with the logits of the given class

  1. it passes the local features to the corresponding local classifier and obtains the domain prediction

  2. it computes the binary cross-entropy loss between the domain prediction and the domain label and returns it as a scalar tensor

Parameters:
  • x – input data

  • logits – input logits

  • c – class index

  • source – True if input is source. False if target

  • lamb – controls the gradient recersal strength

update_dynamic_factor(epoch_length)

update_dynamic_factor. This method updates the dynamic factor for the DAAN loss based on the global and local losses.

  1. it takes the epoch length as an input and computes the average global and local losses by dividing the d_g and d_l variables by the epoch length

  2. it computes the dynamic factor as one minus the ratio of the average global loss to the sum of the average global and local losses

  3. it resets the d_g and d_l variables to zero

Parameters:

epoch_length – length of current epoch

class losses.Discriminator(input_dim=256, hidden_dim=32)

Bases: Module

Discriminator.

This class inherits from the PyTorch Module class and represents a discriminator network that can be used for domain adaptation.

__init__.

This method initializes the Discriminator object and sets the attributes according to the parameters. It also creates a list of layers as a sequential module of the Discriminator object. The layers consist of two linear layers with batch normalization and ReLU activation, followed by a linear layer with sigmoid activation.

Parameters:
  • input_dim – input data dimension

  • hidden_dim – hidden layers dimension

forward(x)

forward.

This method performs the forward pass of the discriminator network and returns the output as a tensor. The method takes the input data as a tensor and passes it through the layers of the discriminator network. The output is a tensor of the probability of the input data belonging to the source domain.

Parameters:

x – input

class losses.LambdaScheduler(gamma=1.0, max_iter=1000, **kwargs)

Bases: Module

LambdaScheduler.

This class inherits from the PyTorch Module class and represents a lambda scheduler that can be used for DAAN and Adversarial losses.

__init__.

This method initializes the LambdaScheduler object and sets the attributes according to the parameters. It also initializes the current iteration to zero.

Parameters:
  • gamma – coefficient for the lambda scheduler

  • max_iter – maximum number of iterations for the lambda scheduler

  • kwargs – additional arguments

lamb()

lamb.

This method computes and returns the lambda value for the lambda scheduler.

It uses the current iteration, the maximum iteration, and the gamma coefficient to calculate the lambda value as a sigmoid function.

step()

step.

This method updates the current iteration for the lambda scheduler.

It increments the current iteration by one, but does not exceed the maximum iteration.

class losses.MMDLoss(kernel_type='rbf', kernel_mul=2.0, kernel_num=5, fix_sigma=None, **kwargs)

Bases: Module

MMDLoss.

This class inherits from the PyTorch Module class and represents a loss function for domain adaptation based on the maximum mean discrepancy (MMD) criterion.

__init__.

Parameters:
  • kernel_type – type of kernel to use. Possible values are: * ‘linear’ * ‘rbf’

  • kernel_mul – multiplier for the kernel bandwidth

  • kernel_num – number of kernels to use

  • fix_sigma – fixed bandwidth for the kernel. If None, the bandwidth will be computed from the data

  • kwargs – additional arguments

forward(source, target)

forward.

This method computes and returns the MMD loss for domain adaptation.

  1. it takes the source and target tensors as inputs and checks the kernel type attribute: - if the kernel type is ‘linear’, it calls the linear_mmd2

    method and returns the result

    • if the kernel type is ‘rbf’, it calls the gaussian_kernel method and computes the MMD loss as the mean difference between the kernel values of the source-source, target-target, source-target, and target-source pairs

  2. it returns the MMD loss as a scalar tensor

Parameters:
  • source – source tensor

  • target – target tensor

gaussian_kernel(source, target, kernel_mul, kernel_num, fix_sigma)

gaussian_kernel.

This method computes and returns the sum of Gaussian kernels for the MMD computation.

  1. it takes the source and target tensors as inputs and concatenates them into a total tensor

  2. it computes the pairwise L2 distance between the rows of the total tensor

  3. it computes the bandwidth for the Gaussian kernels, either using the fix_sigma parameter or the data statistics

  4. it creates a list of bandwidth values by multiplying the base bandwidth by the kernel_mul parameter to the power of the kernel index

  5. it computes a list of kernel values by applying the Gaussian kernel function to the L2 distance matrix with each bandwidth value

  6. it returns the sum of the kernel values as a tensor

Parameters:
  • source – source tensor

  • target – target tensor

  • kernel_mul – multiplier for the kernel bandwidth

  • kernel_num – number of kernels to use for the MMD computation

  • fix_sigma – fixed bandwidth for the kernel. If None, the bandwidth will be computed from the data

linear_mmd2(f_of_X, f_of_Y)

linear_mmd2.

This method computes and returns the linear MMD loss for the MMD computation.

  1. it takes the source and target tensors as inputs

  2. it computes the mean difference between them

  3. it computes the dot product of the mean difference with itself

  4. it returns it as a scalar tensor

Parameters:
  • f_of_X – source tensor

  • f_of_Y – target tensor

class losses.ReverseLayerF(*args, **kwargs)

Bases: Function

ReverseLayerF.

This class inherits from the PyTorch Function class and implements a custom function that reverses the gradient flow during the backward pass.

static backward(ctx, grad_output)

backward.

This method performs the backward pass of the function and returns the negative gradient of the input with respect to the loss, multiplied by the alpha parameter. It also returns None for the alpha parameter, as it does not require gradient.

Parameters:
  • ctx – context object storing information for the backward pass

  • grad_output – gradient of the output w.r.t. the loss

static forward(ctx, x, alpha)

forward.

This method performs the forward pass of the function and returns the input as the output. It also stores the alpha parameter in the context object for the backward pass.

Parameters:
  • ctx – context object storing information for the backward pass

  • x – input data tensor

  • alpha – controls the strength of the gradient reversal

class losses.TransferLoss(loss_type, **kwargs)

Bases: Module

TransferLoss.

__init__.

Parameters:
  • loss_type

  • kwargs

forward(source, target, **kwargs)

forward.

Parameters:
  • source

  • target

  • kwargs